Skip to main content

NOMATEN HYBRID-SEMINAR May 22: Machine Learning in Protein Classification, Modeling and Design

Date

NOMATEN HYBRID-SEMINAR

online: https://meet.goto.com/NCBJmeetings/nomaten-seminar
In-person: NOMATEN seminar room (102)

Thursday, MAY 22th  2025 1 PM (CET)

Machine Learning in Protein Classification, Modeling and Design

Dominik Gront, Ph.D., D.Sc.
University of Warsaw

 

Abstract:

Machine learning has been applied in the natural sciences for many years, including in bioinformatics and molecular modeling. However, it is only within the last decade that the rapid development of deep learning, along with the increased availability of data and computational power, has enabled a qualitative leap in the application of these methods. In my presentation, I will showcase selected examples of machine learning techniques developed and applied in our laboratory, illustrating their potential in protein analysis and modeling.

The first example involves the use of clustering methods to classify cytochrome P450 sequences, employing similarity measures based on local alignments and unsupervised grouping algorithms. The outcome of this work is P450Atlas: an automated tool for detecting and annotating new sequences within the P450 superfamily. The second example demonstrates the application of deep convolutional neural networks to support coarse-grained methods in protein structure modeling. Our method, deepBBQ, enables the reconstruction of complete protein backbone structures based solely on the positions of Cα atoms. The model, trained on a large structural dataset, achieves accuracy comparable to experimental data.

In a third project, we focus on the application of generative deep learning models (e.g., variational autoencoders) to generate realistic conformations of protein fragments. The aim of this approach is to enrich the conformational space in molecular simulations and improve the modeling of intrinsically disordered or partially known protein structures.

During the presentation, I will also discuss the challenges of training machine learning models on biological data and potential directions for further development of these methods in the context of biomolecular modeling.

 

Bio:

Dominik Gront, Ph.D., D.Sc.

Dominik Gront received his Ph.D. in chemical sciences with distinction in 2006 from the Faculty of Chemistry at the University of Warsaw. He subsequently completed two international postdoctoral fellowships: at the University of Virginia in Charlottesville in the group of Prof. Władek Minor, and at the University of Washington in the laboratory of Prof. David Baker. In 2016, he obtained his habilitation at the Faculty of Chemistry of the University of Warsaw, where he has been employed as an associate professor since 2020. Dr. Gront’s scientific work focuses on the development of generative neural networks for protein modeling and the classification of proteins from the cytochrome P450 superfamily. Since 2019, Dr. Gront has also been a board member of Rosetta Commons, an organization dedicated to the development and maintenance of the Rosetta software suite.

 

 

 

Galeria
Dominik Gront, Ph.D., D.Sc.


This project has received funding from the European Union Horizon 2020 research and innovation
programme under grant agreement No 857470 and from European Regional Development Fund
via Foundation for Polish Science International Research Agenda PLUS programme grant
No MAB PLUS/2018/8.
Poland
The project is co-financed from the state budget within the framework of the undertaking of the Minister of Science and Higher Education "Support for the activities of Centers of Excellence established under Horizon 2020".

Grant: 5 143 237,70 EUR
Total value: 29 971 365,00 EUR
Date of signing the funding agreement: December 2023

The purpose of the undertaking is to support entities of the higher education and science system that have received funding from the European Union budget in the competition H2020-WIDESPREAD-2018-2020/WIDESPREAD-01-2018-2019: Teaming Phase 2. in the preparation, implementation and updating of activities, maintenance of material resources necessary for carrying out activities, acquisition and modernization of scientific and research apparatus, maintenance and development of personnel potential necessary for the implementation of activities, and dissemination of the results of scientific activities.