Dina Schneidman

Integrative Structure Modeling in the Age of Deep Learning

Dina Schneidman

The Hebrew University of Jerusalem, Israel

Deep learning models like AlphaFold2 and RoseTTAFold enable high-accuracy protein structure prediction. However, large protein complexes are still challenging to predict due to their size and the complexity of interactions between multiple subunits. Integrative structure modeling is often used to characterize structures and dynamics of large macromolecular assemblies by combining various types of input information, such as available protein structures and models, cross-linking mass spectrometry, cryo-electron microscopy, and small-angle x-ray scattering. Recent progress in protein folding enabled by deep learning has improved structural coverage for domains, and even protein-protein interactions, which are essential inputs for integrative structure modeling. I will present CombFold, a hierarchical and combinatorial assembly algorithm for the prediction of structures of large protein complexes utilizing pairwise interactions between subunits predicted by AlphaFold2. We test the method on a benchmark of large heteromeric assemblies (up to 30 chains and 18,000 amino acids) and obtain a success rate of ~70%. Distance restraints, based on crosslinking mass spectrometry, can guide the assembly. Moreover, we design a deep learning model for predicting the optimal distance range for a crosslinked residue pair based on the structures of their neighborhoods. These tools will be useful in expanding structural coverage beyond monomeric proteins.

Event Timeslots (1)

ictadmin ictadmin