Deep generative modeling of temperature-dependent structural ensembles of proteins.
Giacomo Janson, Alexander Jussupow, Michael Feig
Abstract
Open AccessDeep learning has revolutionized protein structure prediction, but capturing conformational ensembles remains a challenge. Molecular dynamics (MD) simulations can describe biomolecular dynamics, but they are computationally expensive. Alternatively, deep learning models trained on MD can generate structural ensembles at reduced cost. We present aSAM (atomistic structural autoencoder model), a latent diffusion model trained on MD to generate heavy atom protein ensembles. aSAM models atoms in a latent space to accurate sample side chain and backbone torsion angle distributions. Additionally, the aSAMt version generates ensembles conditioned on temperature. Trained on mdCATH, aSAMt captures temperature-dependent ensemble properties and demonstrates generalization beyond training temperatures. By comparing aSAMt ensembles to long MD simulations of fast-folding proteins, we find that high-temperature training enhances the ability of generators to explore energy landscapes. aSAMt can also capture experimentally observed thermal behavior of proteins. This work generalizes deep learning ensemble generation towards the inclusion of environmental conditions.