Decoding protein binding plasticity via integrated deep ribosome display and deep learning.
Mengtong Tang, Jiawei Li, Zhixi Li, Jingsong Cui, Hao Qi
Abstract
Open AccessThe plasticity in protein interaction is central to understanding biological networks and de novo protein design. However, the systematic exploration remains impeded by the astronomic dimensionality of sequence space. Here, we present a platform that synergizes deep experimental screening with deep learning to decode interaction plasticity. By developing a ribosome display stripped of all known ribosome termination and rescue functions, we produce a comprehensive dataset comprising 47.8 million unique peptides spanning a broad spectrum of Streptactin-binding activity. A deep learning architecture, systematically trained on sequence context, enrichment dynamics, and subsequence abundance, achieves high accuracy (Pearson's r = 0.902) on predicting Streptactin-binding activity. Through sequence dimensionality reduction, exhaustive subsequence elucidation, and enriched motif elicitation, we identify 799 strong-binding sequences containing a canonical motif and 219 sequences harboring novel motifs with divergent docking conformations. These findings reveal an unanticipated depth and breadth in protein-binding plasticity. We propose that this integrated experimental-AI framework will facilitate the systematic exploration of protein interactions and enable the data-driven design of synthetic peptides.