Three-Segment Protein Labeling Using a Highly Efficient and Cysteine-Less Split Intein Identified with Computational Prediction of Aggregation Properties.
Christoph Humberg, Jonas Kröger, Shmuel Pietrokovski, Henning D Mootz
Abstract
Open AccessSplit inteins are versatile tools in protein engineering. We envisaged a new tandem protein trans-splicing (PTS) scheme to assemble a protein from three segments, of which each can be individually treated with regard to its cysteine oxidation or chemical labeling status. However, only a single highly efficient cysteine-less split intein has been reported so far. Split intein activities are currently not predictable and require time-consuming biochemical characterizations. We aimed to accelerate the discovery of novel split inteins with high splicing efficiency by computational sequence analysis. Inspired by our previous finding that linked reduced splicing efficiency of characterized split intein fragments to soluble, β-sheet-rich amyloid-like aggregates, we confirmed the inverse correlation between predicted aggregation propensity and splicing efficiency for new intein candidates by size-exclusion chromatography and biochemical analysis. The LCGC14 intein emerged as a second available cysteine-less split intein with virtually quantitative splicing efficiency, significantly expanding protein engineering opportunities independent of thiol chemistries or in oxidizing conditions. We exploit the orthogonality to the cysteine-less CLm intein to assemble proteins from three selectively labeled segments, as demonstrated for a trimodular non-ribosomal peptide synthetase (NRPS). The prediction of split intein efficiency from their sequence is a significant advancement to streamline future discovery processes.