Skip to content Skip to footer

Scientists at the University of Washington Develop a Protein Sequence Design Using Deep Learning that Thoroughly Incorporates the Complete Non-Protein Atomic Context

Researchers at the University of Washington have developed a novel technique using deep learning to improve protein sequence design, particularly focusing on enzymes and the design of small molecule binder and sensors. The method, known as LigandMPNN, has been designed to address certain shortcomings in existing methods like Rosetta and ProteinMPNN, which struggle to model non-protein atoms and molecules – a key requirement in the precise design of protein sequences that interact with small molecules, nucleotides, and metals.

Current methods often overlook non-protein atoms and molecules, which are crucial for designing enzymes and protein-DNA/RNA interactions as well as protein-small molecule and protein-metal binders. LigandMPNN addresses this issue by building on the ProteinMPNN architecture to include the full non-protein atomic context. It uses protein-ligand graphs to model interactions and encode ligand atom geometries using neural networks, creating sequences and side-chain conformations specific to non-protein interactions.

Unlike conventional methods, LigandMPNN uses a graph-based approach, treating protein residues as nodes and incorporating nearest neighbor edges based on Cα-Cα distances. Protein residues and ligand atoms serve as nodes in the protein-ligand graphs it creates, capturing interactions and representing geometric relationships.

Testing of LigandMPNN revealed superior performance compared to previously used methods — it showed increased sequence recovery for residues interacting with small molecules, nucleotides, and metals with 20-30% more accuracy. Moreover, LigandMPNN outperformed existing models in terms of speed and efficiency, being roughly 250 times faster than Rosetta.

In conclusion, LigandMPNN fills a gap in protein sequence design methodology by allowing for explicit inclusion of non-protein atoms and molecules. This graph-based approach has been shown to provide a significant improvement in performance, gaining higher sequence recovery and improved side-chain packing accuracy. The method also has great potential in designing small molecule and DNA-binding proteins with high affinity and specificity, promising valuable contributions to protein engineering. Credit for this research goes to the team at the University of Washington and additional details can be found in the published research paper.

Leave a comment

0.0/5