Researchers from Exscientia and the University of Oxford have developed an advanced predictive model called ABodyBuilder3 for antibody structures. This new tool is key for creating monoclonal antibodies, which are integral in immune responses and therapeutic applications. The novel model improves upon the previous ABodyBuilder2 by enhancing the accuracy of predicting Complementarity Determining Region (CDR) loops. These loops are crucial for antibodies’ ability to bind to antigens.
ABodyBuilder3 integrates language model embeddings to improve its predictive abilities and introduces refined relaxation techniques for better structure estimation. It also replaces the previously used ensemble-based confidence approach with per-residue lDDT-Cα scores for uncertainty estimation. This projective method significantly reduces computational costs while improving the model’s correlation between predicted uncertainty and Root-Mean-Square Deviation (RMSD).
Key updates have been made to enhance data curation, sequence representation, and structure refinement processes. Furthermore, the model is now equipped to efficiently assess multiple therapeutic antibody candidates, making it a scalable solution for the medical field.
The new model was developed using vectorization and optimizations from OpenFold, improving performance by over three times and maintaining efficient memory usage. Outliers, ultra-long CDRH3 loops, and low-resolution structures were filtered from the training data, which was taken from the Structural Antibody Database (SAbDab). By using a large validation and test set focused on human antibodies, the model robustness was improved.
Additionally, ProtT5 language model embeddings replaced the one-hot encoding used in ABodyBuilder2, improving the model stability. By using ProtT5, the researchers were able to generate separate embeddings for the heavy and light chains and combine these for the full variable region, achieving refined structure models, particularly for CDRH3 and CDRL3 loops.
The ABodyBuilder3 model can accurately predict the structures of CDR regions. High scores from this model correlate strongly with lower RMSD in crucial areas like CDRH3, offering a precise estimation of uncertainty.
These major improvements have made ABodyBuilder3 more scalable and accurate than its predecessors, largely because of the incorporation of language model embeddings. The new model’s computational efficiency and predictive accuracy could have promising implications for the development of therapeutic antibodies in the medical field. Future enhancements could include deploying self-distillation techniques and pre-training on synthetic datasets, as well as potentially marrying ensemble approaches with pLDDT for better results.