Scientists at Massachusetts Institute of Technology (MIT) have developed a computational model aimed at simplifying the process of protein engineering. The researchers applied mutations to natural proteins with desirable traits, such as the ability to emit fluorescent light, using random mutation to cultivate better versions of the protein. The technique was deployed using the green fluorescent protein (GFP) and a protein from adeno-associated virus (AAV), responsible for DNA delivery for gene therapy. While the model was applied to well-characterised data sets and showcased promising results, the scientists plan to further use the model on data focused on voltage indicator proteins.
The team developed a computational model able to predict superior versions of proteins based on limited data. Protein design has traditionally been challenging due to the complex relationship between a protein’s DNA sequence and its structure and function. The newly created model aims to simplify this process by indicating a clearer path to an improved protein. Using the model, the scientists created proteins with mutations anticipated to lead to superior versions of GFP and proteins from AAV.
The team used a convolutional neural network (CNN) to train the model on experimental data consisting of GFP sequences and their brightness, the trait they were attempting to improve. The model could form a “fitness landscape” based on this data, although making accurate predictions on a protein’s pathway to enhanced fitness can still be challenging. To tackle this, the researchers used a computational technique to “smooth” the fitness landscape, allowing the model to predict improved GFP sequences.
The method was also successfully applied to the viral capsid of AAV, optimising the capsid for its potential to package a DNA payload. The scientists now plan to apply their computational technique to create data on voltage indicator proteins to facilitate future predictions of protein improvement.
Ultimately, the researchers aim to simplify the protein-engineering process using their model, ensuring an easy transition from initial, naturally occurring proteins to enhanced versions through predicted mutations. The method could be employed by other laboratories to improve proteins for medical treatments and support neuroscience research. Notably, the model builds on previous work to engineer GFP to produce a stronger fluorescent signal more efficiently, with the model offering an enhanced ability to optimise this trait.