Skip to content Skip to footer

MIT researchers have developed a computational approach that predicts protein mutations, based on limited data, that would enhance their performance. The researchers used their model to create optimized versions of proteins derived from two naturally occurring structures. One of these was the green fluorescent protein (GFP), a molecule used to track cellular processes within the body. The other was a protein taken from adeno-associated virus (AAV), which supports the delivery of DNA for gene therapy.

The research team, led by Regina Barzilay from MIT’s School of Engineering and Ila Fiete from the School of Brain and Cognitive Sciences, also plans to apply this computational technique to proteins used to monitor neuron activity. This could offer an alternative to less-engineered proteins, and further the understanding and treatment of neural disorders.

Protein design faces challenges due to the complex mapping from DNA sequence to protein structure and function. A desired protein may need a series of genetic alterations, all of which could render it nonfunctional. The researchers compared this process to a challenging mountainous journey, involving numerous obstructive peaks before reaching the desired destination. The recent computational technique developed aims to simplify this complex journey significantly.

There are an enormous number of possible protein sequences, each yielded from the swapping of different amino acids. It is practically impossible to test them all, and thus computational modeling plays a crucial role in determining the most effective versions.

The team used a model known as a convolutional neural network (CNN) on GFP experimental data to develop a 3D map that shows the fitness levels of a protein and its varied deviation from the original structure. The landscape includes peaks that represent more fit proteins and valleys indicating less fit proteins.

Predicting the navigation route a protein needs to follow to get to the fitness peaks can be challenging, as some proteins might have to accept mutations that make them less fit before reaching a fitness peak. To overcome this hurdle, the researchers used a computational technique to “smooth” the fitness landscape, which allowed the CNN model to reach greater fitness peaks more easily.

The researchers proof-tested the computational technique on GFP and AAV experiments. In the case of GFP, they were able to optimize the protein’s brightness — a feature they aimed to enhance. In the AAV experiment, they optimized the viral capsid to have a greater DNA packaging capacity.

Continuing their research, the team now plans to apply this technique to voltage indicator proteins. Over the past twenty years, numerous labs have unsuccessfully attempted to improve these proteins. With the help of current computational techniques, the researchers hope to achieve a more substantial outcome from a dataset much smaller than those used previously.

Funding for the research was provided by various organizations including the U.S. National Science Foundation, the Abdul Latif Jameel Clinic for Machine Learning in Health, the DARPA Accelerated Molecular Discovery Program, and the National Institutes of Health.

Leave a comment

0.0/5