Skip to content Skip to footer

The ESM3 by EvolutionaryScale: a creative model for biological studies.

AI startup EvolutionaryScale has launched ESM3, a transformative generative language model focused on “programming biology”. With a 98-billion parameter count, ESM3 leverages artificial intelligence (AI) to generate and prototype new proteins. The company’s main focus is on proteomics, which involves studying the function, composition, structure, interactions, and cellular activities of proteins present in a biological system.

Protein generation is a crucial biological process wherein a ribosome uses messenger RNA (mRNA) to create a specific protein based on the genetic code contained within the mRNA. All forms of life share a uniform genetic code that is comprised of 20 distinct amino acids. ESM3 has been designed to read and understand this genetic code, thereby enabling the generation of proteins on demand.

The linguistic model undergoes training using billions of proteins naturally found within the environment. The most significant challenge during the creation process was the tokenization of the protein’s three-dimensional structure and its subsequent functionalities. Scientists overcame this hurdle by developing a method to represent every function and three-dimensional structure as an alphabetic sequence.

In a demonstration of ESM3’s capabilities, EvolutionaryScale utilized it to generate a novel green fluorescent protein (GFP), a bioluminescent protein that imparts the characteristic glow to several marine organisms like corals and jellyfish. However, these proteins are very rare in nature. The company estimates that novel protein, called esmGFP, is equivalent to over 500 million years of natural evolution.

EvolutionaryScale is making the ESM3 model openly available with the hope that it will aid scientists in exploring new realms of protein design and synthetic biology. This could potentially lead to the development of solutions for significant global problems.

Despite the benefits, the company recognizes the potential risks associated with the dual-use and open-source nature of ESM3. Hence, they plan to mitigate these risks using a Responsible Development Framework. AI-driven biology programming tools, such as ESM3, AlphaFold, and CRISPR, could allow for the development of proteins capable of carbon absorption, plastic degradation, or potentially even pioneering new medicines. These advancements could help address diseases and environmental issues that have posed challenges to scientists for decades.

Leave a comment

0.0/5