MIT engineering students Irene Terpstra ’23 and Rujul Gandhi ’22 are collaborating with the MIT-IBM Watson AI Lab to advance Artificial Intelligence (AI) systems using Natural Language Processing (NLP), taking advantage of the vast amount of natural language data available. Terpstra is focusing on the application of AI algorithms for computer chip design, leveraging the power of large language models. On the other hand, Gandhi’s research is concerned with transforming natural language instructions into machine-friendly language and enhancing voice recognition for low-resource languages.
Terpstra and her mentors are using large pre-trained language models including ChatGPT, Llama 2, and Bard, and an open-source circuit simulator language called NGspice, integrated with a reinforcement learning algorithm, for chip design optimization. The project involves developing an AI system that iterates different designs based on specific textual prompts detailing the desired chip modifications. The output of the AI can guide the manipulation of the physical chip to meet desired parameters. The ultimate goal is to create an AI system that can effectively design the chips independently by merging the logical reasoning of the language models with the optimization capability of the reinforcement learning algorithms.
On the other hand, Gandhi is focusing on making AI communication with humans smoother and more effective. She is building a parser that translates natural language into language machines can understand. The parser uses a pre-trained encoder-decoder model, T5, to identify atomic propositions or the smallest logical units embedded in an instruction. The model identifies sub-tasks within instructions and matches them with available actions and objects in the AI’s environment. If a sub-task can’t be executed, the system can request assistance from the user. The parser’s capability extends to understanding logical dependencies, such as carrying out a specific task until a certain event occurs.
Gandhi is also working on improving speech models for low-resource languages. For languages with limited or non-existent transcriptions, the model learns to recognize repeated sound sequences, assuming them as individual words or concepts. These inferred words are collected into a pseudo-vocabulary, allowing for data labeling for subsequent applications.
The pair believe that the technological advancements yielded from their research could serve a vast array of applications. For instance, enhancing communication with AI systems and software, improving voice assistants, making native language and dialect applications more effective, and possibly aiding translation or interpretation processes.