MIT's Shaping the Future of Work Initiative officially launched on January 22. Co-directed by MIT Professors Daron Acemoglu, David Autor, and Simon Johnson, the initiative seeks to examine the factors negatively impacting job quality and employment opportunities for workers without a completed four-year college education. The goal is to propose novel solutions that set the…
Dr. Benjamin Warf, a renowned neurosurgeon from Boston Children's Hospital, has been virtually present in Brazil, aiding and mentoring residents as they perform delicate surgery on a model of a baby's brain. This has been made possible through a digital avatar of Dr. Warf, developed by the medical simulator and augmented reality (AR) company, EDUCSIM,…
Tamara Broderick, a once participant of the Women's Technology Program at MIT, found her path back to the institution years later, but as a faculty member. Broderick, a tenured associate professor in the Department of Electrical Engineering and Computer Science (EECS), works in the field of Bayesian inference, which is a statistical approach to measure…
As traditional semiconductor technologies approach their physical limits, the demand for computing power continues to rise, largely driven by the rapid expansion of artificial intelligence (AI). Addressing this conundrum, Lightmatter, a company founded by three MIT alumni, has developed pioneering computing technology that harnesses the properties of light to bolster data processing and transport.
Lightmatter’s…
Generative AI is being recognized for its capacity to produce both text and images. Establishment and application of generative AI to produce realistic synthetic data about various scenarios can assist businesses to improve services, reroute planes, or upgrading software platforms, especially in cases where tangible-world data is scarce or sensitive.
For the preceding three years, a…
The concept of "Interactive Fleet Learning" (IFL) addresses a significant development in the field of robotics and artificial intelligence. Large groups of robots, or fleets, have emerged from laboratories to perform practical tasks in real-world settings. Examples of these include Waymo's fleet of over 700 self-driving cars operating in several cities and the industrial application…
The recent advancement in text-to-image generation using diffusion models has produced impressive results with high-quality, realistic images. Yet, despite these successes, diffusion models, including Stable Diffusion, often struggle to follow prompts correctly, particularly when spatial or common-sense reasoning is needed. These shortcomings become evident in four key scenarios: negation, numeracy, attribute assignment, and spatial relationships.…
CoarsenConf is a novel architecture introduced for molecular conformer generation, a critical task in computational chemistry where the goal is to predict stable low-energy 3D molecular structures, or conformers, using a given 2D molecule. This is a crucial process for various applications, such as drug discovery and protein docking, which depend on accurate spatial and…
Deep learning's remarkable success can be partially attributed to its ability to extract useful representations of complex data, a process often achieved via Self-Supervised Learning (SSL). However, the core process by which SSL algorithms achieve this has largely remained a mystery. A recent paper to appear at ICML 2023 provides the first comprehensive mathematical model…
The paper "Training Diffusion Models with Reinforcement Learning" presents a technique to train diffusion models, recognized for generating high-dimensional outputs using reinforcement learning (RL). The paper's key idea focuses on improving diffusion models' performances on particular objectives instead of broadly matching with training data. A significant point is to upgrade the model's efficiency on atypical…
The development of more powerful and complex virtual assistants, like GPT-4, Claude-2, Bard, and Bing Chat, are facilitated by the use of Reinforcement Learning with Human Feedback (RLHF). However, despite their achievements, certain issues arise within the RLHF process. The reward learning stage relies on human preference data in the form of comparisons to train…
Rethinking the Role of PPO in RLHF
The process of Reinforcement Learning with Human Feedback (RLHF) utilises a dominant RL optimizer called Proximal Policy Optimization (PPO), an important component of the training behind powerful virtual assistants such as GPT-4, Claude-2, Bard, and Bing Chat. However, current RLHF processes exhibit a tension between the reward learning phase,…