MIT researchers have developed a technique to train robots on multiple tasks by combining and optimising data from a variety of sources. At the core of their work is a type of generative AI known as a ‘diffusion model’, which learns from a specific dataset to complete a task. However, the particular innovation here lies in combining the methods learnt by these models into a general strategy. The research team referred to their method as Policy Composition (PoCo).
The MIT team discovered that by training each diffusion model on a different dataset, such as one assembled from human video demonstrations and another comprised of examples from teleoperation of a robotic arm, they could then amalgamate the policies learnt by all the models. The combined policy thus satisfies the objectives of each individual policy and becomes greater than the sum of its parts.
By separately training these policies, the research team was able to combine policies and trade them between tasks to yield optimal results. This effectively means users can easily add new modalities or domains by training an additional diffusion policy with that dataset, without needing to initiate the entire process from the beginning.
The PoCo method saw improvements in task performance by up to 20% compared to baseline techniques. The team applied their model to both simulations and real-world robotic arms performing tasks from using a hammer to flipping an object with a spatula.
The research, to be presented at the Robotics: Science and Systems Conference, marks the first step in a broader plan of applying the technique to more complex problems. The researchers aim to move onto tasks requiring a robot manipulating multiple tools and incorporating larger datasets to improve the robot’s performance.
This work was partly funded by Amazon, the National Science Foundation, Singapore’s Defense Science and Technology Agency, and the Toyota Research Institute, though the specific amounts have not been disclosed. With the PoCo technique demonstrating a significant improvement in robotic performance, its roll-out may just be a matter of when, not if, we’ll see home robots capable of a range of handyman tasks.