The integration of robotics into automatic assembly procedures is highly valuable but has met with issues adapting to high-mix, low-volume manufacturing. Robotic learning which enables robots to acquire assembly skills through demonstrations, not scripted processes, offers a potential resolution to this problem. However, teaching robots to perform assembly tasks from raw sensor data presents a difficulty due to the intensive and meticulous nature of these tasks, necessitating new innovative training methods.
Researchers have offered numerous resolutions for these difficulties in training robots for assembly tasks using raw data perceptive skills, like Reinforcement Learning (RL) and Imitation Learning (IL). However, RL struggles with long-term tasks and sparse rewards, making it less ideal for assembly tasks. Meanwhile, IL, especially in smaller data regimes, allows users to collect their demonstration data, thus lifting the burdens of data collection. However, utilizing IL with a small dataset brings its issues.
Fitting complex demonstrated actions guided by raw images, particularly with long-term, precise tasks, is a significant hurdle. The decision of policy architecture and action prediction mechanism strongly impacts the model’s learning process from the data. Recent studies suggest that predicting multiple future actions chunks and representing policies as conditional diffusion models can enhance performance.
Prompt or “bottleneck” areas, where minor inaccuracies can lead to failure, is another significant hurdle for learning robust behaviors. For this, structured data augmentation and noising methods have been proposed to handle these situations.
A novel technique involves utilizing automatic resets to bottleneck states, simulating “disassembly” actions to perturb the scene, and synthesizing corrective actions by reversing the disassembly sequence. This method allows structured data noising in a larger variety of scenarios, enhancing the model’s durability to environmental changes.
Researchers have also studied ways to automatically expand whole trajectories dataset using iterative model development cycles across tasks. By collecting successful or partly successful rollouts during the model’s evaluation and incorporating new data from parallel tasks, the dataset’s size can be increased without additional human efforts.
In conclusion, the proposed JUICER pipeline offers a comprehensive approach to learning high-precision manipulation from a few demonstrations. By combining diffusion policy architectures with mechanisms for dataset expansion through noising and repetitive model development cycles, JUICER shows significant improvements in overall task success compared to basic methods. These tools and datasets will allow the research community to further explore and build upon these advancements in robotic learning for assembly tasks.