Augmented Reality (AR) presents unique issues necessitating understanding from a first-person perspective, unlike third-person perspective. Synthetic data, helpful for third-person vision models, is still underutilized in areas involving embodied egocentric perception. A major challenge here is accurately simulating human movements and behaviours, vital for directing embodied cameras to capture true-to-life egocentric representations of a 3D environment.
To tackle this, researchers at ETH Zurich and Microsoft have developed EgoGen, an innovative synthetic data generator. Central to EgoGen is a groundbreaking human motion synthesis model that uses egocentric visual inputs from a virtual human to perceive the surrounding 3D environment. The model is enhanced with collision-avoiding movement principles and applies a two-stage reinforcement learning strategy. This results in a closed-loop system where the virtual human’s embodied perception and movement are fully integrated. Unlike prior models, this eliminates the need for a predefined global path, applying directly to dynamic conditions.
EgoGen allows existing real-world egocentric datasets to be smoothly augmented with synthetic images. Results have shown enhanced performance of current algorithms in various applications such as mapping and localization for head-mounted cameras, egocentric camera tracking, and human mesh recovery from egocentric views. This underlines EgoGen’s utility in improving existing algorithms and its potential to propel egocentric computer vision research.
EgoGen includes a user-friendly, scalable data generation pipeline, effective across key tasks and fully open-source, assisting researchers in creating realistic egocentric training data. Its flexibility makes it useful for diverse applications beyond tasks like human-computer interaction, virtual reality, and robotics. With its open-source launch, researchers hope EgoGen will spur innovation and progress in egocentric perception and contribute to the wider field of computer vision research.
For more details, visit the Paper and Code. The research credit goes to the researchers involved in this project. You can also follow them on Twitter and Google News and join their ML SubReddit, Facebook Community, Discord Channel, and LinkedIn Group. If you appreciate their work, consider subscribing to their newsletter and joining the Telegram Channel.