A team of researchers from the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) and Google Research have developed an image-to-image diffusion model called Alchemist, which allows users to modify the material properties of objects in photos. The system adjusts aspects such as roughness, metallicity, innate color (albedo), and transparency, and can be applied to real and AI-generated images.
The tool is based on a denoising diffusion model, through which users can engage with the material properties of an object via an easy-to-use, slider-based interface. This interface outperforms previous methods by enabling changes to low-level attributes, an aspect traditionally overlooked in other diffusion models.
Previously, using a software like Photoshop to adjust the material properties of an object within an image would require various individual steps. However, Alchemist simplifies this process, allowing users more control over specific properties once the initial image is given.
The potential applications for this new technology are extensive. For instance, it could be used in video game design to help designers refine textures and adjust the appearances of different models rapidly, increasing efficiency. Additionally, Alchemist could have applications in movie effects, graphic design, video production, and even robotic training data, offering precision and photorealism.
Despite its advantages, Alchemist does have a few limitations. It often struggles to infer illumination accurately, leading to a lack of compliance with user input on certain occasions. In some cases, it might also generate physically impossible transparencies. Hence, the research team intends to improve these aspects of the model in their future work.
The researchers believe Alchemist’s ability to interpret material properties from images could prove instrumental in exploring connections between an object’s visual and physical attributes, setting a roadmap for future AI developments in the field of photo-editing and 3D graphics at the scene level. The team’s findings will be presented at CVPR in June.