Skip to content Skip to footer

Google and MIT Scientists Unveil Synclr: An Innovative AI System for Training Visual Representations Solely from Artificial Images and Artificial Captions without any Actual Data

Discover the exciting potential of representation learning with synthetic data! Google Research and MIT CSAIL’s new research explores the possibility of creating large-scale curated datasets to train state-of-the-art visual representations using synthetic data derived from commercially available generative models. This new method, known as Learning from Models, takes advantage of the new controls provided by models’ latent variables, conditioning variables, and hyperparameters to curate data. Because models are less bulky than data, they are easier to store and share, and models can generate endless data samples, albeit with limited variability.

In testing this method, the researchers rethink the level of detail in visual classes by using generative models. For instance, the same caption can result in many images that exactly match the caption, allowing for a greater level of granularity than traditional self-supervised methods like SimCLR. The team found that compared to SimCLR and supervised training, the granularity at the caption level is superior, and the new method is easily extensible with online class (or data) augmentation, allowing for hypothetically scaling up to unlimited classes.

The team’s results show that Synthetic Contrastive Learning (SynCLR) can achieve results comparable to those of DINO v2 models derived from a pre-trained ViT-g model, surpassing CLIP for ViT-B by 3.3% and ViT-L by 1.5%. On fine-grained classification tasks, SynCLR also surpasses MAE pre-trained on ImageNet by 6.2 and 4.1 in mIoU for ViT-B and ViT-L respectively. This demonstrates the powerful potential of representation learning with synthetic data for dense prediction tasks with limited resources.

The team suggests there are several ways to improve caption sets, such as using more sophisticated language models, improving the sample ratios among distinct concepts, and expanding the library of in-context examples. With the endless potential of this method, we can’t wait to see how far representation learning with synthetic data can go! Join the 35k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, LinkedIn Group, Twitter, and Email Newsletter to stay up-to-date with the latest AI research news, cool AI projects, and more.

Leave a comment

0.0/5