Skip to content Skip to footer

Researchers from MIT have developed an image dataset that simulates peripheral vision in machine learning models, improving their object detection capabilities. However, even with this modification, the AI models still fell short of human performance. The researchers discovered that size and visual clutter, factors that impact human performance, largely did not affect the AI’s ability. Moving forward, these insights could help create machine learning models that more closely resemble how humans interpret visual information, improving technology like driver safety systems and display interfaces.

One crucial use for peripheral vision is directing our eyes to collect more information. “Modeling peripheral vision… can help us understand the features in a visual scene that make our eyes move to collect more information,” explains Anne Harrington MEng ’23. A better understanding of how peripheral vision works in AI models could help predict human behavior, making interactions with machines like cars or user interfaces more effective and accommodating.

To simulate peripheral vision, the MIT team began with the texture tiling model, a technique used to replicate peripheral vision in humans. This method represents a loss of visual detail by transforming images to seem more textural the farther they move into the periphery. By modifying this model, the team was able to transform images in a flexible way that didn’t require predicting where the AI would focus its “vision”. This technique produced a large dataset of transformed images, which were used to train several computer vision models and test their object detection proficiency against humans.

The results showed that, while training models from scratch with the new dataset improved their performance, fine-tuning an existing model resulted in only minor improvements. Regardless of the training method, AI systems were consistently unable to match human performance, especially when it came to detecting objects in the far periphery. This may suggest that these models do not use contextual clues in the same way humans do when detecting objects. Further studies aim to understand these differences better and develop a model that can predict human behavior in the visual periphery.

This research contributes to our understanding of the role our peripheral vision plays in real-world tasks, explains Justin Gardner, an associate professor in the Department of Psychology at Stanford University. Despite recent advancements, AI models still struggle to measure up to human performance, suggesting that periphery is more than just lesser-quality vision due to fewer photoreceptors but rather optimized for real-world tasks. This study could lead to more AI research to examine the neuroscience of human vision and improve technological interfaces for human use. The extensive image database provided by the researchers, which mimics peripheral vision in humans, will be a significant tool in this future research.

Leave a comment

0.0/5