MIT researchers are replicating peripheral vision—a human’s ability to detect objects outside their direct line of sight—in AI systems, which could enable these machines to more effectively identify imminent dangers or predict human behavior. By equipping machine learning models with an extensive image dataset to imitate peripheral vision, the team found these models were better at recognizing objects in the visual bounds. Despite measurable improvements, the AI systems still underperformed humans. Researchers also discovered that the size of objects and the degree of visual clutter in a scene did not significantly affect AI performance.
The research, published by lead author Anne Harrington MEng ’23 and co-author Vasha DuTell, may support the development of machine-learning models that more closely resemble human vision. The technology could make displays more accessible and easier to view, or could predict how humans will react—particularly useful in automotive settings, for instance.
To simulate peripheral vision, the researchers utilized a method known as the ‘texture tiling model’ which is used to model peripheral vision in humans. The technique transforms images to represent the decline in a human’s visual information. The researchers altered this model for flexibility, allowing for transformations in images without prior knowledge of where the viewer’s eyes are focused.
Using this modified technique, the researchers created a large dataset of transformed images that represent the loss of detail in a human’s peripheral vision. They then trained and tested multiple AI models on an object detection task.
The team found that training models from scratch with this dataset led to major improvement in object detection. Using the dataset to fine-tune an existing model for a new task resulted in smaller progress. However, the AI models were not as proficient as humans, particularly at detecting objects in the far periphery, indicating that the models may not be using context in the same way humans do. The researchers plan to further investigate these differences and continue working towards a model which can accurately predict human behavior in the visual periphery. The research was funded by the Toyota Research Institute and the MIT CSAIL METEOR Fellowship.