Peripheral vision, most humans’ mechanism to see objects not directly in their line of sight, although with less detail, does not exist in AI. However, researchers at MIT have made significant progress towards this by developing an image dataset to simulate peripheral vision in machine learning models. The research indicated that models trained with this dataset improved at identifying objects in visual periphery, but still fell short of the abilities of human beings.
The researchers, led by Vasha DuTell, a postdoctoral researcher at MIT, created models using the image dataset to mimic the human eye’s ability to detect shapes and movement outside of its direct focus. But unlike humans, the size of objects or the quantity of visual clutter in a scene did not significantly affect the AI’s performance. DuTell suggests that these results indicate a fundamental issue with the models and emphasises the need to better understand what is missing in their design.
The study’s lead author, Anne Harrington MEng ’23, posits that accurately modelling peripheral vision could enable a more comprehensive understanding of the factors in a visual scene that engage our attention, leading to better prediction of human behaviour.
The research involved using a texture tiling model, a technique for replicating human peripheral vision based on visual information loss. The researchers modified the model to be more flexible and used this model to create a dataset of images that represented the loss of detail experienced in human peripheral vision.
The models were then trained using this dataset and their performance compared to humans in an object detection task. The study revealed that training models from scratch using their dataset resulted in the most significant performance improvements, while fine-tuning existing models yielded smaller gains. However, none of the AI models matched human performance, especially in detecting objects in the far periphery.
Their findings suggest that machine learning models do not use context in the same way humans do. The researchers aim to continue investigating these differences to develop a model that can accurately predict human performance in the visual periphery. The ultimate goal is to enable AI systems that can alert drivers to unseen hazards, for example.
The researchers plan to make their dataset publicly available to motivate further studies in computer vision. Their findings contribute to an understanding that human peripheral vision is not just a result of our limited photoreceptors but an adaptation that is optimized for real-world tasks. The research highlights the need for further AI study to learn from the neuroscience of human vision, a task that will be aided considerably by the database of images provided by the authors, reflecting peripheral human vision.