The Impact of Perceptual Straightness in Computer Vision
Category Computer Science Sunday - May 14 2023, 22:53 UTC - 1 year ago MIT researchers have discovered that a specific training method can help computer vision models learn more perceptually straight representations, like humans do. This can help create more accurate predictions, such as in autonomous vehicles.
Imagine sitting on a park bench, watching someone stroll by. While the scene may constantly change as the person walks, the human brain can transform that dynamic visual information into a more stable representation over time. This ability, known as perceptual straightening, helps us predict the walking person's trajectory.
Unlike humans, computer vision models don't typically exhibit perceptual straightness, so they learn to represent visual information in a highly unpredictable way. But if machine-learning models had this ability, it might enable them to better estimate how objects or people will move.
MIT researchers have discovered that a specific training method can help computer vision models learn more perceptually straight representations, like humans do. Training involves showing a machine-learning model millions of examples so it can learn a task.
The researchers found that training computer vision models using a technique called adversarial training, which makes them less reactive to tiny errors added to images, improves the models' perceptual straightness.
The team also discovered that perceptual straightness is affected by the task one trains a model to perform. Models trained to perform abstract tasks, like classifying images, learn more perceptually straight representations than those trained to perform more fine-grained tasks, like assigning every pixel in an image to a category.
For example, the nodes within the model have internal activations that represent "dog," which allow the model to detect a dog when it sees any image of a dog. Perceptually straight representations retain a more stable "dog" representation when there are small changes in the image. This makes them more robust.
By gaining a better understanding of perceptual straightness in computer vision, the researchers hope to uncover insights that could help them develop models that make more accurate predictions. For instance, this property might improve the safety of autonomous vehicles that use computer vision models to predict the trajectories of pedestrians, cyclists, and other vehicles.
"One of the take-home messages here is that taking inspiration from biological systems, such as human vision, can both give you insight about why certain things work the way that they do and also inspire ideas to improve neural networks," says Vasha DuTell, an MIT postdoc and co-author of a paper exploring perceptual straightness in computer vision.
Joining DuTell on the paper are lead author Anne Harrington, a graduate student in the Department of Electrical Engineering and Computer Science (EECS); Ayush Tewari, a postdoc; Mark Hamilton, a graduate student; Simon Stent, research manager at Woven Planet; Ruth Rosenholtz, principal research scientist in the Department of Brain and Cognitive Sciences and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL); and senior author William T. Freeman, the Thomas and Gerd Perkins Professor of Electrical Engineering and Computer Science and a member of CSAIL. The research was presented at the International Conference on Learning Representations.
After reading a 2019 paper from a team of New York University researchers about perceptual straightness in humans, DuTell, Harrington, and their colleagues developed a set of models to measure the perception of stability and movement in computer vision models. MIT researchers found that computer vision models trained to classifying images learn more perceptually straight representations than those trained to more fine-grained tasks. This can help create more accurate predictions, such as in autonomous vehicles.
Share