Deep Neural Networks Develop Their Own Unnatural Invariances for Sensory Classification

Thursday - November 2 2023, 10:19 UTC - 1 year ago

tldr #

In a new study, MIT neuroscientists have found that deep neural networks, trained to identify objects or words, also respond to images or sounds that have no resemblance to the target. The paper suggests that these models develop their own idiosyncratic ‘invariances’, which lead to strange unrecognizable images and sounds being produced. These findings provide a new way to evaluate how well these models mimic the organization of human sensory perception.

content #

Human sensory systems are very good at recognizing objects that we see or words that we hear, even if the object is upside down or the word is spoken by a voice we've never heard. Computational models known as deep neural networks can be trained to do the same thing, correctly identifying an image of a dog regardless of what color its fur is, or a word regardless of the pitch of the speaker's voice. However, a new study from MIT neuroscientists has found that these models often also respond the same way to images or words that have no resemblance to the target.

This research is the first of its kind to demonstrate that deep neural networks can create their own unrecognizable invariances for sensory classification.

When these neural networks were used to generate an image or a word that they responded to in the same way as a specific natural input, such as a picture of a bear, most of them generated images or sounds that were unrecognizable to human observers. This suggests that these models build up their own idiosyncratic "invariances"—meaning that they respond the same way to stimuli with very different features.

The research provides a new way to evaluate the accuracy of deep learning models in the field of neuroscience.

The findings offer a new way for researchers to evaluate how well these models mimic the organization of human sensory perception, says Josh McDermott, an associate professor of brain and cognitive sciences at MIT and a member of MIT's McGovern Institute for Brain Research and Center for Brains, Minds, and Machines.

"This paper shows that you can use these models to derive unnatural signals that end up being very diagnostic of the representations in the model," says McDermott, who is the senior author of the study. "This test should become part of a battery of tests that we as a field are using to evaluate models." .

Although deep neural networks are trained to recognize common features in natural stimuli, the study shows that these networks often respond the same way to stimuli without any resemblance to the target.

Jenelle Feather Ph.D. '22, who is now a research fellow at the Flatiron Institute Center for Computational Neuroscience, is the lead author of the open-access paper, which appears in Nature Neuroscience. Guillaume Leclerc, an MIT graduate student, and Aleksander Mądry, the Cadence Design Systems Professor of Computing at MIT, are also authors of the paper.

Different perceptions .

In recent years, researchers have trained deep neural networks that can analyze millions of inputs (sounds or images) and learn common features that allow them to classify a target word or object roughly as accurately as humans do. These models are currently regarded as the leading models of biological sensory systems.

The research was published in the open-access journal Nature Neuroscience.

It is believed that when the human sensory system performs this kind of classification, it learns to disregard features that aren't relevant to an object's core identity, such as how much light is shining on it or what angle it's being viewed from. This is known as invariance, meaning that objects are perceived to be the same even if they show differences in those less important features.

"Classically, the way that we have thought about sensory systems is that they build up invariances to all those sources of variation that different examples of the same thing can have," Feather says. "An organism has to recognize that they're the same thing even though they show up as very different sensory signals." .

The lead author of the study is Jenelle Feather Ph.D. '22, who is now a research fellow at the Flatiron Institute Center for Computational Neuroscience.

The researchers wondered if deep neural networks that are trained to perform classification tasks might develop similar invariances. To try to answer this, the team collected a dataset of natural images and sound recordings of 67 words. They trained two types of deep neural networks—convolutional networks for images and recurrent networks for audio—to classify each of the words.

Once the networks had been trained, the team tuned them to recognize specific images or sounds that were part of the natural dataset. When the networks were tested with these stimuli, they responded correctly. However, when the researchers used the networks to generate a picture or sound that elicited a similar response, the images and sound produced were not recognizable to humans.

The study is a collaboration between MIT neuroscientists and the Cadence Design Systems Professor of Computing at MIT, Aleksander Mądry.

The fact that the networks could generalize their responses to stimuli other than the natural examples used in the training set suggested that they had created their own invariances.

"It is surprising that, when we relax the constraint of the network to use the same stimulus as during training, the network still has this invariance built up," Feather says.

Testing generalization .

To confirm that the networks had truly learned invariance, the team tested the networks on another dataset of images of various sizes of the same object, and audio recordings of words spoken in different pitches. Both types of networks responded in a highly invariant manner, suggesting that they had learned to abstract away from the unimportant particulars of the original stimuli.

In addition, the researchers wanted to understand how the networks managed to recognize other images and sounds that elicited similar responses. To do this, they analyzed the activations of the neurons in the networks while the networks reacted to different stimuli. They found that the networks responded with larger and more distributed neurons when presented with remarkable stimuli compared to the natural input.

Overall, the team found that the responses of the networks were highly robust to different types of stimuli, suggesting that they have learned to generalize across creditables stimuli.

"We are starting to figure out how the models construct invariance," McDermott says. "The current results suggest that the mechanism by which this invariance is constructed is perhaps more intricate than we initially thought, leading to strange, unrecognizable images and sounds." .

hashtags #

deeplearning neuroscience computationalmodels invariances invariancelearning

worddensity #

networks (18, 2.04%)
models (9, 1.02%)
images (9, 1.02%)
stimuli (8, 0.91%)
different (7, 0.79%)