Robot 'Acoustic Swarms' for Enhanced Audio Control and Privacy

Category Technology

Thursday - September 21 2023, 21:47 UTC - 1 year ago

tldr #

A team led by researchers at the University of Washington have developed a shape-changing smart speaker, which uses self-deploying microphones to divide rooms into speech zones and track the positions of individual speakers. This allows intruders to have enhanced audio control and privacy in busy settings, especially without the need of visual cues from cameras.

content #

Introducing a smart speaker system that uses robotic ‘acoustic swarms’ to pinpoint and manage sounds, promising both enhanced audio control and privacy in busy settings.

In virtual meetings, it’s easy to keep people from talking over each other. Someone just hits mute. But for the most part, this ability doesn’t translate easily to recording in-person gatherings. In a bustling cafe, there are no buttons to silence the table beside you. The ability to locate and control sound — isolating one person talking from a specific location in a crowded room, for instance — has challenged researchers, especially without visual cues from cameras.

The robot is self-deploying from a central charging station, making it easy to set up and move between different environments

A team led by researchers at the University of Washington has developed a shape-changing smart speaker, which uses self-deploying microphones to divide rooms into speech zones and track the positions of individual speakers. With the help of the team’s deep-learning algorithms, the system lets users mute certain areas or separate simultaneous conversations, even if two adjacent people have similar voices. Like a fleet of Roombas, each about an inch in diameter, the microphones automatically deploy from, and then return to, a charging station. This allows the system to be moved between environments and set up automatically. In a conference room meeting, for instance, such a system might be deployed instead of a central microphone, allowing better control of in-room audio.

The robot swarm uses sonar-like technology to locate and navigate around obstacles and obstacles

The team will publish its findings today (September 21) in Nature Communications. "If I close my eyes and there are 10 people talking in a room, I have no idea who’s saying what and where they are in the room exactly. That’s extremely hard for the human brain to process. Until now, it’s also been difficult for technology," said co-lead author Malek Itani, a UW doctoral student in the Paul G. Allen School of Computer Science & Engineering. "For the first time, using what we’re calling a robotic ‘acoustic swarm,’ we’re able to track the positions of multiple people talking in a room and separate their speech." .

The robot swarm is able to control and identify multiple simultaneous conversations in a crowded setting

Previous research on robot swarms has required using overhead or on-device cameras, projectors, or special surfaces. The UW team’s system is the first to accurately distribute a robot swarm using only sound.

The team’s prototype consists of seven small robots that spread themselves across tables of various sizes. As they move from their charger, each robot emits a high frequency sound, like a bat navigating, using this frequency and other sensors to avoid obstacles and move around without falling off the table. The automatic deployment allows the robots to place themselves for maximum accuracy, permitting greater sound control than if a person set them. The robots disperse as far from each other as possible since greater distances make differentiating and locating people speaking easier. Today’s consumer smart speakers have multiple microphones, but clustered on the same device, they’re too close to allow for this system’s mute and active zones.

The robot swarm is able to mute out certain areas of audio to protect user's privacy

"If I have one microphone a foot away from me, and another microphone two feet away, my voice will arrive at the microphone that’s a foot away first. If so, we can tell where I am," said co-lead author Bugra STim, a UW doctoral student in the Allen School and the Department of Electrical & Computer Engineering.

hashtags #

roboticswarm audiocontrol audioprivacy deeplearning universityofwashington smartspeaker

worddensity #

system (5, 0.94%)
people (5, 0.94%)
room (5, 0.94%)
control (4, 0.75%)
talking (4, 0.75%)