The Cocktail party problem

What is the Challenge?

“One of our most important faculties is our ability to listen to, and follow, one speaker in the presence of others. This is such a common experience that we may take it for granted; we may call it “the cocktail party problem.” No machine has been constructed to do just this, to filter out one conversation from a number jumbled together.”

Colin Cherry, 1957

How do people hear one person in a crowded room?

We humans have a sophisticated skill for listening to one person in a crowded room. Our ears gather and amplify sounds, converting them into left and right (binaural) frequency-domain audio signals. Our brain then analyzes these binaural signals, analyzing auditory glimpses of sounds as they rise above the ambient noise level.

If our brain thinks a sound seems interesting, we believe it quickly creates a basic spatial audio propagation model, describing how sounds from that location in 3D space reach each ear. Using that model, the brain is not only able to determine direction with amazing accuracy, but also to a lesser extent the distance, although it is better at this in the presence of reverberation.

Even more importantly, however, we believe it also uses that propagation model to enhance sounds originating from that specific location while suppressing those from all others, continuously updating the model as the brain continues to listen to sounds from that location.

Audiologists refer to this phenomenon as Spatial Release from Masking and it explains not only how we can hear individual people or conversations in a noisy restaurant or on a city sidewalk, but also why we lose this ability if our hearing is impaired due to age-related hearing loss or damage.

Optical Analogy

We are all familiar with portrait mode on our smartphones, where the background and foreground are blurred to draw attention to the subject. Like our eyes, the optical lenses in cameras have the ability to focus on and sharpen objects within what is called the 'depth of field' while blurring all others.

In-focus image of a wood fence


We believe that our auditory system does something similar. We have the ability to focus on an acoustic subject and blur out sound sources in the foreground and background, as well as above, below, and to the sides.

Wave Sciences has invented the first general solution to the Cocktail Party Problem by mimicking the brain’s Spatial Release from Masking, as audiologists refer to this ability of the brain to spatially filter the incoming sound field to reduce competing speech and noise sources (a.k.a. maskers) that are not co-located with the target source.

Go Back
Understand our Solution