SciTech

Cheers to the Cocktail Party Problem

The diagram shows the auditory cortical activity across the scalp, measured by decoders trained at different time lags. (credit: Courtesy of O’Sullivan, J.A., Power, A.J., Mesgarani, N., Rajaram, S., Foxe, J.J., Sh) The diagram shows the auditory cortical activity across the scalp, measured by decoders trained at different time lags. (credit: Courtesy of O’Sullivan, J.A., Power, A.J., Mesgarani, N., Rajaram, S., Foxe, J.J., Sh)

In a room filled with infinite conversations and unlimited noise, how is the brain able to focus on a single speaker? In the world of neuroscience and psychology, this is known as the cocktail party problem.

Since 2016, the Misha Mahowald prize has been awarded annually to the best and brightest in neuromorphic engineering. In December 2021, the prize was awarded to two teams for their research, one of which made groundbreaking contributions into understanding the famous cocktail party problem.

The group, known as The Telluride Auditory Attention Team, included one of Carnegie Mellon’s faculty members, Dr. Barbara Shinn-Cunningham, the director of Carnegie Mellon’s Neuroscience Institute. Other members of the winning team are Edmund Lalor (Principal Investigator, University of Rochester); James O'Sullivan (Trinity College, Dublin), Alan Power (Trinity College, Dublin and University of Cambridge), Nima Mesgarani (University of California, San Francisco), Siddharth Rajaram (Boston University), John Foxe (University of Rochester), Malcolm Slaney (Google Research) and Shihab Shamma (University of Maryland).

Their work began in 2012, during the Telluride Neuromorphic Engineering Cognitive Workshop held annually in Telluride, Colorado. And with funding from the European Union, the Cognitively Controlled Hearing Aid Project was born.

The team created a real-time Auditory Attention Decoding (AAD) system that measured auditory attention selection and performance using electrical impulses from the scalp. To mimic the cocktail party setting, the system was used in multi-speaker environments. The team published their findings in a paper featured in Cerebral Cortex in July 2015.

The researchers proved that even a single trial of unfiltered electroencephalography (EEG) waves was enough to decode selective attention in a multi-speaker environment. Through this, they discovered that a neural processing speed of 200 milliseconds was enough to solve the cocktail party problem. After a stimulus, such as the start of a conversation, only 200-250 milliseconds were needed to show the maximum correlation between behavior and reconstruction accuracy. In that time, the brain can process a single conversation of interest and react accordingly.

Previous approaches to solving the cocktail party problem involved mapping the cortical pathways that track auditory input. However, this was mainly conducted through cortical surface recordings and magnetoencephalography (MEG). While the researchers of the Telluride Auditory Attention Team also chose to use surface recordings to collect data, they opted for EEGs as opposed to MEGs due to the affordability and non-invasive nature of EEGs.

In the future, the researchers of the Telluride Auditory Attention Team hope to use their findings to create better brain-computer interfaces and hearing aids for those with hearing impairments.