Skip to content
What's on

Attention to sound

Scientific meeting

Event downloads

Starts:

November
142018

09:30

Ends:

November
152018

17:30

Location

Kavli Royal Society Centre, Chicheley Hall, Newport Pagnell, Buckinghamshire, MK16 9JJ

Overview

Theo Murphy international scientific meeting organised by Dr Alain de Cheveigné, Professor Maria Chait and Dr Malcolm Slaney.

Sound waves. Credit: seyfettinozel

Some sounds are safe to ignore, others require attention. New paradigms and analysis techniques are emerging that enhance our understanding of how the auditory brain makes this choice, and pave the way for novel applications such as the cognitive control of a hearing aid. We gathered neuroscientists, experts in brain signal encoding, and people involved in developing and marketing devices.

The schedule of talks and speaker abstracts and biographies are below. Recorded audio of the presentations will be available on this page shortly.

Attendance of invited discussants supported by H2020 project COCOHA.

Confirmed invited discussants include:

  • Dr Aurélie Bidet-Caulet, Lyon Neuroscience Research Centre, France
  • Dr Jennifer Bizley, University College London, UK
  • Dr Gregory Ciccarelli, Massachusetts Institute of Technology Lincoln Laboratory, USA
  • Professor Maarten De Vos, University of Oxford, UK
  • Professor Fred Dick, Birkbeck University of London and University College London, UK
  • Professor Tom Francart, KU Leuven, Belgium
  • Dr Jens Hjortkjær, Technical University of Denmark, Denmark
  • Dr Christophe Micheyl, Starkey France, Lyon Neuroscience Research Center and Ecole Normale Supérieure, France
  • Professor Lucas Parra, The City College of New York, USA
  • Dr Tobias Reichenbach, Imperial College London, UK

Attending this event

This event has taken place.

Enquiries: contact the Scientific Programmes team

Event organisers

Select an organiser for more information

Schedule of talks

14 November

Session 1 09:30-12:40

Motivation/industry

3 talks Show detail Hide detail

Chairs

Professor Maria Chait, University College London, UK

09:30-09:50 Introduction

09:50-10:10 Towards intention controlled hearing aids: experiences from eye-controlled hearing aids

Professor Thomas Lunner, Eriksholm Research Centre, Denmark and Linköping University, Sweden

Abstract

A hearing impairment causes a reduced ability to segregate acoustic sources. This gives problems in switching between and following speech streams in complex scenes with multiple talkers. Current hearing aid beamforming technologies rely on a listener’s ability to point with the head towards a source of interest. However, this is very difficult in a conversation situation with spatially separated talkers where rapid switches between talkers takes place. In this talk Professor Lunner will show that eye-gaze position signals can be picked up electrically in the ear canal through electrooculography, and that these signals can be used for fast intentional eye-gaze control towards the source of interest in a complex listening scene like a restaurant. Experiments where eye-gaze signals are combined with motion sensors and beamformers show that high benefits in form of improved speech intelligibility is possible for the hearing-impaired listeners. Results also indicate that eye-control combined with head movements is faster and more precise than head movements alone. The presentation will include several videos to show the use cases.

Show speakers

10:10-10:30 Discussion

10:30-11:00 Coffee

11:00-11:20 Path to product: cost, benefits and risks

Dr Simon Carlile, Starkey Hearing Technologies, USA

Abstract

For the hearing impaired listener, current hearing aid technology largely fails with the 'cocktail party problem'. In dynamic, conversational turn-taking, the intent of the listener determines the target talker. Using EEG, the measured focus of attention can be used as a proxy for intent. Other potential applications of EEG include measurement of listening effort and speech comprehension, and automated  hearing aid fitting.

A primary challenge for the productisation of EEG based systems is the development of an appropriate electrophysiological front-end and include: (1) sensor arrays that provides for a commercially acceptable industrial design; (2) electrodes that don’t require conductive paste but overcome the problems of electrical noise and movement artefact; and (3) platform constraints such limited power, ultra-low current implementation and A2D architectures.

On-line EEG analysis is computationally expensive and requires middle tier or cloud-based platforms which are only appropriate for applications that are not time sensitive (<10 milliseconds). The analysis time window needs to be short and current Bluetooth communication protocols produce 100s milliseconds latency. Cloud computation produces additional delays dependent on the availability and quality of the cellular network.

Such implementation challenges are not trivial and will require a concerted investment of resources. This approach however, has the potential to solve the principal and refractory failing of current hearing aid technology as well as enable the development of adaptive devices to much better fit individual needs.

Show speakers

11:20-11:40 Discussion

11:40-12:00 The need for auditory attention

Dr Malcolm Slaney, Google AI Machine Hearing, USA

Abstract

Understanding attention is key to many auditory tasks. In this talk Dr Slaney would like to summarise several aspects of attention that have been used to better understand how humans use attention in our daily lives. This work extends from top-down and bottom-up models of attention that are useful for solving the cocktail party problem, to the use of eye-gaze and face-pose information to better understand speech in human-machine and human-human-machine interactions. The common thread throughout all this work is the use of implicit signals such as auditory saliency, face pose and eye gaze as part of a speech-processing system. Dr Slaney will show algorithms and results from speech recognition, speech understanding, addressee detection, and selecting the desired speech from a complicated auditory environment. All of this is grounded in models of auditory attention and saliency.

Show speakers

12:00-12:20 Discussion

12:20-12:40 General discussion

12:40-14:00 Lunch

Session 2 14:00-17:20

Decoding attention

4 talks Show detail Hide detail

Chairs

Professor Torsten Dau, Technical University of Denmark, Denmark

14:00-14:20 The transformation from auditory to linguistic representations across auditory cortex is rapid and attention-dependent

Professor Jonathan Simon, University of Maryland, USA

Abstract

Professor Simon shows that magnetoencephalography (MEG) responses to continuous speech can be used to directly study lexical as well as acoustic processing. Source localised MEG responses to passages from narrated stories were modelled as linear responses to multiple simultaneous predictor variables, reflecting both acoustic and linguistic properties of the stimuli. Lexical variables were modelled as an impulse at each phoneme, with values based on the phoneme cohort model, including cohort size, phoneme surprisal and cohort entropy.

Results indicate significant left-lateralised effects of phoneme surprisal and cohort entropy. The response to phoneme surprisal, peaking at ~115 ms, arose from auditory cortex, whereas the response reflecting cohort entropy, peaking at ~125 ms, was more ventral, covering the superior temporal sulcus. These short latencies suggest that acoustic information is rapidly used to constrain the word currently being heard. This difference in localisation and timing are consistent with two stages during lexical processing, with phoneme surprisal being a local measure of how informative each phoneme is, and cohort entropy reflecting the state of lexical activation via lexical competition. An additional left-lateralised response to word onsets peaked at ~105 ms.

The effect of selective attention was also investigated using a two speaker mixture, one attended and one ignored. Responses reflect the acoustic properties of both speakers, but reflect lexical processing only for the attended speech. While previous research has shown that responses to semantic properties of words in unattended speech are suppressed, these results indicate that even processing of word forms is restricted to attended speech.

Show speakers

14:20-14:40 Discussion

14:40-15:00 Bottom-up auditory attention using complex soundscapes

Professor Mounya Elhilali, Johns Hopkins University, USA

Abstract

Recent explorations of task-driven (top-down) attention in the auditory modality draw a picture of a dynamic system where attentional feedback modulates sensory encoding of sounds in the brain to facilitate detection of events of interest and ultimately perception especially in complex soundscapes. Complementing these processes are mechanisms of bottom-up attention that are dictated by acoustic salience of the scene itself but still engage a form of attentional feedback. Often, studies of auditory salience have relied on simplified or well-controlled auditory scenes to shed light on acoustic attributes that drive the salience of sound events. Unfortunately, the use of constrained stimuli in addition to a lack of well-established benchmarks of salience judgments hampers the development of comprehensive theories of bottom-up auditory attention. Here, Professor Elhilali will explore auditory salience in complex and natural scenes. She will discuss insights from behavioural, neural and computational explorations of bottom-up attention and their implications for our current understanding of auditory attention in the brain.

Show speakers

15:00-15:20 Discussion

15:20-16:00 Coffee

16:00-16:20 Speaker-independent auditory attention decoding without access to clean speech sources

Professor Nima Mesgarani, Columbia University, USA

Abstract

Speech perception in crowded acoustic environments is particularly challenging for hearing impaired listeners. Assistive hearing devices can suppress background noises that are sufficiently different from speech; however, they cannot lower interfering speakers without knowing the speaker on which the listener is focusing. One possible solution to determine the listener’s focus is auditory attention decoding in which the brainwaves of listeners are compared with sound sources in an acoustic scene to determine the attended source, which can then be amplified to facilitate hearing. In this talk, Professor Mesgarani addresses a major obstacle in actualising this system, which is the lack of access to clean sound sources in realistic situations where only mixed audio is available. He proposes a novel speech separation algorithm to automatically separate speakers in mixed audio without any need for prior training on the speakers. The separated speakers are compared to evoked neural responses in the auditory cortex of the listener to determine and amplify the attended speaker. These results show that auditory attention decoding with automatically separated speakers is as accurate and fast as using clean speech sounds. Moreover, Professor Mesgarani demonstrates that the proposed method significantly improves both the subjective and objective quality of the attended speaker. By combining the latest advances in speech processing technologies and brain-computer interfaces, this study addresses a major obstacle in actualisation of auditory attention decoding that can assist individuals with hearing impairment and reduce the listening effort for normal hearing subjects in adverse acoustic environments.

Show speakers

16:20-16:40 Discussion

16:40-17:00 On the encoding and decoding of natural auditory stimulus processing using EEG

Professor Ed Lalor, University of Rochester, USA and Trinity College Dublin, Ireland

Abstract

Over the past few years there has been a surge in efforts to model neurophysiological responses to natural sounds. This has included a variety of methods for decoding brain signals to say something about how a person is engaging with and perceiving the auditory world. In this talk Professor Lalor will discuss recent efforts to improve these decoding approaches and broaden their utility. In particular, he will focus on three related factors: 1) how we represent the sound stimulus, 2) what features of the data we focus on, and 3) how we model the relationship between stimulus and response. Professor Lalor will present data from several recent studies in which he has used different stimulus representations, different EEG features and different modelling approaches in an attempt to lead to more useful decoding models and more interpretable encoding models of brain responses to sound.

Show speakers

17:00-17:20 Discussion

17:20-17:40 General discussion

15 November

Session 3 08:30-12:00

Visual modality

4 talks Show detail Hide detail

Chairs

Professor Adrian KC Lee, University of Washington, USA

08:30-08:50 Facilitation and inhibition in visual selective attention

Professor Heleen Slagter, University of Amsterdam, The Netherlands

Abstract

Visual selective attention is thought to facilitate performance both through enhancement and inhibition of sensory processing of goal-relevant and irrelevant (or distracting) information. While much insight has been gained over the past few decades into the neural mechanisms underlying facilitatory effects of attention, much less is known about inhibitory mechanisms in visual attention. In particular, it is still unclear as to whether target facilitation and distractor inhibition are simply different sides of the same coin or whether they are controlled by distinct neural mechanisms. Moreover, recent work indicates that suppression of visual distractors only emerges when information about the distractor can be derived directly from experience, consistent with a predictive coding model of expectation suppression. This also raises the question as to how visual attention and expectation interact to bias information processing. In this talk, Professor Slagter will discuss recent findings from several behavioural and EEG studies that examined how expectations about upcoming target or distractor locations and/or features influence facilitatory and inhibitory effects of attention on visual information processing and representation using ERPs, multivariate decoding analyses, and inverted encoding models. Collectively, these confirm an important role for alpha oscillatory activity in town-down biasing of visual attention to, and sharpening of representations of target locations. Yet, they also show that target facilitation and distractor suppression are differentially influenced by expectation, and rely at least in part on different neural mechanisms, with distractor suppression selectively occurring after stimulus presentation. This latter finding raises the question as to whether voluntary preparatory inhibition is possible at all.

Show speakers

08:50-09:10 Discussion

09:10-09:30 Rhythmic structures in visual attention: behavioural and neural evidence

Professor Huan Luo, Peking University, China

Abstract

In a crowded visual scene, attention must be efficiently and flexibly distributed over time and space to accommodate different task contexts. In this talk, Professor Luo would like to present several works in the lab investigating the temporal structure of visual attention. First, by using a time-resolved behavioral measurement, the group demonstrates that attentional behavioural performance contains temporal fluctuations (theta-band, alpha-band, etc), supporting that neuronal oscillatory profile might be directly revealed at behavioural level. These behavioural oscillations display a temporal alternating relationship between locations, suggesting that attention samples multiple items in a time-based rhythmic manner. Second, by employing EEG recordings in combination with a TRF approach, the group extracted object-specific neuronal impulse responses during multi-object selective attention. The results show that attention rhythmically switches among visual objects every ~200 ms, and the spatiotemporal sampling profile adaptively changes in various task contexts. Finally, by using MEG recordings in combination with a decoding approach, the group demonstrates that attention fluctuates between attended orientation features in a theta-band rhythm, suggesting that feature-based attention is mediated by rhythmic sampling similar to that for spatial attention. In summary, attention is not stationary but dynamically samples multiple visual objects in a periodic or serial-like way. This work advocates a generally central role of temporal organisation in attention by flexibly and efficiently organising resources in time dimension.

Show speakers

09:30-09:50 Discussion

09:50-10:20 Coffee

10:20-10:40 Attention across sound and vision: effects of perceptual load

Professor Nilli Lavie FBA, University College London, UK

Abstract

Load Theory of attention and cognitive control offers a hybrid model that combines capacity limits in perception with automaticity of processing. The model proposes that perception has limited capacity but proceeds automatically and involuntarily in parallel on all stimuli within capacity: relevant as well as irrelevant. Much evidence accumulated to support load theory in vision research so far. However the cross modal effects of perceptual load across the senses are less clear. In her talk, Professor Lavie will present recent work on the effects of visual perceptual load on auditory perception and the related neural activity as assessed with magnetoencephalography. The results showed that the level of unattended auditory perception and the related neural signal critically depends on the level of perceptual processing load in the visual attention task. Task conditions of high perceptual load that takes up all capacity with attended task processing, lead to reduced processing of unattended stimuli. In contrast in conditions of low perceptual load that leave spare capacity ignored task-irrelevant stimuli are nevertheless perceived, and elicit neural response. These findings demonstrate the value of understanding the role of attention in auditory processing within the framework of Load Theory.

Show speakers

10:40-11:00 Discussion

11:00-11:20 Networks controlling attention in vision (and audition)

Professor Barbara Shinn-Cunningham, Carnegie Mellon University, USA

Abstract

Neuroimaging with fMRI shows that there are distinct networks biased towards the processing of visual and auditory information. These networks include inter-digitated areas in frontal cortex as well as corresponding primary and secondary sensory regions. In Professor Shinn-Cunningham's studies, she sees these distinct frontal regions consistently in individual subjects across multiple studies spanning years; however, the inter-digitated structural organization of the 'visual' and 'auditory' regions in frontal cortex is not clear using standard methods for co-registering and averaging fMRI results across subjects. Although the networks that include these inter-digitated frontal control regions are 'sensory biased', they are also recruited to process information in the other sensory modality as needed. Specifically, areas that are always engaged by auditory attention are recruited when visual tasks require processing of temporal structure, but not when the same visual inputs are accessed for tasks requiring processing of spatial information. Conversely, processing of auditory spatial information preferentially engages the visually biased brain network – a network that is traditionally associated with spatial visual attention. This visuo-spatial network includes retinotopically organised spatial maps in parietal cortex. Recent EEG results from Professor Shinn-Cunningham's lab confirm that auditory spatial attention makes use of the parietal maps in the 'visual spatial attention' network. Together, these results reveal that visual networks for attention are a shared resource used by the auditory system.

Show speakers

11:20-11:40 Discussion

11:40-12:00 General discussion

12:00-13:30 Lunch

Session 4 13:30-17:30

Auditory modality

4 talks Show detail Hide detail

Chairs

Professor Stephen David,Oregon Health & Science University, USA

13:30-13:50 Objective, reliable, and valid? Measuring auditory attention

Professor Jonas Obleser, University of Lübeck, Germany

Abstract

Auditory attention is a fascinating feat. For example, it is most astonishing how our brain 'does away' with considerable differences in sound pressure between a behaviourally relevant sound source and other interferences. Meanwhile, auditory attention has remained this elusive phenomenon: do we really understand enough just yet of auditory attention to build machines that attend, or machines that help us attend? Illustrated by behavioural, electrophysiological, and functional imaging data from his own lab and others, Professor Obleser will take stock of the evidence: are top-down selective-attention abilities indeed a stable, trait-like feature of the individual listener, with predictable decline in older adults? And, what are we really getting from our current go-to neural measures of auditory attention, speech tracking aka 'neural entrainment' versus alpha-power fluctuations? Luckily, Professor Obleser will probably be out of time as the talk reaches the main question: what are we measuring when we measure auditory attention?

Show speakers

13:50-14:10 Discussion

14:10-14:30 Auditory selective attention: lessons from distracting sounds

Dr Elana Golumbic, Bar Ilan University, Israel

Abstract

A fundamental assumption in attention research is that, since processing resources are limited, the core function of attention is to manage these resources and allocate them among concurrent stimuli or tasks, according to current behavioural goals and environmental needs. However, despite decades of research, we still do not have a full characterisation of the nature these processing limitations, or ‘bottlenecks’ – ie what processes can be in performed in parallel and where the need for attentional selection kicks in. This question is particularly pertinent in the auditory system, which has been studied far less extensively than the visual system, and is proposed to have a wider capacity for parallel processing of incoming stimuli.

In this talk Dr Golumbic will discuss a series of experiments studying the depth of processing applied to task-irrelevant sounds and their neural encoding in auditory cortex. She will look at how this is affected by the acoustic properties, temporal structure, and linguistic structure of unattended sounds, as well as by overall acoustic load and task demands, in attempt to understand what levels suffer most from processing bottlenecks. In addition, she will discuss what we can learn about the capacity of parallel processing of auditory stimuli from pushing the system to its limits and requiring the division of attention among multiple concurrent inputs.

Show speakers

14:30-14:50 Discussion

14:50-15:30 Coffee

15:30-15:50 The neuro-computational architecture of auditory attention

Professor Elia Formisano, Maastricht University, The Netherlands

Abstract

Auditory attention is a crucial component of real-life listening and is required, for instance, to enhance a particularly relevant aspect of a sound or to separate a sound of interest from noisy backgrounds. When listening to simple tones, attending to a certain frequency range induces a rapid and specific adaptation of neuronal tuning, which ultimately results in enhanced processing of that frequency range and suppression of the other frequencies. But which are the neural mechanisms enabling attentive selection and enhancement when listening to complex real-life sounds and scenes? At which levels of neural sound representation does attention operate? And how do these mechanisms depend on the specific behavioural requirements? High-resolution fMRI and computational modelling of sound representations both provide a relevant contribution to address these questions. Sub-millimetre fMRI enables distinguishing the activity and connectivity of neuronal populations across cortical layers non-invasively in humans (laminar fMRI). This is required for disentangling feedforward/feedback processing in primary and non-primary auditory areas and the communication between auditory and other areas (eg frontal areas). Modelling of sound representations allows formulating well-defined hypotheses on the nature of simple and complex features processed in the network of auditory areas and how the neural sensitivity for these features is affected by attention and behavioural task demands. The combination of laminar fMRI and sound representation models is thus ideally positioned to unravel the neural circuitry and the computational architecture of auditory attention in naturalistic listening scenarios.

Show speakers

15:50-16:10 Discussion

16:10-16:30 How attention modulates processing of mildly degraded speech to influence perception and memory

Professor Ingrid Johnsrude, Western University, Canada

Abstract

Professor Johnsrude and colleagues have previously demonstrated that, whereas the pattern of brain (fMRI) activity elicited by clearly spoken sentences does not seem to depend on attention, patterns are markedly different when attending or not to highly intelligible but degraded (6-band noise vocoded) sentences (Wild et al, J Neurosci, 2012). They have replicated and extended this work to sentences that, although slightly degraded (12-band noise vocoded), can be reported word-for-word with 100% accuracy. Even for these very intelligible materials, a marked dissociatation was observed in patterns of brain activity when people attended to these compared to when they were performing a multiple object tracking task. Furthermore, in both of these experiments, memory for degraded items was enhanced by attention, whereas memory for clear sentences was not, suggesting that even perfectly intelligible but degraded sentences are processed in a qualitatively different, attentionally gated, way, compared to clear sentences. Supported by a Canadian Institutes of Health Research operating grant (MOP 133450) and Canadian Natural Sciences and Engineering Research Council Discovery grant (3274292012).

Show speakers

16:30-16:50 Discussion

16:50-17:10 General discussion

17:10-17:30 Concluding session

Attention to sound

Theo Murphy international scientific meeting organised by Dr Alain de Cheveigné, Professor Maria Chait and Dr Malcolm Slaney.

Kavli Royal Society Centre, Chicheley Hall Newport Pagnell Buckinghamshire MK16 9JJ
Was this page useful?
Thank you for your feedback
Thank you for your feedback. Please help us improve this page by taking our short survey.