Interference in spoken communication: Evaluating the corrupting and disrupting effects of other voices 2016-2019

The datasets comprise behavioural responses to speech stimuli. These stimuli are either simplified analogues of spoken sentence-length utterances or syllables (for datasets 1-4 and 6) or signal-processed natural syllables (for dataset 5). For the utterances, the responses are the transcriptions entered by the participant using a keyboard. For the syllables, the responses are key presses indicating the perceived identity of the initial consonant. Much of the information necessary to understand speech (acoustic-phonetic information) is carried by the changes in frequency over time of a few broad peaks in the frequency spectrum of the speech signal, known as formants. The project aims to investigate how listeners presented with mixtures of target speech and interfering formants are able to group together the appropriate formants, and to reject others, such that the speech of the talker we want to listen to can be understood. Interfering sounds can have two kinds of effect - energetic masking, in which the neural response of the ear to the target is swamped by the response to the masker, and informational masking, in which the "auditory brain" fails to separate readily detectable parts of the target from the masker. The project will explore the informational masking component of interference - often the primary factor limiting speech intelligibility - using stimulus configurations that eliminate energetic masking. The project will explore how speech-like interferers affects intelligibility, distinguishing the circumstances in which the interferer takes up some of the available perceptual processing capacity from those in which specific properties of the interferer intrude into the perception of the target speech. Our approach is to use artificial speech-like stimuli with precisely controlled properties, to accompany target speech with carefully designed interferers that offer alternative grouping possibilities, and to measure how manipulating the properties of these interferers affects listeners' abilities to recognise the target speech in the mixture. In everyday life, talking with other people is important not only for sharing knowledge and ideas, but also for maintaining a sense of belonging to a community. Most people take it for granted that they can converse with others with little or no effort. Successful communication involves understanding what is being said and being understood, but it is quite rare to hear the speech of a particular talker in isolation. Speech is typically heard in the presence of interfering sounds, which are often the voices of other talkers. The human auditory system, which is responsible for our sense of hearing, therefore faces the challenge of identifying which parts of the sounds reaching our ears have come from which talker. Solving this "auditory scene analysis" problem involves separating those sound elements arising from one source (e.g., the voice of the talker to whom you are attending) from those arising from other sources, so that the identity and meaning of the target source can be interpreted by higher-level processes in the brain. Over the course of evolution, humans have been exposed to a variety of complex listening environments, and so we are generally very successful at understanding the speech of one person in the presence of other talkers. This contrasts with attempts to develop listening machines, which often fail when confronted with adverse conditions, such as automatic transcription of a conversation in an open-plan office. Human listeners with hearing impairment often find these environments especially difficult, even when using the latest developments in hearing-aid or cochlear-implant design, and so can struggle to communicate effectively in such conditions. Much of the information necessary to understand speech (acoustic-phonetic information) is carried by the changes in frequency over time of a few broad peaks in the frequency spectrum of the speech signal, known as formants. The project aims to investigate how listeners presented with mixtures of target speech and interfering formants are able to group together the appropriate formants, and to reject others, such that the speech of the talker we want to listen to can be understood. Interfering sounds can have two kinds of effect - energetic masking, in which the neural response of the ear to the target is swamped by the response to the masker, and informational masking, in which the "auditory brain" fails to separate readily detectable parts of the target from the masker. The project will explore the informational masking component of interference - often the primary factor limiting speech intelligibility - using stimulus configurations that eliminate energetic masking.

Show More

Geographic Coverage:

GB

Temporal Coverage:

2016-09-01/2019-08-31

Resource Type:

dataset

Available in Data Catalogs:

UK Data Service

Topics: