The auditory cortex selectively hears what we listen to
Rice. 1.
Areas of the superior temporal gyrus responsible for the perception of spoken language.
primary auditory
cortex receives information from the thalamus, where it arrives (through several intermediate stages) from the organ of hearing - the cochlea.
This information is structured from the very beginning - divided into frequencies. auditory cortex
takes the first steps toward understanding what is heard by filtering auditory information and combining it with data from other senses.
Wernicke 's area
, which occupies the most posterior part of the superior temporal gyrus, recognizes words and plays a key role in speech understanding. Image from D. Purves, G. J. Augustine, D. Fitzpatrick, et al. editors. Neuroscience. 2nd edition. Sunderland (MA): Sinauer Associates; 2001
People are able to listen and understand each other even in a company where everyone is talking at the same time. How the brain selects the necessary sounds from a complex acoustic background is unknown. American neuroscientists, working with patients who had electrodes implanted into the superior temporal gyrus during treatment for epilepsy, discovered that the activity of neurons in the secondary auditory cortex reflects the speech of the person to whom the subject is listening. Based on the activity of these neurons, a specially trained computer program can determine which of the two speakers the subject is listening to and reconstruct the words heard.
Isolating the speech of one specific person from a multi-voice choir is a technically extremely difficult task, as developers of automatic speech recognition systems are well aware of. Our brain, however, easily copes with it, but how it does it is not really known. It can be assumed that at some stage in the processing of auditory information, the speech of the person we are listening to is cleared of “extraneous impurities,” but when and where this happens is again unclear.
Nima Mesgarani and Edward Chang from the University of California, San Francisco, studied the functioning of neurons in the secondary auditory cortex (Fig. 1) in three patients with epilepsy who, in preparation for surgery, had microelectrodes implanted in the superior temporal lobe. gyrus (Fig. 2).
Rice. 2.
Location of electrodes on the subjects' brains.
Shades of red
show how different the signal from the electrode is when perceiving speech and in silence.
Image from the discussed article in Nature
Previously it was shown that neurons in the secondary auditory cortex “encode” (reflect) the spoken language perceived by a person. Computer programs have been developed that, after special training, are able, based on data on the activity of these neurons, to reconstruct the timbre of the speaker’s voice and even recognize spoken words (Formisano et al., 2008. “Who” is saying “what”? Brain-based decoding of human voice and speech Pasley et al., 2012. Reconstructing Speech from Human Auditory Cortex). But these experiments were carried out on subjects who were allowed to listen to the speech of only one speaker. Mesgarani and Chang decided to find out what information the auditory cortex neurons would reflect if there were two speakers, but the subject was asked to listen to only one of them.
The experiments used recordings of two voices - male and female. They uttered meaningless seven-word phrases such as “ready tiger go to red two now” or “ready ringo go to green five now.” The first, third, fourth and seventh words were always the same. The second word - tiger or ringo - served as a conditioned signal for the subject. One of these words was displayed on the screen in front of him, and he had to listen to which of the two speakers would pronounce this word. In fifth place was a word denoting one of three colors (red, blue or green), in sixth place was one of three numerals (two, five or seven). The subject had to answer what number and what color was named by the one of the two speakers who uttered the key word. The phrases were combined in such a way that two voices simultaneously called different numbers and colors.
The authors used a previously developed program to reconstruct the sound signal from data on the activity of neurons in the auditory cortex. The program was previously “trained”, and during the training the subjects were allowed to listen to voices one at a time, and not both at the same time. When the program learned to reconstruct spectrograms of single phrases well, the main phase of the experiment began. Now the subjects listened to two voices simultaneously, and the spectrograms reconstructed by the program from data on neuronal activity were compared with real spectrograms of phrases spoken by two speakers.
It turned out that in those cases when the subject successfully completed the task (that is, correctly named the color and number pronounced in the voice that said the key word), the spectrogram reconstructed from his neurons reflected the speech of only one of the two speakers - the one who was needed listen (Fig. 3). If the subject was mistaken, the reconstructed spectrogram was not similar to the speech of the “correct” speaker, but reflected either an unintelligible mixture or correlated with the spectrogram of the second, “distracting” speaker. As a rule, in the first case the subject could not correctly reproduce the words of either of the two speakers, and in the second he indicated the number and color, called the “distractor” voice.
Rice. 3.
Examples of oscillograms and spectrograms of spoken phrases (
a
–
d
) and reconstructions of spectrograms made by a computer program based on data on the operation of neurons in the auditory cortex (
e
–
h
).
a
,
b
- phrases spoken by two voices -
SP1
(male) and
SP2
(female) - separately.
c
,
d
- phrases spoken by two voices simultaneously (in figure
d, blue
and
red colors
show the areas in which the voice of the first or second speaker is louder, respectively).
e
,
f -
spectrograms reconstructed by a computer program based on the work of neurons in the auditory cortex when listening to two phrases one by one.
g
,
h
- the same, when listening to both phrases simultaneously (
g
- the subject listens to the first voice,
h
- to the second).
Image from the discussed article in Nature
At the final stage, the authors used a computer program - a regularized linear classifier (see Linear classifier), trained to distinguish between two voices and spoken words based on the activity of neurons in the auditory cortex when listening to single phrases. When this program was asked to process data on the work of the same neurons when listening to two voices at the same time, it successfully identified both the voice (male or female) and the words (color and number) spoken by the speaker to whom the subject was listening. In those experiments in which the subject completed the task, based on the work of his neurons, the program successfully identified the voice in 93%, the color in 77.2%, and the number in 80.2% of cases. In experiments where the subject made a mistake, the program either produced a random result or recognized the “distracting” voice and the words spoken by it.
Thus, the study showed that in the secondary auditory cortex, speech information is reflected in a “filtered” form: the work of neurons encodes the speech of the person to whom the subject is listening. Although we still do not know the mechanisms of this filtering, it is already possible to determine from the activity of neurons in the auditory cortex which of the two speakers a person is listening to and identify the words heard.
Source:
Nima Mesgarani, Edward F. Chang.
Selective cortical representation of attended speaker in multi-talker speech perception // Nature
. 2012. V. 485. P. 233–236.
See also:
Hearing is responsible for the integration of hearing and touch, “Elements”, 10.24.2005.
Alexander Markov
Physiology of pathways and centers of the auditory system
First order neurons (bipolar neurons) are located in the spiral ganglion, which is located parallel to the organ of Corti and follows the curls of the cochlea.
One branch of the bipolar neuron forms a synapse on the auditory receptor, and the other goes to the brain, forming the auditory nerve.
The auditory nerve fibers exit the internal auditory canal and reach the brain in the area of the so-called cerebellopontine angle (this is the anatomical border between the medulla oblongata and the pons).
Second-order neurons form a complex of auditory nuclei in the medulla oblongata. When describing, we will proceed from a simplified anatomical diagram, according to which this complex is divided into dorsal and ventral, which, in turn, consists of anterolateral and posterolateral parts.
Each of these three divisions of the auditory nuclei has an independent representation of the organ of Corti.
As can be seen in the figure, the advancement of the recording microelectrode from the dorsal to the ventral nucleus reveals neurons with gradually decreasing values of the characteristic frequency.
This means that the principle of tonotopic organization is observed. Thus, the frequency projection of the organ of Corti as a whole is repeated in an orderly manner in the space of each of the divisions of the auditory complex of nuclei.
The axons of the neurons of the auditory nuclei ascend into the overlying structures of the auditory analyzer both ipsilaterally and contralaterally.
The next level of the auditory system is at the level of the bridge and is represented by the nuclei of the superior olive (medial and lateral) and the nucleus of the trapezius body.
At this level, binaural (from both ears) analysis of sound signals is already carried out. The projections of the auditory pathways to the indicated pontine nuclei are also organized tonotopically.
Most neurons of the superior olive nuclei are excited binaurally. Two categories of binaural neurons have been discovered. Some are excited by sound signals from both ears (BB-type), others are excited by one ear, but inhibited by the other (BT-type).
The trapezoid nucleus receives a predominantly contralateral projection from the auditory nucleus complex, and accordingly, neurons respond predominantly to sound stimulation of the contralateral ear. Tonotopy is also found in this nucleus.
The axons of the cells of the auditory nuclei of the bridge are part of the lateral loop. The main part of its fibers (mainly from the medial olive) switches in the inferior colliculus, the other part goes to the thalamus and ends on the neurons of the internal (medial) geniculate body, as well as in the anterior colliculus.
In addition, part of the fibers of the lateral lemniscus innervates the contralateral colliculus of the inferior colliculus, forming the Probst commissure.
The inferior colliculus, located on the dorsal surface of the midbrain, is the most important center for the analysis of sound signals.
At this level, apparently, the analysis of sound signals necessary for indicative reactions to sound ends. The main part of the cellular elements of the posterior colliculus is localized in the central nucleus.
The axons of the cells of the posterior colliculus are directed as part of its handle to the medial geniculate body. However, some of the axons go to the opposite hill, forming the intercalicular commissure.
The medial geniculate body is the thalamic center of the auditory system. It distinguishes between large-cell and small-cell (main) parts.
Axons of neurons in the parvocellular part of the geniculate body produce acoustic radiation and are directed to the auditory area of the cortex.
The magnocellular part of the internal geniculate body receives projections from the inferior colliculus. Tonotopy is also visible in this thalamic nucleus: low frequency is represented in the lateral, and high frequency in the medial part of the nucleus.
The auditory cortex represents the highest center of the auditory system and is located in the temporal lobe. In humans, it includes fields 41, 42 and partially 43.
In each of the zones there is tonotopy, that is, a complete representation of the neuroepithelium of the organ of Corti. The spatial representation of frequencies in the auditory areas is combined with the columnar organization of the auditory cortex, especially pronounced in the primary auditory cortex.