Professional Documents
Culture Documents
Duplex theory
The Duplex theory proposed by Lord Rayleigh (1907) provides an explanation for the ability of
humans to localise sounds by time differences between the sounds reaching each ear (ITDs) and
differences in sound level entering the ears (interaural level differences, ILDs). But there still lies a
question whether ITD or ILD is prominent.
The duplex theory states that ITDs are used to localise low frequency sounds, in particular, while
ILDs are used in the localisation of high frequency sound inputs. However, the frequency ranges for
which the auditory system can use ITDs and ILDs significantly overlap, and most natural sounds
will have both high and low frequency components, so that the auditory system will in most cases
have to combine information from both ITDs and ILDs to judge the location of a sound source.[1] A
consequence of this duplex system is that it is also possible to generate so-called "cue trading" or
"timeintensity trading" stimuli on headphones, where ITDs pointing to the left are offset by ILDs
pointing to the right, so the sound is perceived as coming from the midline. A limitation of the
duplex theory is that the theory does not completely explain directional hearing, as no explanation is
given for the ability to distinguish between a sound source directly in front and behind. Also the
theory only relates to localising sounds in the horizontal plane around the head. The theory also
does not take into account the use of the pinna in localisation.(Gelfand, 2004)
Experiments conducted by Woodworth (1938) tested the duplex theory by using a solid sphere to
model the shape of the head and measuring the ITDs as a function of azimuth for different
frequencies. The model used had a distance between the 2 ears of approximately 2223 cm. Initial
measurements found that there was a maximum time delay of approximately 660 s when the sound
source was placed at directly 90 azimuth to one ear. This time delay correlates to the wavelength of
a sound input with a frequency of 1500 Hz. The results concluded that when a sound played had a
frequency less than 1500 Hz the wavelength is greater than this maximum time delay between the
ears. Therefore there is a phase difference between the sound waves entering the ears providing
acoustic localisation cues. With a sound input with a frequency closer to 1500 Hz the wavelength of
the sound wave is similar to the natural time delay. Therefore due to the size of the head and the
distance between the ears there is a reduced phase difference so localisations errors start to be made.
When a high frequency sound input is used with a frequency greater than 1500 Hz, the wavelength
is shorter than the distance between the 2 ears, a head shadow is produced and ILD provide cues for
the localisation of this sound.
Feddersen et al. (1957) also conducted experiments taking measurements on how ITDs alter with
changing the azimuth of the loudspeaker around the head at different frequencies. But unlike the
Woodworth experiments human subjects were used rather than a model of the head. The experiment
results agreed with the conclusion made by Woodworth about ITDs. The experiments also
concluded that is there is no difference in ITDs when sounds are provided from directly in front or
behind at 0 and 180 azimuth. The explanation for this is that the sound is equidistant from both
ears. Interaural time differences alter as the loudspeaker is moved around the head. The maximum
ITD of 660 s occurs when a sound source is positioned at 90 azimuth to one ear.
right AVCN. The result of having input from both cochleas is an increase in the firing rate of the
MSO units. The neurons in the MSO are sensitive to the difference in the arrival time of sound at
each ear, also known as the interaural time difference (ITD). Research shows that if stimulation
arrives at one ear before the other, many of the MSO units will have increased discharge rates. The
axons from the MSO continue to higher parts of the pathway via the ipsilateral lateral lemniscus
tract.(Yost, 2000)
The lateral lemniscus (LL) is the main auditory tract in the brainstem connecting SOC to the
inferior colliculus. The dorsal nucleus of the lateral lemniscus (DNLL) is a group of neurons
separated by lemniscus fibres, these fibres are predominantly destined for the inferior colliculus
(IC). In studies using an unanesthetized rabbit the DNLL was shown to alter the sensitivity of the IC
neurons and may alter the coding of interaural timing differences (ITDs) in the IC.(Kuwada et al.,
2005) The ventral nucleus of the lateral lemniscus (VNLL) is a chief source of input to the inferior
colliculus. Research using rabbits shows the discharge patterns, frequency tuning and dynamic
ranges of VNLL neurons supply the inferior colliculus with a variety of inputs, each enabling a
different function in the analysis of sound.(Batra & Fitzpatrick, 2001) In the inferior colliculus (IC)
all the major ascending pathways from the olivary complex and the central nucleus converge. The
IC is situated in the midbrain and consists of a group of nuclei the largest of these is the central
nucleus of inferior colliculus (CNIC). The greater part of the ascending axons forming the lateral
lemniscus will terminate in the ipsilateral CNIC however a few follow the commissure of Probst
and terminate on the contralateral CNIC. The axons of most of the CNIC cells form the brachium of
IC and leave the brainstem to travel to the ipsilateral thalamus. Cells in different parts of the IC tend
to be monaural, responding to input from one ear, or binaural and therefore respond to bilateral
stimulation.
The spectral processing that occurs in the AVCN and the ability to process binaural stimuli, as seen
in the SOC, are replicated in the IC. Lower centres of the IC extract different features of the
acoustic signal such as frequencies, frequency bands, onsets, offsets, changes in intensity and
localisation. The integration or synthesis of acoustic information is thought to start in the CNIC.
(Yost, 2000)
Sound localization
Sound localization refers to a listener's ability to identify the location or origin of a detected sound
in direction and distance. It may also refer to the methods in acoustical engineering to simulate the
placement of an auditory cue in a virtual 3D space (see binaural recording, wave field synthesis).
The sound localization mechanisms of the mammalian auditory system have been extensively
studied. The auditory system uses several cues for sound source localization, including time- and
level-differences between both ears, spectral information, timing analysis, correlation analysis, and
pattern matching.
These cues are also used by animals, but there may be differences in usage, and there are also
localization cues which are absent in the human auditory system, such as the effects of ear
movements.
Sound from the right side has a higher level at the right ear than at the left ear, because the
head shadows the left ear. These level differences are highly frequency dependent and they
increase with increasing frequency.
For frequencies below 800 Hz, mainly interaural time differences are evaluated (phase delays), for
frequencies above 1600 Hz mainly interaural level differences are evaluated. Between 800 Hz and
1600 Hz there is a transition zone, where both mechanisms play a role.
Localization accuracy is 1 degree for sources in front of the listener and 15 degrees for sources to
the sides. Humans can discern interaural time differences of 10 microseconds or less.[5][6]
Evaluation for low frequencies
For frequencies below 800 Hz, the dimensions of the head (ear distance 21.5 cm, corresponding to
an interaural time delay of 625 s), are smaller than the half wavelength of the sound waves. So the
auditory system can determine phase delays between both ears without confusion. Interaural level
differences are very low in this frequency range, especially below about 200 Hz, so a precise
evaluation of the input direction is nearly impossible on the basis of level differences alone. As the
frequency drops below 80 Hz it becomes difficult or impossible to use either time difference or
level difference to determine a sound's lateral source, because the phase difference between the ears
becomes too small for a directional evaluation.
Evaluation for high frequencies
For frequencies above 1600 Hz the dimensions of the head are greater than the length of the sound
waves. An unambiguous determination of the input direction based on interaural phase alone is not
possible at these frequencies. However, the interaural level differences become larger, and these
level differences are evaluated by the auditory system. Also, group delays between the ears can be
evaluated, and is more pronounced at higher frequencies; that is, if there is a sound onset, the delay
of this onset between the ears can be used to determine the input direction of the corresponding
sound source. This mechanism becomes especially important in reverberant environment. After a
sound onset there is a short time frame where the direct sound reaches the ears, but not yet the
reflected sound. The auditory system uses this short time frame for evaluating the sound source
direction, and keeps this detected direction as long as reflections and reverberation prevent an
unambiguous direction estimation.
The mechanisms described above cannot be used to differentiate between a sound source ahead of
the hearer or behind the hearer; therefore additional cues have to be evaluated.
Signal processing
Sound processing of the human auditory system is performed in so-called critical bands. The
hearing range is segmented into 24 critical bands, each with a width of 1 Bark or 100 Mel. For a
directional analysis the signals inside the critical band are analyzed together.
The auditory system can extract the sound of a desired sound source out of interfering noise. So the
auditory system can concentrate on only one speaker if other speakers are also talking (the cocktail
party effect). With the help of the cocktail party effect sound from interfering directions is perceived
attenuated compared to the sound from the desired direction. The auditory system can increase the
signal-to-noise ratio by up to 15 dB, which means that interfering sound is perceived to be
attenuated to half (or less) of its actual loudness.
inside the critical bands in such a strong way, but the directional cues become unstable, because
there is a mix of sound of several reflection directions. As a result no new directional analysis is
triggered by the auditory system.
This first detected direction from the direct sound is taken as the found sound source direction, until
other strong loudness attacks, combined with stable directional information, indicate that a new
directional analysis is possible. (see Franssen effect)
Animals
Since most animals have two ears, many of the effects of the human auditory system can also be
found in animals. Therefore interaural time differences (interaural phase differences) and interaural
level differences play a role for the hearing of many animals. But the influences on localization of
these effects are dependent on head sizes, ear distances, the ear positions and the orientation of the
ears.
Head tilting
For sound localization in the median plane (elevation of the sound) also two detectors can be used,
which are positioned at different heights. In animals, however, rough elevation information is
gained simply by tilting the head, provided that the sound lasts long enough to complete the
movement. This explains the innate behavior of cocking the head to one side when trying to localize
a sound precisely. To get instantaneous localization in more than two dimensions from timedifference or amplitude-difference cues requires more than two detectors.
shifted above or below the elevation of the horizontal plane. This is due to the asymmetry in
placement of the ear openings in the owl's head, such that sounds from below the owl reach the left
ear first and sounds from above reach the right ear first.[12] IID is a measure of the difference in the
level of the sound as it reaches each ear. In many owls, IIDs for high-frequency sounds (higher than
4 or 5 kHz) are the principal cues for locating sound elevation.
Parallel processing pathways in the brain
The axons of the auditory nerve originate from the hair cells of the cochlea in the inner ear.
Different sound frequencies are encoded by different fibers of the auditory nerve, arranged along
the length of the auditory nerve, but codes for the timing and level of the sound are not segregated
within the auditory nerve. Instead, the ITD is encoded by phase locking, i.e. firing at or near a
particular phase angle of the sinusoidal stimulus sound wave, and the IID is encoded by spike rate.
Both parameters are carried by each fiber of the auditory nerve.[13]
The fibers of the auditory nerve innervate both cochlear nuclei in the brainstem, the cochlear
nucleus magnocellularis (mammalian anteroventral cochlear nucleus) and the cochlear nucleus
angularis (see figure; mammalian posteroventral and dorsal cochlear nuclei). The neurons of the
nucleus magnocellularis phase-lock, but are fairly insensitive to variations in sound pressure, while
the neurons of the nucleus angularis phase-lock poorly, if at all, but are sensitive to variations in
sound pressure. These two nuclei are the starting points of two separate but parallel pathways to the
inferior colliculus: the pathway from nucleus magnocellularis processes ITDs, and the pathway
from nucleus angularis processes IID.
Parallel processing pathways in the brain for time and level for sound localization in the owl
In the time pathway, the nucleus laminaris (mammalian medial superior olive) is the first site of
binaural convergence. It is here that ITD is detected and encoded using neuronal delay lines and
coincidence detection, as in the Jeffress model; when phase-locked impulses coming from the left
and right ears coincide at a laminaris neuron, the cell fires most strongly. Thus, the nucleus
laminaris acts as a delay-line coincidence detector, converting distance traveled to time delay and
generating a map of interaural time difference. Neurons from the nucleus laminaris project to the
core of the central nucleus of the inferior colliculus and to the anterior lateral lemniscal nucleus.
In the sound level pathway, the posterior lateral lemniscal nucleus (mammalian lateral superior
olive) is the site of binaural convergence and where IID is processed. Stimulation of the
contralateral ear inhibits and that of the ipsilateral ear excites the neurons of the nuclei in each brain
hemisphere independently. The degree of excitation and inhibition depends on sound pressure, and
the difference between the strength of the inhibitory input and that of the excitatory input
determines the rate at which neurons of the lemniscal nucleus fire. Thus the response of these
neurons is a function of the difference in sound pressure between the two ears.
The time and sound-pressure pathways converge at the lateral shell of the central nucleus of the
inferior colliculus. The lateral shell projects to the external nucleus, where each space-specific
neuron responds to acoustic stimuli only if the sound originates from a restricted area in space, i.e.
the receptive field of that neuron. These neurons respond exclusively to binaural signals containing
the same ITD and IID that would be created by a sound source located in the neurons receptive
field. Thus their receptive fields arise from the neurons tuning to particular combinations of ITD
and IID, simultaneously in a narrow range. These space-specific neurons can thus form a map of
auditory space in which the positions of receptive fields in space are isomorphically projected onto
the anatomical sites of the neurons.[14]
Significance of asymmetrical ears for localization of elevation
The ears of many species of owls are asymmetrical. For example, in barn owls (Tyto alba), the
placement of the two ear flaps (operculi) lying directly in front of the ear canal opening is different
for each ear. This asymmetry is such that the center of the left ear flap is slightly above a horizontal
line passing through the eyes and directed downward, while the center of the right ear flap is
slightly below the line and directed upward. In two other species of owls with asymmetrical ears,
the saw-whet Owl and the long-eared owl, the asymmetry is achieved by different means: in saw
whets, the skull is asymmetrical; in the long-eared owl, the skin structures lying near the ear form
asymmetrical entrances to the ear canals, which is achieved by a horizontal membrane. Thus, ear
asymmetry seems to have evolved on at least three different occasions among owls. Because owls
depend on their sense of hearing for hunting, this convergent evolution in owl ears suggests that
asymmetry is important for sound localization in the owl.
Ear asymmetry allows for sound originating from below the eye level to sound louder in the left ear,
while sound originating from above the eye level to sound louder in the right ear. Asymmetrical ear
placement also causes IID for high frequencies (between 4 kHz and 8 kHz) to vary systematically
with elevation, converting IID into a map of elevation. Thus, it is essential for an owl to have the
ability to hear high frequencies. Many birds have the neurophysiological machinery to process both
ITD and IID, but because they have small heads and low frequency sensitivity, they use both
parameters only for localization in the azimuth. Through evolution, the ability to hear frequencies
higher than 3 kHz, the highest frequency of owl flight noise, enabled owls to exploit elevational
IIDs, produced by small ear asymmetries that arose by chance, and began the evolution of more
elaborate forms of ear asymmetry.[15]
Another demonstration of the importance of ear asymmetry in owls is that, in experiments, owls
with symmetrical ears, such as the screech owl (Otus asio) and the great horned owl (Bubo
virginianus), could not be trained to locate prey in total darkness, whereas owls with asymmetrical
ears could be trained.[16]
Neural interactions
In vertebrates, inter-aural time differences are known to be calculated in the superior olivary
nucleus of the brainstem. According to Jeffress,[17] this calculation relies on delay lines: neurons in
the superior olive which accept innervation from each ear with different connecting axon lengths.
Some cells are more directly connected to one ear than the other, thus they are specific for a
particular inter-aural time difference. This theory is equivalent to the mathematical procedure of
cross-correlation. However, because Jeffress' theory is unable to account for the precedence effect,
in which only the first of multiple identical sounds is used to determine the sounds' location (thus
avoiding confusion caused by echoes), it cannot be entirely used to explain the response.
Furthermore, a number of recent physiological observations made in the midbrain and brainstem of
small mammals have shed considerable doubt on the validity of Jeffress' original ideas [18]
Neurons sensitive to ILDs are excited by stimulation of one ear and inhibited by stimulation of the
other ear, such that the response magnitude of the cell depends on the relative strengths of the two
inputs, which in turn, depends on the sound intensities at the ears.
In the auditory midbrain nucleus, the inferior colliculus (IC), many ILD sensitive neurons have
response functions that decline steeply from maximum to zero spikes as a function of ILD.
However, there are also many neurons with much more shallow response functions that do not
decline to zero spikes.
Binaural fusion
Binaural fusion (or binaural integration) is a cognitive process that involves the "fusion" of
different auditory information presented binaurally, or to each ear. In humans, this process is
essential in understanding speech as one ear may pick up more information about the speech stimuli
than the other.
The process of binaural fusion is important for computing the location of sound sources in the
horizontal plane (sound localization), and it is important for sound segregation.[1] Sound
segregation refers the ability to identify acoustic components from one or more sound sources.[2]
The binaural auditory system is highly dynamic and capable of rapidly adjusting tuning properties
depending on the context in which sounds are heard. Each eardrum moves one-dimensionally; the
auditory brain analyzes and compares movements of both eardrums to extract physical cues and
synthesize auditory objects.[3]
When stimulation from a sound reaches the ear, the eardrum deflects in a mechanical fashion, and
the three middle ear bones (ossicles) transmit the mechanical signal to the cochlea, where hair cells
transform the mechanical signal into an electrical signal. The auditory nerve, also called the
cochlear nerve, then transmits action potentials to the central auditory nervous system.[3]
In binaural fusion, inputs from both ears integrate and fuse to create a complete auditory picture at
the brainstem. Therefore, the signals sent to the central auditory nervous system are representative
of this complete picture, integrated information from both ears instead of a single ear.
Binaural fusion is responsible for what is known as the cocktail party effect, the ability of a listener
to hear a particular speaker against other interfering voices.[3]
The binaural squelch effect is a result of nuclei of the brainstem processing timing, amplitude, and
spectral differences between the two ears. Sounds are integrated and then separated into auditory
objects. For this effect to take place, neural integration from both sides is required.[4]
Anatomy
As sound travels into the inner eardrum of vertebrate mammals, it encounters the hair cells that line
the basilar membrane of the cochlea in the inner ear.[5] The cochlea receives auditory information
to be binaurally integrated. At the cochlea, this information is converted into electrical impulses that
travel by means of the cochlear nerve, which spans from the cochlea to the ventral cochlear nucleus,
which is located in the pons of the brainstem.[6] The lateral lemniscus projects from the cochlear
nucleus to the superior olivary complex (SOC), a set of brainstem nuclei that consists primarily of
two nuclei, the medial superior olive (MSO) and the lateral superior olive (LSO), and is the major
site of binaural fusion. The subdivision of the ventral cochlear nucleus that concerns binaural fusion
is the anterior ventral cochlear nucleus (AVCN).[3] The AVCN consists of spherical bushy cells and
globular bushy cells and can also transmit signals to the medial nucleus of the trapezoid body
(MNTB), whose neuron projects to the MSO. Transmissions from the SOC travel to the inferior
colliculus (IC) via the lateral lemniscus. At the level of the IC, binaural fusion is complete. The
signal ascends to the thalamocortical system, and sensory inputs to the thalamus are then relayed to
the primary auditory cortex.[3][7][8][9]
Function
The ear functions to analyze and encode a sounds dimensions.[10] Binaural fusion is responsible
for avoiding the creation of multiple sound images from a sound source and its reflections. The
advantages of this phenomenon are more noticeable in small rooms, decreasing as the reflective
surfaces are placed farther from the listener.[11]
Sound localization
Sound localization is the ability to correctly identify the directional location of sounds. A sound
stimulus localized in the horizontal plane is called azimuth; in the vertical plane it is referred to as
elevation. The time, intensity, and spectral differences in the sound arriving at the two ears are used
in localization. Localization of low frequency sounds is accomplished by analyzing interaural time
difference (ITD). Localization of high frequency sounds is accomplished by analyzing interaural
level difference (ILD).[4]
Mechanism
Binaural hearing
Action potentials originate in the hair cells of the cochlea and propagate to the brainstem; both the
timing of these action potentials and the signal they transmit provide information to the SOC about
the orientation of sound in space. The processing and propagation of action potentials is rapid, and
therefore, information about the timing of the sounds that were heard, which is crucial to binaural
processing, is conserved.[15] Each eardrum moves in one dimension, and the auditory brain
analyzes and compares the movements of both eardrums in order to synthesize auditory objects.[3]
This integration of information from both ears is the essence of binaural fusion. The binaural system
of hearing involves sound localization in the horizontal plane, contrasting with the monaural system
of hearing, which involves sound localization in the vertical plane.[3]
are particularly related to the proper function of the SOC, and there is increasing evidence that
morphological abnormalities within the brainstem, namely in the SOC, of autistic individuals are a
cause of the hearing difficulties.[17] The neurons of the MSO of individuals with autism display
atypical anatomical features, including atypical cell shape and orientation of the cell body as well as
stellate and fusiform formations.[18] Data also suggests that neurons of the LSO and MNTB
contain distinct dysmorphology in autistic individuals, such as irregular stellate and fusiform shapes
and a smaller than normal size. Moreover, a significant depletion of SOC neurons is seen in the
brainstem of autistic individuals. All of these structures play a crucial role in the proper functioning
of binaural fusion, so their dysmorphology may be at least partially responsible for the incidence of
these auditory symptoms in autistic patients.[17]
Acoustic location
Swedish soldiers operating an acoustic locator in 1940
Acoustic location is the science of using sound to determine the distance and direction of
something. Location can be done actively or passively, and can take place in gases (such as the
atmosphere), liquids (such as water), and in solids (such as in the earth).
Active acoustic location involves the creation of sound in order to produce an echo, which is
then analyzed to determine the location of the object in question.
Passive acoustic location involves the detection of sound or vibration created by the object
being detected, which is then analyzed to determine the location of the object in question.
Both of these techniques, when used in water, are known as sonar; passive sonar and active sonar
are both widely used.
Acoustic mirrors and dishes, when using microphones, are a means of passive acoustic localization,
but when using speakers are a means of active localization. Typically, more than one device is used,
and the location is then triangulated between the several devices.
As a military air defense tool, passive acoustic location was used from mid-World War I[1] to the
early years of World War II to detect enemy aircraft by picking up the noise of their engines. It was
rendered obsolete before and during World War II by the introduction of radar, which was far more
effective (but interceptable). Acoustic techniques had the advantage that they could 'see' around
corners and over hills, due to sound refraction.
The civilian uses include locating wildlife[2] and locating the shooting position of a firearm.[3]
Sonar
SONAR (Sound Navigation And Ranging) or sonar is a technique that uses sound
propagation under water (or occasionally in air) to navigate, communicate or to detect other vessels.
There are two kinds of sonar active and passive. A single active sonar can localize in range and
bearing as well as measuring radial speed. However, a single passive sonar can only localize in
bearing directly, though target motion analysis can be used to localize in range, given time. Multiple
passive sonars can be used for range localization by triangulation or correlation, directly.
For more information on this item, see the article on Sonar.
Time-of-arrival localization
Having speakers/ultrasonic transmitters emitting sound at known positions and time, the position of
a target equipped with a microphone/ultrasonic receiver can be estimated based on the time of
arrival of the sound. The accuracy is usually poor under non-line-of-sight conditions, where there
are blockages in between the transmitters and the receivers. [11]
Seismic surveys
Seismic surveys involve the generation of sound waves to measure underground structures. Source
waves are generally created by percussion mechanisms located near the ground or water surface,
typically dropped weights, vibroseis trucks, or explosives. Data are collected with geophones, then
stored and processed by computer. Current technology allows the generation of 3D images of
underground rock structures using such equipment.
For more information, see Reflection seismology.
Ecotracer
Ecotracer is an acoustic locator that was used to determining the presence and position of ships in
fog. Some could detect targets at distances up to 12 kilometers. Static walls could detect aircraft up
to 30 miles away.
Types
There were four main kinds of system:[12]
Personal/wearable horns
Transportable steerable horns
Static dishes
Static walls
Impact
American acoustic locators were used in 1941 to detect the Japanese attack on the fortress island of
Corregidor in the Philippines.
Other
Because the cost of the associated sensors and electronics is dropping, the use of sound ranging
technology is becoming accessible for other uses, such as for locating wildlife.[13]
Fig. 2: If a sound arrives at the left ear before the right ear,
the impulse in the left auditory tract will reach X sooner
than the impulse in the right auditory tract reaches Y.
Neurons 4 or 5 may therefore receive coincident inputs.
pyramidal neuron may not induce long-term potentiation. However, this same stimulation paired
with a simultaneous strong stimulation from another neuron will strengthen both synapses. This
process suggests that two neuronal pathways converging on the same cell may both strengthen if
stimulated coincidentally.
Animal echolocation
A depiction of the ultrasound signals emitted by a bat, and the echo from a nearby object.
Echolocation, also called biosonar, is the biological sonar used by several kinds of animals.
Echolocating animals emit calls out to the environment and listen to the echoes of those calls that
return from various objects near them. They use these echoes to locate and identify the objects.
Echolocation is used for navigation and for foraging (or hunting) in various environments. Some
blind humans have learned to find their way using clicks produced by a device or by mouth.
Echolocating animals include some mammals and a few birds; most notably microchiropteran bats
and odontocetes (toothed whales and dolphins), but also in simpler form in other groups such as
shrews, one genus of megachiropteran bats (Rousettus) and two cave dwelling bird groups, the socalled cave swiftlets in the genus Aerodramus (formerly Collocalia) and the unrelated Oilbird
Steatornis caripensis.
Early research
The term echolocation was coined by Donald Griffin, whose work with Robert Galambos was the
first to conclusively demonstrate its existence in bats in 1938.[1][2]
Long before that, however, the 18th century Italian scientist Lazzaro Spallanzani had, by means of a
series of elaborate experiments, concluded that bats navigate by hearing and not by vision.[3]
Echolocation in odontocetes was not properly described until two decades later, by Schevill and
McBride.[4]
Principle
Echolocation is the same as active sonar, using sounds made by the animal itself. Ranging is done
by measuring the time delay between the animal's own sound emission and any echoes that return
from the environment. The relative intensity of sound received at each ear as well as the time delay
between arrival at the two ears provide information about the horizontal angle (azimuth) from
which the reflected sound waves arrive.[5]
Unlike some man-made sonars that rely on many extremely narrow beams and many receivers to
localize a target (multibeam sonar), animal echolocation has only one transmitter and two receivers
(the ears). Echolocating animals have two ears positioned slightly apart. The echoes returning to the
two ears arrive at different times and at different loudness levels, depending on the position of the
object generating the echoes. The time and loudness differences are used by the animals to perceive
distance and direction. With echolocation, the bat or other animal can see not only where it is going
but also how big another animal is, what kind of animal it is, and other features.[citation needed]
Bats
Microbats use echolocation to navigate and forage, often in total darkness. They generally emerge
from their roosts in caves, attics, or trees at dusk and hunt for insects into the night. Their use of
echolocation allows them to occupy a niche where there are often many insects (that come out at
night since there are fewer predators then) and where there is less competition for food, and where
there are fewer other species that may prey on the bats themselves.
Microbats generate ultrasound via the larynx and emit the sound through the open mouth or, much
more rarely, the nose. The latter is most pronounced in the horseshoe bats (Rhinolophus spp.).
Microbat calls (helpinfo) range in frequency from 14,000 to well over 100,000 Hz, mostly beyond
the range of the human ear (typical human hearing range is considered to be from 20 Hz to
20,000 Hz). Bats may estimate the elevation of targets by interpreting the interference patterns
caused by the echoes reflecting from the tragus, a flap of skin in the external ear.[6] There are two
hypotheses about the evolution of echolocation in bats. The first suggests that laryngeal
echolocation evolved twice in Chiroptera, once in the Yangochiroptera and once in the Horseshoe
bats (Rhinolophidae).[7][8] The second proposes that laryngeal echolocation had a single origin in
Chiroptera, was subsequently lost in the family Pteropodidae (all megabats), and later evolved as a
system of tongue-clicking in the genus Rousettus.[9]
Individual bat species echolocate within specific frequency ranges that suit their environment and
prey types. This has sometimes been used by researchers to identify bats flying in an area simply by
recording their calls with ultrasonic recorders known as "bat detectors". However echolocation calls
are not always species specific and some bats overlap in the type of calls they use so recordings of
echolocation calls cannot be used to identify all bats. In recent years researchers in several countries
have developed "bat call libraries" that contain recordings of local bat species that have been
identified known as "reference calls" to assist with identification.
Since the 1970s there has been an ongoing controversy among researchers as to whether bats use a
form of processing known from radar termed coherent cross-correlation. Coherence means that the
phase of the echolocation signals is used by the bats, while cross-correlation just implies that the
outgoing signal is compared with the returning echoes in a running process. Today most - but not all
- researchers believe that they use cross-correlation, but in an incoherent form, termed a filter bank
receiver.
When searching for prey they produce sounds at a low rate (10-20 clicks/second). During the search
phase the sound emission is coupled to respiration, which is again coupled to the wingbeat. This
coupling appears to dramatically conserve energy as there is little to no additional energetic cost of
echolocation to flying bats.[10] After detecting a potential prey item, microbats increase the rate of
pulses, ending with the terminal buzz, at rates as high as 200 clicks/second. During approach to a
detected target, the duration of the sounds is gradually decreased, as is the energy of the sound.
descent with modification, and resulting in the diversity of the Microchiroptera today.[11] [12][13]
[14][15][16]
Acoustic features
Describing the diversity of bat echolocation calls requires examination of the frequency and
temporal features of the calls. It is the variations in these aspects that produce echolocation calls
suited for different acoustic environments and hunting behaviors.[17][18][19][20][21]
Frequency Modulation and Constant Frequency: Echolocation calls can be composed of
two different types of frequency structures: frequency modulated (FM) sweeps, and constant
frequency (CF) tones. A particular call can consist of one, the other, or both structures. An
FM sweep is a broadband signal that is, it contains a downward sweep through a range of
frequencies. A CF tone is a narrowband signal: the sound stays constant at one frequency
throughout its duration.
Intensity: Echolocation calls have been measured at intensities anywhere between 60 and
140 decibels.[22] Certain microbat species can modify their call intensity mid-call, lowering
the intensity as they approach objects that reflect sound strongly. This prevents the returning
echo from deafening the bat.[23] Additionally, the so-called "whispering bats" have adapted
low-amplitude echolocation so that their prey, moths, which are able to hear echolocation
calls, are less able to detect and avoid an oncoming bat[24]
Harmonic composition: Calls can be composed of one frequency, or multiple frequencies
comprising a harmonic series. In the latter case, the call is usually dominated by a certain
harmonic ("dominant" frequencies are those present at higher intensities than other
harmonics present in the call).[citation needed]
Call duration: A single echolocation call (a call being a single continuous trace on a sound
spectrogram, and a series of calls comprising a sequence or pass) can last anywhere from 0.2
to 100 milliseconds in duration, depending on the stage of prey-catching behavior that the
bat is engaged in. For example, the duration of a call usually decreases when the bat is in the
final stages of prey capture this enables the bat to call more rapidly without overlap of call
and echo. Reducing duration comes at the cost of having less total sound available for
reflecting off objects and being heard by the bat.[citation needed]
Pulse interval: The time interval between subsequent echolocation calls (or pulses)
determines two aspects of a bat's perception. First, it establishes how quickly the bat's
auditory scene information is updated. For example, bats increase the repetition rate of their
calls (that is, decrease the pulse interval) as they home in on a target. This allows the bat to
get new information regarding the target's location at a faster rate when it needs it most.
Secondly, the pulse interval determines the maximum range that bats can detect objects. This
is because bats can only keep track of the echoes from one call at a time; as soon as they
make another call they stop listening for echoes from the previously made call.[25] For
example, a pulse interval of 100 ms (typical of a bat searching for insects) allows sound to
travel in air roughly 34 meters so a bat can only detect objects as far away as 17 meters (the
sound has to travel out and back). With a pulse interval of 5 ms (typical of a bat in the final
moments of a capture attempt), the bat can only detect objects up to 85 cm away. Therefore
the bat constantly has to make a choice between getting new information updated quickly
and detecting objects far away.
FM Signal Advantages
The major advantage conferred by an FM signal is extremely precise range discrimination, or
localization, of the target. J.A. Simmons demonstrated this effect with a series of elegant
experiments that showed how bats using FM signals could distinguish between two separate targets
even when the targets were less than half a millimeter apart. This amazing ability is due to the
broadband sweep of the signal, which allows for better resolution of the time delay between the call
and the returning echo, thereby improving the cross correlation of the two. Additionally, if harmonic
frequencies are added to the FM signal, then this localization becomes even more precise.[26][27]
[28][29]
One possible disadvantage of the FM signal is a decreased operational range of the call. Because the
energy of the call is spread out among many frequencies, the distance at which the FM-bat can
detect targets is limited.[30] This is in part because any echo returning at a particular frequency can
only be evaluated for a brief fraction of a millisecond, as the fast downward sweep of the call does
not remain at any one frequency for long.[31]
CF Signal Advantages
The structure of a CF signal is adaptive in that it allows the CF-bat to detect both the velocity of a
target, and the fluttering of a target's wings as Doppler shifted frequencies. A Doppler shift is an
alteration in sound wave frequency, and is produced in two relevant situations: when the bat and its
target are moving relative to each other, and when the target's wings are oscillating back and forth.
CF-bats must compensate for Doppler shifts, lowering the frequency of their call in response to
echoes of elevated frequency - this ensures that the returning echo remains at the frequency to
which the ears of the bat are most finely tuned. The oscillation of a target's wings also produces
amplitude shifts, which gives a CF-bat additional help in distinguishing a flying target from a
stationary one. (Schnitzler and Flieger 1983; Zupanc 2004; Simmons and Stein 1980; Grinnell
1995; Neuweiler 2003; Jones and Teeling 2006)
Additionally, because the signal energy of a CF call is concentrated into a narrow frequency band,
the operational range of the call is much greater than that of an FM signal. This relies on the fact
that echoes returning within the narrow frequency band can be summed over the entire length of the
call, which maintains a constant frequency for up to 100 milliseconds.[32][33]
Acoustic environments of FM and CF signals
FM: An FM component is excellent for hunting prey while flying in close, cluttered
environments. Two aspects of the FM signal account for this fact: the precise target
localization conferred by the broadband signal, and the short duration of the call. The first of
these is essential because in a cluttered environment, the bats must be able to resolve their
prey from large amounts of background noise. The 3D localization abilities of the broadband
signal enable the bat to do exactly that, providing it with what Simmons and Stein (1980)
call a "clutter rejection strategy." This strategy is further improved by the use of harmonics,
which, as previously stated, enhance the localization properties of the call. The short
duration of the FM call is also best in close, cluttered environments because it enables the
bat to emit many calls extremely rapidly without overlap. This means that the bat can get an
almost continuous stream of information essential when objects are close, because they
will pass by quickly without confusing which echo corresponds to which call. (Neuweiler
2003; Simmons and Stein 1980; Jones and Teeling 2006; Fenton 1995)
CF: A CF component is often used by bats hunting for prey while flying in open, clutter-free
environments, or by bats that wait on perches for their prey to appear. The success of the
former strategy is due to two aspects of the CF call, both of which confer excellent preydetection abilities. First, the greater working range of the call allows bats to detect targets
present at great distances a common situation in open environments. Second, the length of
the call is also suited for targets at great distances: in this case, there is a decreased chance
that the long call will overlap with the returning echo. The latter strategy is made possible by
the fact that the long, narrowband call allows the bat to detect Doppler shifts, which would
be produced by an insect moving either towards or away from a perched bat. (Neuweiler
2003; Simmons and Stein 1980; Jones and Teeling 2006; Fenton 1995)
Inferior colliculus
In the Inferior colliculus, a structure in the bat's midbrain, information from lower in the auditory
processing pathway is integrated and sent on to the auditory cortex. As George Pollak and others
showed in a series of papers in 1977, the interneurons in this region have a very high level of
sensitivity to time differences, since the time delay between a call and the returning echo tells the
bat its distance from the target object. Especially interesting is that while most neurons respond
more quickly to stronger stimuli, collicular neurons maintain their timing accuracy even as signal
intensity changes.
These interneurons are specialized for time sensitivity in several ways. First, when activated, they
generally respond with only one or two action potentials. This short duration of response allows
their action potentials to give a very specific indication of the exact moment of the time when the
stimulus arrived, and to respond accurately to stimuli that occur close in time to one another. In
addition, the neurons have a very low threshold of activation they respond quickly even to weak
stimuli. Finally, for FM signals, each interneuron is tuned to a specific frequency within the sweep,
as well as to that same frequency in the following echo. There is specialization for the CF
component of the call at this level as well. The high proportion of neurons responding to the
frequency of the acoustic fovea actually increases at this level.[37][38][39]
Auditory cortex
The auditory cortex in bats is quite large in comparison with other mammals.[40] Various
characteristics of sound are processed by different regions of the cortex, each providing different
information about the location or movement of a target object. Most of the existing studies on
information processing in the auditory cortex of the bat have been done by Nobuo Suga on the
mustached bat, Pteronotus parnellii. This bat's call has both CF tone and FM sweep components.
Suga and his colleagues have shown that the cortex contains a series of "maps" of auditory
information, each of which is organized systematically based on characteristics of sound such as
frequency and amplitude. The neurons in these areas respond only to a specific combination of
frequency and timing (sound-echo delay), and are known as combination-sensitive neurons.
The systematically organized maps in the auditory cortex respond to various aspects of the echo
signal, such as its delay and its velocity. These regions are composed of "combination sensitive"
neurons that require at least two specific stimuli to elicit a response. The neurons vary
systematically across the maps, which are organized by acoustic features of the sound and can be
two dimensional. The different features of the call and its echo are used by the bat to determine
important characteristics of their prey. The maps include:
FM-FM area: This region of the cortex contains FM-FM combination-sensitive neurons. These
cells respond only to the combination of two FM sweeps: a call and its echo. The neurons in the
FM-FM region are often referred to as "delay-tuned," since each responds to a specific time delay
between the original call and the echo, in order to find the distance from the target object (the
range). Each neuron also shows specificity for one harmonic in the original call and a different
harmonic in the echo. The neurons within the FM-FM area of the cortex of Pteronotus are
organized into columns, in which the delay time is constant vertically but increases across the
horizontal plane. The result is that range is encoded by location on the cortex, and increases
systematically across the FM-FM area.[41][42][43][44]
CF-CF area: Another kind of combination-sensitive neuron is the CF-CF neuron. These
respond best to the combination of a CF call containing two given frequencies a call at
30 kHz (CF1) and one of its additional harmonics around 60 or 90 kHz (CF2 or CF3) and
the corresponding echoes. Thus, within the CF-CF region, the changes in echo frequency
caused by the Doppler shift can be compared to the frequency of the original call to calculate
the bat's velocity relative to its target object. As in the FM-FM area, information is encoded
by its location within the map-like organization of the region. The CF-CF area is first split
into the distinct CF1-CF2 and CF1-CF3 areas. Within each area, the CF1 frequency is
organized on an axis, perpendicular to the CF2 or CF3 frequency axis. In the resulting grid,
each neuron codes for a certain combination of frequencies that is indicative of a specific
velocity[45][46][47]
DSCF area: This large section of the cortex is a map of the acoustic fovea, organized by
frequency and by amplitude. Neurons in this region respond to CF signals that have been
Doppler shifted (in other words, echoes only) and are within the same narrow frequency
range to which the acoustic fovea responds. For Pteronotus, this is around 61 kHz. This area
is organized into columns, which are arranged radially based on frequency. Within a column,
each neuron responds to a specific combination of frequency and amplitude. Suga's studies
have indicated that this brain region is necessary for frequency discrimination.[48][49][50]
Toothed whales
Biosonar is valuable to Toothed whales (suborder odontoceti), including dolphins, porpoises, river
dolphins, killer whales and sperm whales, because they live in an underwater habitat that has
favourable acoustic characteristics and where vision is extremely limited in range due to absorption
or turbidity.
Cetacean evolution consisted of three main radiations. Throughout the middle and late Eocene
periods (49-31.5 million years ago), archaeocetes, primitive toothed Cetacea that arose from
terrestrial mammals with the creation of aquatic adaptations, were the only known archaic Cetacea.
[51] These primitive aquatic mammals did not possess the ability to echolocate, although they did
have slightly adapted underwater hearing.[52] The morphology of acoustically isolated ear bones in
basilosaurid archaeocetes indicates that this order had directional hearing underwater at low to mid
frequencies by the late middle Eocene.[53] However, with the extinction of archaeocete at the onset
of the Oligocene, two new lineages in the early Oligocene period (31.5-28 million years ago)
compromised a second radiation. These early mysticete (baleen whales) and odontocete can be
dated back to the middle Oligocene in New Zealand.[51] Based on past phylogenies, it has been
found that the evolution of odontocetes is monophyletic, suggesting that echolocation evolved only
once 36 to 34 million years ago.[53] Dispersal rates routes of early odontocetes included
transoceanic travel to new adaptive zones. The third radiation occurred later in the Neogene, when
present dolphins and their relatives evolved to be the most common species in the modern sea.[52]
The evolution of echolocation could be attributed several theories. There are two proposed drives
for the hypotheses of cetacean radiation, one biotic and the other abiotic in nature. The first,
adaptive radiation, is the result of a rapid divergence into new adaptive zones. This results in
diverse, ecologically different clades that are incomparable.[54] Clade Neocete (crown cetacean)
has been characterized by an evolution from archaeocetes and a dispersion across the world's
oceans, and even estuarites and rivers. These ecological opportunities were the result of abundant
dietary resources with low competition for hunting.[55] This hypothesis of lineage diversification,
however, can be unconvincing due to a lack of support for rapid speciation early in cetacean history.
A second, more abiotic drive is better supported. Physical restructuring of the oceans has played a
role in echolocation radiation. This was a result of global climate change at the Eocene-Oligocene
boundary; from a greenhouse to an icehouse world. Tectonic openings created the emergence of the
Southern ocean with a free flowing Antarctic Circumpolar current.[56][57][58][59] These events
allowed for a selection regime characterized by the ability to locate and capture prey in turbid river
waters, or allow odontocetes to invade and feed at depths below the photic zone. Further studies
have found that echolocation below the photic zone could have been a predation adaptation to diel
migrating cephalopods.[53][60] Since its advent, there has been adaptive radiation especially in the
Delphinidae family (dolphins) in which echolocation has become extremely derived.[61]
One specific type of echolocation, narrow-band high frequency (NBHF) clicks, evolved at least four
times in groups of odontocetes, including pygmy sperm whale (Kogiidae) and porpoise
(Phocoenidae) families, Pontoporia blainvillei, the genus Cephalorhynchus, and part of the genus
Lagenorhynchus.[62][63] These high frequency clicks likely evolved as adaptation of predator
avoidance, as they inhabit areas that have many killer whales and the signals are inaudible to killer
whales due to the absence of energy below 100 kHz. Another reason for variation in echolocation
frequencies is habitat. Shallow waters, where many of these species live, tend to have more debris;
a more directional transmission reduces clutter in reception.[63]
Toothed whales emit a focused beam of high-frequency clicks in the direction that their head is
pointing. Sounds are generated by passing air from the bony nares through the phonic lips.[64]
These sounds are reflected by the dense concave bone of the cranium and an air sac at its base. The
focused beam is modulated by a large fatty organ known as the 'melon'. This acts like an acoustic
lens because it is composed of lipids of differing densities. Most toothed whales use clicks in a
series, or click train, for echolocation, while the sperm whale may produce clicks individually.
Toothed whale whistles do not appear to be used in echolocation. Different rates of click production
in a click train give rise to the familiar barks, squeals and growls of the bottlenose dolphin. A click
train with a repetition rate over 600 per second is called a burst pulse. In bottlenose dolphins, the
auditory brain response resolves individual clicks up to 600 per second, but yields a graded
response for higher repetition rates.
It has been suggested that some smaller toothed whales may have their tooth arrangement suited to
aid in echolocation. The placement of teeth in the jaw of a bottlenose dolphin, as an example, are
not symmetrical when seen from a vertical plane, and this asymmetry could possibly be an aid in
the dolphin sensing if echoes from its biosonar are coming from one side or the other.[65][66]
However, this idea lacks experimental support.
Echoes are received using complex fatty structures around the lower jaw as the primary reception
path, from where they are transmitted to the middle ear via a continuous fat body.[67][68] Lateral
sound may be received though fatty lobes surrounding the ears with a similar density to water.
Some researchers believe that when they approach the object of interest, they protect themselves
against the louder echo by quietening the emitted sound. In bats this is known to happen, but here
the hearing sensitivity is also reduced close to a target.
Before the echolocation abilities of "porpoises" were officially discovered, Jacques Yves Cousteau
suggested that they might exist. In his first book, The Silent World (1953, pp. 206207), he reported
that his research vessel, the lie Monier, was heading to the Straits of Gibraltar and noticed a group
of porpoises following them. Cousteau changed course a few degrees off the optimal course to the
center of the strait, and the porpoises followed for a few minutes, then diverged toward mid-channel
again. It was obvious that they knew where the optimal course lay, even if the humans didn't.
Cousteau concluded that the cetaceans had something like sonar, which was a relatively new feature
on submarines.
Human echolocation
Human echolocation is an ability of humans to detect objects in their environment by sensing
echoes from those objects. By actively creating sounds for example, by tapping their canes, lightly
stomping their foot or making clicking noises with their mouths people trained to orientate with
echolocation can interpret the sound waves reflected by nearby objects, accurately identifying their
location and size. This ability is used by some blind people for acoustic wayfinding, or navigating
within their environment using auditory rather than visual cues. It is similar in principle to active
sonar and to the animal echolocation employed by some animals, including bats, dolphins and
toothed whales.
Background
Human echolocation has been known and formally studied since at least the 1950s.[1] In earlier
times, human echolocation was sometimes described as "facial vision".[2][3][4] The field of human
and animal echolocation was surveyed in book form as early as 1959.[5] See also White, et al.,
(1970)[6]
Mechanics
Vision and hearing are closely related in that they can process reflected waves of energy. Vision
processes light waves as they travel from their source, bounce off surfaces throughout the
environment and enter the eyes. Similarly, the auditory system processes sound waves as they travel
from their source, bounce off surfaces and enter the ears. Both systems can extract a great deal of
information about the environment by interpreting the complex patterns of reflected energy that
they receive. In the case of sound, these waves of reflected energy are called "echoes".
Echoes and other sounds can convey spatial information that is comparable in many respects to that
conveyed by light.[7] With echoes, a blind traveler can perceive very complex, detailed, and
specific information from distances far beyond the reach of the longest cane or arm. Echoes make
information available about the nature and arrangement of objects and environmental features such
as overhangs, walls, doorways and recesses, poles, ascending curbs and steps, planter boxes,
pedestrians, fire hydrants, parked or moving vehicles, trees and other foliage, and much more.
Echoes can give detailed information about location (where objects are), dimension (how big they
are and their general shape), and density (how solid they are). Location is generally broken down
into distance from the observer and direction (left/right, front/back, high/low). Dimension refers to
the object's height (tall or short) and breadth (wide or narrow).
By understanding the interrelationships of these qualities, much can be perceived about the nature
of an object or multiple objects. For example, an object that is tall and narrow may be recognized
quickly as a pole. An object that is tall and narrow near the bottom while broad near the top would
be a tree. Something that is tall and very broad registers as a wall or building. Something that is
broad and tall in the middle, while being shorter at either end may be identified as a parked car. An
object that is low and broad may be a planter, retaining wall, or curb. And finally, something that
starts out close and very low but recedes into the distance as it gets higher is a set of steps. Density
refers to the solidity of the object (solid/sparse, hard/soft). Awareness of density adds richness and
complexity to one's available information. For instance, an object that is low and solid may be
recognized as a table, while something low and sparse sounds like a bush; but an object that is tall
and broad and very sparse is probably a fence.[8] As was shown in the documentary series
"Extraordinary People", echolocation can be used to play videogames, apparently by sensing the
depth variation in on/off pixels.[9][broken citation]
made recordings of the clicks and their very faint echoes using tiny microphones placed in the ears
of the blind echolocators as they stood outside and tried to identify different objects such as a car, a
flag pole, and a tree. The researchers then played the recorded sounds back to the echolocators
while their brain activity was being measured using functional magnetic resonance imaging.
Remarkably, when the echolocation recordings were played back to the blind experts, not only did
they perceive the objects based on the echoes, but they also showed activity in those areas of their
brain that normally process visual information in sighted people, primarily primary visual cortex or
V1. Most interestingly, the brain areas that process auditory information were no more activated by
sound recordings of outdoor scenes containing echoes than they were by sound recordings of
outdoor scenes with the echoes removed. Importantly, when the same experiment was carried out
with sighted people who did not echolocate, these individuals could not perceive the objects and
there was no echo-related activity anywhere in the brain.
Echo-related activity in the brain of an early-blind echolocator is shown on the left. There is no
activity evident in the brain of a sighted person (shown on the right) listening to the same echoes
and now trains other blind people in the use of echolocation and in what he calls "Perceptual
Mobility".[14] Though at first resistant to using a cane for mobility, seeing it as a "handicapped"
device, and considering himself "not handicapped at all", Kish developed a technique using his
white cane combined with echolocation to further expand his mobility.[14]
Kish reports that "The sense of imagery is very rich for an experienced user. One can get a sense of
beauty or starkness or whatever - from sound as well as echo".[12] He is able to distinguish a metal
fence from a wooden one by the information returned by the echoes on the arrangement of the fence
structures; in extremely quiet conditions, he can also hear the warmer and duller quality of the
echoes from wood compared to metal.[12]
Ben Underwood
Diagnosed with retinal cancer at the age of two, American Ben Underwood had his eyes removed at
the age of three.[15]
He discovered echolocation at the age of five. He was able to detect the location of objects by
making frequent clicking noises with his tongue. This case was explained in 20/20:medical
mysteries.[16] He used it to accomplish such feats as running, playing basketball, riding a bicycle,
rollerblading, playing foosball, and skateboarding.[17][18] Underwood's child eye doctor claimed
that Underwood was one of the most proficient human echolocators.
Underwood died on January 19, 2009 at the age of 16, from the same cancer that took his vision.
[19]
Tom De Witte
Tom De Witte was born in 1979 in Belgium with bilateral congenital glaucoma in both eyes. It had
seemed that De Witte would become a successful flautist until he had to give up playing music in
2005. De Witte has been completely blind since 2009 due to additional problems with his eyes. He
was taught echolocation by Daniel Kish and was given the nickname "Batman from Belgium" by
the press.[citation needed]
Lucas Murray
Lucas Murray, from Poole, Dorset, was born blind. He is believed to be one of the first British
people to learn to visualise his surroundings using echolocation, and was taught by Daniel Kish.
Kevin Warwick
The scientist Kevin Warwick experimented with feeding ultrasonic pulses into the brain (via
electrical stimulation from a neural implant) as an additional sensory input. In tests he was able to
accurately discern distance to objects and to detect small movements of those objects.[21]
Juan Ruiz
Juan Ruiz appeared in the first episode of Stan Lee's Superhumans titled "Electro Man". He lives at
Los Angeles, California and was blind since birth. In the episode, he was shown to be capable of
riding a bicycle, avoid parked cars and other obstacles and determine nearby objects. He was also
able to go inside and out of a cave, where he determined its length and other features.[citation
needed]
Long-term potentiation
In neuroscience, long-term potentiation (LTP) is a long-lasting enhancement in signal
transmission between two neurons that results from stimulating them synchronously.[2] It is one of
several phenomena underlying synaptic plasticity, the ability of chemical synapses to change their
strength. As memories are thought to be encoded by modification of synaptic strength,[3] LTP is
widely considered one of the major cellular mechanisms that underlies learning and memory.[2][3]
LTP was discovered in the rabbit hippocampus by Terje Lmo in 1966 and has remained a popular
subject of research since. Many modern LTP studies seek to better understand its basic biology,
while others aim to draw a causal link between LTP and behavioral learning. Still others try to
develop methods, pharmacologic or otherwise, of enhancing LTP to improve learning and memory.
LTP is also a subject of clinical research, for example, in the areas of Alzheimer's disease and
addiction medicine.
hippocampus, an important organ for learning and memory. In such studies, electrical recordings are
made from cells and plotted in a graph such as this one. This graph compares the response to stimuli
in synapses that have undergone LTP versus synapses that have not undergone LTP. Synapses that
have undergone LTP tend to have stronger electrical responses to stimuli than other synapses. The
term long-term potentiation comes from the fact that this increase in synaptic strength, or
potentiation, lasts a very long time compared to other processes that affect synaptic strength.[1]
The 19th century neuroanatomist Santiago Ramn y Cajal proposed that memories might be stored
across synapses, the junctions between neurons that allow for their communication.
At the end of the 19th century, scientists generally recognized that the number of neurons in the
adult brain (roughly 100 billion[4]) did not increase significantly with age, giving neurobiologists
good reason to believe that memories were generally not the result of new neuron production.[5]
With this realization came the need to explain how memories could form in the absence of new
neurons.
The Spanish neuroanatomist Santiago Ramn y Cajal was among the first to suggest a mechanism
of learning that did not require the formation of new neurons. In his 1894 Croonian Lecture, he
proposed that memories might instead be formed by strengthening the connections between existing
neurons to improve the effectiveness of their communication.[5] Hebbian theory, introduced by
Donald Hebb in 1949, echoed Ramn y Cajal's ideas, further proposing that cells may grow new
connections or undergo metabolic changes that enhance their ability to communicate:
Let us assume that the persistence or repetition of a reverberatory activity (or "trace")
tends to induce lasting cellular changes that add to its stability.... When an axon of cell A
is near enough to excite a cell B and repeatedly or persistently takes part in firing it,
some growth process or metabolic change takes place in one or both cells such that A's
efficiency, as one of the cells firing B, is increased.[6]
Though these theories of memory formation are now well established, they were farsighted for their
time: late 19th and early 20th century neuroscientists and psychologists were not equipped with the
neurophysiological techniques necessary for elucidating the biological underpinnings of learning in
animals. These skills would not come until the latter half of the 20th century, at about the same time
as the discovery of long-term potentiation.
Discovery
LTP was first discovered in the rabbit hippocampus. In humans, the hippocampus is located in the
medial temporal lobe. This illustration of the underside of the human brain shows the hippocampus
highlighted in red. The frontal lobe is at the top of the illustration and the occipital lobe is at the
bottom.
LTP was first observed by Terje Lmo in 1966 in the Oslo, Norway, laboratory of Per Andersen.[7]
[8] There, Lmo conducted a series of neurophysiological experiments on anesthetized rabbits to
explore the role of the hippocampus in short-term memory.
Lmo's experiments focused on connections, or synapses, from the perforant pathway to the dentate
gyrus. These experiments were carried out by stimulating presynaptic fibers of the perforant
pathway and recording responses from a collection of postsynaptic cells of the dentate gyrus. As
expected, a single pulse of electrical stimulation to fibers of the perforant pathway caused excitatory
postsynaptic potentials (EPSPs) in cells of the dentate gyrus. What Lmo unexpectedly observed
was that the postsynaptic cells' response to these single-pulse stimuli could be enhanced for a long
period of time if he first delivered a high-frequency train of stimuli to the presynaptic fibers. When
such a train of stimuli was applied, subsequent single-pulse stimuli elicited stronger, prolonged
EPSPs in the postsynaptic cell population. This phenomenon, whereby a high-frequency stimulus
could produce a long-lived enhancement in the postsynaptic cells' response to subsequent singlepulse stimuli, was initially called "long-lasting potentiation".[9][10]
Timothy Bliss, who joined the Andersen laboratory in 1968,[7] collaborated with Lmo and in 1973
the two published the first characterization of long-lasting potentiation in the rabbit hippocampus.
[9] Bliss and Tony Gardner-Medwin published a similar report of long-lasting potentiation in the
awake animal which appeared in the same issue as the Bliss and Lmo report.[10] In 1975, Douglas
and Goddard proposed "long-term potentiation" as a new name for the phenomenon of long-lasting
potentiation.[11][12] Andersen suggested that the authors chose "long-term potentiation" perhaps
because of its easily pronounced acronym, "LTP".[13]
More neurotransmitters.
depends also on intracellular calcium in relation to NMDA receptor voltage gates, have been
developed since the 1980s and modify the traditional a priori Hebbian learning model with both
biological and experimental justification. Still others have proposed re-arranging or synchronizing
the relationship between receptor regulation, LTP, and synaptic strength.[14]
Types
Since its original discovery in the rabbit hippocampus, LTP has been observed in a variety of other
neural structures, including the cerebral cortex, cerebellum, amygdala,[15] and many others. Robert
Malenka, a prominent LTP researcher, has suggested that LTP may even occur at all excitatory
synapses in the mammalian brain.[16]
Different areas of the brain exhibit different forms of LTP. The specific type of LTP exhibited
between neurons depends on a number of factors. One such factor is the age of the organism when
LTP is observed. For example, the molecular mechanisms of LTP in the immature hippocampus
differ from those mechanisms that underlie LTP of the adult hippocampus.[17] The signalling
pathways used by a particular cell also contribute to the specific type of LTP present. For example,
some types of hippocampal LTP depend on the NMDA receptor, others may depend upon the
metabotropic glutamate receptor (mGluR), while still others depend upon another molecule
altogether.[16] The variety of signaling pathways that contribute to LTP and the wide distribution of
these various pathways in the brain are reasons that the type of LTP exhibited between neurons
depends in part upon the anatomic location in which LTP is observed. For example, LTP in the
Schaffer collateral pathway of the hippocampus is NMDA receptor-dependent, whereas LTP in the
mossy fiber pathway is NMDA receptor-independent.[18]
The pre- and postsynaptic activity required to induce LTP are other criteria by which LTP is
classified. Broadly, this allows classification of LTP into Hebbian, non-Hebbian, and anti-Hebbian
mechanisms. Borrowing its name from Hebb's postulate, summarized by the maxim that "cells that
fire together wire together," Hebbian LTP requires simultaneous pre- and postsynaptic
depolarization for its induction.[19] Non-Hebbian LTP is a type of LTP that does not require such
simultaneous depolarization of pre- and postsynaptic cells; an example of this occurs in the mossy
fiber hippocampal pathway.[20] A special case of non-Hebbian LTP, anti-Hebbian LTP explicitly
requires simultaneous presynaptic depolarization and relative postsynaptic hyperpolarization for its
induction.[21]
Owing to its predictable organization and readily inducible LTP, the CA1 hippocampus has become
the prototypical site of mammalian LTP study. In particular, NMDA receptor-dependent LTP in the
adult CA1 hippocampus is the most widely studied type of LTP,[16] and is therefore the focus of
this article.
Properties
NMDA receptor-dependent LTP exhibits several properties, including input specificity,
associativity, cooperativity, and persistence.
Input specificity
Once induced, LTP at one synapse does not spread to other synapses; rather LTP is input
specific. Long-term potentiation is only propagated to those synapses according to the rules of
associativity and cooperativity. However, the input specificity of LTP may be incomplete at
short distances.[citation needed] One model to explain the input specificity of LTP was
presented by Frey and Morris in 1997 and is called the synaptic tagging and capture
hypothesis.[citation needed]
Associativity
Associativity refers to the observation that when weak stimulation of a single pathway is
insufficient for the induction of LTP, simultaneous strong stimulation of another pathway will
induce LTP at both pathways.[citation needed]
Cooperativity
LTP can be induced either by strong tetanic stimulation of a single pathway to a synapse, or
cooperatively via the weaker stimulation of many. When one pathway into a synapse is
stimulated weakly, it produces insufficient postsynaptic depolarization to induce LTP. In
contrast, when weak stimuli are applied to many pathways that converge on a single patch of
postsynaptic membrane, the individual postsynaptic depolarizations generated may
collectively depolarize the postsynaptic cell enough to induce LTP cooperatively. Synaptic
tagging, discussed later, may be a common mechanism underlying associativity and
cooperativity. Bruce McNaughton argues that any difference between associativity and
cooperativity is strictly semantic.[22]
Persistence
LTP is persistent, lasting from several minutes to many months, and it is this persistence that
separates LTP from other forms of synaptic plasticity.[23]
Early phase
The early phase of LTP, one model of which is shown here, is independent of protein synthesis.[24]
Ca2+/calmodulin-dependent protein kinase II (CaMKII) appears to be an important mediator of the
early, protein synthesis-independent phase of LTP.
Maintenance
While induction entails the transient activation of CaMKII and PKC, maintenance of E-LTP is
characterized by their persistent activation. During this stage, PKMz(Protein kinase M) which does
not have dependence on calcium, become autonomously active. Consequently they are able to carry
out
the phosphorylation events that underlie E-LTP expression.[25]
Expression
Phosphorylation is a chemical reaction in which a small phosphate group is added to another
molecule to change that molecule's activity. Autonomously active CaMKII and PKC use
phosphorylation to carry out the two major mechanisms underlying the expression of E-LTP. First,
and most importantly, they phosphorylate existing AMPA receptors to increase their activity.[16]
Second, they mediate or modulate the insertion of additional AMPA receptors into the postsynaptic
membrane.[16] Importantly, the delivery of AMPA receptors to the synapse during E-LTP is
independent of protein synthesis. This is achieved by having a nonsynaptic pool of AMPA receptors
adjacent to the postsynaptic membrane. When the appropriate LTP-inducing stimulus arrives,
nonsynaptic AMPA receptors are rapidly trafficked into the postsynaptic membrane under the
influence of protein kinases.[26] As mentioned previously, AMPA receptors are the brain's most
abundant glutamate receptors and mediate the majority of its excitatory activity. By increasing the
efficiency and number of AMPA receptors at the synapse, future excitatory stimuli generate larger
postsynaptic responses.
While the above model of E-LTP describes entirely postsynaptic mechanisms for induction,
maintenance, and expression, an additional component of expression may occur presynaptically.
[27] One hypothesis of this presynaptic facilitation is that persistent CaMKII activity in the
postsynaptic cell during E-LTP may lead to the synthesis of a "retrograde messenger", discussed
later. According to this hypothesis, the newly synthesized messenger travels across the synaptic
cleft from the postsynaptic to the presynaptic cell, leading to a chain of events that facilitate the
presynaptic response to subsequent stimuli. Such events may include an increase in
neurotransmitter vesicle number, probability of vesicle release, or both. In addition to the retrograde
messenger underlying presynaptic expression in early LTP, the retrograde messenger may also play
a role in the expression of late LTP.
Late phase
The early and late phases of LTP are thought to communicate via the extracellular signal-regulated
kinase (ERK).[24]
Late LTP is the natural extension of E-LTP. Unlike E-LTP, which is independent of protein
synthesis, L-LTP requires gene transcription[28] and protein synthesis[29] in the postsynaptic cell.
Two phases of L-LTP exist: the first depends upon protein synthesis, while the second depends upon
both gene transcription and protein synthesis.[24] These phases are occasionally called LTP2 and
LTP3, respectively, with E-LTP referred to as LTP1 under this nomenclature.
Induction
Late LTP is induced by changes in gene expression and protein synthesis brought about by the
persistent activation of protein kinases activated during E-LTP, such as MAPK.[25][24][30] In fact,
MAPKspecifically the extracellular signal-regulated kinase (ERK) subfamily of MAPKsmay
be the molecular link between E-LTP and L-LTP, since many signaling cascades involved in E-LTP,
including CaMKII and PKC, can converge on ERK.[30] Recent research has shown that the
induction of L-LTP can depend on coincident molecular events, namely PKA activation and calcium
influx, that converge on CRTC1 (TORC1), a potent transcriptional coactivator for cAMP response
element binding protein (CREB).[31] This requirement for a molecular coincidence accounts
perfectly for the associative nature of LTP, and, presumably, for that of learning.
Maintenance
Upon activation, ERK may phosphorylate a number of cytoplasmic and nuclear molecules that
ultimately result in the protein synthesis and morphological changes observed in L-LTP.[24] These
cytoplasmic and nuclear molecules may include transcription factors such as CREB.[25] ERKmediated changes in transcription factor activity may trigger the synthesis of proteins that underlie
the maintenance of L-LTP. One such molecule may be protein kinase M (PKM), a persistently
active kinase whose synthesis increases following LTP induction.[32][33] PKM is an atypical
isoform of PKC that lacks a regulatory subunit and thus remains constitutively active.[32] Unlike
other kinases that mediate LTP, PKM is active not just in the first 30 minutes following LTP
induction; rather, PKM becomes a requirement for LTP maintenance only during the late phase of
LTP.[32] PKM thus appears important for the persistence of memory and would be expected to be
important in the maintenance of long-term memory. Indeed, administration of a PKM inhibitor into
the hippocampus of the rat results in retrograde amnesia with intact short-term memory; PKM does
not play a role in the establishment of short-term memory.[33] PKM has recently been shown to
underlie L-LTP maintenance[32][33] by directing the trafficking and reorganization of proteins in
the synaptic scaffolding that underlie the expression of L-LTP.[32]
Expression
Aside from PKM, the identities of only a few proteins synthesized during L-LTP are known.
Regardless of their identities, it is thought that they contribute to the increase in dendritic spine
number, surface area, and postsynaptic sensitivity to neurotransmitter associated with L-LTP
expression.[24] The latter may be brought about in part by the enhanced synthesis of AMPA
receptors during L-LTP.[24] Late LTP is also associated with the presynaptic synthesis of
synaptotagmin and an increase in synaptic vesicle number, suggesting that L-LTP induces protein
synthesis not only in postsynaptic cells, but in presynaptic cells as well.[24] As mentioned
previously, for postsynaptic LTP induction to result in presynaptic protein synthesis, there must be
communication from the postsynaptic to the presynaptic cell. This may occur via the synthesis of a
retrograde messenger, discussed later.
Even in studies restricted to postsynaptic events, investigators have not determined the location of
the protein synthesis that underlies L-LTP. Specifically, it is unclear whether protein synthesis takes
place in the postsynaptic cell body or in its dendrites.[30] Despite having observed ribosomes (the
major components of the protein synthesis machinery) in dendrites as early as the 1960s, prevailing
wisdom was that the cell body was the predominant site of protein synthesis in neurons.[30] This
reasoning was not seriously challenged until the 1980s, when investigators reported observing
protein synthesis in dendrites whose connection to their cell body had been severed.[30] More
recently, investigators have demonstrated that this type of local protein synthesis is necessary for
some types of LTP.[34][35]
One reason for the popularity of the local protein synthesis hypothesis is that it provides a possible
mechanism for the specificity associated with LTP.[30] Specifically, if indeed local protein
synthesis underlies L-LTP, only dendritic spines receiving LTP-inducing stimuli will undergo LTP;
the potentiation will not be propagated to adjacent synapses. By contrast, global protein synthesis
that occurs in the cell body requires that proteins be shipped out to every area of the cell, including
synapses that have not received LTP-inducing stimuli. Whereas local protein synthesis provides a
mechanism for specificity, global protein synthesis would seem to directly compromise it. However,
as discussed later, the synaptic tagging hypothesis successfully reconciles global protein synthesis,
synapse specificity, and associativity.
Retrograde signaling
Main article: Retrograde signaling in LTP
Retrograde signaling is a hypothesis that attempts to explain that, while LTP is induced and
expressed postsynaptically, some evidence suggests that it is expressed presynaptically as well.[16]
[27][36] The hypothesis gets its name because normal synaptic transmission is directional and
proceeds from the presynaptic to the postsynaptic cell. For induction to occur postsynaptically and
be partially expressed presynaptically, a message must travel from the postsynaptic cell to the
presynaptic cell in a retrograde (reverse) direction. Once there, the message presumably initiates a
cascade of events that leads to a presynaptic component of expression, such as the increased
probability of neurotransmitter vesicle release.[37]
Retrograde signaling is currently a contentious subject as some investigators do not believe the
presynaptic cell contributes at all to the expression of LTP.[16] Even among proponents of the
hypothesis there is controversy over the identity of the messenger. Early thoughts focused on nitric
oxide, while most recent evidence points to cell adhesion proteins.[16]
Synaptic tagging
Before the local protein synthesis hypothesis gained significant support, there was general
agreement that the protein synthesis underlying L-LTP occurred in the cell body. Further, there was
thought that the products of this synthesis were shipped cell-wide in a nonspecific manner. It thus
became necessary to explain how protein synthesis could occur in the cell body without
compromising LTP's input specificity. The synaptic tagging hypothesis attempts to solve the cell's
difficult problem of synthesizing proteins in the cell body but ensuring they only reach synapses
that have received LTP-inducing stimuli.
The synaptic tagging hypothesis proposes that a "synaptic tag" is synthesized at synapses that have
received LTP-inducing stimuli, and that this synaptic tag may serve to capture plasticity-related
proteins shipped cell-wide from the cell body.[38] Studies of LTP in the marine snail Aplysia
californica have implicated synaptic tagging as a mechanism for the input-specificity of LTP.[39]
[40] There is some evidence that given two widely separated synapses, an LTP-inducing stimulus at
one synapse drives several signaling cascades (described previously) that initiates gene expression
in the cell nucleus. At the same synapse (but not the unstimulated synapse), local protein synthesis
creates a short-lived (less than three hours) synaptic tag. The products of gene expression are
shipped globally throughout the cell, but are only captured by synapses that express the synaptic
tag. Thus only the synapse receiving LTP-inducing stimuli is potentiated, demonstrating LTP's input
specificity.
The synaptic tag hypothesis may also account for LTP's associativity and cooperativity.
Associativity (see Properties) is observed when one synapse is excited with LTP-inducing
stimulation while a separate synapse is only weakly stimulated. Whereas one might expect only the
strongly stimulated synapse to undergo LTP (since weak stimulation alone is insufficient to induce
LTP at either synapse), both synapses will in fact undergo LTP. While weak stimuli are unable to
induce protein synthesis in the cell body, they may prompt the synthesis of a synaptic tag.
Simultaneous strong stimulation of a separate pathway, capable of inducing cell body protein
synthesis, then may prompt the production of plasticity-related proteins, which are shipped cellwide. With both synapses expressing the synaptic tag, both would capture the protein products
resulting in the expression of LTP in both the strongly stimulated and weakly stimulated pathways.
Cooperativity is observed when two synapses are activated by weak stimuli incapable of inducing
LTP when stimulated individually. But upon simultaneous weak stimulation, both synapses undergo
LTP in a cooperative fashion. Synaptic tagging does not explain how multiple weak stimuli can
result in a collective stimulus sufficient to induce LTP (this is explained by the postsynaptic
summation of EPSPs described previously). Rather, synaptic tagging explains the ability of weakly
stimulated synapses, none of which are capable of independently generating LTP, to receive the
products of protein synthesis initiated collectively. As before, this may be accomplished through the
synthesis of a local synaptic tag following weak synaptic stimulation.
Modulation
Proposed modulators of LTP[25]
Modulator
Target
-Adrenergic receptor
cAMP, MAPK amplification
Nitric oxide synthase
Guanylyl cyclase, PKG, NMDAR
Dopamine receptor
cAMP, MAPK amplification
Metabotropic glutamate receptor PKC, MAPK amplification
As described previously, the molecules that underlie LTP can be classified as mediators or
modulators. A mediator of LTP is a molecule, such as the NMDA receptor or calcium, whose
presence and activity is necessary for generating LTP under nearly all conditions. By contrast, a
modulator is a molecule that can alter LTP but is not essential for its generation or expression.[16]
In addition to the signaling pathways described above, hippocampal LTP may be altered by a
variety of modulators. For example, the steroid hormone estradiol may enhance LTP by driving
CREB phosphorylation and subsequent dendritic spine growth.[41] Additionally, -adrenergic
receptor agonists such as norepinephrine may alter the protein synthesis-dependent late phase of
LTP.[42] Nitric oxide synthase activity may also result in the subsequent activation of guanylyl
cyclase and PKG.[43] Similarly, activation of dopamine receptors may enhance LTP through the
cAMP/PKA signaling pathway.[44][45]
Spatial memory
The Morris water maze task has been used to demonstrate the necessity of NMDA receptors in
establishing spatial memories.
In 1986, Richard Morris provided some of the first evidence that LTP was indeed required for the
Inhibitory avoidance
In 2006, Jonathan Whitlock and colleagues reported on a series of experiments that provided
perhaps the strongest evidence of LTP's role in behavioral memory, arguing that to conclude that
LTP underlies behavioral learning, the two processes must both mimic and occlude one another.[48]
Employing an inhibitory avoidance learning paradigm, researchers trained rats in a two-chambered
apparatus with light and dark chambers, the latter being fitted with a device that delivered a foot
shock to the rat upon entry. An analysis of CA1 hippocampal synapses revealed that inhibitory
avoidance training induced in vivo AMPA receptor phosphorylation of the same type as that seen in
LTP in vitro; that is, inhibitory avoidance training mimicked LTP. In addition, synapses potentiated
during training could not be further potentiated by experimental manipulations that would have
otherwise induced LTP; that is, inhibitory avoidance training occluded LTP. In a response to the
article, Timothy Bliss and colleagues remarked that these and related experiments "substantially
advance the case for LTP as a neural mechanism for memory."[49]
Clinical significance
The role of LTP in disease is less clear than its role in basic mechanisms of synaptic plasticity.
However, alterations in LTP may contribute to a number of neurological diseases, including
depression, Parkinson's disease, epilepsy, and neuropathic pain.[50] Impaired LTP may also have a
role in Alzheimer's disease and drug addiction.
Alzheimer's disease
Misprocessing of amyloid precursor protein (APP) in Alzheimer's disease disrupts LTP and is
thought to lead to early cognitive decline in individuals with the disease.[51]
LTP has received much attention among those who study Alzheimer's disease (AD), a
neurodegenerative disease that causes marked cognitive decline and dementia. Much of this
deterioration occurs in association with degenerative changes in the hippocampus and other medial
temporal lobe structures. Because of the hippocampus' well established role in LTP, some have
suggested that the cognitive decline seen in individuals with AD may result from impaired LTP.
In a 2003 review of the literature, Rowan et al. proposed one model for how LTP might be affected
in AD.[51] AD appears to result, at least in part, from misprocessing of amyloid precursor protein
(APP). The result of this abnormal processing is the accumulation of fragments of this protein,
called amyloid (A). A exists in both soluble and fibrillar forms. Misprocessing of APP results in
the accumulation of soluble A that, according to Rowan's hypothesis, impairs hippocampal LTP
and may lead to the cognitive decline seen early in AD.
AD may also impair LTP through mechanisms distinct from A. For example, one study
demonstrated that the enzyme PKM accumulates in neurofibrillary tangles, which are a pathologic
marker of AD. PKM is an enzyme with critical importance in the maintenance of late LTP.[52]
Drug addiction
Research in the field of addiction medicine has also recently turned its focus to LTP, owing to the
hypothesis that drug addiction represents a powerful form of learning and memory.[53] Addiction is
a complex neurobehavioral phenomenon involving various parts of the brain, such as the ventral
tegmental area (VTA) and nucleus accumbens (NAc). Studies have demonstrated that VTA and
NAc synapses are capable of undergoing LTP[53] and that this LTP may be responsible for the
behaviors that characterize addiction.[54]
Long-term depression
From Wikipedia, the free encyclopedia
Jump to: navigation, search
Not to be confused with chronic depression, a mental disorder, or the Long Depression, an
economic recession.
Long-term depression (LTD), in neurophysiology, is an activity-dependent reduction in the
efficacy of neuronal synapses lasting hours or longer following a long patterned stimulus. LTD
occurs in many areas of the CNS with varying mechanisms depending upon brain region and
developmental progress.[1] LTD in the hippocampus and cerebellum have been the best
characterized, but there are other brain areas in which mechanisms of LTD are understood.[1] LTD
has also been found to occur in different types of neurons that release various neurotransmitters,
however, the most common neurotransmitter involved in LTD is L-glutamate. L-glutamate acts on
the N-methyl-D- asparate receptors (NMDARs), -amino-3-hydroxy-5-methylisoxazole-4propionicacid receptors (AMPARs), kinate receptors (KARs) and metabotropic glutamate receptors
(mGluRs) during LTD. It can result from strong synaptic stimulation (as occurs in the cerebellar
Purkinje cells) or from persistent weak synaptic stimulation (as in the hippocampus). Long-term
potentiation (LTP) is the opposing process to LTD; it is the long-lasting increase of synaptic
strength. In conjunction, LTD and LTP are factors affecting neuronal synaptic plasticity. LTD is
thought to result mainly from a decrease in postsynaptic receptor density, although a decrease in
presynaptic neurotransmitter release may also play a role. Cerebellar LTD has been hypothesized to
be important for motor learning. However, it is likely that other plasticity mechanisms play a role as
well. Hippocampal LTD may be important for the clearing of old memory traces.[2][3]
Hippocampal/cortical LTD can be dependent on NMDA receptors, metabotrophic glutamate
receptors (mGluR), or endocannabinoids.[4] The result of the underlying-LTD molecular
mechanism is the phosphorylation of AMPA glutamate receptors and their elimination from the
surface of the PF-PC synapse.[5]
LTD is one of several processes that serves to selectively weaken specific synapses in order to make
constructive use of synaptic strengthening caused by LTP. This is necessary because, if allowed to
continue increasing in strength, synapses would ultimately reach a ceiling level of efficiency, which
would inhibit the encoding of new information.[6]
Neural homeostasis
It is highly important for neurons to maintain a variable range of neuronal output. If synapses were
only reinforced by positive feedback, they would eventually come to the point of complete
inactivity or too much activity. To prevent neurons from becoming static, there are two regulatory
forms of plasticity that provide negative feedback: metaplasticity and scaling.[7] Metaplasticity is
expressed as a change in the capacity to provoke subsequent synaptic plasticity, including LTD and
LTP.[8] The Bienenstock, Cooper and Munro model (BCM model) proposes that a certain threshold
exists such that a level of postsynaptic response below the threshold leads to LTD and above it leads
to LTP. BCM theory further proposes that the level of this threshold depends upon the average
amount of postsynaptic activity.[9] Scaling has been found to occur when the strength of all of a
neurons excitatory inputs are scaled up or down.[10] LTD and LTP coincide with metaplasticity
and synaptic scaling to maintain proper neuronal network function.
Cerebellum
LTD occurs at synapses in cerebellar Purkinje neurons, which receive two forms of excitatory input,
one from a singleclimbing fiber and one from hundreds of thousands of parallel fibers. LTD
decreases the efficacy of parallel fiber synapse transmission, though, according to recent findings, it
also impairs climbing fiber synapse transmission.[6] Both parallel fibers and climbing fibers must
be simultaneously activated for LTD to occur. For best release of calcium however, it is best if
parallel fibre is activated a few hundred milliseconds before the climbing fibres. In one pathway,
parallel fiber terminals release glutamate to activate AMPA and metabotropic glutamate receptors in
the postsynaptic Purkinje cell. When glutamate binds to the AMPA receptor, the membrane
depolarizes. Glutamate binding to the metabotropic receptor activates phospholipase C (PLC) and
produces diacylglycerol (DAG) and inositol triphosphate (IP3) second messengers. In the pathway
initiated by activation of climbing fibers, calcium enters the postsynaptic cell through voltage-gated
ion channels, raising intracellular calcium levels. Together, DAG and IP3 augment the calcium
concentration rise by targeting IP3-sensitive triggering release of calcium from intracellular stores
as well as protein kinase C (PKC) activation (which is accomplished jointly by calcium and DAG).
PKC phosphorylates AMPA receptors, causing receptor internalization as is seen in hippocampal
LTD. With the loss of AMPA receptors, the postsynaptic Purkinje cell response to glutamate release
from parallel fibers is depressed.[6] Calcium triggering in the cerebellum is a critical mechanism
involved in long-term depression. Parallel fibre terminals and climbing fibres work together in a
positive feedback loop for invoking high calcium release.[15]
Ca2+ involvement
Further research has determined calcium's role in long-term depression induction. While other
mechanisms of long-term depression are being investigated, calcium's role in LTD is a defined and
well understood mechanism by scientists. High calcium concentrations in the post-synaptic Purkinje
cells is a necessity for the induction of long-term depression. There are several sources of calcium
signaling that elicit LTD: climbing fibres and parallel fibres which converge onton Purkinje cells.
Calcium signaling in the post-synaptic cell involved both spatial and temporal overlap of climbing
fibre induced calcium release into dendrites as well as parallel fibre induced mGluRs and IP3
mediated calcium release. In the climbing fibres, AMPAR-mediated depolarization induces a
regenerative action potential that spreads to the dendrites, which is generated by voltage-gated
calcium channels. Paired with PF-mediated mGluR1 activation results in LTD induction.[16] In the
parallel fibres, GluRs are acitvated by constant activation of the parallel fibres which indirectly
induces the IP3 to bind to its receptor (IP3) and activate calcium release from intracellular storage.
In calcium induction, there is a positive feedback loop to regenerate calcium for long-term
depression.Climbing and parallel fibres must be activated together to depolarize the Purkinje cells
while activating mGlur1s.[17] Timing is a critical component to CF and PF as well, a better calcium
release involves PF activation a few hundred milliseconds before CF activity.[15]
AMPAR phosphorylation
There is a series of signaling cascades, MAPK, in the cerebellum that plays a critical role in
cerebellum LTD. The MAPK cascade is important in information processing within neurons and
other various types of cells. The cascade includes MAPKKK,MAPKK and MAPK. Each is dual
phosphorylated by the other, MAPKKK dual phosphorylates MAPKK and in turn dual
phosphorylates MAPK. There is a positive feedback loop that results from a simultaneous input of
signals from PF-CF and increases DAG and Ca2+ in Purkinje dendritic spines. Calcium and DAG
activate conventional PKC (cPKC), which then activates MAPKKK and the rest of the MAPK
cascade. Activated MAPK and Ca2+ activate PLA2, AA and cPKC creating a positive feedback
loop. Induced cPKC phosphorylates AMPA receptors and are eventually removedd form the
postsynaptic membrane via endocytosis. The timescale is for this process is approximately 40
minutes. overall, the magnitude of the LTD correlates with AMPAR phosphorylation.[5]
Striatum
The mechanisms of LTD differ in the two subregions of the striatum.[1] LTD is induced at
corticostriatal medium spiny neuron synapses in the dorsal striatum by a high frequency stimulus
coupled with postsynaptic depolarization, coactivation of dopamine D1 and D2 receptors and group
I mGlu receptors, lack of NMDA receptor activation, and endocannabinoid activation.[1]
In the prelimbic cortex of the striatum, three forms or LTD have been established.[1] The
mechanism of the first is similar to CA1-LTD: a low frequency stimulus induces LTD by activation
of NMDA receptors, with postsynaptic depolarization and increased postsynaptic calcium influx.[1]
The second is initiated by a high frequency stimulus and is arbitrated by presynaptic mGlu receptor
2 or 3, resulting in a long term reduction in the involvement of P/Q-type calcium channels in
glutamate release.[1] The third form of LTD requires endocannabinoids, activation of mGlu
receptors, and repetitive stimulation of glutamatergic fibers (13 Hz for ten minutes) and results in a
long term decrease in presynaptic glutamate release.[1] It is proposed that LTD in GABAergic
striatal neurons leads to a long term decrease in inhibitory effects on the basal ganglia, influencing
the storage of motor skills.[1]
Visual cortex
Long-term depression has also been observed in the visual cortex, and it is proposed to be involved
in ocular dominance.[1] Recurring low-frequency stimulation of layer IV of the visual cortex or the
white matter of the visual cortex causes LTD in layer III.[18] In this form of LTD, low-frequency
stimulation of one pathway results in LTD only for that input, making it homosynaptic.[18] This
type of LTD is similar to that found in the hippocampus, because it is triggered by a small elevation
in postsynaptic calcium ions and activation of phosphatases.[18] LTD has also been found to occur
in this fashion in layer II.[1] A different mechanism is at work in the LTD that occurs in layer V. In
layer V, LTD requires low frequency stimulation, endocannabinoid signaling, and activation of
presynaptic NR2B-containing NMDA receptors.[1]
It has been found that paired-pulse stimulation (PPS) induces a form of homosynaptic LTD in the
superficial layers of the visual cortex when the synapse is exposed to carbachol (CCh) and
norepinephrine (NE).[19]
The magnitude of this LTD is comparable to that which results from low frequency stimulation, but
with fewer stimulation pulses (40 PPS for 900 low frequency stimulations).[19] It is suggested that
the effect of NE is to control the gain of NMDA receptor-dependent homosynaptic LTD.[19] Like
norepinephrine, acetylcholine is proposed to control the gain of NMDA receptor-dependent
homosynaptic LTD, but it is likely to be a promoter of additional LTD mechanisms as well.[19]
Prefrontal cortex
The neurotransmitter serotonin is involved in LTD induction in the prefrontal cortex (PFC). The
serotonin system in the PFC plays an important role in regulating cognition and emotion. Serotonin,
in cooperation with a group I metabotropic glutamate receptor (mGluR) agonist, facilitates LTD
induction through augmentation of AMPA receptor internalization. This mechanism possibly
underlies serotonin's role in the control of cognitive and emotional processes that synaptic plasticity
in PFC neurons mediates.[20]
Perirhinal cortex
Computational models predict that LTD creates a gain in recognition memory storage capacity over
that of LTP in the perirhinal cortex, and this prediction is confirmed by neurotransmitter receptor
blocking experiments.[1] It is proposed that there are multiple memory mechanisms in the
perirhinal cortex.[1] The exact mechanisms are not completely understood, however pieces of the
mechanisms have been deciphered. Studies suggest that one perirhinal cortex LTD mechanism
involves NMDA receptors and I and II mGlu receptors 24 hours after the stimulus.[1] The other
LTD mechanism involves acetylcholine receptors and kainate receptors at a much earlier time,
about 20 to 30 minutes after stimulus.[1]
Role of endocannabinoids
Endocannabinoids affect long-lasting plasticity processes in various parts of the brain, serving both
as regulators of pathways and necessary retrograde messengers in specific forms of LTD. In regard
to retrograde signaling, endocannabinoid receptors (CB1) function widely throughout the brain in
presynaptic inhibition. Endocannabinoid retrograde signaling has been shown to effect LTD at
corticostriatal synapses and glutamatergic synapses in the prelimbic cortex of the nucleus
accumbens (NAc), and it is also involved in spike-timing-dependent LTD in the visual cortex.
Endocannabinoids are implicated in LTD of inhibitory inputs (LTDi) within the basolateral nucleus
of the amygdala (BLA) as well as in the stratum radiatum of the hippocampus. Additionally,
endocannabinoids play an important role in regulating various forms of synaptic plasticity. They are
involved in inhibition of LTD at parallel fiber Purkinje neuron synapses in the cerebellum and
NMDA receptor-dependent LTD in the hippocampus.[21]
either LTP or LTD. LTD occurs when postsynaptic spikes precede presynaptic spikes by up to 20-50
ms.[22] Whole-cell patch clamp experiments "in vivo" indicate that post-leading-pre spike delays
elicit synaptic depression.[22] LTP is induced when neurotransmitter release occurs 5-15 ms before
a back-propagating action potential, whereas LTD is induced when the stimulus occurs 5-15 ms
after the back-propagating action potential.[23] There is a plasticity window: if the presynaptic and
postsynaptic spikes are too far apart (i.e., more than 15 ms apart), there is little chance of plasticity.
[24] The possible window for LTD is wider than that for LTP[25] although it is important to note
that this threshold depends on synaptic history.
When postsynaptic action potential firing occurs prior to presynaptic afferent firing, both
presynaptic endocannabinoid (CB1) receptors and NMDA receptors are stimulated at the same time.
Postsynaptic spiking alleviates the Mg2+ block on NMDA receptors. The postsynaptic
depolarization will subside by the time an EPSP occurs, enabling Mg2+ to return to its inhibitory
binding site. Thus, the influx of Ca2+ in the postsynaptic cell is reduced. CB1 receptors detect
postsynaptic activity levels via retrograde endocannabinoid release.[26]
STDP selectively enhances and consolidates specific synaptic modifications (signals), while
depressing global ones (noise). This results in a sharpened signal-to-noise ratio in human cortical
networks that facilitates the detection of relevant signals during information processing in humans.
[27]
Encoding of new space is the priority of LTP, while information about orientation in space could be
encoded by LTD in the dentate gyrus, and the finer details of space could be encoded by LTD in the
CA1.[33]
Current research
Research on the role of LTD in neurological disorders such as Alzheimer's disease (AD) is ongoing.
It has been suggested that a reduction in NMDAR-dependent LTD may be due to changes not only
in postsynaptic AMPARs but also in NMDARs, and these changes are perhaps present in early and
mild forms of Alzheimer-type dementia.[36]
Additionally, researchers have recently discovered a new mechanism (which involves LTD) linking
soluble amyloid beta protein (A) with the synaptic injury and memory loss related to AD. While
A's role in LTD regulation has not been clearly understood, it has been found that soluble A
facilitates hippocampal LTD and is mediated by a decrease in glutamate recycling at hippocampal
synapses. Excess glutamate is a proposed contributor to the progressive neuronal loss involved in
AD. Evidence that soluble A enhances LTD through a mechanism involving altered glutamate
uptake at hippocampal synapses has important implications for the initiation of synaptic failure in
AD and in types of age-related A accumulation. This research provides a novel understanding of
the development of AD and proposes potential therapeutic targets for the disease. Further research
is needed to understand how soluble amyloid beta protein specifically interferes with glutamate
transporters.[37]
The mechanism of long-term depression has been well characterized in limited parts of the brain.
However, the way in which LTD affects motor learning and memory is still not well understood.
Determining this relationship is presently one of the major focuses of LTD research.
Neurodegeneration
Neurodegenerative diseases research remains inconclusive in the precise mechanisms that triggers
the degeneration in the brain. New evidence demonstrates there are similarities between the
apoptotic pathway and LTD which involves the phosphorylation/activation of GSK3. NMDARLTD(A) contributes to the elimination of excess synapses during development. This process is
downregulated after synapses have stabilized, and is regulated by GSK3. During
neurodegeneration, there is the possibility that there is deregulation of GSK3 resulting in 'synaptic
prunning'. If there is excess removal of synapses, this illustrates early signs of neurodegeration and
a link between apoptosis and neurodegeneration diseases.[38]