Our perception of musical timing is closely linked to the quality of the sound

0
144

The brain does not necessarily perceive the sounds in music simultaneously as they are being played. New research sheds light on musicians’ implicit knowledge of sound and timing.

“It is very important for our overall impression of music that the details are right,” says musicologist Anne Danielsen at the RITMO Center for Interdisciplinary Studies in Rhythm, Time and Motion at the University of Oslo.

Together with her research colleague Guilherme Schmidt Câmara, she is looking for answers to what these details are. They know there are some basic rules relating to sound and timing which most creators of music comply with.

Few, however, are aware of what they actually do in order to make it sound right.

“When we talk to musicians and producers, it becomes clear that they simply adjust sounds automatically in order to get the right timing – it’s a form of implicit knowledge,” says Câmara.

In order to make this knowledge more explicit, the researchers have studied the factors that influence when we perceive a sound happening.

They have found a pattern and have noticed that our perception of timing is closely related to the quality of the sound – whether or not it is soft or sharp, short or long and wobbly.

When does a sound happen?

Timing the sounds of all instruments so that the music sounds good is essential, but the different notes are not necessarily being played when you hear them.

“Scientists have previously assumed that we perceive the timing at the beginning of a sound but have not reflected critically on what happens when the sounds have different shapes,” says Danielsen.

A sound has a rhythmic center. If you imagine a sound wave, this center is located near the top of the wave, and your perception of where in time the sound is located is actually up there, rather than where it begins.

“If the sound is sharp, the beginning coincides with this rhythmic center. While with a longer and more wobbly sound, we perceive that the center is placed long after the sound has actually begun.”

In order to hit a beat, or to play together in a band, musicians have to tune in to each other to get it right.

“If you have a soft sound and you want it to be heard exactly on the beat, then you need to place it a little early so that it can be experienced like that,” says Danielsen.

Experiments reveal musicians’ strategies

In order to investigate this, Câmara has conducted controlled experiments with skilled guitarists, bassists and drummers.

“They were all given a rhythmic reference, a simple groove pattern that can be found in many genres. Then they were asked to play along with it in three different ways: either right on the beat, a little behind, or a little ahead,” he explains.

This way, he could test their perception of the timing of different sounds, and how they play in order to time the sounds to a beat. After the experiments, they asked the musicians what they had been trying to do.

“They use their own words, saying they are playing slower or more heavily when they are aiming for after the beat. This accords well with what we see as a pattern of influencing the sound rather than just its location.”

Danielsen points out that timing one’s own playing to a beat is something that all musicians practice, so it is something that everyone thinks about.

“However, they are much less aware of how they use sound to communicate timing differences,” she says.

Musicians manipulate sound and time

The researchers believe our perception of sound in time is based on fundamental psychoacoustic rules, meaning how the brain perceives sound signals. All musicians take these more or less stated rules into account, but how they do it depends on what genre their playing falls into.

“Each genre has a characteristic microrhythmic profile. Samba has it its own, EDM has its own, hip-hop has another,” says Danielsen.

In music production, the producer sees the sound in front of her on the screen and can twist and turn the music by moving how the sounds relate to each other.

“Producers who create a groove on a computer know this. They move sounds back and forth on the beat and think: ‘if I put it there it works, and if I put it there it doesn’t.’ So, they learn through experience, and if something needs to sound precise, they need to juggle the sounds around a bit.”

AI strives to give music human qualities

The researchers believe that our knowledge about how different types of sound affect timing could be used to develop software that uses artificial intelligence to create music.

“We can already make a sequence groovier and more human so that it doesn’t sound completely mechanical. If we start with a programmed beat, then the algorithms can move the sounds slightly to affect the style.

If the algorithm also takes the shape of the sound into account, we can obtain an even broader palette of rhythmic conditions that can shape the music in a more esthetically pleasing way,” says Câmara.

When you listen to music, it doesn’t take much before something sounds wrong. It’s about context, and about the type of music involved.

“When we play live, we want a margin of error, we’re not machines. There is always a certain amount of asynchronicity,” says Câmara, who is a musician himself.

Although we are talking about tiny shifts, humans have a trained ear for placing something in time by using sound.

“In some contexts, 10 to 20 milliseconds may be enough for hearing a difference. We don’t need to be completely aware of this, but we can feel it.”

Anne Danielsen points out that this does not just apply to people who work with music.

“Compared to what we perceive with our eyes, our precision in terms of time and sound is extremely precise. This makes us very sensitive to spatial sound differences. But also, when listening to differences in voices—whether someone is angry, sad, happy or annoyed—we use finely meshed audio information to interpret what that voice is actually communicating,” she says.

“It may seem incredibly small and insignificant, but it’s actually very important information for us.”

Music challenges our sensory boundaries

Danielsen believes that the fact that music research has enabled us to discover psychoacoustic rules relating to how the human brain perceives sound says something about the importance of conducting research on music.

“We do extreme things in music. By testing out the boundaries of what we may find esthetically pleasing, we are also testing our perception apparatus,” she says.

“You could say that music is constantly experimenting with our senses. That’s why music is a good research topic for finding out how we perceive sound, how we listen and how we structure it in time.”


Music exists in every human culture, and every culture has some form of music with a beat: a perceived periodic pulse that listeners use to guide their movements and performers use to coordinate their actions (Nettl, 2000; Brown and Jordania, 2013). What brain mechanisms support beat perception, and how did these mechanisms evolve?

One possibility is that the relevant neural mechanisms are very ancient. This is an intuitively appealing view, as rhythm is often considered the most basic aspect of music, and is increasingly thought to be a fundamental organizing principle of brain function (Buzsáki, 2006).

The view is also consonant with Darwin’s ideas about the evolution of human musicality. Darwin believed that our capacity for music had deep evolutionary roots and argued that “The perception, if not the enjoyment, of musical cadences and of rhythm is probably common to all animals, and no doubt depends on the common physiological nature of their nervous systems” (Darwin, 1871).

This view has been echoed by several modern researchers. For example, Hulse et al. (1995) argues that “There is increasing evidence that some of the principles governing human music perception and cognition may also hold for non-human animals, such as the perception of tempo and rhythm.”

More recently, Large and colleagues (e.g., Large, 2008; Large and Snyder, 2009) have proposed a theory of musical beat perception based on very general neural mechanisms, building on the dynamic attending theory of Jones (e.g., Jones and Boltz, 1989; Large and Jones, 1999).

According to this “neural resonance” theory, beat perception arises when non-linear oscillations in the nervous system entrain to (oscillate in synchrony with) external rhythmic stimuli.

As stated by Large and Snyder (2009), “Non-linear oscillations are ubiquitous in brain dynamics and the theory asserts that some neural oscillations -perhaps in distributed cortical and subcortical areas – entrain to the rhythms of auditory sequences.”

Large’s ideas are in line with Darwin’s views because neural resonance theory “holds that listeners experience dynamic temporal patterns (i.e., pulse and meter) … because they are intrinsic to the physics of the neural systems involved in perceiving, attending, and responding to auditory stimuli.”

Neural resonance theory is interesting in light of other mechanistic proposals for the interaction of attention, neural oscillators, and the temporal dynamics of sensory signals in the brain (Schroeder and Lakatos, 2009).

There are, however, reasons to suggest that entrainment of auditory neural activity to external rhythms is not sufficient to explain beat perception. One such reason is that “pure perception” of a musical beat (i.e., listening in the absence of overt movement) strongly engages the motor system, including regions such as premotor cortex, basal ganglia, and supplementary motor regions (Chen et al., 2008a; Grahn and Rowe, 2009; Kung et al., 2013).

In other words, there is an intimate connection between beat perception and motor functions of the brain, and any theory of beat perception needs to account for this coupling. Second, recent EEG work on rhesus monkeys (Macaca mulatta) suggests that they do not perceive a beat in rhythmic auditory patterns (Honing et al., 2012).

This EEG study followed earlier work showing that monkeys could not learn to tap in synchrony with an auditory (or a visual) metronome, a task which is trivially easy for humans, even for those with no musical training (Zarco et al., 2009). This was the first study to train monkeys (or for that matter, any animal) to move in synchrony with a metronome, a task that has been extensively studied in human cognitive science (Repp and Su, 2013).

The study produced several surprising results. While the monkeys could successfully listen to two metronome clicks and then reproduce the same interval by tapping twice on a key, they had great difficulty learning to tap in synchrony with a metronome of several beats.

Specifically, each monkey took over a year of training to learn the metronome task, and when tested, their taps were always a few 100 ms after each metronome click rather than aligned with it. This is quite unlike humans: when humans are asked to tap with a metronome, they spontaneously align their taps closely in time with metronome clicks (i.e., within a few tens of ms).

This human tendency for “phase alignment” between taps and beats indicates that humans accurately predict the timing of upcoming beats. In contrast, monkey rhythmic tapping did not show this sort of predictive behavior. To be sure, the monkeys did show shorter tapping latencies to metronomic vs. irregularly-timed clicks, suggesting they had some predictive capacities.

Furthermore, monkey and human tapping to a metronome both showed the scalar property of temporal processing, whereby temporal variability between taps scaled with interval duration. What was striking, however, was the lack of phase alignment between taps and metronome events in monkeys.

This inability to accurately align movement with discrete, periodic events is particularly surprising given that monkey motor cortex can represent time-to-contact in a predictive manner when doing an interception task involving a continuously-moving visual object (Merchant et al., 2004; Merchant and Georgopoulos, 2006).

Recently, based on the results of Zarco et al. (2009) and subsequent studies, including studies which characterize the neurophysiological properties of cells in medial premotor areas and the putamen during synchronization-continuation tapping tasks (e.g., Merchant et al., 2011, 2013a,b; Bartolo et al., 2014), Merchant and Honing (2014) have proposed that monkeys and humans share neural mechanisms for interval-based timing (i.e., timing of single intervals), but may differ in the mechanisms involved in beat-based timing.

The above research with humans (showing extensive activation of the motor system in pure beat perception) and with monkeys (suggesting that they may lack human-like beat perception) suggests that entrainment of auditory cortical activity to external rhythms is not a sufficient explanation of beat perception.

Here we advance a view of musical beat perception which can account for auditory-motor interactions in pure perception of a beat, and which can also account for species-restrictedness in the capacity for beat perception. In terms of auditory-motor interactions, we argue that musical beat perception (even in the absence of overt movement) relies on a simulation of periodic action in motor planning regions of the brain, and on bidirectional signaling between these regions and auditory regions. In terms of species-restrictedness, we suggest that only some species may have the requisite neural connections to support these specific auditory-motor interactions.

The paper is organized into three sections. The first section discusses some key aspects of musical beat perception, including the predictive and flexible nature of beat perception. The second section focuses on the brain’s ability to predict the timing of beats, introduces the “action simulation for auditory prediction” (ASAP) hypothesis, and discusses three testable predictions made by this hypothesis.

The third section discusses possible neural substrates for auditory-motor interactions in beat perception, and suggests why the relevant neural pathways may be restricted to certain species. It should be emphasized at the outset that the ASAP hypothesis and the species-restrictedness of beat perception are conceptually distinct ideas.

That is, the ASAP hypothesis does not require the assumption that beat perception is species-restricted, although this paper links these ideas together. It is also worth noting that the ASAP hypothesis, while involving the idea of motor simulation, does not involve the mirror neuron system (a point further discussed in the section on possible neural substrates).

Some key aspects of human musical beat perception
Beat perception is predictive

Musical beat perception involves perceiving a periodic pulse in spectotemporally complex sound sequences. Listeners often express their perception of the pulse by moving rhythmically in synchrony with the pulse, e.g., via head bobbing, foot tapping, or dance. (Informally, the beat is what we tap our foot to when listening to music.

In the laboratory, this rhythmic response to music can easily be studied by asking people to tap a finger to the perceived beat, e.g., Iversen and Patel, 2008). The manner in which people synchronize to the beat reveals that musical beat perception is a predictive process. Specifically, taps fall very close to beats in time (i.e., within a few tens of ms of beats) showing that the brain makes highly accurate temporal predictions about the timing of upcoming beats (Rankin et al., 2009; for further evidence of the anticipatory nature of movement to a beat see Van der Steen and Keller, 2013).

Accurate temporal prediction of beat times has consequences for perception even in the absence of movement. Several studies have shown facilitated perceptual processing of auditory events which occur on (vs. off) the beat (Escoffier et al., 2010; Geiser et al., 2012). This body of findings is consistent with Jones’s “Dynamic Attending Theory” (Jones and Boltz, 1989), which posits an increase of “attentional energy” at expected times of the beat and focuses perceptual processing resources on those times.

This temporal facilitation even extends to the processing of non-auditory events. For example, (Escoffier et al., 2010) showed facilitation of visual image processing when images occurred on (vs. off) the beat of an accompanying auditory pattern. More generally, it appears that the prediction of auditory beats has broader cognitive consequences, including facilitating the learning and recall of strongly beat-inducing rhythmic patterns (Povel and Essens, 1985).

Beat perception is flexible across a wide range of tempi

Humans can perceive musical beats across a wide range of tempi. We perceive beats in a range of about 250 ms–2 s, though intervals between about 400 and 1200 ms give rise to the strongest sense of beat, and humans show a preference for beat periods around 600 ms (London, 2012).

In dance music (i.e., music designed to convey a clear sense of a beat), pieces tend to have tempi between 94 and 176 beats per minute (BPM) (van Noorden and Moelants, 1999). Within this range, van Noorden and Moelants (1999) found a preponderance of pieces between 120 and 150 BPM, and a median tempo of 133 BPM, corresponding to one beat every 451 ms. Given this median tempo, it appears that humans can easily synchronize to beats which are about 30% slower than this tempo (i.e., 94 BPM) or about 30% faster than this tempo (i.e., 176 BPM).

This tempo flexibility of beat perception and synchronization can be contrasted with many other examples of synchrony in nature, such as the synchronous chirping of certain cricket species or the synchronous flashing of certain firefly species, which is limited to a rather narrow tempo range (e.g., for fireflies, ±10% relative to the spontaneous flash rate, cf. Figure 2 of Hanson et al., 1971).

Beat perception is constructive

Behavioral evidence suggests that beat perception involves more than the passive entrainment of neural responses to sound. This evidence concerns the fact that the beat imposed on a given sound can be consciously altered by the listener, and this manipulation can radically reshape how that sound is heard.

Thus, beat perception is not merely the “discovery” of periodicity in complex sounds, but is more active and under voluntary control, and provides an internal temporal reference that shapes rhythm perception.

For example, the beat guides attention in time, influences accent perception, and determines grouping boundaries between rhythmic patterns (Repp, 2007; Locke, 2009). While much popular music is composed in such a way as to guide the listeners’ beat perception (e.g., by physically accenting the beats or emphasizing them with grouping boundaries, instrumentation, or melodic contours), music with weaker cues may be more ambiguous and can lead to multiple interpretations of the beat. These can include interpretations with little support from the stimulus (e.g., as marked by the coincidence of notes with the beat). Such multiplicity of beat interpretations is demonstrated in Figure ​1, which shows how different listeners’ responses can be when instructed to “tap to the beat you hear” in an excerpt of jazz as part of the “Beat Alignment Test” (BAT) for the assessment of beat production and perception (Iversen and Patel, 2008). The data emphasize that the acoustic signal does not determine the beat: individuals picked different phases for their taps, corresponding to taps on the downbeat with the bass note (Phase 1), or on the upbeat with the snare drum (Phase 2). Listeners can also shift their beat phase midstream (S8 and 9).

An external file that holds a picture, illustration, etc.
Object name is fnsys-08-00057-g0001.jpg
Figure 1
Top: Spectrogram of an excerpt of jazz music (“Stompin at the Savoy,” by Benny Goodman; for corresponding audio, see supplementary sound file 1). Inverted arrows above the spectrogam show times of double bass and snare drum onsets, respectively. Bottom: time at which 9 human subjects (S1–9) tapped when instructed to “tap to the beat you hear.” Each tap is indicated by a vertical red bar. See text for details.

Such phase flexibility was studied by Repp et al. (2008) who showed that listeners could synchronize with rhythmic sequences successfully both at the beat phase most strongly supported by the stimulus, but also at other phases that had little acoustic support and which corresponded to highly syncopated rhythms. The ability to maintain a beat that conflicts with the acoustic signal is strong evidence for the constructed nature of the beat, and the ability to voluntarily shift the phase of the internal beat relative to the stimulus has been exploited by neuroscientific experiments discussed below (Iversen et al., 2009).

Importantly, a listener’s placement of the beat has a profound influence on their perception of temporal patterns (Repp et al., 2008). That is, identical temporal patterns of notes heard with different beat interpretations can sound like completely different rhythms to listeners (Repp, 2007; Iversen et al., 2009), indicating the influence of beat perception on rhythm perception more generally. Thus, the beat seems to serve as a temporal scaffold for the encoding of patterns of time, and rhythm perception depends not only on the stimulus but on the timing of the endogenous sense of beat.

Beat perception is hierarchical

Beats are often arranged in patterns that create higher-level periodicities, for example a “strong” beat every 2 beats (which creates a march-like pattern) or every three beats (which creates a waltz-like pattern). This hierarchical patterning of beats is referred to as meter.

When asked to “tap to the beat of music,” an individual listener can often switch between which metric level she or he synchronizes with. Audio examples are provided in supplementary sound files 2 and 3: sound file 2 presents a simple Western melody, while sound file 3 presents this melody twice, with “tapping” at different metrical levels [taps are indicated by percussive sounds].

The notation of this melody and a metrical grid showing the different hierarchical levels of beats can be found in Chapter 3 of Patel (2008). Numerous studies have found that listeners tend to pick the level of the hierarchy closest to the human preferred tempo range of about 600 ms between beats (see above), but there is considerable individual variation, with some listeners picking metrical levels either faster and slower than this (Drake et al., 2000; Toiviainen and Snyder, 2003; McKinney and Moelants, 2006; Martens, 2011).

Beat perception is modality-biased

Rhythmic information can be transmitted to the brain via different modalities, e.g., via auditory vs. visual signals. Yet in humans the same rhythmic patterns can give rise to a clear sense of a beat when presented as sequences of tones but not when presented as sequences of flashing lights (Patel et al., 2005; McAuley and Henry, 2010; Grahn et al., 2011, but see Iversen et al., in press; Grahn, 2012, for evidence that moving visual stimuli may give rise to a sense of beat).

This may be one reason why humans synchronize so much better with auditory vs. visual metronomes, even when they have identical timing characteristics (e.g., Chen et al., 2002; Repp and Penel, 2002; Hove et al., 2010; Iversen et al., in press). Interestingly, when monkeys tap with a metronome, they do not synchronize any better with auditory than with visual metronomes, and in fact find it easier to learn to tap with a visual metronome (Zarco et al., 2009; Merchant and Honing, 2014).

Beat perception engages the motor system

An important finding in the neuroscience of beat perception is that pure perception of a beat (i.e., in the absence of any overt movement) engages motor areas of the brain, including premotor cortex (PMC), the basal ganglia (putamen), and supplementary motor area (SMA) (e.g., Grahn and Brett, 2007; Chen et al., 2008a; Grahn and Rowe, 2009; Geiser et al., 2012; Teki et al., 2012; Kung et al., 2013).

Beat perception in auditory rhythms is also associated with enhanced functional coupling between auditory and motor regions (Kung et al., 2013), and this coupling appears to be stronger in musicians than in non-musicians (Grahn and Rowe, 2009). Grahn and Rowe (2009) have suggested that a cortico-subcortical network including the putamen, SMA, and PMC is engaged in the analysis of temporal sequences and prediction or generation of putative beats (cf. Teki et al., 2012).

Zatorre et al. (2007) have suggested that auditory-premotor interactions in particular underlie the temporal predictions involved in rhythm perception. More generally, a role for the motor system in prediction of events in structured sequences has been proposed by Schubotz (2007).

Going even further, Rauschecker and Scott (2009) have suggested that the premotor cortex (and associated structures of the dorsal auditory stream) have evolved primarily for the purpose of timing in sequences, a function used both by the motor system in programming motor sequences and by the auditory system in predicting the structure of acoustic sequences (cf. Leaver et al., 2009 for relevant fMRI data).

These ideas provide a foundation for the current work, which seeks to explain why and how the motor system is involved in predicting the timing of auditory beats, and why this ability may be restricted to certain species.

REFERENCE LINK :https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4026735/


Source: University of Oslo

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Questo sito usa Akismet per ridurre lo spam. Scopri come i tuoi dati vengono elaborati.