Do We Judge Music More on Sight Than on Sound?

A new study says yes, but many psychologists disagree.

Yulianna Avdeeva of Russia, first prize winner, performs during the 2010 Chopin Piano Competition in Warsaw. (Reuters)

Spend any time performing classical music and you are told that appearance matters. A choir can enter a hall and, through their demeanor alone, receive applause. A band strolling on to a darkened stage gets cheers. We live with YouTube music videos as much as we live with invisible MP3s, and what we see prepares us, excites us, primes us, for what we’re about to hear.

So the results of a University College London study, published last week, seemed intriguing. The psychologist Chia-Jung Tsay showed novice musicians and professional adjudicators different types of media from an international piano competition. She had them only listen to audio, only view silent video, or only watch video clips with sound of the contestants performing. Then she asked them to identify, from experiencing just one of those types of media, which of the contestants had won the competition.

And, as NPR recently reported:

"What was surprising was that even though most people will say sound matters the most, it turned out that it was only in the silent videos, the videos without any sound, that participants were able to identify the actual winners," Tsay says.

Incredibly, the volunteers were better able to identify the winners when they couldn't hear the music at all, compared with when they could only hear the music. In fact, it was even worse than that: When the volunteers could see the musicians and hear the music, they became less accurate in picking the winners compared with when they could only see the performers. The music was actually a distraction.

At first glance, it looks like a staggering result. The study’s abstract declares: “[T]he findings demonstrate that people actually depend primarily on visual information when making judgments about music performance.”

But looks can be deceiving. Tom Stafford, a cognitive scientist and author of the book Mind Hacks, found the study, and its coverage, problematic*. The finalists at an international piano competition are some of the finest musicians in the world, and, he writes on his website:

In all probability there is a minute difference between their performances on any scale of quality. The paper itself admits that the judges themselves often disagree about who the winner is in these competitions.

The experimental participants were not scored according to some abstract ability to measure playing quality, but according to how well they were able to match real-world competition outcome.

The experiments show that matching the judges in these competitions can be done based on sight but not on sound. This isn’t because sight reveals playing quality, but because sight gives the experimental participants similar biases to the real judges. The real expert judges are biased by how the performers look – and why not, since there is probably so little to choose between them in terms of how they sound?

The science blog Arcsecond agrees. If all the players were the best in the world, their auditory performance would be among the best in the world and thus very close, but:

the variation in how the musicians move and express themselves physically could potentially be large – 50, 70, 90, for example. So even if judges base their scores mostly on the quality of playing, the visual aspect can still dominate the final rankings.

But how to explain the silent video viewers outperforming the audio-and-video watchers? Perhaps they don’t need to. “[I]t’s not like people with visual information did very well,” writes Arcsecond: “They got to roughly 50% accurate.”

And on top of all this: None of the clips that subjects watched or heard — silent or sound, video or audio — were longer than six seconds*. Subjects were attempting to simulate professional results with almost no information. Contra Arcsecond, great performances of the same work can sound very different from each other, but, regardless, six seconds is not long enough to judge a musician’s holistic interpretation. And as musicians sometimes lose competitions by fudging a single note, a six second sample could easily deceive a listener.

So are those staggering results entirely moot? Can we retire to Spotify, safe in the very auditoryness of listening?

Perhaps. Arcsecond guesses that “the conclusions of the paper are probably true,” but says that’s based on an understanding of how human beings work (i.e., we’re biased toward the visual), not on the study’s evidence.

Classical music organizations know this, too. Many high-profile instrumental auditions are blind: The judges don't know the name, gender or age of the musician auditioning on the other side of the curtain. Orchestras take measures like those, though, to counteract more than visual bias: They’re meant to hinder sexism or favoritism of a former student.

Even if we’re not adjudicating an international competition, we’d be wise as listeners to heed what most musicians -- and every elementary school music teacher -- already know: that sight shapes and casts what we hear, and that appearance can separate a tremendous performer from a merely pleasant one.

Via Ed Yong.

Update, 6pm: The plot thickens! Tsay has emailed me about the study to clarify two things:

First, sound clips alone did allow for differentiation between the performers. It was possible to distinguish between two performances based off their sound alone: “Participants who were randomly assigned to receive sound-only recordings were able to choose one performer over the other two performers in each trial,” writes Tsay.

Second, the clips weren’t only six seconds long! This assertion, repeated among science bloggers, was incorrect. The length of the clips -- audio and visual -- ranged from 1 second to 1 minute, and, in them, says Tsay, “the pattern held.”

So the clips were neither indistinguishable on an auditory basis, nor too short to allow for some kind of artistic adjudication. This, needless to say, strengthens the study's original findings.