Grading Creativity

Can a standardized exam save arts education?

Kai Pfaffenbach / Reuters

It sounded like an ordinary assignment for a visual-arts class: Karen Ladd, a teacher, asked her freshmen at New Hampshire’s Sanborn Regional High School to research an artist, create a piece of art inspired by the artist’s work, and then write a reflection about the experience.

Dressed in tank tops and shorts that heralded the arrival of summer weather, some students studied the assignment while others listened to headphones as they browsed for artists online. One girl begged to be allowed to use Bob Ross as her inspiration; another searched determinedly for paintings of bowling to use.

But this was no ordinary class project: It was a test.

This spring, with a six-district pilot, New Hampshire joined a small but growing list of at least a half-dozen states experimenting with large-scale arts testing. Educators prefer to call the new exams “assessments,” because they’re so different in form and format from traditional standardized tests. The goal, though, is to create a common “test”—often in the form of a project—that can be given to students in different classrooms across the state and used to help compare the performance of schools and districts.

But coming up with a uniform and efficient way to measure a subject that’s all about creativity is difficult. In its arts tests, Florida has incorporated multiple-choice and short-answer questions that are easy to score efficiently. New Hampshire and Michigan are trying something more ambitious: devising tasks that require a student to submit a finished piece of artwork or perform a piece of music. These tests are time-intensive to administer and grade, however, and the results are difficult to translate into a single numeric score.

The push to find the best way to test the arts is coming from arts educators themselves in many instances. They hope to foster not only student improvement, but also a sense that the arts are as valuable to curriculum and society as such long-tested subjects as math and reading.

“It’s very important for arts to be seen as a subject that can be and should be tested,” said Frank Philip, an arts-assessment consultant. “It’s a parity thing.”

Research has shown that arts education can improve student achievement in reading and math, as well as increase critical-thinking skills and engage students in school. One study by the National Endowment for the Arts found that low-income students who take multiple arts classes are significantly more likely to enroll in a four-year college.

Yet access to arts education remains unequal. A federal survey released in 2012, for instance, revealed that roughly 95 percent of the highest-income high schools offered visual-arts courses, while only 80 percent of the lowest-income high schools did. The same was true of music courses.

Karen Ladd, a high-school art teacher, and her colleagues meet in Concord, New Hampshire, to compare artwork created by their students. (Sarah Butrymowicz)

Scott Shuler, a consultant for Solutions Music Group and former president of the National Association for Music Education, hopes that including the arts in state-testing systems will highlight these inequities. Disparities in opportunities to learn math and English “pale in comparison to the disparities in the opportunity to learn the arts,” he said.

While arts educators want the arts to be given equal weight in the curriculum, they understand that the arts can’t be treated the same way when it comes to testing. Multiple-choice questions might demonstrate if a student knows the quadratic formula or the timeline of World War II, but they can’t measure whether a student knows how to draw with perspective or keep a steady rhythm.

That’s why, when the National Assessment of Education Progress, or NAEP, included an arts test in 1997, it required students to create real works of art in addition to answering standard multiple-choice questions. (That year’s test famously led to semitractor-trailers full of student-created clay bunnies; since then, efforts have been made to digitize work in photos or videos.) NAEP gave arts tests again in 2008 and 2016, but some experts expressed concern that they standardized the tests too much by, for example, over-relying on the multiple-choice questions and requiring students to respond to music rather than perform it.

New Hampshire has eschewed multiple-choice questions on art tests altogether in favor of open-ended tasks that require students to make or perform something. “You want to create a task that allows kids to demonstrate in their own way what they know and can do,” said Marcia McCaffrey, an arts-education consultant for the New Hampshire Department of Education. “An assessment can also allow kids creativity if it’s designed in the right way.”

The arts are just one of many subjects for which the state is developing so-called “performance assessments”—tests that are really multi-step assignments that require students to solve a problem or produce something. English, math, and science performance tests may someday be a mandatory part of the state’s accountability system; arts assessments will likely remain voluntary.

McCaffrey and a group of 11 teachers from around the state spent months developing the music and visual arts tests, which evaluate students on a scale of one to four. Shortly after school let out in June, the teachers met in Concord to share their students’ test work and begin the complicated process of reducing subjective impressions to a single number. As they discovered, that’s not so simple.

Sarah Boudreau and Justina Austin, elementary-school art teachers, commandeered a separate room and laid out about two dozen self-portraits drawn by their fourth-grade students. They needed to agree on a score of one, two, three, or four for each piece, based on predetermined grading criteria, such as drawing skills and oil-pastel blending technique.

“This one you thought could be a one,” Boudreau said. “I thought two.”

“I just thought the control was lacking,” Austin replied.

They paused over another piece that they had both awarded a three, scoring guidelines in hand, and justified why they hadn’t marked the girl down despite the unrealistic placement of her eyes in the middle of her forehead. The scoring system calls for students to use “deliberate placement”—but who were Boudreau and Austin to say the choice was not an artistically deliberate one?

As the elementary-school art teachers discussed whether they needed to tweak the grading system, music teachers in another classroom struggled to distill improvised student performances on the recorder into one of the four numbers. The guidelines called for rating the students on pitch, tone, and rhythm.

They debated how much minor imperfections mattered. Could a student receive a four if they made any errors in tone, for instance? But the bigger issue for the teacher Virginia Avery was how to combine those three separate measurements into an overall score. She worried not everyone would do that the same way.

“You’re just coming up with a number to fill a box, and that angers me,” she told her colleagues. “I don’t feel comfortable saying, ‘This kid is a two.’”

This kind of resistance is understandable, said Timothy Brophy, the director of institutional assessment and a professor of music education at the University of Florida. That’s because even the best predetermined scoring systems might not capture everything they need to capture.

A possible solution to this problem, he says, is “consensus moderation,” in which a group of experts, such as practicing artists, get together to engage with a work of art. They will discuss it and come to an agreement on a final grade. This process is more labor-intensive than scoring off a checklist, Brophy said, but more closely aligns with how artists really operate. “We’re all pretty glad that Monet and Da Vinci didn’t go to a school that said, ‘You need to [paint] in this way to meet a rubric,’” he said.

Several teachers at the Concord meeting recognized this tension, but still felt enthusiastic about moving forward with the project. They’ll offer the revised assessments again this fall and once more meet to refine the scoring process before expanding the testing program to encompass more schools and students.

Justina Austin, an elementary-school art teacher, holds up a student self-portrait that earned a 3. (Sarah Butrymowicz)

Ladd was one of four high-school teachers who gave the assignment to create a work of art inspired by an artist. The four teachers easily agreed on small changes to the assignment: Students’ statements should be typed instead of handwritten, and students should be required to submit their “inspiration” pieces so that scorers can tell how derivative their work is.

But the teachers debated whether or not to make more substantive changes. Ladd’s freshman students generally struggled more with the project than the older students taught by her peers did. High-school seniors got stronger results overall—although plenty of students still received ones and twos for lack of creativity and poor craftsmanship or written statements.

Sarah Kiley, a teacher at Epping High School, argued that it was okay—even expected—for freshmen to score lower and for all students to be docked points if they fell short. Altering the task to get more passing scores wasn’t the answer. “We need that high level,” she said. “It’s going to be a challenge and they have to work for it.”

This post appears courtesy of The Hechinger Report.