>by Mark Kleiman
A school superintendent allowing his staff to doctor students' answers on a set of high-stakes standardized exams has something in common with a corporate CEO holding a bundle of stock options who practices "earnings management" via bogus asset sales. Each is responding to an intense incentive system by faking success rather than producing it.
One could formulate this as a general principle: any incentive to create a result also creates an incentive to simulate the same result. The corollary is obvious: the greater the incentive, the greater the temptation. Or, as W. C. Fields put it in You Can't Cheat an Honest Man, "If a thing is worth winning, it's worth cheating for." Borrowing Fields's real name, I propose to call this generalization Dukenfield's Law of Incentive Management. Designers of control systems ignore Dukenfield's Law at their peril, and ours.
A second corollary follows directly from the first: holding the level of audit effort constant and other things equal, the reliability of a measure will decline as the importance attached to it grows. To put the same thing another way: to maintain a given level of reliability, the resources invested in verifying any performance measure need to rise roughly in proportion to the stakes involved
Yet audit and other counter-simulation systems are typically treated as afterthoughts in the design of incentive management systems. The school accountability movement is a good example here. There are many ways of cheating on standardized tests other than doctoring the answer keys or even using questions from the test in class exercises. Simulation strategies come in a wide range of subtleties, and no doubt all of them are being used.
Unless we're literally training children to answer examinations, all school tests are merely proxies for things we really care about. It isn't hard to find ways of producing proxy results instead of real ones, for example by drilling students in four-term verbal analogies [Apple is to pear is catfish is to: 1) cat 2) salmon 3) fish 4) seafood 5) none of the above.] The ability to solve such puzzles quickly (and not too quirkily) isn't a bad proxy measure for a certain kind of reasoning and interpretive skill, but it's hardly valuable enough to rate the hour a week it took out of my 11th-grade English class. The goal back then was to fool the SAT test to get students into good colleges, rather than fooling the state to get raises for teachers, but the strategy was the same.
Test results at the level of the school can also be influenced by managing the population tested; if all the worst students transfer to other schools, the average score will surely go up. For better or worse, expulsion has been made difficult, but there are other ways - incentive-based ways, many of them - to induce the weaker players to leave the game.
At the other extreme of subtlety, dropping art and music (or even reducing hours spent on science and history) and substituting more hours of reading can be thought of either as cheating (by degrading a set of valued characteristics that the tests don't happen to measure) or as simply responding as intended to an incentive system designed to produce literacy at virtually all costs. In such cases, discussion of the simulation risks and counter-simulation strategies will require a discussion of just what it is that the incentive system is trying to produce, and therefore what it is that the tests are intended to measure.