SAT scores were released on Tuesday, more than two months after the first administration of the revamped exam and years after the media started to speculate about the new test’s difficulty. And it turns out that, after years of stagnation, the scores have gone up—rather significantly.

Maybe students are smarter and more prepared for the new exam than their predecessors. But chances are the upward score drift is a product of the test—not the test takers. As in the past, each section of the SAT is scored from 200 to 800. The average combined score on the old, three-section test was right around 1500; that would indicate that the average score would be about a 1000 on the new, two-section test. But the new average is actually close to 1090. according to data released by the College Board, which administers the SAT.

So has the new test been dumbed down? Or has the revamped exam, which is supposed to be more closely aligned with real school work, created greater opportunities for students to shine?

Some testing experts suggest that these questions overlook the design and intent of standardized tests and the role difficulty plays in them. It is not just that difficulty lies in the eye of the beholder. It’s also that tests like the SAT will always seem hard to most students: If they were generally easy, standardized assessments would lose much of their value to highly selective colleges that use them to distinguish among applicants.

According to Andrew Ho, a professor at the Harvard Graduate School of Education, most standardized tests are designed so that the number of examinees who answer a given question correctly averages around 60 percent. “If 90 percent of students get a question right [on the SAT], then nine out of 10 of them are indistinguishable,” Ho said. “Such a question does little to distinguish their skills.” Tests need to be difficult, but there is no value, either, in making them so difficult that nine out of 10 students are getting a question wrong.

Even when the content on a particular test is harder, the job of a test-maker, Ho explained, is to make sure that the ultimate average score is the same regardless of the test. Testing companies, like College Board and ACT, have entire departments devoted to “equating,” the process of ensuring that it doesn’t matter whether you take the test in one month or another. “Say, on the old test, if you got a 90 percent, that might get you a 1500. There are dozens of people whose job it is to make sure that, if you got a 80 percent on the new test, you still get a 1500,” Ho said, adding that it’s impossible to do this perfectly. “This process assumes that you’re measuring the same [domain of content and skills], and the entire premise of the new test is that it’s measuring something related but ultimately different, ideally more relevant.”

The new SAT is different in many ways from the old model. To name just a few, the questions have four rather than five answer choices, there are fewer math concepts covered, and hard vocabulary is no longer directly tested. Comparing the two tests is like comparing apples to oranges. Instead, College Board has come up with calculations that allow colleges to compare scores on the new SAT to those on the old one. Its research has found, for instance, that a 730 on the new test’s math section is equivalent to a 700 on the old. The College Board is strongly encouraging admissions officers to use these formulas to compare applicants who took different tests, rather than look at percentiles.

The higher average scores and the overall rise in the performance percentiles on both the math and reading section of the new SAT have led some critics, such as Dan Edmonds of Noodle Education, to speculate that the College Board may be intentionally inflating scores to attract more students. In 2012, the ACT became the most popular college-admissions test in the country. Many of the changes the College Board had made to the test appear to be designed to make the SAT more attractive to students, states, and school districts, which are increasingly paying for students to take the exam during the school day.

There are, however, likely valid reasons to explain why the percentiles have floated upward. Students are no longer penalized for picking a wrong answer, for example; they also have more time to answer each question on the test. These factors led to fewer people getting lower scores, thus pushing the average up. As Adam Ingersoll of Compass Education Group explained in an email,“College Board has decided to accept this ‘natural’ lift resulting from changes to the test,” a move he described as “perfectly reasonable.”

The question of difficulty on the SAT matters a great deal to the College Board. In a recent appearance in Boston, its president and CEO, David Coleman, cited the preponderance of questions that looked nothing like the work students do in school as a key reason the SAT was so difficult. Think “logic puzzles” on the math section and arcane vocabulary on the reading one—all of which lent themselves to test-prep techniques. (I work for, but do not speak for The Princeton Review, which offers test-prep services.) “The new SAT,” Coleman declared, “is utterly unsurprising. It is again and again the work you see in class.”

The exam has a ways to go before it becomes “utterly unsurprising.” This past weekend’s exam included a surprise section. On the current SAT, students can choose to write the optional essay at the end or to simply complete the four mandatory sections. Last Saturday, many or perhaps even all students who opted out of the essay likely had to complete an extra, 20-minute section, presumably what is known as an experimental section, which tests new questions and has no impact on scores. The College Board’s only mention of this section appears in a high-school counselor guide that few students see. ACT for its part tells students on those occasions when it uses an experimental section.

Another surprise on the May SAT exam was more to the benefit of students—or at least to some students who had used the free practice tests that College Board had released on its site and through Khan Academy. One of the math questions on the exam was almost identical with a question that had appeared in one of those tests. College Board’s partnership with Khan was meant to level the playing field, but surely not by providing test questions ahead of time. Ultimately, what’s most important is whether the SAT is well-made and fair. Mistakes like this throw both of those aspects into doubt because, as Scott Marion of the Center for Assessment pointed out, reusing a question from a published practice test turns it into an issue of whether a student used Khan Academy or not, rather than a measure of college readiness.

Perhaps the best question for high-school students is not whether the new SAT is more difficult than its predecessor or the ACT. Nor is it how competitive they are according to the percentiles reports. Rather, maybe it’s whether, even with these difficulties, anything has really changed in the way students should think about the exam.