How To Get States To Improve School Testing

February 12, 2010

By Marcus A. Winters

What percentage of Texas’ fourth-graders are good readers? It seems to depend on whom you ask.

The state will tell you that 83 percent of them met or exceeded the proficiency benchmark on its 2007 test. Not too shabby. On the other hand, only 30 percent of fourth-graders in Texas scored high enough to be considered proficient on the National Assessment of Educational Progress, an exam administered by the U.S. Department of Education that is usually regarded as the gold standard in education testing.

The big difference results from where the two tests set their proficiency bars. Texas sets its bar pretty low — so low that barely literate students can score high enough to be deemed proficient. The NAEP has a much higher standard; in fact, a student labeled “proficient” by Texas could fail to score above “basic” on the federal test.

Unfortunately, Texas’ low standard is not anomalous. Eight states have even bigger gaps in fourth-grade reading. Worse, standards are declining, even as graduates need more and more knowledge and skills to get ahead in the global economy. A recent federal study noted that 15 states lowered at least one of their proficiency standards in math and reading between 2005 and 2007.

States’ low standards have spurred a bipartisan campaign to create worthwhile national ones. Conservative groups like the Fordham Foundation have pushed for national standards for years; more recently, President Barack Obama and Secretary of Education Arne Duncan, as well as local leaders like New York City schools Chancellor Joel Klein, have embraced the idea. But the road to national standards would be extremely tough to navigate politically. A more feasible approach would give all states an incentive to set objectively high standards themselves, and the looming reauthorization of the federal No Child Left Behind Act, or NCLB, gives us a perfect opportunity to do it.

The perverse incentives of NCLB clearly have something to do with the downward movement of state standards. NCLB punishes a school when too few of its students make progress toward math and reading proficiency — beginning with extending to students the choice of other public schools to attend and culminating with completely restructuring the school. But the law has a gaping loophole: States get to define proficiency however they want. A state can thus meet NCLB targets by defining proficiency down. Toughening its standards, by contrast, handicaps its ability to meet the federal requirements.

Of course, low standards have their own appeal. The lower the standard, the more students pass it, and lots of supposedly proficient students make an education system look as though it’s working well, even when it’s not. State governments love to tell constituents that students are doing great on standardized exams; the public usually just assumes that the criteria used on those exams are meaningful.

States can get away with setting low standards in part because proficiency remains an inexact notion. In determining the difficulty of exam questions, states typically rely on the judgment of expert panels, whose members ask themselves: “Would a ’proficient’ student answer this question correctly, or is it so difficult that a student who gets it right should be considered ’advanced’?” The percentage of students who answered the question correctly also gets taken into account. Add to all this the fact that what constitutes an expert varies from state to state — and that the experts only make suggestions, which the state may ignore — and it’s clear that the process has an unavoidably subjective dimension.

Given such confusion, the case for high, uniform, enforceable national standards seems strong. States with unchallenging standards do a disservice to their students, who wind up thinking that they have mastered the skills to succeed when they may in fact be way, way behind peers in more demanding states.

But national standards run up against serious objections. First, many worry that they threaten America’s federalist system, which historically has made schooling a state matter. Back in 2002, some conservatives argued against enacting NCLB because they saw it as an unwarranted federal intrusion into state business. A congressionally set national standard would only expand Washington’s role.

Even more troubling, though, is that we could wind up with a single low standard. Proponents of national benchmarks seem to think that they’ll be the ones writing them. Bureaucrats in the U.S. Department of Education will have other ideas. So will congressmen from lower-achieving states, which won’t want to be embarrassed by a national proficiency standard that their students can’t reach. Since any system of setting a common standard — either by federal mandate or voluntary state agreement — depends on the cooperation of lousy performers, it’s hard to see how a demanding national standard would survive the political process. Similarly, if the NAEP became an enforceable national benchmark, pressure would grow to make it easier.

Duncan has acknowledged that in a system of high national standards, “test scores are going to drop in some places precipitously. And ... we have to give those politicians cover for doing the right thing. So there is a tricky balance that we have to work on here.” But no clear consensus has yet formed on how to strike that “tricky balance.”

We could make better progress toward an effective testing regime if we changed our goal from uniform national standards to high state standards, which two simple amendments to NCLB could help bring about.

First, we would remove the law’s disincentives for states to adopt higher standards. Recall that NCLB holds schools accountable based solely on the percentage of students who score above a certain proficiency benchmark, however it is established. Not only does this encourage states to lower their standards; it makes little sense. After all, you can boost your odds of finding a school with high test scores just by looking in wealthy, white neighborhoods, where students benefit from the involvement of their educated parents. These students would meet the (often very low) proficiency benchmark even if their school wasn’t very good. In some excellent urban, minority-dominated schools, by contrast, students might be making tremendous progress but fail to meet NCLB requirements because of their disadvantaged backgrounds.

Instead of focusing solely on who counts as “proficient,” NCLB’s measure of school quality should also include the gains that students make from year to year on state tests. This would stop rewarding states that push down proficiency standards. (To work, a revised system would have to include rules ensuring that tests were scaled properly.) Such a “value-added” measure would also make more sense, since a truly effective school is one that makes a meaningful difference in how much its students learn.

An entirely value-added system isn’t desirable, however, since it’s important that we continue to provide schools with goals for student learning. The best system would combine value-added measures with a minimum proficiency standard; existing state and district accountability plans (New York City’s, for instance) provide useful examples of such a hybrid approach.

But even with NCLB’s perverse incentives removed, some states will still prefer lower standards because they inflate records of achievement, satisfying citizens who just want to see higher scores. We need to give states an incentive to set higher standards.

Here — a second tweak to NCLB — is how to do it. States would still develop their own tests in math and reading and set their own proficiency benchmarks. Every few years, though, the federal government would collect and administer each of these tests to a small but nationally representative sample of students. The scores that these students earned on the exams would provide a uniform and objective measure that could be used to compare the relative difficulty of standards across states. For example, a test that identified 80 percent of the nation’s students as proficient would have a lower proficiency standard than one that labeled 40 percent as proficient.

With an objective measure of each state’s standards in hand, a revised NCLB could then link some portion of a state’s federal per pupil funding to the standards’ difficulty, relative to those of other states. The greatest bonus per pupil would go to the state with the highest standards, the next greatest bonus to the state with the next highest standards, and so on. Under such a system, states would have an incentive to raise standards — but not to set them unreasonably high, since some of their funding would still depend on the percentage of their students meeting the proficiency benchmark.

This system would still allow the definition of proficiency to vary, and that’s sure to attack our sensibilities. But the system would at least encourage the proficiency benchmark to increase over time in each state, so that standards generally improve nationwide. In this system, the answer to the question “What should students know?” would always be “More!” That would help ensure that America’s students acquire the skills they need to flourish in a 21st-century economy.

