Economics, Education Finance, Pre K-12
February 1st, 2003 4 Minute Read Report by Marcus A. Winters, Jay P. Greene, Greg Forster

Testing High Stakes Tests: Can We Believe the Results of Accountability Tests?

Do standardized tests that are used to reward or sanction schools for their academic performance, known as “high stakes” tests, effectively measure student proficiency? Opponents of high stakes testing argue that it encourages schools to “teach to the test,” thereby improving results on high stakes tests without improving real learning. Since many states have implemented high stakes testing and it is also central to President Bush’s No Child Left Behind Act, this a crucial question to answer.

This report tackles that important policy issue by comparing schools’ results on high stakes tests with their results on other standardized tests that are not used for accountability purposes, and thus are “low stakes” tests. Schools have no incentive to manipulate scores on these nationally respected tests, which are administered around the same time as the high stakes tests. If high stakes tests and low stakes tests produce similar results, we can have confidence that the stakes attached to high stakes tests are not distorting test outcomes, and that high stakes test results accurately reflect student achievement.

The report finds that score levels on high stakes tests closely track score levels on other tests, suggesting that high stakes tests provide reliable information on student performance. When a state’s high stakes test scores go up, we should have confidence that this represents real improvements in student learning. If schools are “teaching to the test,” they are doing so in a way that conveys useful general knowledge as measured by nationally respected low stakes tests. Test score levels are heavily influenced by factors that are outside schools’ control, such as student demographics, so some states use year-to-year score gains rather than score levels for accountability purposes. The report’s analysis of year-to-year score gains finds that some high stakes tests are less effective than others in measuring schools’ effects on student performance.

The report also finds that Florida, which has the nation’s most aggressive high stakes testing program, has a very strong correlation between high and low stakes test results on both score levels and year-to-year score gains. This justifies a high level of confidence that Florida’s high stakes test is an accurate measure of both student performance and schools’ effects on that performance. The case of Florida shows that a properly designed high stakes accountability program can provide schools with an incentive to improve real learning rather than artificially improving test scores.

The report’s specific findings are as follows:

  • On average in the two states and seven school districts studied, representing 9% of the nation’s total public school enrollment, there was a very strong population adjusted average correlation (0.88) between high and low stakes test score levels, and a moderate average correlation (0.45) between the year-to-year score gains on high and low stakes tests. (If the high and low stakes tests produced identical results, the correlation would be 1.00.)
  • The state of Florida had by far the strongest correlations, with a 0.96 correlation between high and low stakes test score levels, and a 0.71 correlation between the year-to-year gains on high and low stakes tests.
  • The other state studied, Virginia, had a strong 0.77 correlation between test score levels, and a weak correlation of 0.17 between year-to-year score gains.
  • The Chicago school district had a strong correlation of 0.88 between test score levels, and no correlation (-0.02) between year-to-year score gains.
  • The Boston school district had a strong correlation of 0.75 between test score levels, and a moderate correlation of 0.27 between year-to-year score gains.
  • The Toledo school district had a strong correlation of 0.79 between test score levels, and a weak correlation of 0.14 between year-to-year score gains.
  • The Fairfield, Ohio, school district had a moderate correlation of 0.49 between test score levels, and a moderate negative correlation of -0.56 between year-to-year score gains.
  • The Blue Valley, Kansas, school district had a moderate correlation of 0.53 between test score levels, and a weak correlation of 0.12 between year-to-year score gains.
  • The Columbia, Missouri, school district had a strong correlation of 0.82 between test score levels, and a weak negative correlation of -0.14 between year-to-year score gains.
  • The Fountain Fort Carson, Colorado, school district had a moderate correlation of 0.35 between test score levels, and a weak correlation of 0.15 between year-to-year score gains.

READ FULL REPORT

Donate

Are you interested in supporting the Manhattan Institute’s public-interest research and journalism? As a 501(c)(3) nonprofit, donations in support of MI and its scholars’ work are fully tax-deductible as provided by law (EIN #13-2912529).