First came high stakes tests, the educational equivalent of trying to improve children's physical fitness by measuring their body mass index, strength and stamina, then measuring them again next year. And the next year. And the year after that.
High stakes tests yield terabytes of data, but no measurable student improvement. All we learn from the time consuming, curriculum distorting exercise is, test scores correlate with family income. Actually, we don't even learn that. We knew it already.
Then came A-F school grades issued by the state based on students' scores on the high stakes tests. In their original form, they were just a different way of presenting schools' test scores. The only added value was, they made intuitive sense to people who want a simple way of rating schools. We all know what letter grades on report cards mean, so the system was easy to understand. Schools with an "A" or "B" grade were likely to have mostly middle-to-high income students and high academic achievement. The "C," "D" and "F" schools were likely to have lower income students and lower academic achievement.
Lots of people complained about the grades, with good reason. They echoed the class bias of test scores, but the grades made the results were even more judgmental. They lavished praise on schools with high income students — "You get an A! You get a B! — while they labeled schools with low income students anywhere from average to failing. No matter how talented the teachers and administrators at the schools teaching low income students were, no matter how hard they worked, it was nearly impossible for them to get the top grades schools with higher income students received as a matter of course.
People at the Department of Education heard the complaints, so they decided to try and make the grading system more nuanced. Educators, statisticians and computer techies set to work to create a weighting system which made the grades more equitable.
The changes were at least a partial success. The current state grades reflect more than the students' family income. That's a step in the right direction, isn't it?
Well, maybe. But the changes create a new problem. If the new, improved grading system doesn't tell us which schools have the highest test scores, what does it tell us?
To try and figure that out, let's take a look at four elementary schools in TUSD that all earned a "B" this year: Ochoa; Holladay Magnet; Gale; and Sam Hughes.
From 2018 to 2019, Ochoa Elementary accomplished the almost-unheard-of feat of jumping three letter grades, from an "F" to a "B." Holladay Magnet Elementary made a major leap as well, from a "D" to a "B" in 2019.
Meanwhile, Sam Hughes Elementary and Gale Elementary, which were "A" schools in 2018, dropped a grade to a "B" in 2019.
So, all four TUSD schools earned an identical "B" grade from the state in 2019. When most people see those grades, it would be reasonable for them to assume the schools are in the same ballpark when it comes to student achievement. But that assumption would be wrong.
Look at the passing rates in the Language Arts and Math portions of the 2019 AZMerit test at each school.
Ochoa Elementary. Language Arts: 22%. Math: 30%.
Holladay Magnet Elementary. Language Arts: 34%. Math: 37%.
Gale Elementary. Language Arts: 60%. Math: 55%.
Sam Hughes Elementary. Language Arts: 75%. Math: 65%.
In Language arts, there is a 53 percentage point spread in the schools' passing rates, from 22 percent to 75 percent. In Math it's a 35 point spread, from 30 percent to 65 percent.
So what do we learn about the academic achievement of students at the four schools from their state grades? Somewhere between little and nothing.
There is a good explanation for the four schools receiving "B" grades despite the wide difference in their students' rate passing rates on the AZMerit. It's because in the current system, the single most important factor in determining school grades is student growth. The amount students' test scores rise or fall from year to year accounts for half of a school's grade. The other half is divided between three categories: student proficiency, English Language Learners' growth and proficiency, and a variety of "acceleration and readiness measures."
So let's look at the change in passing rates at the four schools. Ochoa, had an 8 percent increase in Language Arts and a 13 point increase in Math. Holladay had a 11 point increase in Language Arts and a 15 point increase in Math. Those unusually high increases in the students' passing rates is the reason the schools jumped from an "F" and "D" in 2018 to a "B."
The passing rates of both schools with an "A" in 2018 dropped in 2019. Sam Hughes students had the same passing rate in Language Arts, but they dropped 8 points in Math. Gale students gained 1 point in Language Arts, but they dropped 5 points in Math. That's the primary reason each school fell a notch from an "A" to a "B."
With the new scoring system, it's possible for lower income schools to shine because of student growth on state tests, even if their post-growth passing rates are still low, and higher income schools with high test scores can drop a grade or two if their passing rates slip. In the vast majority of cases, state grades still correlate with parental income, meaning they retain their socioeconomic bias, but it's no longer as certain as it once was.
It's ironic that the state grading system took a step forward in terms of equity but took a step backward in terms of clarity, which defeats the purpose of the A-F grades. To "improve" the state grading system, the state had to render it meaningless.
That's where things stand. We're stuck with a high stakes testing regimen that drives teachers and students crazy, eats into quality school time (No time for recess, art or music, just drill and test, drill and test, drill and test), and distorts the Language Arts and Math curricula into test performance delivery systems. And we have a state grading system that's more equitable than the raw test scores, but there's no way of knowing what a school's grade means without digging into the data.
Which leaves us with two options. We can try to fix the unfixable by tweaking the high stakes tests and the state grading system. Or we can stop the madness and admit our 17 year, No Child Left Behind experiment with evaluating students and schools by assigning them a number or a grade is a failure, then do the right thing and throw the whole goddam NCLB mess onto the trash heap of educational history.