States are going to give standardized tests for this school year. But what’s the best way to report scores from those tests in a way that’s useful, highlights students with the greatest needs, and isn’t fundamentally misleading?
An assessment expert thinks he can help, and he’s shared his proposal with the nation’s state schools chiefs. A big part of his approach involves thinking about the tests less like traditional exams and more like a census. Having data systems that have tracked individual students over the last five years is another major piece of it.
But the plan from Andrew Ho, a professor at the Harvard Graduate School of Education, doesn’t have neat solutions for all the potential obstacles involved in testing this year. And of course, it might not win over many doubters whose concerns about the tests extend beyond his pitch.
The U.S. Department of Education announced Feb. 22 that it is not entertaining requests from states to cancel standardized tests for this school year mandated by the Every Student Succeeds Act, despite a push from some states to do so. Not all states have given up hope of nixing these tests or replacing them in some way. But with the pandemic’s disruptions, many states will still confront significant challenges for testing.
One of the biggest will be how to report scores in a way that’s meaningful, but doesn’t discount remote learning, students who’ve largely or entirely vanished from school, testing opt-outs, and other issues. In short: There’s a lot of doubt out there about putting much if any stock in these scores.
With these issues in mind, Ho has pitched three test-score metrics for states to report and how to get them from the standardized exams. Late last month, after the Biden Education Department’s announcement, he presented his plan to a collaborative on testing at the Council of Chief State School Officers.
“We know the problems. The problems are substantial. They’re massive,” Ho said in an interview. “There is no way to interpret scores this year like it’s business as usual without massive misinterpretations and technical flaws.”
At the same time, Ho stressed, the problems shouldn’t seem so daunting that nobody tries “to give technical solutions and to minimize those flaws.”
Equity, trends, and matches are key
There are three main elements of Ho’s proposal.
1. The first part is to report the percentage of students from this year’s state testing that have comparable previous test scores—indeed, he stressed the importance of this and not scores being the first issue that states focus on when reporting testing data. This would mean looking at which students took the tests two years ago, and seeing whether they took the tests for this academic year. Remember: Last year states canceled their tests en masse, so there aren’t test scores from that time.
So, for example, states would report the percentage of students who, two years ago, took the tests in the 3rd grade and also took the tests in the 5th grade this year. This would require state data systems to track individual students.
Ho says that this amounts to conducting an educational census that would help states sort students into two important groups: one for which the state has comparable test score data, and the other for which there isn’t test score data.
“It instantly divides your attention into two deserving groups,” he said, calling this piece of his idea the “match rate.”
2. The second part would focus on the students for whom there is comparable test score data from two school years ago.
Ho proposes that for those students, states find their previous “academic peers.” In other words, states would identify students from 2017 and 2019 who performed at similar levels on the exams. Then they would study how the 2017 group performed on 2019 tests, and how the 2019 group performed on 2021 tests. (This is why Ho’s plan relies on states that have longitudinal data systems dating and “stable” testing systems dating back to the 2016-17 school year.)
This method would help people determine the extent to which the pandemic has affected students’ academic progress, compared to similar students from before COVID-19. Ho calls this a “fair trend” approach.
3. The third part focuses on students who don’t take the tests this year and who the system has lost track of, or what Ho calls the “equity check.”
For those students, Ho also proposes looking at the scores from those students in 2019, and looking at their academic peers from 2017, and then looking at the test scores from that peer group in 2019.
Ho admitted that this third piece of his plan “requires the most guesswork,” but could still tell a meaningful, descriptive story. Yet he also said this “equity check” would probably paint a best-case picture of where these missing students stand. Why? Because it “assumes academic learning rates for those who went missing from 2019 to 2021 are the same as those in 2017 to 2019,” Ho wrote.
Concerns about the tests will persist
Ho’s memo, which includes technical details about things like students who leave schools between academic years, does not directly address state tests for high schools. States are required to give exams in certain subjects once in high school grades, although federal law doesn’t mandate a particular grade for those years.
In his memo, Ho advises states to “keep it simple” lest states “risk the public trust on what appears to be a black box.” In the interview, he acknowledged that when it comes to thinking about the data, “It is better to be complicated than wrong.”
The goal for this year, Ho stressed, should be to assess not just student needs, but where the pandemic has had the most dramatic effects: “What we don’t know is who needs disproportionate support this year compared to previous years.”
What about states that give tests in the fall? The Education Department told states they could administer 2020-21 tests outside the typical testing window, such as in the fall of this year. Yet Ho says he doesn’t think his approach would work for such tests because there aren’t past fall tests to use as an appropriate baseline. In addition, he said, things like extended learning opportunities (think summer school) that states might provide over the next several months, or, conversely, any sort of “summer slide” make fall testing a dubious proposition in general.
What about parent opt-out? The prospect of large numbers of parents keeping their kids home (or otherwise away) from the state tests has raised a host of concerns about who will be tested and how results from those tests could be misinterpreted and misleading. What if many high-achieving students take the tests, for example, but their counterparts don’t?
Ho said that’s the sort of distortion his first measure, the “match rate,” is meant to counteract, because it emphasizes that this is an atypical year and would clearly demonstrate that in some cases large shares of students didn’t take this year’s test.
What about tests administered remotely? Ho acknowledged that remote administration of exams presents a “serious risk” to the comparability of scores. But he leaves it to ongoing or “post hoc” research to decide whether comparing in-person and remote test scores can be done fairly. And he says that in either case, the metrics he’s proposing can still be useful.
Not surprisingly, opponents of testing for this year have a list of concerns not addressed by Ho’s plan.
In an Education Week opinion piece from December, Lorrie Shepard, a distinguished professor at the University of Colorado’s school of education, said that if officials considered all the downsides, from parent sentiment to logistical problems, they would cancel most if not all of the state tests.
In an interview, Shepard said that Ho’s key suggestions for things like creating a “fair trend” are good ones from a statistician’s point of view. But she said that’s no guarantee states will handle the scores fairly.
“We are all aware that when things are posted publicly, it is usually the simplistic interpretation that most people rely on,” Shepard said. “There will be misinterpretations and there will be misuse.” (Read more on data reporting requirements here.)
High opt-out rates still pose problems for Ho’s approach, she noted, even though his approach tries to account for them.
Shepard also stressed that pre-pandemic data about other things, such as access to internet-connected devices, could help direct resources more efficiently than test scores, Shepard also said.
Ideally, test scores will “serve a tertiary purpose,” Ho said. Measures of students physical and mental well-being, as well as things like local assessments, should also be important considerations for state leaders.
Still, “screaming from the rooftops with bad data” about tests would be not just unhelpful but damaging, Ho stressed.
“When would anybody trust you again?” Ho asked.