A decade ago, Harvard’s Jon Fullerton and I penned “The Numbers We Need,” arguing that educational data systems focus too much on metrics useful to policymakers and too little on the numbers that are useful to educators and families. We observed that the data most useful to policymakers “are often simple, straightforward data on assessment results and graduation rates, whereas the key data for district officials shed light inside the black box of the school and district—illuminating why those results look like they do and what might be done about them.”
Those same concerns are still with us today. And the disruptions of the pandemic have only made the need for useful, actionable data even more pressing. Well, in a recent report, “Bridging the Gaps in Education Data,” Fullerton, the executive director of Harvard’s Center for Education Policy Research, tries to make sense of why dissatisfaction with data remains so high and what we can do about it. Fullerton has worked intimately with state and district data over the years, as part of CEPR’s Strategic Data Project, and he’s a wealth of wisdom on this topic. (Full disclosure: The analysis was published by AEI Education, which I direct.)
Fullerton argues that we never seem to have enough data for two reasons: technical constraints and normative disagreements about what schools should do and how to measure this effectively. To make sense of all this, he walks through the “data gaps” that bedevil early childhood, school spending, postsecondary outcomes, and tutoring interventions.
For instance, when it comes to determining whether tutoring is effective, he notes that we frequently don’t know how much is spent on particular interventions or even which students receive which interventions. Fullerton writes that, so long as this is the case, neither school systems nor evaluators will “be able to determine whether tutoring worked,” “the cost of tutoring relative to student growth,” or “whether tutoring is more or less cost-effective than other interventions.”
These kinds of problems can be solved. If we want to understand whether and when tutoring (or any other intervention or program) actually works, Fullerton says schools need to get consistent about how they collect and report the essential data. Districts, he explains, do not currently standardize or even collect basic information: Is a student being tutored in math or history? During or after the school day? Does the student even attend each session? Fullerton suggests that schools track program and intervention participation and integrate that into student-information systems. So, for instance, “A student receiving high-dosage tutoring might get tagged with ‘high-dosage tutoring, [provider name], math, two hours per week, in-school, in-person.’”
Even if we get such things right, though, data dissatisfaction will persist because of broader technical shortfalls, especially with how we measure outcomes. For instance, he notes that “grades are not particularly reliable in demonstrating students’ actual skills,” while standardized tests omit important skills and competencies. As Fullerton puts it, “Measures, particularly of educational outcomes, almost never capture the richness of what we want to measure efficiently and in time to be useful.”
And then, of course, Fullerton points to the role of normative disagreement about what data we should be collecting and why. As he writes, technical debates about how and when to collect data often skip past the fact “that we don’t agree, at least in any deep way, on the specifics about what education is for and what we are even trying to measure.”
In the end, Fullerton wisely cautions, “We must also maintain an appropriate sense of humility and realize that data will not answer all our questions or make our core disagreements go away.” We shouldn’t ask data to do what they can’t. But we should also ensure that we’re doing what we can to ensure that the data do what they can to help us understand the impact of our dizzying array of programs, practices, and interventions.