| How preliminary research on the Cleveland voucher program was ‘reanalyzed’ to fit a preconception. |
Firm conviction oftentimes leads people to go to great lengths to convince themselves and others of the value of the cause which they believe to be obvious. At the extreme, science and research are invoked and manipulated to support the cause at hand. “See,” the believers tell us, “we have research which proves our point,” even when the facts of the matter contradict this. When greed is involved, when the issue is extremely contentious, or when passion about the issue is high, manipulation of the facts and bad science are even more likely. The stories we are now hearing about the misuse and manipulation of science by the tobacco companies are illustrative of the extent to which this can be taken. Recent manipulation of science and distortion of facts about a much-publicized study of school vouchers in Cleveland is another.
In the spring, much media attention was given to the preliminary findings of a study on the effects of the private-school-voucher program in Cleveland. The study, which was conducted by colleagues and me at the Indiana Center for Evaluation at Indiana University’s school of education, found no significant differences in academic achievement between voucher students and Cleveland public school children after slightly less than one year in the program. In the first-year report, we pointed out that it was much too early to draw conclusions about the effects of the voucher program, and we cautioned against making too much of the results. Education is, we noted, a cumulative process, and one year of the program is not a sufficient basis on which to judge its effects. We strongly urged that those on both sides of the school choice issue wait until evidence is available over several years.
Unfortunately, though perhaps not surprisingly, a small though vocal group of voucher advocates has attempted to cloud the issue by manipulating the data in a “reanalysis” of their own. Though all of this may seem to be nothing more than an abstract, academic “cat fight,” the outcomes of this debate are likely to impact thousands of children and their families. It is absolutely critical that families and policymakers are provided with legitimate, timely, factual information on which to make their decisions about school choice.
As members of the research team, we have, until now, elected to focus our efforts on ensuring the quality of our ongoing work with the public and private school educators in Cleveland. However, the importance of the issue and misrepresentations of our recent work and findings have reached a point at which we feel we must respond. Thus, though we would much prefer to invest all of our energy and time in the continuing research in Cleveland, we feel it necessary to explain what we did in the study, why we did it, and why applying bad science in order to create the findings desired by one side or the other is unfair to the children whose educational futures hang in the balance.
School choice and publicly funded voucher programs undoubtedly represent the most contentious issue facing American public education today. | |
Let me make one fact clear: We are neither advocates nor opponents of school voucher programs. In fact, the faculty and graduate students who make up our team represent a cross section of backgrounds, many of us experienced with both public and private schools as students, teachers, parents, and administrators. We have been contracted by the Ohio Department of Education and maintain independent responsibility for our work. Thus, we have no point to prove. We do not wish to change policy or to make policy. Our only goal is to conduct unbiased, objective research that can inform policy and parents who wish to do the very best for their children.
The voucher advocates have criticized our study for what they believe to be weaknesses and then reanalyzed our data using what they suggest are different and more appropriate statistical techniques. With a surprising air of vehemence and confidence, they have made much of their new findings that vouchers improved students’ academic performance. Leaving aside for a moment the issue of bad science, no matter how confidently presented, I would like to explain why my colleagues and I continue to stand behind our work and findings.
Our study examined the academic achievement of 125 3rd grade voucher students and the achievement of approximately 450 Cleveland public school 3rd graders. We used 3rd grade students, even though their numbers were relatively small (an approach criticized by the advocates), because they were the only voucher students for whom background data were available. These data, particularly achievement-test scores from 2nd grade, allowed us to consider differences between the two groups (voucher and nonvoucher students) before they entered the program. In the present study, children who participated in the voucher program were already achieving at significantly higher levels than their public school counterparts in 2nd grade, even before they entered the scholarship program. Without taking these beginning differences into consideration, it is impossible to tell whether 3rd grade differences were the result of the voucher program. It would be a little like trying to determine who won a basketball game by looking only at the points scored in the second half of the game.
The critics also suggest we did not include in our study the 31 3rd grade voucher students who attended the two private HOPE Academies sponsored by Akron, Ohio, industrialist and voucher advocate David Brennan. This is blatantly untrue. These 31 students were included in our study. However, their test scores were examined in a separate analysis from the 94 other voucher children, because they took a different test, assessing different information, and given under very different conditions. HOPE school representatives refused to allow us to test their students. As a result, the only test data available for HOPE students were from a test being used in a separate self-evaluation of the HOPE schools, and which was given in short sessions over several days.
In contrast, the 94 students in the larger group were tested using the most current available achievement test given in a single, three-hour session on one day and conducted by proctors who were trained and supervised by our evaluation staff. Both sets of test scores may provide useful information, but they are considerably and obviously different. The test scores from the one test mean something very different from those of the other. To illustrate, it would be much like lumping together the overall score of athletes who completed 10 decathlon events over 10 days with the average time of athletes who completed a marathon in a single day. What would this bundle of numbers mean? Nothing.
The advocates have argued that combining these two sets of scores is not problematic and, in their reanalysis, they do. They justify this by noting that the test manufacturer provides tables with which to convert scores on one test to those on the other. However, this issue was discussed at length with the test manufacturer’s vice president for research several months before release of our report. In a written correspondence to us, this individual, who was responsible for overseeing the development and use of the tables noted by the critics, cautioned against combining scores across the two tests.
Two additional issues raised by the critics seem further to suggest that their motives may not be completely pure. These relate to the statistical methods used to analyze the data and to the interpretation of results. In their reanalysis of the data we collected, the advocates indicate that they applied more appropriate statistical techniques than we had used in the original study. They imply that we intentionally applied techniques to reduce the chances of finding significant voucher effects, tantamount to “lying with statistics.” However, the facts again contradict their argument. First, the analyses we conducted produce exactly the same statistical results as theirs when applied to the same data. The complex technical reasons for this need not be discussed here, but the outcome of the two approaches is mathematically and statistically equivalent. Thus, in order to achieve their results, the advocates had to do two things which have already been shown to be inappropriate. They had to: (a) integrate the differing HOPE students’ scores with those of the 94 other students; and/or (b) not control for differences in students’ achievement levels before they entered the voucher program.
| Applying bad science to create the findings desired by one side or the other is unfair to the children whose educational futures hang in the balance. |
A second issue is similarly related to the interpretation of statistical results which are significant or meaningful. Scientific research functions within limits or standards which are agreed upon by members of the research community. One of these is the standard a research result must meet in order to be considered significant or “real.” In two of the five areas we measured, the differences found between the voucher and nonvoucher students were near this standard, but not quite large enough to reach it. In order for the advocates to claim significant results (that the program made a difference), they applied standards that quadrupled the likelihood that they were wrong. In other words, when they claim that their results indicate that the voucher program impacted students’ achievement, they are four times more likely to be wrong than is commonly accepted!
Did the advocates intentionally manipulate the data to get the results they wished? Or, did they unknowingly apply inappropriate research techniques? The members of the team of advocates, though some of them represent prominent institutions, are strong supporters of vouchers and have done much to promote the implementation of voucher programs throughout the country. So it is possible that they are engaged in a deliberate effort to misrepresent the Cleveland data in order to influence educational policy. We would prefer to believe that scholars would not do such a thing, no matter how strongly they believed in their cause. On the other hand, it is perhaps equally troubling to believe that the advocates were either so uninformed about acceptable standards of educational research that they unknowingly used inappropriate techniques, or they were so arrogant that they believed it unnecessary to follow accepted standards.
Only the advocates and those who fund them can know with certainty why they are acting as they are. Much of their earlier work on the voucher issue has been debunked by the research community, and their current work may be as well. But regardless of their intentions, their actions cloud rather than clarify the issues and serve to reduce the already limited public belief in research and researchers. The application of bad science, whether to educational policy or tobacco, through negligence, arrogance, indifference, greed, or ideology, does a disservice not just to the research community, but to the families whose lives depend on the results.
School choice and publicly funded voucher programs undoubtedly represent the most contentious issue facing American public education today. Billions of dollars and millions of children’s lives are at stake. It is absolutely critical that policymakers and parents know as much as possible about the effects of these programs. Do they increase students’ learning? Do they enable teachers and schools to be more creative and responsive? What is it about schools, public or private, chosen or assigned, that helps children learn more and feel better about school, and their parents become more involved and more satisfied with their children’s schools? Clear, objective, unbiased answers to these questions are necessary and well-applied science can provide many of them.
The ongoing research we and others are conducting of the Cleveland voucher program and other voucher programs throughout the country will continue to seek information that can be used by policymakers, teachers, and parents to make decisions about schools. Our goal is to inform policy, not to influence it. And with the cooperation of literally hundreds of education professionals in the public and private schools where we are working, we can do so.
What will be the answers to our questions about school choice? Only time and good research will tell. Will our work stand the tests of time and of scrutiny by the research community? We believe it will. Unfortunately, however, there are no clear answers yet. Until there are, it is important to separate research activity from political activity, information from opinion, and good science from bad.
If we allow bad science to guide us in establishing educational policy and hundreds of millions of children suffer as a result, will our grandchildren witness multibillion-dollar lawsuits over the negative effects, as we have witnessed in the tobacco litigation? Sadly, probably not.
Kim K. Metcalf is the director of the Indiana Center for Evaluation at Indiana University’s school of education in Bloomington.