Federal

Benchmark Assessments Offer Regular Checkups On Student Achievement

By Lynn Olson — November 29, 2005 10 min read
  • Save to favorites
  • Print
BRIC ARCHIVE

School districts worried about how students will perform on end-of-the-year state tests are increasingly administering “benchmark assessments” throughout the year to measure students’ progress and provide teachers with data about how to adjust instruction.

Nearly seven in 10 superintendents surveyed for Education Week this past summer said they periodically give districtwide tests, and another 10 percent said they planned to do so this school year. Such tests typically are aligned to state or district standards for academic content and given three to five times during the year. Some are given as often as monthly.

Victoria Todd, a 3rd grader at London Towne Elementary, finishes one of 38 problems on a Benchmark Assessment Resource Tool test.

Most benchmark assessments take one hour each for reading and mathematics, but may include other subjects. Extensive reporting systems break down test results by the same student categories required under the federal No Child Left Behind Act, such as by race, income, disability, and English proficiency, in addition to providing individual progress reports at the district, school, classroom, and student levels.

“I do believe that three years from now, certainly five years from now, no one will remember a time when there weren’t benchmarks,” said Robert E. Slavin, the director of the Center for Data-Driven Reform in Education, at Johns Hopkins University.

Burgeoning Market

That’s certainly what test vendors hope. Last year, Eduventures Inc., a market-research firm based in Boston, identified benchmark assessments as one of two high-growth areas in the assessment industry, alongside state exams, with a compound annual growth rate of greater than 15 percent. The company predicted that by 2006, what it called “the formative-assessment market”—using a term sometimes treated as a synonym for benchmark assessment—would generate $323 million in annual revenues for vendors.

See Also

But while many assessment experts agree that the idea of frequent testing of students to monitor their learning and adjust instruction is sound, some also warn that districts should take a close look at what they’re getting for their money and how they are using such exams.

“You might say that the message here is, ‘Get a second opinion,’ ” said Grant Wiggins, the president of Authentic Education, a Hopewell, N.J.-based consulting service that works with districts.

It’s no secret why districts are turning to benchmark tests. The No Child Left Behind Act, signed into law by President Bush in January 2002, and states’ own accountability systems have created a high-stakes environment in which both districts and schools can face penalties for failing to meet performance targets.

View a complete collection of stories in this Education Week special report, Testing Takes Off.

In this standards-based environment, the feeling is that the sooner and more often schools have information about how they’re doing against the standards, the better.

“The reason that there is a boom in benchmark assessments is that most states and school systems are providing nothing more than autopsy reports right now,” said Douglas B. Reeves, the founder of the Center for Performance Assessment, a private consulting organization based in Denver that works with districts to design fair and rigorous assessments and classroom activities. “They tell you why the patient died at the end of the year, and then marveled that the patient didn’t get any better.”

Studies by the Washington-based Council of the Great City Schools, the Austin, Texas-based National Center for Educational Accountability, and others have found that one feature of high-achieving districts is their use of periodic, benchmark assessments to track student achievement and make adjustments.

“Good formative assessments, good benchmark assessments,” Mr. Reeves said, “provide feedback throughout the year, and that is far more fair to principals and teachers, provided they are used wisely.”

Vendors Vary

In the past few years, according to Eduventures’ 2004 report, “Testing in Flux,” new competitors have flooded the formative-assessment market, including:

• Major test publishers, such as the New York City-based CTB/McGraw-Hill and the San Antonio-based Harcourt Assessment;

• Test-preparation companies, including the New York City-based Princeton Review;

• For-profit providers that specialize in linking assessment results with prescribed remediation plans and curricula, such as the San Diego-based Compass Learning and the New York City-based Kaplan K-12 Learning Services;

• Nonprofit organizations, such as the Portland, Ore.-based Northwest Evaluation Association; and

• Suppliers of “whole-school-reform models,” such as the New York City-based Edison Schools Inc. and Mr. Slavin’s Baltimore-based Success for All Foundation, which designed the 4Sight assessment series.

The products of such suppliers range from formatted tests linked to the standards in individual states, to item banks that districts and schools can use to develop their own assessments, to online testing, scoring, and reporting systems.

Skimming the Surface?

Lorrie A. Shepard, the dean of the school of education at the University of Colorado at Boulder, voices caution about the trend.

Benchmark-Test Market Foresees Growth

A 2004 report predicted that the market for benchmark or formative assessments would expand by a compound annual growth rate of more than 15 percent from 2003 to 2006.

BRIC ARCHIVE

TEST MARKET
New competitors have emerged in recent years to supply school districts with benchmark assessments. They include:

MAJOR TEST PUBLISHERS, such as CTB/McGraw-Hill, based in New York City, and the San Antonio-based Harcourt Assessment;

TEST-PREPARATION COMPANIES, including the Princeton Review, based in New York City;

SUPPLIERS of whole-school-reform models, such as Edison Schools Inc., of New York, and the Success for All Fouondation, of Baltimore;

FOR-PROFIT PROVIDERS that specialize in linking assessment results with prescribed remediation plans and curricula, such as the San Diego-based Compass Learning and the New York City-based Kaplan K-12 Learning Services;

NONPROFIT ORGANIZATIONS, such as the Northwest Evaluation Association, in Portland, Ore.

SOURCE: Eduventures Inc., Education Week

While “not all formal benchmarking systems are bad,” she said, she worries about the effects of using 15- or 20-item multiple-choice tests that mirror the format of state exams to drive classroom instruction.

Previous research by Ms. Shepard and others has found that students who do well on one set of standardized tests do not perform as well on other measures of the same content, suggesting that they have not acquired a deep understanding.

“The data-driven-instruction fad means earlier and earlier versions of external tests being administered at quarterly or monthly intervals,” Ms. Shepard said. “The result is a long list of discrete skill deficiencies requiring inexperienced teachers to give 1,000 mini-lessons.”

Good benchmark assessments, she suggested, should include rich representations of the content students are expected to master, be connected to specific teaching units, provide clear and specific feedback to teachers so that they know how to help students improve, and discourage narrow test-preparation strategies.

Rather than trying to assess everything, added Mr. Reeves, the best benchmark tests focus on the most important state or district content standards. And they provide results almost immediately, in simple, easy-to-use formats, he said.

The National Center for Educational Accountability stresses that good benchmark assessments measure performance “on the entire curriculum at a deep level of understanding.” They also begin before grade 3 in both reading and math and provide a process to ensure that data on student performance are reviewed and acted upon by both districts and schools, the center says. In addition to such tests, it adds, districts may provide unit or weekly assessments that principals and teachers can use to monitor student progress.

Approaches Differ

But in talking about benchmark assessments, not everyone means the same thing.

According to Mr. Slavin, some benchmark tests, like 4Sight, are designed primarily to predict students’ performance on end-of-the-year state exams. They measure the same set of knowledge and skills at several points during the school year to see if students are making progress and to provide an early warning of potential problems.

Other benchmarks are tied more closely to the curriculum, and to the knowledge and skills students are supposed to have learned by a particular time. For example, a skill-by-skill benchmark series in math might focus on fractions in November, decimals in January, geometry in March, and problem-solving in May, rather than testing all skills at the same time, Mr. Slavin said.

Such benchmarks serve as pacing guides for teachers and schools, providing information on whether students have learned the curriculum they’ve just been taught. Some companies claim their tests serve both purposes, predicting students’ ultimate success on state tests and gauging how they’re progressing through the curriculum.

Historically, vendors would design one set of benchmark tests for the entire country. Now they craft tests for each state, starting with the larger ones.

What Are Benchmark Assessments?

While not everyone means the same thing by the term, benchmark assessments typically:

• Are given periodically, from three times a year to as often as once a month;

• Focus on reading and mathematic skills, taking about an hour per subject;

• Reflect state or district academic-content standards; and

• Measure students’ progress through the curriculum and/or on material in state exams.

SOURCE: Education Week

Many companies also work with districts to design the districts’ own assessments, tied to state and district standards, or permit districts and schools to modify previously formatted exams. Some vendors provide large, computerized pools of item banks that teachers and schools can use to create their own classroom tests and check students’ progress on state standards.

Stuart R. Kahl, the president of Measured Progress, a Dover, N.H.-based testing company, says that while item banks hold great promise, because they permit teachers to design tests that can be used during the ongoing flow of instruction, one issue is whether teachers are prepared to use them appropriately.

“Now we’re putting individual items in the hands of teachers,” he said, “saying, ‘You construct the test; make it as long or as short as you want.’ Do we think they have the understanding to know how much stock they can put in the generalizations they make from such exams?”

Some also worry that as vendors have rushed in, quality has not kept pace. The Eduventures report noted that many vendors have marketed formative assessments “on the basis of the quantity of exam items, as opposed to those items’ quality.” For example, companies may tout having tens of thousands of exam items, it said, although many of the items have not been extensively field-tested or undergone a rigorous psychometric review.

“I think vendors in our space have found it challenging,” said Marissa A. Larsen, the senior product manager for assessment at the Bloomington, Minn.-based Plato Learning Inc., whose eduTest online assessment system is now used in more than 3,000 schools.

While districts sometimes apply the same psychometric standards to benchmark tests that are applied to high-stakes state exams, she said, “in many cases, that’s not what vendors in this space are trying to do. If we did that, it would be well beyond what districts could afford to buy for formative systems.”

Critics also say that even the best benchmark assessments are more accurately described as “early warning” or “mini-summative” tests, rather than as true “formative” assessments, which are meant to help adjust teaching and learning as it’s occurring. In contrast, summative tests are designed to measure what students have learned after instruction on a subject is completed.

“Formative assessments are while you’re still teaching the topic, providing on-the-spot corrections,” said Mr. Kahl. “With benchmark assessments, you’re finished. You’ve moved on. Not that you don’t get individual student information, but at that stage, it’s remediation.”

What Is ‘Formative’?

Yet Eric Bassett, the research director for Eduventures, said the terms formative and benchmark assessments are often used interchangeably in the commercial education market.

And that, some critics say, is precisely the problem.

“I recognize that I’ve lost the battle over the meaning of the term ‘formative assessment,’ ” said Dylan Wiliam, a senior researcher at the Educational Testing Service, based in Princeton, N.J.

In the 1990s, he wrote an influential review that found that improving the formative assessments teachers used dramatically boosted student achievement and motivation. Now that same evidence, he fears, is being used to support claims about the long-term benefits of benchmark assessments that have yet to be proven. “There’s a lack of intellectual honesty there,” Mr. Wiliam said. “We just don’t know if this stuff works.”

He and others say the money, time, and energy invested in benchmark assessments could divert attention from the more potent lever of changing what teachers do in classrooms each day, such as the types of questions they ask students and how they comment on students’ papers.

“If you’re looking, as you should be, at the full range of development that you want kids to engage in, you’re going to have to look at their work products, their compositions, their math problem-solving, their science and social-studies performance,” said Mr. Slavin of Johns Hopkins.

Mr. Wiggins of Authentic Education said that while some commercially produced benchmark assessments are far from ideal, they’re better than nothing. “I would rather see a district mobilizing people to analyze results more frequently,” he said. “That’s all to the good.”

The key point, he and others stress, is what use is made of the data.

“It’s only a diagnosis,” Mr. Slavin said. “If you don’t do anything about it, it’s like going to the doctor and getting all the lab tests, and not taking the drug.”

Events

This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
School & District Management Webinar
Leadership in Education: Building Collaborative Teams and Driving Innovation
Learn strategies to build strong teams, foster innovation, & drive student success.
Content provided by Follett Learning
School & District Management K-12 Essentials Forum Principals, Lead Stronger in the New School Year
Join this free virtual event for a deep dive on the skills and motivation you need to put your best foot forward in the new year.
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
Privacy & Security Webinar
Navigating Modern Data Protection & Privacy in Education
Explore the modern landscape of data loss prevention in education and learn actionable strategies to protect sensitive data.
Content provided by  Symantec & Carahsoft

EdWeek Top School Jobs

Teacher Jobs
Search over ten thousand teaching jobs nationwide — elementary, middle, high school and more.
View Jobs
Principal Jobs
Find hundreds of jobs for principals, assistant principals, and other school leadership roles.
View Jobs
Administrator Jobs
Over a thousand district-level jobs: superintendents, directors, more.
View Jobs
Support Staff Jobs
Search thousands of jobs, from paraprofessionals to counselors and more.
View Jobs

Read Next

Federal The Topic That Didn't Get a Single Mention in Biden-Trump Debate
K-12 schools—after animating state and local elections in recent years—got no airtime.
2 min read
President Joe Biden, right, and Republican presidential candidate former President Donald Trump, left, during a presidential debate hosted by CNN, Thursday, June 27, 2024, in Atlanta.
President Joe Biden, right, and former President Donald Trump, left, face off on stage during a presidential debate hosted by CNN, June 27, 2024, in Atlanta. Not a single question was asked about K-12 education and neither candidate raised the issue.
Gerald Herbert/AP
Federal Social Media Should Come With a Warning, Says U.S. Surgeon General
A surgeon general's warning label would alert users that “social media is associated with significant mental health harms in adolescents.”
4 min read
Image of social media icons and warning label.
iStock + Education Week
Federal Classroom Tech Outpaces Research. Why That's a Problem
Experts call for better alignment between research and the classroom in Capitol Hill discussions.
4 min read
People walk outside the U.S Capitol building in Washington, June 9, 2022.
People walk outside the U.S Capitol building in Washington, June 9, 2022. Experts called for investments in education research and development at a symposium at the Dirksen Senate Office Building on June 13.
Patrick Semansky/AP
Federal Opinion Federal Education Reform Has Largely Failed. Unfortunately, We Still Need It
Neither NCLB nor ESSA have lived up to their promise, but the problems calling for national action persist.
Jack Jennings
4 min read
Red, Blue, and Purple colors over a fine line etching of the Capitol building. Republicans and Democrats, Partisan Politicians.
Douglas Rissing/iStock