Special Report
Artificial Intelligence From Our Research Center

Will AI Transform Standardized Testing?

By Alyson Klein — December 09, 2024 10 min read
Custom illustration by Stuart Briers showing the silhouette of a female student wearing a backpack and with a tech dot matrix and ruler in the background. There is a speech bubble containing letters in different languages highlighted within a magnifying glass.
  • Save to favorites
  • Print

A previous version of this article misstated the percentage of educators who believe artificial intelligence will make standardized testing worse, according to a survey conducted by the EdWeek Research Center earlier this year.

Here’s a multiple-choice question: Which of the following have educators said is a problem with current state standardized tests?

  • a. Teachers don’t get the test data back quickly enough.
  • b. The exams are not personalized for students’ interests or learning needs.
  • c. The exams don’t measure what students really need to know.
  • d. All of the above

The correct response, d., points to the big, long-standing problems with today’s standardized tests. That raises another, more recent question that has been coming up in education circles: Can artificial intelligence mitigate those problems and help standardized testing improve significantly?

For now, there’s no hard and fast answer to that question. While AI has the potential to help usher in a new, deeper breed of state standardized tests, there are plenty of reasons for caution.

On the one hand, testing has long been due for a facelift, many experts argue.

The tests students now take—particularly the state standardized assessments that carry significant stakes for schools and districts—were developed for a time when the “dominant testing model was a lot of students sitting in a gym, taking a pencil and paper test,” said Ikkyu Choi, a senior research scientist in the research and development division of ETS, a nonprofit testing organization.

AI may be able to “provide much more engaging and relevant types of scenarios, conversations, interactions that can help us measure the things that we want to measure,” Choi said, including students’ ability to think critically and communicate. “We’re quite interested and excited, with the caveat that there are a lot of things that we need to be aware of and be careful about.”

AI’s greatest potential at this moment seems to be in helping with the nuts and bolts of assessments—including generating test items and scoring them more efficiently, as well as providing more actionable feedback to educators on their students’ strengths and weaknesses.

Technologies like natural language processing—the ability of AI to listen and respond to human speech in real time—may make it possible to gauge some of the skills educators say most traditional tests simply cannot do, such as creativity and problem-solving abilities.

But the technology comes with its own problems, experts add. For one thing, AI often cites wrong information, without a clear explanation of where it originated.

Plus, because AI is trained on data created by humans, it reflects human biases. In one controlled experiment, AI tools gave a lower grade to an essay that mentioned listening to rap music to enhance focus, compared with an otherwise identical essay that cited classical music for the same purpose.

Educators aren’t especially enthusiastic about the potential of AI to make testing better. In fact, more than a third of district and school leaders and teachers—36 percent—believe that because of AI, standardized testing will actually be worse five years from now.

Fewer than 1 in 5—19 percent—believe the technology might improve the assessments. The survey by the EdWeek Research Center of 1,135 educators was conducted from Sept. 26 through Oct. 8 of this year.

How AI might help capture more sophisticated thinking skills

One of the most-cited problems with the current breed of state standardized tests: Teachers don’t often see the results of tests their students take in the spring until the following school year, when it is typically too late to make any changes to instruction that could help students.

Multiple-choice tests are relatively easy and inexpensive to score, and much of that work can be automated, even without AI. But those exams can only capture a limited portion of students’ knowledge.

For instance, Matt Johnson, a principal research director in the foundational psychometrics and statistics research center at ETS, would love to be able to give students credit on an assessment for successfully working out multiple steps of a problem even if they ultimately arrive at the wrong answer because of a simple calculation error. That is essentially the approach many teachers use now.

Analyzing students’ work in that way would take significant muscle and manpower for human scorers. But it might be a simpler proposition if AI tools—which can recognize and process human writing—were employed. The technology, however, hasn’t reached the point where it can assess students’ thinking process reliably enough to be used in high-stakes testing, Johnson said.

Even so, AI may help speed up scoring on richer tests, which ask students to write a constructed response or short essay in answer to a problem. Typically, grading those questions requires a team of teachers all working with the same scoring guidelines and reviewers to check the fairness of their assessments—though that process can already be partially automated.

That, however, is where questions about bias surface. Parents have also expressed concerns about relying on machines to score student essays, on the assumption that machines would be less effective at understanding students’ writing.

For the foreseeable future, human beings will still play an integral role in scoring high-stakes tests, said Lindsay Dworkin, the senior vice president of policy and government affairs at NWEA, an assessment organization.

“I don’t think we’re ready to take things that have historically been deeply human activities, like scoring of, you know, constructed-response items, and just hand it over to the robots,” she said. “I think there will be a phased-in period where we see how it goes but we make sure it’s passing through teachers’ hands.”

Despite that gradual approach, AI may be able to offer more actionable feedback to teachers about their practice so that they can improve their teaching, Dworkin said.

For instance, a language arts teacher with a class of 30 kids could ask an AI tool: “Tell me what all of my students collectively did well. Tell me what they didn’t do well. Tell me the skill gaps that are missing?” Dworkin said. “Is everybody failing to give me strong topic sentences? Is everybody failing to write a conclusion?”

Big experiment on AI and testing about to begin

One high-profile experiment in using AI for standardized assessment is about to get underway. The 2025 edition of the Program for International Student Assessment, or PISA, is slated to include performance tasks probing how students approach learning and solve problems.

Students may be able to use an AI-powered chatbot to complete their work. They could ask it basic questions about a topic so that the test could focus on their thinking capability, not whether they possess background knowledge of a particular subject.

That prospect—announced at a meeting of the Council of Chief State School Officers earlier this year—got an excited reaction from some state education leaders.

Their enthusiasm may reflect concerns about whether the current batch of state standardized tests capture the kinds of skills students will need in postsecondary education and the workplace.

More than half of educators—57 percent—don’t believe that state standardized tests—which generally focus on math and language arts—measure what students need to know and be able to do, according to the EdWeek Research Center survey.

States are increasingly focused on creating “portraits of a graduate” that consider the kinds of skills students will need when they enter postsecondary education or the workforce. But right now, state standardized tests emphasize language arts and math skills, and that can carry big consequences, said Lillian Pace, the vice president of policy and advocacy for KnowledgeWorks, a nonprofit organization that works to personalize learning for students.

“We are missing the picture entirely on whether we’re preparing students for success” by ignoring kids’ ability to work across disciplines to solve more complex problems, Pace said. “What might it look like if AI opens the door for us to be able to design integrated assessments that are determining how well students are using knowledge to demonstrate mastery” of skills such as critical thinking and communication.

That prospect—though intriguing—will take significant work, even with AI’s help, said Joanna Gorin, now the vice president of the design and digital science unit at ACT, a public benefit assessment corporation.

In a previous role, Gorin helped teams design a virtual task that asked students to decide whether a particular historical artifact belonged in their town’s museum. The simulation required students to interview local experts and visit a library to conduct research.

The task was designed to give insight into students’ communication skills and ability to evaluate information. That’s the kind of test many educators would like to move toward, she said.

“States want to move [toward richer assessments] because there’s incredible promise from AI, and it can potentially get them the kind of information they really want,” Gorin said.

But that could come with complications, even with AI’s help, she added. “At what point are [states] willing to make the trade-offs that would come along with it, in terms of cost, in terms of technology requirements, in terms of other possible effects on how they teach?”

For instance, creating and reliably scoring performance tasks with AI would require significant data, meaning a lot of students would have to participate in experimental testing, Gorin said.

Given all that, “I do not foresee full-blown performance assessment, simulation-based AI-driven assessments in K-12, high-stakes, large-scale assessment” for quite some time, Gorin said.

AI could help generate better test questions, faster

Instead, Gorin expects that AI will help inform testing in other ways, such as helping to generate test questions.

Say an educator—or a testing company—has a passage they want to use on an exam, Gorin said. “Can I use AI to say ‘what would be the best types of items to build based on this [passage] or the reverse, what passages would work best based on the types of questions that I need to generate?” she said.

AI could also write the initial draft of an item, and a human could “come in and take it from there,” Gorin said. That would allow test-makers to be “more efficient and more creative,” she said. Being able to create test items faster could be a key to personalizing tests to reflect students’ interests and learning needs.

If a goal of an assessment was to figure out if students understood say, fractions, it could offer a baking enthusiast a set of questions based on a chocolate chip cookie recipe and a sports-loving student another set based on the dimensions of a football field.

It could be possible to train AI to craft questions on different topics that measure the same skill, experts say. But it would be difficult—and pricey—to “field test” them. That entails having real students try them out to ensure fairness.

That means change will likely come first and most dramatically to teacher-created exams for classrooms, which may determine student grades, as opposed to state standardized tests, which evaluate how teachers and schools are performing.

In fact, teachers are already experimenting with the technology to create their own tests. One in 6 teachers have used AI to develop classroom exams, according to the EdWeek Research Center survey.

When a version of ChatGPT that could spit out remarkably human-sounding writing in minutes was released in late 2022, it seemed to come out of nowhere. Even so, it is unlikely that AI will transform standardized testing overnight.

“I think it’s going to come slowly,” said Johnson of ETS. “My opinion is that there will be a slow creep of new stuff. Scenario-based tasks. Maybe some personalization will come in. As we get more comfortable with the various [use] cases, you’ll start seeing more and more of them.”

education week logo subbrand logo RC RGB

Data analysis for this article was provided by the EdWeek Research Center. Learn more about the center’s work.

Coverage of education technology is supported in part by a grant from the Chan Zuckerberg Initiative, at www.chanzuckerberg.com. Education Week retains sole editorial control over the content of this coverage.
A version of this article appeared in the December 18, 2024 edition of Education Week as Will AI Transform Standardized Testing?

Events

This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
School & District Management Webinar
Harnessing AI to Address Chronic Absenteeism in Schools
Learn how AI can help your district improve student attendance and boost academic outcomes.
Content provided by Panorama Education
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
Science Webinar
Spark Minds, Reignite Students & Teachers: STEM’s Role in Supporting Presence and Engagement
Is your district struggling with chronic absenteeism? Discover how STEM can reignite students' and teachers' passion for learning.
Content provided by Project Lead The Way
Recruitment & Retention Webinar EdRecruiter 2025 Survey Results: The Outlook for Recruitment and Retention
See exclusive findings from EdWeek’s nationwide survey of K-12 job seekers and district HR professionals on recruitment, retention, and job satisfaction. 

EdWeek Top School Jobs

Teacher Jobs
Search over ten thousand teaching jobs nationwide — elementary, middle, high school and more.
View Jobs
Principal Jobs
Find hundreds of jobs for principals, assistant principals, and other school leadership roles.
View Jobs
Administrator Jobs
Over a thousand district-level jobs: superintendents, directors, more.
View Jobs
Support Staff Jobs
Search thousands of jobs, from paraprofessionals to counselors and more.
View Jobs

Read Next

Artificial Intelligence How AI Is Changing Education: The Year's Top 5 Stories
Schools are tackling big questions about AI.
1 min read
Illustration with tech education background and the words AI Top Five.
Education Week + Getty
Artificial Intelligence Without AI Literacy, Students Will Be 'Unprepared for the Future,' Educators Say
Students need to understand AI’s potential, power, and pitfalls to be informed citizens, educators said during an Education Week panel.
2 min read
Artificial Intelligence Can AI Improve Literacy Outcomes for English Learners?
The federal government is funding a project that will explore AI's potential to improve English learners' early literacy skills.
2 min read
Ai translate language concept. Robot hand holds ai translator with blue background, Artificial intelligence chatbot equipped with a Language model technology.
Witsarut Sakorn/iStock
Artificial Intelligence From Our Research Center Why Schools Need to Wake Up to the Threat of AI 'Deepfakes' and Bullying
Schools are underprepared to deal with a deluge of AI-created videos that harm the reputations of students and educators.
11 min read
Custom illustration by Stuart Briers showing two identical male figures sitting in a chair with a computer dot matrix pointing to different parts of the body. The background depicts soundwaves, a play button, speaker icon, eye, and ear.
Stuart Briers for Education Week