It’s been a year since ChatGPT stormed into classrooms. Its most common users have been students looking for homework aid—or shortcuts—and teachers who use it to create tailored, on-the-spot lesson plans.
A group of researchers, though, are asking a new question: Can generative AI help teachers teach better?
New research released in November shows that when teachers engage with frequent, personalized and on-demand feedback about their teaching practice, they ask richer, more analytical questions in their mathematics or science classes. The study, conducted by researchers from the University of Maryland, Harvard University, and Stanford University, also found that teachers engaged with feedback when it was directly emailed to them, suggesting that feedback needs to be provided in a succinct and accessible manner.
The feedback in question comes from an AI-powered tool called TeachFX, in which recorded classroom audio is put through a large-language model that’s trained to identify when a teacher uses a particular instructional strategy, like the use of focusing questions. Such questions allow students to reason out an answer instead of just recalling it. The tool generates a personalized report for the teacher detailing how often they asked questions, how much their students talked in class, and how often the teachers pressed students for explanations.
“Changing teacher practice in math and science is an uphill battle. I’m excited about the results of this study because the intervention is low-cost, it’s private to teachers, and it’s voluntary. And the study shows that there are improvements in a particular instructional practice,” said Heather Hill, one of the researchers and a professor in teacher learning and practice at Harvard University.
One of the big challenges of providing teacher feedback at scale has been instructional coaches’ caseloads. While AI can’t replace that human contact, it can tag and process classroom transcripts much faster than humans. So coaches potentially can use data from AI tools to tailor their feedback to teachers, if they can’t themselves observe teachers as frequently.
The research is among the first to test AI feedback in in-person K-12 classrooms, rather than college or online classrooms.
Questions got better. But what about the teaching?
The study a working paper released in November by EdWorkingPapers, a project of the Annenberg Institute at Brown University, has not yet been peer reviewed.
Researchers randomly assigned 532 teachers in Utah into two groups. All teachers had access to the basic version of TeachFX—the ability to record and generate audio snapshots of their classrooms. But only the teachers in the treatment group were sent a weekly email about their use of “focusing questions” and tips on how to structure them.
The study found that the treatment group, over the course of five weeks, asked 20 percent more focusing questions than the control group. The treatment group teachers also opened up their email 55-61 percent of the time over five weeks, and they also viewed their class report generated by TeachFX at a greater rate than control group teachers.
Hill said the researchers picked focusing questions as an intervention strategy because it helps teachers structure instruction for students.
“A focusing question might begin with this problem and say to students, ‘why don’t you look at this and tell me how you’re thinking about the problem? Does anyone have any ideas about how you might solve the problem?’ Focusing questions could open up space for students to start thinking and communicating about their mathematical ideas,” Hill added.
Hill and her co-authors were puzzled, though, that an increase in focusing questions didn’t lead to any noticeable difference in how much students spoke in class, student reasoning, or teachers’ uptake of student ideas. (Uptake here refers to a teachers revoicing a student’s contribution, elaborating on it or asking a follow-up question.)
“We had hypothesized that would happen as a result of more feedback that increased the use of focusing questions,” said Hill.
The feedback is private. That will clash with change
TeachFX keeps teacher feedback private, and no principal or coach can access the data, unless teachers share it with them. But teachers surveyed at the end of the study had concerns about how their data might be used, hinting at potential challenges to widespread adoption of these teacher-feedback tools.
In 13 interviews done with a subset of the whole group, teachers worried about scrutiny if the data were to be accessed by their superiors; noted that the transcripts were “imprecise” about how much student voice was recorded; and said they didn’t have the time to sift through all the feedback and data being sent to them.
The last observation isn’t surprising to Julie York, a high school teacher from South Portland High School in Newport, Maine.
York said she’s occasionally used TeachFX in her classes over the last two years. “I love that the tool gives me a breakdown on how much I talked vs. how much my students talked. It also gives you word clouds on the most used words in class. I can see if students are using the words I teach. I can also see if “what” or “help” are the most common words used by them,” York said.
York found TeachFX through her own research. Teachers might be less inclined to use these AI platforms as a feedback tool if their district or school requires it.
Hill understands that issue, but believes that there are ways to aggregate data at a school or district level that would hide personal information. Then, teams of teachers and coaches could use the instructional data alongside test scores and other information to plan teaching.