pith. machine review for the scientific record. sign in

arxiv: 2510.06965 · v2 · submitted 2025-10-08 · 💻 cs.CL · cs.AI

Recognition: unknown

EDUMATH: Generating Standards-aligned Educational Math Word Problems

Authors on Pith no claims yet
classification 💻 cs.CL cs.AI
keywords mwpsmathmodelsopencustomizededucationalstudentsclosed
0
0 comments X
read the original abstract

Math word problems (MWPs) are critical K-12 educational tools, and customizing them to students' interests and ability levels can enhance learning. However, teachers struggle to find time to customize MWPs for students given large class sizes and increasing burnout. We propose that LLMs can support math education by generating MWPs customized to student interests and math education standards. We use a joint human expert-LLM judge approach to evaluate over 11,000 MWPs generated by open and closed LLMs and develop the first teacher-annotated dataset for standards-aligned educational MWP generation. We show the value of our data by using it to train a 12B open model that matches the performance of larger and more capable open models. We also use our teacher-annotated data to train a text classifier that enables a 30B open LLM to outperform existing closed baselines without any training. Next, we show our models' MWPs are more similar to human-written MWPs than those from existing models. We conclude by conducting the first study of customized LLM-generated MWPs with grade school students, finding they perform similarly on our models' MWPs relative to human-written MWPs but consistently prefer our customized MWPs.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. A Multi-Agent Approach to Validate and Refine LLM-Generated Personalized Math Problems

    cs.CY 2026-04 unverdicted novelty 5.0

    A multi-agent generate-validate-revise framework reduces failures in realism and authenticity for LLM-personalized math problems, with one iteration helping and different strategies varying by criterion.

  2. Mathematics Teachers Interactions with a Multi-Agent System for Personalized Problem Generation

    cs.AI 2026-04 unverdicted novelty 4.0

    Eight teachers used a four-agent LLM system to create 212 personalized middle-school math problems; final versions had few realism or hallucination problems noted by users even though agents flagged realism issues dur...