Recognition: no theorem link
Million Tutoring Moves (MTM): An Open Multimodal Dataset for the Science of Tutoring
Pith reviewed 2026-05-13 18:04 UTC · model grok-4.3
The pith
MTM v1 releases 4,654 math tutoring transcripts to make authentic interactions observable at scale.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
MTM v1 consists of 4,654 math tutoring transcripts collected from a U.S.-based nonprofit online platform and is released as the initial component of the National Tutoring Observatory to advance the science of tutoring through large-scale, reusable interaction data.
What carries the argument
The MTM v1 dataset of tutoring transcripts, which renders previously private instructional exchanges systematically observable and analyzable.
If this is right
- Researchers gain a reusable resource for studying instructional processes at scale.
- Tutoring organizations can examine concrete session data to refine their methods.
- Developers of AI educational tools obtain real interaction examples to ground model training.
- Findings from the data can be translated into practical recommendations for educators.
Where Pith is reading between the lines
- Expansion to video or audio would allow study of timing and nonverbal elements in tutoring exchanges.
- Linkage with other open education datasets could enable cross-context comparisons of tutoring styles.
- Long-term growth of the repository might support longitudinal tracking of tutoring effectiveness over multiple sessions.
Load-bearing premise
The collected transcripts accurately represent authentic tutoring interactions and their open release will yield actionable research insights without significant privacy or bias problems.
What would settle it
A controlled comparison showing that patterns derived from the released transcripts fail to predict tutoring outcomes observed in independent, non-released sessions would falsify the utility claim.
Figures
read the original abstract
We introduce the Million Tutoring Moves (MTM) project, an open dataset initiative aimed at advancing the science of tutoring through large-scale, reusable, and multimodal interaction data. MTM is developed within the National Tutoring Observatory (NTO), a research infrastructure designed to study authentic tutoring interactions and translate them into actionable insights for research, practice, and AI-powered educational technology development. In this paper, we present the vision behind MTM and describe MTM v1, an initial release consisting of 4,654 math tutoring transcripts from a U.S.-based nonprofit online tutoring platform. MTM v1 serves as a first step toward a broader repository that is safe, open, large-scale, broad-coverage, and multimodal. By making tutoring interactions systematically observable and analyzable, MTM aims to support research on instructional processes, improve tutoring practice, and enable the development of AI systems grounded in real educational interactions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces the Million Tutoring Moves (MTM) project under the National Tutoring Observatory and describes its initial release, MTM v1, consisting of 4,654 math tutoring transcripts collected from a U.S.-based nonprofit online tutoring platform. It frames MTM v1 as a first step toward a broader repository that is safe, open, large-scale, broad-coverage, and multimodal, with the goal of enabling research on instructional processes, improving tutoring practice, and supporting AI development in education.
Significance. If the data collection and documentation issues are addressed, MTM v1 could provide a valuable open resource for the science of tutoring, where large-scale authentic interaction data remains scarce. The open release supports reproducibility and community-driven analysis, which aligns with the paper's stated aims for actionable insights in research and educational technology.
major comments (2)
- [MTM v1 description] The description of MTM v1 states that the transcripts come from a U.S. nonprofit online platform and positions the release as 'safe' and representative of 'authentic tutoring interactions,' but supplies no information on IRB approval, participant consent, de-identification pipeline, quality checks, or the sampling method that yielded the 4,654 sessions from the full platform population. These details are load-bearing for the central claim that MTM v1 is a reliable first step toward a safe repository.
- [Abstract and vision] The abstract and vision section claim the dataset supports broad-coverage and multimodal research, yet MTM v1 is described only as text transcripts; no details are given on whether audio, video, or other modalities are included in v1 or how they will be added in future releases.
minor comments (1)
- [Abstract] The abstract refers to 'multimodal interaction data' while MTM v1 is limited to transcripts; clarify this distinction to avoid reader confusion.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
David Carrell, Bradley Malin, John Aberdeen, Samuel Bayer, Cheryl Clark, Ben Wellner, and Lynette Hirschman. Hiding in plain sight: use of realistic surrogates to reduce exposure of pro- tected health information in clinical text.Journal of the American Medical Informatics Association, 20(2):342–348, 2013
work page 2013
-
[2]
Federal Trade Commission. Children’s online privacy protection act (coppa) guidance.https://www.ftc.gov/business-guidance/resources/ complying-coppa-frequently-asked-questions, 2013. Accessed: 2026-02-10
work page 2013
-
[3]
Jens Dietrichson, Trine Filges, Julie K Seerup, Rasmus H Klokker, Bjørn CA Viinholt, Martin Bøg, and Misja Eiberg. Targeted school-based interventions for improving reading and mathematics for students with or at risk of academic difficulties in grades k-6: A systematic review.Campbell Systematic Reviews, 17(2):e1152, 2021
work page 2021
-
[4]
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He, Xiaodong Liu, Jianfeng Gao, and Weizhu Chen. Deberta: Decoding-enhanced bert with disentangled attention.arXiv preprint arXiv:2006.03654, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2006
-
[5]
Sandpiper: Orchestrated AI-Annotation for Educational Discourse at Scale
Daryl Hedley, Doug Pietrzak, Jorge Dias, Ian Burden, Bakhtawar Ahtisham, Zhuqian Zhou, Kirk Vanacore, Josh Marland, Rachel Slama, Justin Reich, et al. Sandpiper: Orchestrated ai-annotation for educational discourse at scale.arXiv preprint arXiv:2603.08406, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[6]
Jakub Macina, Nico Daheim, Sankalan Chowdhury, Tanmay Sinha, Manu Kapur, Iryna Gurevych, and Mrinmaya Sachan. Mathdial: A dialogue tutoring dataset with rich pedagogical properties grounded in math reasoning problems. InFindings of the Association for Computational Linguistics: EMNLP 2023, pages 5602–5621, 2023
work page 2023
-
[7]
Opportunities and challenges in neural dialog tutoring
Jakub Macina, Nico Daheim, Lingzhi Wang, Tanmay Sinha, Manu Kapur, Iryna Gurevych, and Mrinmaya Sachan. Opportunities and challenges in neural dialog tutoring. InProceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 2357–2372, 2023
work page 2023
-
[8]
Andre Nickow, Philip Oreopoulos, and Vincent Quan. The promise of tutoring for prek–12 learning: A systematic review and meta-analysis of the experimental evidence.American Educational Research Journal, 61(1):74–107, 2024
work page 2024
-
[9]
U.S. Department of Education. Family educational rights and privacy act.https://www.ecfr. gov/current/title-34/part-99, 2011
work page 2011
-
[10]
Carly D Robinson and Susanna Loeb. High-impact tutoring: State of the research and priorities for future learning.National Student Support Accelerator, 21(284):1–53, 2021
work page 2021
-
[11]
Cima: A large open access dialogue dataset for tutoring
Katherine Stasaski, Kimberly Kao, and Marti A Hearst. Cima: A large open access dialogue dataset for tutoring. InProceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 52–64, 2020
work page 2020
-
[12]
Abhijit Suresh, Jennifer Jacobs, Charis Harty, Margaret Perkoff, James H Martin, and Tamara Sumner. The talkmoves dataset: K-12 mathematics lesson transcripts annotated for teacher and student discursive moves. InProceedings of the thirteenth language resources and evaluation confer- ence, pages 4654–4662, 2022
work page 2022
-
[13]
Department of Health and Human Services, Office for Civil Rights
U.S. Department of Health and Human Services, Office for Civil Rights. Guidance regarding methods for de-identification of protected health information in accordance with the HIPAA pri- vacy rule.https://www.hhs.gov/sites/default/files/ocr/privacy/hipaa/understanding/ coveredentities/De-identification/hhs_deid_guidance.pdf, 2012. 5
work page 2012
-
[14]
Kurt VanLehn. The relative effectiveness of human tutoring, intelligent tutoring systems, and other tutoring systems.Educational psychologist, 46(4):197–221, 2011
work page 2011
-
[15]
Strategize before teaching: A conversational tutoring system with pedagogy self-distillation
Lingzhi Wang, Mrinmaya Sachan, Xingshan Zeng, and Kam-Fai Wong. Strategize before teaching: A conversational tutoring system with pedagogy self-distillation. InFindings of the Association for Computational Linguistics: EACL 2023, pages 2268–2274, 2023
work page 2023
-
[16]
Benjamin Warner, Antoine Chaffin, Benjamin Clavi´ e, Orion Weller, Oskar Hallstr¨ om, Said Taghadouini, Alexis Gallagher, Raja Biswas, Faisal Ladhak, Tom Aarsen, et al. Smarter, better, faster, longer: A modern bidirectional encoder for fast, memory efficient, and long context finetun- ing and inference. InProceedings of the 63rd Annual Meeting of the Ass...
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.