From Heuristics to Analytics: Forecasting Effort and Progress in Online Learning

Boyuan Guo; Conrad Borchers; Danielle R. Thomas; Eric S. Qiu; Vincent Aleven

arxiv: 2605.12788 · v1 · pith:AC2J2EMPnew · submitted 2026-05-12 · 💻 cs.LG · cs.CY

From Heuristics to Analytics: Forecasting Effort and Progress in Online Learning

Eric S. Qiu , Danielle R. Thomas , Boyuan Guo , Vincent Aleven , Conrad Borchers This is my paper

Pith reviewed 2026-05-14 20:27 UTC · model grok-4.3

classification 💻 cs.LG cs.CY

keywords engagement forecastingintelligent tutoring systemseffort predictionprogress forecastingstudent modelingeducational data miningweekly forecasting

0 comments

The pith

Feature-based models forecast weekly student effort and progress in tutoring systems with 22-33 percent lower error than heuristic rules.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces engagement forecasting as a supervised task that predicts two weekly outcomes from tutoring logs: minutes practiced and new skills mastered. Feature-based models drawn from regressions, decision trees, and neural networks cut mean absolute error by 22-33 percent relative to fixed-percentile heuristics on logs from 425 middle-school students. Recent activity features dominate effort forecasts while learner-state and content-difficulty signals matter more for progress. A small tutor interview study shows that tutors reason about effort and progress goals in patterns that match the models' feature importances. The work supplies a reproducible benchmark that makes weekly effort and progress visible for goal setting and instructional decisions.

Core claim

Using interaction logs from 425 middle-school students across a school year, feature-based predictors reduce mean absolute error by 22-33 percent compared with percentile-based heuristic baselines when forecasting weekly minutes practiced and new skills mastered. The models track individual practice trajectories more closely than fixed rules, with effort forecasts driven chiefly by recent activity features and progress forecasts depending more on learner-state and content difficulty signals. In a case study, eight college tutors reasoned about effort versus progress goals in ways that aligned with these target-specific feature patterns.

What carries the argument

Supervised machine learning models that use interaction-log features to predict two weekly targets, benchmarked against fixed-percentile heuristic rules adapted from prior behavioral work.

Load-bearing premise

The 425-student log dataset and selected features capture patterns representative enough for the models to generalize to new students and new weeks without large distribution shifts.

What would settle it

A fresh cohort of student logs in which the feature-based models show no reduction in mean absolute error, or a reduction below 15 percent, relative to the same percentile heuristics.

Figures

Figures reproduced from arXiv: 2605.12788 by Boyuan Guo, Conrad Borchers, Danielle R. Thomas, Eric S. Qiu, Vincent Aleven.

**Figure 1.** Figure 1: (a) Intuitive goal-setting scenario (minutes practiced). Example of an intuitive case that raises goal because student exceeded prior week’s goal. (b) Counter-intuitive goal-setting scenario (minutes practiced). Example of a counter-intuitive case that lowers goal even though the student exceeded prior week’s goal, citing the consistency feature as reason. but adapts better to shifting regimes in the data,… view at source ↗

**Figure 2.** Figure 2: Average minutes and skills per week across all students plotted against the predictions of different models, also [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Feature-importance rankings. Red: learner [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

read the original abstract

Sustained effort is essential for realizing the benefits of intelligent tutoring systems (ITS), yet many learners disengage or underuse available practice time. We introduce engagement forecasting as a supervised prediction task based on ITS logs, targeting two outcomes central to effort and learning progress: minutes practiced per week and new skills mastered per week. Using interaction log data from 425 middle-school students over a school year, we benchmark fifteen predictors including regressions, decision trees, and neural networks. We show that these feature-based models reduce mean absolute error (MAE) by 22-33% relative to heuristic baselines, including fixed-percentile rules adapted from prior work in other behavioral domains. We find that percentile heuristics systematically overpredict, whereas feature-based models better track student practice trajectories across weeks. To support explainability, we analyze feature importance and ablations, revealing target-specific patterns: effort forecasting is driven mainly by recent activity features, while progress forecasting depends more on learner-state and content difficulty signals. Finally, in a semi-structured user interview case study with eight college tutors, we examine how tutors reasoned about system-generated predictive features when setting goals with students. We find that tutors reasoned differently about effort versus progress goals in ways that mirror our pattern analysis. Together, these results establish a reproducible benchmark for forecasting weekly effort and learning progress in ITS. By making patterns of sustained effort and progress visible at a weekly timescale, engagement forecasting offers a foundation for supporting tutor-learner goal setting and timely instructional decisions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper sets up weekly effort and progress forecasting in ITS logs and shows feature models cut MAE 22-33% versus heuristics, but the train-test split is not described clearly enough to confirm the gains are out-of-sample.

read the letter

The main thing to know is that this work defines a concrete supervised task for predicting minutes practiced and new skills mastered each week from tutoring system logs, then shows that regressions, trees, and neural nets reduce mean absolute error by 22-33% compared with adapted percentile rules on 425 middle-school students. Feature importance and ablation results line up with intuition: recent activity dominates effort forecasts while learner state and content difficulty matter more for progress. A small interview with eight tutors illustrates how the predictions might actually be used in goal-setting conversations.

Referee Report

2 major / 2 minor

Summary. The paper introduces engagement forecasting as a supervised prediction task on ITS interaction logs from 425 middle-school students, targeting weekly minutes practiced and new skills mastered. It benchmarks fifteen feature-based models (regressions, decision trees, neural networks) against heuristic baselines such as fixed-percentile rules, reporting 22-33% MAE reductions, provides feature-importance and ablation analyses showing recent activity driving effort forecasts while learner-state and content difficulty drive progress forecasts, and includes a semi-structured interview case study with eight tutors on using the predictions for goal setting.

Significance. If the central MAE reductions hold under proper out-of-sample validation, the work supplies a reproducible benchmark for weekly effort and progress forecasting in intelligent tutoring systems, moving beyond ad-hoc heuristics toward analytics-driven support for tutor-learner goal setting. The target-specific feature patterns and the tutor interview results add explanatory and practical value; the public benchmark framing is a clear strength.

major comments (2)

[Benchmarking / Experimental Setup] Benchmarking / Experimental Setup: The train-test partitioning procedure is not stated to be student-stratified (or week-stratified). Because the data consist of repeated weekly observations per student, any split that allows the same student to appear in both training and test sets risks temporal autocorrelation leakage, which would inflate the reported 22-33% MAE gains relative to the percentile heuristics and undermine the out-of-sample forecasting claim.
[Results] Results section: No statistical significance tests, confidence intervals, or cross-validation variance estimates are provided for the MAE differences across the fifteen models and two targets. Without these, it is impossible to determine whether the observed improvements are reliable or could be explained by sampling variability in the 425-student corpus.

minor comments (2)

[Methods] The abstract states that fifteen predictors were benchmarked, yet the methods section would benefit from an explicit enumerated list of all models together with their hyperparameter ranges or selection procedure.
[Feature Engineering] Feature definitions (especially the 'recent activity' and 'learner-state' groups) are described at a high level; a table listing each feature, its computation, and any normalization would improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We appreciate the referee's insightful comments on the experimental validation and statistical reporting. We have revised the manuscript to clarify the data partitioning procedure and to include statistical significance tests and confidence intervals for the reported MAE improvements. Our responses to the major comments are detailed below.

read point-by-point responses

Referee: [Benchmarking / Experimental Setup] Benchmarking / Experimental Setup: The train-test partitioning procedure is not stated to be student-stratified (or week-stratified). Because the data consist of repeated weekly observations per student, any split that allows the same student to appear in both training and test sets risks temporal autocorrelation leakage, which would inflate the reported 22-33% MAE gains relative to the percentile heuristics and undermine the out-of-sample forecasting claim.

Authors: We thank the referee for highlighting this critical aspect of the experimental design. Upon review, the train-test split in our study was indeed performed in a student-stratified manner, with all weekly observations for a given student assigned entirely to either the training or test set (70/30 split). This prevents any leakage from temporal autocorrelation within students. We have updated the manuscript's Experimental Setup section to explicitly state this partitioning strategy and its rationale for ensuring valid out-of-sample forecasting. revision: yes
Referee: [Results] Results section: No statistical significance tests, confidence intervals, or cross-validation variance estimates are provided for the MAE differences across the fifteen models and two targets. Without these, it is impossible to determine whether the observed improvements are reliable or could be explained by sampling variability in the 425-student corpus.

Authors: We agree that providing measures of statistical reliability strengthens the results. In the revised manuscript, we have added bootstrap-derived 95% confidence intervals for all MAE values and conducted paired statistical tests (Wilcoxon signed-rank tests due to non-normality of errors) comparing each model's per-student MAE against the heuristic baselines. The improvements remain significant (p < 0.001) across targets, with the confidence intervals confirming the 22-33% reductions are not attributable to sampling variability alone. These additions are incorporated into the Results section and a new supplementary table. revision: yes

Circularity Check

0 steps flagged

No significant circularity: supervised forecasting trained on historical logs to predict future weeks

full rationale

The paper trains feature-based models (regressions, trees, neural nets) on 425-student ITS logs to forecast weekly minutes practiced and skills mastered. Heuristic baselines are percentile rules adapted from external prior work in other domains. No equation or fitting step reduces the target variable to a parameter of itself by construction; the reported 22-33% MAE reduction is an empirical out-of-sample comparison on future weeks. The derivation chain is self-contained against external benchmarks and does not rely on self-citation for its central claim.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. The central claim rests on the unstated assumption that weekly aggregates of log data contain sufficient signal for supervised prediction.

pith-pipeline@v0.9.0 · 5578 in / 1040 out tokens · 36722 ms · 2026-05-14T20:27:28.547087+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

57 extracted references · 57 canonical work pages

[1]

M. A. Adams, J. C. Hurley, M. Todd, N. Bhuiyan, C. L. Jarrett, W. J. Tucker, K. E. Hollingshead, and S. S. Angadi. Adaptive goal setting and financial incentives: a 2×2 factorial randomized controlled trial to increase adults’ physical activity.BMC Public Health, 17(1):1– 16, 2017

work page 2017
[2]

Albreiki, N

B. Albreiki, N. Zaki, and H. Alashwal. A systematic literature review of student’ performance prediction us- ing machine learning techniques.Education Sciences, 11(9):1–27, 2021

work page 2021
[3]

Aleven, C

V. Aleven, C. Borchers, Y. Huang, T. Nagashima, B. McLaren, P. Carvalho, O. Popescu, J. Sewall, and K. Koedinger. An integrated platform for studying learning with intelligent tutoring systems: Ctat+tutorshop, 2025. arXiv:2502.10395

work page arXiv 2025
[4]

Arroyo, H

I. Arroyo, H. Meheranian, and B. P. Woolf. Effort-based tutoring: An empirical approach to intelligent tutoring. InProceedings of the 3rd International Conference on Educational Data Mining (EDM), pages 1–10, 2010

work page 2010
[5]

R. S. Baker, A. T. Corbett, and K. R. Koedinger. Detecting student misuse of intelligent tutoring sys- tems. InProceedings of the 7th International Confer- ence on Intelligent Tutoring Systems (ITS), pages 531– 540, 2004

work page 2004
[6]

R. S. J. d. Baker. Modeling and understanding students’ off-task behavior in intelligent tutoring systems. InPro- ceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI), pages 1059–1068, 2007

work page 2007
[7]

R. S. J. d. Baker, S. M. Gowda, M. Wixon, J. Kalka, A. Z. Wagner, A. Salvi, V. Aleven, G. W. Kusbit, J. Ocumpaugh, and L. M. Rossi. Towards sensor-free affect detection in cognitive tutor algebra. InPro- ceedings of the 5th International Conference on Educa- tional Data Mining (EDM 2012), pages 126–133, Cha- nia, Greece, 2012. International Educational Da...

work page 2012
[8]

C. R. Beal, I. M. Arroyo, P. R. Cohen, and B. P. Woolf. Evaluation of animalwatch: An intelligent tutoring sys- tem for arithmetic and fractions.Journal of Interactive Online Learning, 9(1):64–77, 2010

work page 2010
[9]

Bembenutty

H. Bembenutty. Meaningful and maladaptive home- work practices: The role of self-efficacy and self- regulation.Journal of Advanced Academics, 22(3):448– 473, 2011

work page 2011
[10]

M. L. Bernacki, T. J. Nokes-Malach, and V. Aleven. Examining self-efficacy during learning: Variability and relations to behavior, performance, and learning. Metacognition and Learning, 10:99–117, 2015

work page 2015
[11]

Borchers, A

C. Borchers, A. Houk, V. Aleven, and K. R. Koedinger. Engagement and learning benefits of goal setting with rewards in human-ai tutoring. In A. I. Cristea, E. Walker, Y. Lu, O. C. Santos, and S. Isotani, editors, Artificial Intelligence in Education. AIED 2025. Lec- ture Notes in Computer Science, volume 15880, pages 46–59. Springer, Cham, 2025

work page 2025
[12]

Borchers, J

C. Borchers, J. Ooge, C. Peng, and V. Aleven. How learner control and explainable learning analytics about skill mastery shape student desires to finish and avoid loss in tutored practice. InProceedings of the 15th In- ternational Learning Analytics and Knowledge Confer- ence, LAK 2025, page 810–816. ACM, Mar. 2025

work page 2025
[13]

Borchers, C

C. Borchers, C. Peng, Q. Lyu, P. F. Carvalho, K. R. Koedinger, and V. Aleven. Student perceptions of adap- tive goal setting recommendations: A design prototyp- ing study. In A. I. Cristea, E. Walker, Y. Lu, O. C. Santos, and S. Isotani, editors,Artificial Intelligence in Education, pages 244–251, Cham, 2025. Springer Na- ture Switzerland

work page 2025
[14]

Bull and J

S. Bull and J. Kay. Student models that invite the learner in: The smili open learner modelling framework technical report 580.I. J. Artificial Intelligence in Ed- ucation, 17, 01 2007

work page 2007
[15]

H. Cen, K. Koedinger, and B. Junker. Learning factors analysis - a general method for cognitive model evalua- tion and improvement. InInternational Conference on Intelligent Tutoring Systems, pages 164–175, 2006

work page 2006
[16]

H. Cen, K. Koedinger, and B. Junker. Comparing two irt models for conjunctive skills. InInternational Con- ference on Intelligent Tutoring Systems, pages 796–798, 2008

work page 2008
[17]

Chen and C

T. Chen and C. Guestrin. Xgboost: A scalable tree boosting system. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Dis- covery and Data Mining, pages 785–794, 2016

work page 2016
[18]

A. T. Corbett and J. R. Anderson. Knowledge tracing: Modeling the acquisition of procedural knowledge.User Modeling and User-Adapted Interaction, 4(4):253–278, Dec. 1994

work page 1994
[19]

A. T. Corbett, K. R. Koedinger, and J. R. Ander- son. Intelligent tutoring systems. In M. G. Helander, T. K. Landauer, and P. V. Prabhu, editors,Handbook of Human-Computer Interaction, chapter 37, pages 849–

work page
[20]

Elsevier Science B.V., Amsterdam, The Nether- lands, 2 edition, 1997

work page 1997
[21]

S. C. Dang.Exploring Behavioral Measurement Mod- els of Learner Motivation. Ph.d. thesis, Carnegie Mel- lon University, School of Computer Science, Pittsburgh, PA, USA, Feb. 26 2022. CMU–HCII–21–109

work page 2022
[22]

G. W. Dekker, M. Pechenizkiy, and J. M. Vleeshouw- ers. Predicting students drop out: A case study. In Proceedings of the 2nd International Conference on Ed- ucational Data Mining, pages 41–50, Cordoba, Spain,

work page
[23]

International Working Group on Educational Data Mining

work page
[24]

Eames, E

T. Eames, E. Brunskill, B. Yamkovenko, K. Weather- holtz, and P. Oreopoulos. Computer-assisted learning in the real world: How khan academy influences student math learning.Proceedings of the National Academy of Sciences, 123(1):e2507708123–e2507708123, 2026

work page 2026
[25]

Gardner, Y

J. Gardner, Y. Yang, R. S. Baker, and C. Brooks. Mod- eling and experimental design for mooc dropout pre- diction: A replication perspective. InProceedings of the 12th International Conference on Educational Data Mining (EDM 2019), pages 49–58, 2019

work page 2019
[26]

Grimaldi, K

P. Grimaldi, K. Weatherholtz, and K. M. Hill. Esti- mating the causal effects of Khan Academy MAP Ac- celerator across demographic subgroups. InProceed- ings of the 15th International Conference on Educa- tional Data Mining, pages 839–846, Durham, United Kingdom, 2022. International Educational Data Min- ing Society

work page 2022
[27]

Gurung, J

A. Gurung, J. Lin, Z. Huang, C. Borchers, R. Baker, V. Aleven, and K. Koedinger. Starting seatwork earlier as a valid measure of student engagement. In C. Mills, G. Alexandron, D. Taibi, G. L. Bosco, and L. Paque- tte, editors,Proceedings of the 18th International Con- ference on Educational Data Mining, pages 303–316, Palermo, Italy, July 2025. Internati...

work page 2025
[28]

Hattie.Visible Learning: A Synthesis of Over 800 Meta-Analyses Relating to Achievement

J. Hattie.Visible Learning: A Synthesis of Over 800 Meta-Analyses Relating to Achievement. Routledge, London, UK, 2009

work page 2009
[29]

T. K. Ho. Random decision forests.Proceedings of the 3rd International Conference on Document Analysis and Recognition, 1:278–282, 1995

work page 1995
[30]

A. E. Hoerl and R. W. Kennard. Ridge regression: Bi- ased estimation for nonorthogonal problems.Techno- metrics, 12(1):55–67, 1970

work page 1970
[31]

L. Holt. The 5 percent problem: Online mathematics programs may benefit most the kids who need it least. Education Next, 24(4):26–31, apr 2024

work page 2024
[32]

Hooshyar, M

D. Hooshyar, M. Pedaste, K. Saks, ¨Ali Leijen, E. Bar- done, and M. Wang. Open learner models in sup- porting self-regulated learning in higher education: A systematic literature review.Computers & Education, 154:103878–103878, 2020

work page 2020
[33]

K. R. Koedinger, J. Kim, J. Z. Jia, E. A. McLaughlin, and N. L. Bier. Learning is not a spectator sport: Doing is better than watching for learning from a mooc. In Proceedings of the Second (2015) ACM Conference on Learning @ Scale, L@S ’15, page 111–120, New York, NY, USA, 2015. Association for Computing Machinery

work page 2015
[34]

Kovanovic, D

V. Kovanovic, D. Gaˇ sevi´ c, S. Dawson, S. Joksimovic, and R. Baker. Does time-on-task estimation matter? implications on validity of learning analytics findings. Journal of Learning Analytics, 2(3):81–110, Feb. 2016

work page 2016
[35]

J. A. Kulik and J. D. Fletcher. Effectiveness of intelli- gent tutoring systems: A meta-analytic review.Review of Educational Research, 86(1):42–78, 2016

work page 2016
[36]

E. A. Locke and G. P. Latham. Building a practically useful theory of goal setting and task motivation: A 35- year Odyssey.American Psychologist, 57(9):705–717, Sept. 2002. PMID: 12237980

work page 2002
[37]

E. A. Locke and G. P. Latham. The development of goal setting theory: A half century retrospective.Motivation Science, 5(2):93–105, 2019

work page 2019
[38]

W. Ma, O. O. Adesope, J. C. Nesbit, and Q. Liu. Intelligent tutoring systems and learning outcomes: A meta-analysis.Journal of Educational Psychology, 106(4):901–918, 2014

work page 2014
[39]

T. Mu, A. Jetten, and E. Brunskill. Towards suggesting actionable interventions for wheel-spinning students. In Proceedings of the 13th International Conference on Ed- ucational Data Mining, pages 183–193, Online, 2020. International Educational Data Mining Society

work page 2020
[40]

C. Peng, C. Borchers, and V. Aleven. Designing home- work support tools for middle school mathematics us- ing intelligent tutoring systems. InProceedings of the 18th International Conference of the Learning Sciences (ICLS 2024), pages 1730–1733, Buffalo, NY, USA,

work page 2024
[41]

International Society of the Learning Sciences

work page
[42]

Ritter, A

S. Ritter, A. Joshi, S. Fancsali, and T. Nixon. Predict- ing standardized test scores from cognitive tutor inter- actions. In S. K. D’Mello, R. A. Calvo, and A. Ol- ney, editors,Proceedings of the 6th International Con- ference on Educational Data Mining, Memphis, Ten- nessee, USA, July 6-9, 2013, pages 169–176. Interna- tional Educational Data Mining Soc...

work page 2013
[43]

R. M. Ryan and E. L. Deci. Self-determination the- ory and the facilitation of intrinsic motivation, social development, and well-being.American Psychologist, 55(1):68–78, 2000

work page 2000
[44]

Schaldenbrand, N

P. Schaldenbrand, N. G. Lobczowski, J. E. Richey, S. Gupta, E. A. McLaughlin, A. Adeniran, and K. R. Koedinger. Computer-supported human mentoring for personalized and equitable math learning. InAr- tificial Intelligence in Education: 22nd International Conference, AIED 2021, Utrecht, The Netherlands, June 14–18, 2021, Proceedings, Part II, page 308–313, ...

work page 2021
[45]

D. H. Schunk. Goal setting and self-efficacy during self- regulated learning.Educational Psychologist, 25(1):71– 86, 1990

work page 1990
[46]

G. A. Seber and A. J. Lee.Linear Regression Analysis. Wiley, 2003

work page 2003
[47]

Stamper, K

J. Stamper, K. Koedinger, R. S. J. d. Baker, A. Skogsholm, B. Leber, J. Rankin, and S. Demi. Pslc datashop: A data analysis service for the learning sci- ence community. In V. Aleven, J. Kay, and J. Mostow, editors,Intelligent Tutoring Systems, pages 455–455, Berlin, Heidelberg, 2010. Springer Berlin Heidelberg

work page 2010
[48]

Tibshirani

R. Tibshirani. Regression shrinkage and selection via the lasso.Journal of the Royal Statistical Society: Se- ries B (Methodological), 58(1):267–288, 1996

work page 1996
[49]

K. VanLehn. The behavior of tutoring systems.Inter- national Journal of Artificial Intelligence in Education, 16(3):227–265, 2006

work page 2006
[50]

K. VanLehn. The relative effectiveness of human tu- toring, intelligent tutoring systems, and other tutoring systems.Educational psychologist, 46(4):197–221, 2011

work page 2011
[51]

L. S. Vygotsky. Interaction between learning and de- velopment. In M. Cole, V. John-Steiner, S. Scribner, and E. Souberman, editors,Mind in Society: Develop- ment of Higher Psychological Processes, pages 79–91. Harvard University Press, Cambridge, MA, 1978

work page 1978
[52]

Wan and J

H. Wan and J. B. Beck. Considering the influence of prerequisite performance on wheel spinning. InProceed- ings of the 8th International Conference on Educational Data Mining, pages 129–135, Madrid, Spain, 2015. In- ternational Educational Data Mining Society

work page 2015
[53]

H. Wan, J. Ding, X. Gao, and D. E. Pritchard. Dropout prediction in MOOCs using learners’ study habits fea- tures. InProceedings of the 10th International Confer- ence on Educational Data Mining, pages 408–409, 2017

work page 2017
[54]

W ¨aschle, A

K. W ¨aschle, A. Allgaier, A. Lachner, S. Fink, and M. N¨uckles. Procrastination and self-efficacy: Tracing vicious and virtuous circles in self-regulated learning. Learning and Instruction, 29:103–114, 02 2014

work page 2014
[55]

M. Xia, R. Schmucker, C. Borchers, and V. Aleven. Optimizing mastery learning by fast-forwarding over- practice steps. InTwo Decades of TEL. From Lessons Learnt to Challenges Ahead: 20th European Confer- ence on Technology Enhanced Learning, EC-TEL 2025, Newcastle upon Tyne and Durham, UK, September 15–19, 2025, Proceedings, Part I, page 549–563, Berlin, ...

work page 2025
[56]

A. F. Zambrano, R. S. Baker, S. Baral, N. T. Heffernan, and A. Lan. From reaction to anticipation: Predicting future affect. InProceedings of the 17th International Conference on Educational Data Mining, pages 566– 574, Atlanta, Georgia, USA, July 2024. International Educational Data Mining Society

work page 2024
[57]

Zhang, Y

C. Zhang, Y. Huang, J. Wang, D. Lu, W. Fang, J. C. Stamper, S. E. Fancsali, K. Holstein, and V. Aleven. Early detection of wheel spinning: Comparison across tutors, models, features, and operationalizations. In Proceedings of the 12th International Conference on Ed- ucational Data Mining, pages 468–473, Montr´ eal, QC, Canada, 2019. International Educatio...

work page 2019

[1] [1]

M. A. Adams, J. C. Hurley, M. Todd, N. Bhuiyan, C. L. Jarrett, W. J. Tucker, K. E. Hollingshead, and S. S. Angadi. Adaptive goal setting and financial incentives: a 2×2 factorial randomized controlled trial to increase adults’ physical activity.BMC Public Health, 17(1):1– 16, 2017

work page 2017

[2] [2]

Albreiki, N

B. Albreiki, N. Zaki, and H. Alashwal. A systematic literature review of student’ performance prediction us- ing machine learning techniques.Education Sciences, 11(9):1–27, 2021

work page 2021

[3] [3]

Aleven, C

V. Aleven, C. Borchers, Y. Huang, T. Nagashima, B. McLaren, P. Carvalho, O. Popescu, J. Sewall, and K. Koedinger. An integrated platform for studying learning with intelligent tutoring systems: Ctat+tutorshop, 2025. arXiv:2502.10395

work page arXiv 2025

[4] [4]

Arroyo, H

I. Arroyo, H. Meheranian, and B. P. Woolf. Effort-based tutoring: An empirical approach to intelligent tutoring. InProceedings of the 3rd International Conference on Educational Data Mining (EDM), pages 1–10, 2010

work page 2010

[5] [5]

R. S. Baker, A. T. Corbett, and K. R. Koedinger. Detecting student misuse of intelligent tutoring sys- tems. InProceedings of the 7th International Confer- ence on Intelligent Tutoring Systems (ITS), pages 531– 540, 2004

work page 2004

[6] [6]

R. S. J. d. Baker. Modeling and understanding students’ off-task behavior in intelligent tutoring systems. InPro- ceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI), pages 1059–1068, 2007

work page 2007

[7] [7]

R. S. J. d. Baker, S. M. Gowda, M. Wixon, J. Kalka, A. Z. Wagner, A. Salvi, V. Aleven, G. W. Kusbit, J. Ocumpaugh, and L. M. Rossi. Towards sensor-free affect detection in cognitive tutor algebra. InPro- ceedings of the 5th International Conference on Educa- tional Data Mining (EDM 2012), pages 126–133, Cha- nia, Greece, 2012. International Educational Da...

work page 2012

[8] [8]

C. R. Beal, I. M. Arroyo, P. R. Cohen, and B. P. Woolf. Evaluation of animalwatch: An intelligent tutoring sys- tem for arithmetic and fractions.Journal of Interactive Online Learning, 9(1):64–77, 2010

work page 2010

[9] [9]

Bembenutty

H. Bembenutty. Meaningful and maladaptive home- work practices: The role of self-efficacy and self- regulation.Journal of Advanced Academics, 22(3):448– 473, 2011

work page 2011

[10] [10]

M. L. Bernacki, T. J. Nokes-Malach, and V. Aleven. Examining self-efficacy during learning: Variability and relations to behavior, performance, and learning. Metacognition and Learning, 10:99–117, 2015

work page 2015

[11] [11]

Borchers, A

C. Borchers, A. Houk, V. Aleven, and K. R. Koedinger. Engagement and learning benefits of goal setting with rewards in human-ai tutoring. In A. I. Cristea, E. Walker, Y. Lu, O. C. Santos, and S. Isotani, editors, Artificial Intelligence in Education. AIED 2025. Lec- ture Notes in Computer Science, volume 15880, pages 46–59. Springer, Cham, 2025

work page 2025

[12] [12]

Borchers, J

C. Borchers, J. Ooge, C. Peng, and V. Aleven. How learner control and explainable learning analytics about skill mastery shape student desires to finish and avoid loss in tutored practice. InProceedings of the 15th In- ternational Learning Analytics and Knowledge Confer- ence, LAK 2025, page 810–816. ACM, Mar. 2025

work page 2025

[13] [13]

Borchers, C

C. Borchers, C. Peng, Q. Lyu, P. F. Carvalho, K. R. Koedinger, and V. Aleven. Student perceptions of adap- tive goal setting recommendations: A design prototyp- ing study. In A. I. Cristea, E. Walker, Y. Lu, O. C. Santos, and S. Isotani, editors,Artificial Intelligence in Education, pages 244–251, Cham, 2025. Springer Na- ture Switzerland

work page 2025

[14] [14]

Bull and J

S. Bull and J. Kay. Student models that invite the learner in: The smili open learner modelling framework technical report 580.I. J. Artificial Intelligence in Ed- ucation, 17, 01 2007

work page 2007

[15] [15]

H. Cen, K. Koedinger, and B. Junker. Learning factors analysis - a general method for cognitive model evalua- tion and improvement. InInternational Conference on Intelligent Tutoring Systems, pages 164–175, 2006

work page 2006

[16] [16]

H. Cen, K. Koedinger, and B. Junker. Comparing two irt models for conjunctive skills. InInternational Con- ference on Intelligent Tutoring Systems, pages 796–798, 2008

work page 2008

[17] [17]

Chen and C

T. Chen and C. Guestrin. Xgboost: A scalable tree boosting system. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Dis- covery and Data Mining, pages 785–794, 2016

work page 2016

[18] [18]

A. T. Corbett and J. R. Anderson. Knowledge tracing: Modeling the acquisition of procedural knowledge.User Modeling and User-Adapted Interaction, 4(4):253–278, Dec. 1994

work page 1994

[19] [19]

A. T. Corbett, K. R. Koedinger, and J. R. Ander- son. Intelligent tutoring systems. In M. G. Helander, T. K. Landauer, and P. V. Prabhu, editors,Handbook of Human-Computer Interaction, chapter 37, pages 849–

work page

[20] [20]

Elsevier Science B.V., Amsterdam, The Nether- lands, 2 edition, 1997

work page 1997

[21] [21]

S. C. Dang.Exploring Behavioral Measurement Mod- els of Learner Motivation. Ph.d. thesis, Carnegie Mel- lon University, School of Computer Science, Pittsburgh, PA, USA, Feb. 26 2022. CMU–HCII–21–109

work page 2022

[22] [22]

G. W. Dekker, M. Pechenizkiy, and J. M. Vleeshouw- ers. Predicting students drop out: A case study. In Proceedings of the 2nd International Conference on Ed- ucational Data Mining, pages 41–50, Cordoba, Spain,

work page

[23] [23]

International Working Group on Educational Data Mining

work page

[24] [24]

Eames, E

T. Eames, E. Brunskill, B. Yamkovenko, K. Weather- holtz, and P. Oreopoulos. Computer-assisted learning in the real world: How khan academy influences student math learning.Proceedings of the National Academy of Sciences, 123(1):e2507708123–e2507708123, 2026

work page 2026

[25] [25]

Gardner, Y

J. Gardner, Y. Yang, R. S. Baker, and C. Brooks. Mod- eling and experimental design for mooc dropout pre- diction: A replication perspective. InProceedings of the 12th International Conference on Educational Data Mining (EDM 2019), pages 49–58, 2019

work page 2019

[26] [26]

Grimaldi, K

P. Grimaldi, K. Weatherholtz, and K. M. Hill. Esti- mating the causal effects of Khan Academy MAP Ac- celerator across demographic subgroups. InProceed- ings of the 15th International Conference on Educa- tional Data Mining, pages 839–846, Durham, United Kingdom, 2022. International Educational Data Min- ing Society

work page 2022

[27] [27]

Gurung, J

A. Gurung, J. Lin, Z. Huang, C. Borchers, R. Baker, V. Aleven, and K. Koedinger. Starting seatwork earlier as a valid measure of student engagement. In C. Mills, G. Alexandron, D. Taibi, G. L. Bosco, and L. Paque- tte, editors,Proceedings of the 18th International Con- ference on Educational Data Mining, pages 303–316, Palermo, Italy, July 2025. Internati...

work page 2025

[28] [28]

Hattie.Visible Learning: A Synthesis of Over 800 Meta-Analyses Relating to Achievement

J. Hattie.Visible Learning: A Synthesis of Over 800 Meta-Analyses Relating to Achievement. Routledge, London, UK, 2009

work page 2009

[29] [29]

T. K. Ho. Random decision forests.Proceedings of the 3rd International Conference on Document Analysis and Recognition, 1:278–282, 1995

work page 1995

[30] [30]

A. E. Hoerl and R. W. Kennard. Ridge regression: Bi- ased estimation for nonorthogonal problems.Techno- metrics, 12(1):55–67, 1970

work page 1970

[31] [31]

L. Holt. The 5 percent problem: Online mathematics programs may benefit most the kids who need it least. Education Next, 24(4):26–31, apr 2024

work page 2024

[32] [32]

Hooshyar, M

D. Hooshyar, M. Pedaste, K. Saks, ¨Ali Leijen, E. Bar- done, and M. Wang. Open learner models in sup- porting self-regulated learning in higher education: A systematic literature review.Computers & Education, 154:103878–103878, 2020

work page 2020

[33] [33]

K. R. Koedinger, J. Kim, J. Z. Jia, E. A. McLaughlin, and N. L. Bier. Learning is not a spectator sport: Doing is better than watching for learning from a mooc. In Proceedings of the Second (2015) ACM Conference on Learning @ Scale, L@S ’15, page 111–120, New York, NY, USA, 2015. Association for Computing Machinery

work page 2015

[34] [34]

Kovanovic, D

V. Kovanovic, D. Gaˇ sevi´ c, S. Dawson, S. Joksimovic, and R. Baker. Does time-on-task estimation matter? implications on validity of learning analytics findings. Journal of Learning Analytics, 2(3):81–110, Feb. 2016

work page 2016

[35] [35]

J. A. Kulik and J. D. Fletcher. Effectiveness of intelli- gent tutoring systems: A meta-analytic review.Review of Educational Research, 86(1):42–78, 2016

work page 2016

[36] [36]

E. A. Locke and G. P. Latham. Building a practically useful theory of goal setting and task motivation: A 35- year Odyssey.American Psychologist, 57(9):705–717, Sept. 2002. PMID: 12237980

work page 2002

[37] [37]

E. A. Locke and G. P. Latham. The development of goal setting theory: A half century retrospective.Motivation Science, 5(2):93–105, 2019

work page 2019

[38] [38]

W. Ma, O. O. Adesope, J. C. Nesbit, and Q. Liu. Intelligent tutoring systems and learning outcomes: A meta-analysis.Journal of Educational Psychology, 106(4):901–918, 2014

work page 2014

[39] [39]

T. Mu, A. Jetten, and E. Brunskill. Towards suggesting actionable interventions for wheel-spinning students. In Proceedings of the 13th International Conference on Ed- ucational Data Mining, pages 183–193, Online, 2020. International Educational Data Mining Society

work page 2020

[40] [40]

C. Peng, C. Borchers, and V. Aleven. Designing home- work support tools for middle school mathematics us- ing intelligent tutoring systems. InProceedings of the 18th International Conference of the Learning Sciences (ICLS 2024), pages 1730–1733, Buffalo, NY, USA,

work page 2024

[41] [41]

International Society of the Learning Sciences

work page

[42] [42]

Ritter, A

S. Ritter, A. Joshi, S. Fancsali, and T. Nixon. Predict- ing standardized test scores from cognitive tutor inter- actions. In S. K. D’Mello, R. A. Calvo, and A. Ol- ney, editors,Proceedings of the 6th International Con- ference on Educational Data Mining, Memphis, Ten- nessee, USA, July 6-9, 2013, pages 169–176. Interna- tional Educational Data Mining Soc...

work page 2013

[43] [43]

R. M. Ryan and E. L. Deci. Self-determination the- ory and the facilitation of intrinsic motivation, social development, and well-being.American Psychologist, 55(1):68–78, 2000

work page 2000

[44] [44]

Schaldenbrand, N

P. Schaldenbrand, N. G. Lobczowski, J. E. Richey, S. Gupta, E. A. McLaughlin, A. Adeniran, and K. R. Koedinger. Computer-supported human mentoring for personalized and equitable math learning. InAr- tificial Intelligence in Education: 22nd International Conference, AIED 2021, Utrecht, The Netherlands, June 14–18, 2021, Proceedings, Part II, page 308–313, ...

work page 2021

[45] [45]

D. H. Schunk. Goal setting and self-efficacy during self- regulated learning.Educational Psychologist, 25(1):71– 86, 1990

work page 1990

[46] [46]

G. A. Seber and A. J. Lee.Linear Regression Analysis. Wiley, 2003

work page 2003

[47] [47]

Stamper, K

J. Stamper, K. Koedinger, R. S. J. d. Baker, A. Skogsholm, B. Leber, J. Rankin, and S. Demi. Pslc datashop: A data analysis service for the learning sci- ence community. In V. Aleven, J. Kay, and J. Mostow, editors,Intelligent Tutoring Systems, pages 455–455, Berlin, Heidelberg, 2010. Springer Berlin Heidelberg

work page 2010

[48] [48]

Tibshirani

R. Tibshirani. Regression shrinkage and selection via the lasso.Journal of the Royal Statistical Society: Se- ries B (Methodological), 58(1):267–288, 1996

work page 1996

[49] [49]

K. VanLehn. The behavior of tutoring systems.Inter- national Journal of Artificial Intelligence in Education, 16(3):227–265, 2006

work page 2006

[50] [50]

K. VanLehn. The relative effectiveness of human tu- toring, intelligent tutoring systems, and other tutoring systems.Educational psychologist, 46(4):197–221, 2011

work page 2011

[51] [51]

L. S. Vygotsky. Interaction between learning and de- velopment. In M. Cole, V. John-Steiner, S. Scribner, and E. Souberman, editors,Mind in Society: Develop- ment of Higher Psychological Processes, pages 79–91. Harvard University Press, Cambridge, MA, 1978

work page 1978

[52] [52]

Wan and J

H. Wan and J. B. Beck. Considering the influence of prerequisite performance on wheel spinning. InProceed- ings of the 8th International Conference on Educational Data Mining, pages 129–135, Madrid, Spain, 2015. In- ternational Educational Data Mining Society

work page 2015

[53] [53]

H. Wan, J. Ding, X. Gao, and D. E. Pritchard. Dropout prediction in MOOCs using learners’ study habits fea- tures. InProceedings of the 10th International Confer- ence on Educational Data Mining, pages 408–409, 2017

work page 2017

[54] [54]

W ¨aschle, A

K. W ¨aschle, A. Allgaier, A. Lachner, S. Fink, and M. N¨uckles. Procrastination and self-efficacy: Tracing vicious and virtuous circles in self-regulated learning. Learning and Instruction, 29:103–114, 02 2014

work page 2014

[55] [55]

M. Xia, R. Schmucker, C. Borchers, and V. Aleven. Optimizing mastery learning by fast-forwarding over- practice steps. InTwo Decades of TEL. From Lessons Learnt to Challenges Ahead: 20th European Confer- ence on Technology Enhanced Learning, EC-TEL 2025, Newcastle upon Tyne and Durham, UK, September 15–19, 2025, Proceedings, Part I, page 549–563, Berlin, ...

work page 2025

[56] [56]

A. F. Zambrano, R. S. Baker, S. Baral, N. T. Heffernan, and A. Lan. From reaction to anticipation: Predicting future affect. InProceedings of the 17th International Conference on Educational Data Mining, pages 566– 574, Atlanta, Georgia, USA, July 2024. International Educational Data Mining Society

work page 2024

[57] [57]

Zhang, Y

C. Zhang, Y. Huang, J. Wang, D. Lu, W. Fang, J. C. Stamper, S. E. Fancsali, K. Holstein, and V. Aleven. Early detection of wheel spinning: Comparison across tutors, models, features, and operationalizations. In Proceedings of the 12th International Conference on Ed- ucational Data Mining, pages 468–473, Montr´ eal, QC, Canada, 2019. International Educatio...

work page 2019