The Effortless Trap: Productive Struggle, AI, and the Illusion of Learning

Mario Brcic; Stjepan Frljic

arxiv: 2606.26181 · v1 · pith:HIYPZ4TVnew · submitted 2026-06-24 · 💻 cs.CY

The Effortless Trap: Productive Struggle, AI, and the Illusion of Learning

Mario Brcic , Stjepan Frljic This is my paper

Pith reviewed 2026-06-26 00:57 UTC · model grok-4.3

classification 💻 cs.CY

keywords AI in educationproductive strugglelearning designeffortless trapsix-move modelplacement ruleillusion of learninghigh-school experiments

0 comments

The pith

AI harms learning when it replaces struggle but doubles gains when placed inside a six-step sequence.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims the allow-or-ban debate misses the point: AI's effect on learning depends on its placement inside the process of acquiring a new idea. Evidence from high-school experiments shows an unguarded AI helper produced 17 percent lower scores on later unaided tests than no tool at all, a version that withholds answers removed the drop, and a properly designed tutor roughly doubled gains. The proposed frame is a fixed sequence of six ordered moves—Prime, Probe, Point, Attach, Strengthen, and Test—with the practical rule that any AI intervention making the task feel effortless belongs in the wrong step. Educators can map existing teaching moves and AI features onto the middle steps while keeping the first hard attempt and final unaided check under student control alone.

Core claim

A new idea is learned through six moves in order: Prime, Probe, Point, Attach, Strengthen, and Test. Secure the first hard attempt and the final unaided check, scaffold with guarded AI in between, and one diagnostic carries the frame: if letting AI in makes the task feel effortless, it is in the wrong place. The same model rebuilt to withhold answers erased the harm shown by the unguarded version, and a well-engineered tutor roughly doubled learning.

What carries the argument

The six-move sequence (Prime, Probe, Point, Attach, Strengthen, Test) that orders learning steps and restricts AI scaffolding to the middle four while protecting the initial attempt and final test.

If this is right

Lesson redesign can map classical teaching moves and AI features onto the middle steps of the sequence while leaving the first and last steps unaided.
The same underlying model can be adjusted from unguarded helper to answer-withholding version to eliminate the measured performance drop.
A well-engineered tutor version can produce roughly double the learning gains of no tool at all.
The effortless test provides a simple classroom diagnostic that flags misplaced AI without needing new data collection.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same placement logic could apply to non-AI scaffolds such as worked examples or peer hints if they remove the initial hard attempt.
Course-level redesign might require mapping entire units onto repeated cycles of the six moves rather than single lessons.
Policy on AI access could shift from blanket rules to requirements that tools include the withholding option by default.

Load-bearing premise

The six moves describe the necessary order of steps for learning any new idea and the effortless diagnostic works across different subjects and age groups.

What would settle it

A controlled experiment in which students using AI that makes tasks feel effortless still score as well or better on unaided tests than a matched no-AI group.

Figures

Figures reproduced from arXiv: 2606.26181 by Mario Brcic, Stjepan Frljic.

**Figure 2.** Figure 2: Survivorship bias as a student’s knowledge map across the six moves, growing inside a [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗

read the original abstract

With AI advancing fast, educators face a dilemma: allow the tool or ban it. Conflicting evidence that it both helps and hurts learning only deepens the confusion. The allow-or-ban framing is a false dichotomy; the relevant design question is placement. Used well, AI can scale feedback, examples, practice, and individualized support. Used poorly, it replaces the cognitive work that learning requires and leaves an illusion of learning: a confident sense of mastery that collapses on the unaided task. The strongest causal evidence shows the outcome flips on design: an unguarded AI helper left high-school students about 17% worse on an unaided exam than peers with no tool at all, while the same model rebuilt to withhold answers erased the harm, and a well-engineered tutor roughly doubled learning. We give educators one graspable frame for placing the tool. A new idea is learned through six moves, in order: Prime, Probe, Point, Attach, Strengthen, and Test. Secure the first hard attempt and the final unaided check, scaffold with guarded AI in between, and one diagnostic carries the frame: if letting AI in makes the task feel effortless, it is in the wrong place. To make it usable, we map classical teaching moves and AI-supported interventions to each step. Together, the six-move model, the placement rule, and the intervention menu provide a practical foundation for lesson and course redesign in the age of AI.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Six-move model and effortless diagnostic give a usable placement rule for AI in lessons, but the sequence itself lacks derivation or tests showing it is required.

read the letter

The paper's main offering is a six-step sequence—Prime, Probe, Point, Attach, Strengthen, Test—plus the rule that AI belongs only where it does not make the task feel effortless. That rule and the mapping of AI features onto each step are the new synthesis.

It does a clear job of showing why the allow-or-ban debate misses the point. The cited studies illustrate the design dependence: unguarded AI left high-school students 17% worse on an unaided exam, withholding answers removed the harm, and a well-built tutor doubled learning. The paper then turns those outcomes into a short menu of interventions teachers can actually use.

The soft spot is that the sequence is presented as the order in which a new idea is learned without derivation from prior theory or any check on whether reordering or dropping steps still produces the claimed unaided performance. The external studies are used to show that design matters, not to confirm that these exact six moves are the mechanism. If the sequence is one workable path rather than the necessary one, the placement rule has narrower reach than stated.

This is aimed at educators and instructional designers who need a concrete heuristic for lesson redesign. A practitioner who wants something they can apply next week will find it useful.

It deserves peer review. The framework is straightforward, the problem is current, and the ideas are testable even if the justification for the exact sequence needs more work.

Referee Report

3 major / 2 minor

Summary. The paper claims that the allow-or-ban framing for AI in education is a false dichotomy and that the key issue is placement of the tool within the learning process. It presents a six-move sequence for learning a new idea (Prime, Probe, Point, Attach, Strengthen, Test), derives a placement rule that secures the initial hard attempt and final unaided check while allowing guarded AI only in between, and supplies a diagnostic that effortless AI use signals incorrect placement. The framework is illustrated by cited causal evidence showing design-dependent outcomes (unguarded AI causing ~17% worse unaided exam performance, modified withholding erasing the harm, and engineered tutors doubling learning) and maps classical teaching moves plus AI interventions onto each step.

Significance. If the six-move ordering is shown to be necessary rather than merely illustrative and the placement rule generalizes, the manuscript supplies educators with a compact, actionable frame for lesson redesign that distinguishes productive from illusory learning. The explicit mapping of interventions to steps and the single diagnostic criterion add practical value. The paper does not itself supply new empirical tests or a first-principles derivation, so its significance remains conditional on external validation of the core sequence.

major comments (3)

[Abstract / six-move model] Abstract and the section presenting the six-move model: the statement that "a new idea is learned through six moves, in order" is introduced as the basis for the placement rule without derivation from prior theory, minimality argument, or experiment showing that reordering or omitting steps fails to produce equivalent unaided performance. This assumption is load-bearing for the claimed generality of the rule and diagnostic.
[Abstract / evidence citations] Abstract and evidence discussion: the 17% harm, harm-erasure, and doubling results are invoked to demonstrate that outcome depends on design, yet the manuscript provides no description of the underlying studies' methods, sample sizes, controls, or effect-size calculations, preventing readers from evaluating how strongly they support the six-move ordering as the operative mechanism.
[Placement rule / diagnostic] Placement rule and diagnostic paragraph: the rule to "secure the first hard attempt and the final unaided check" and the test "if letting AI in makes the task feel effortless, it is in the wrong place" are derived directly from the asserted sequence; if the sequence is only one sufficient path rather than required, the rule loses the generality asserted in the abstract.

minor comments (2)

[Terminology] The six capitalized move names are used as technical labels but never defined operationally or contrasted with standard terminology (e.g., priming, retrieval practice); a short glossary or explicit mapping table would improve clarity.
[References] Ensure all cited causal studies receive complete bibliographic entries with DOIs or stable links so readers can retrieve the methods omitted from the abstract.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments. The feedback correctly identifies that the six-move framework is presented without a formal derivation or new experiments, which we will address by clarifying its status as a synthesized practical model. We will revise the manuscript to temper claims of generality and provide more context on the cited evidence.

read point-by-point responses

Referee: [Abstract / six-move model] Abstract and the section presenting the six-move model: the statement that "a new idea is learned through six moves, in order" is introduced as the basis for the placement rule without derivation from prior theory, minimality argument, or experiment showing that reordering or omitting steps fails to produce equivalent unaided performance. This assumption is load-bearing for the claimed generality of the rule and diagnostic.

Authors: We agree that the phrasing in the abstract and model section could be interpreted as asserting a necessary sequence. The six-move model is intended as a compact synthesis of cognitive principles (e.g., productive struggle and retrieval practice) drawn from the literature, not as a minimal or uniquely required path. In the revision, we will change the language to 'one effective sequence for learning a new idea is Prime, Probe, Point, Attach, Strengthen, and Test' and explicitly state that the placement rule is a heuristic derived from this model. We will also note that while the cited evidence shows design matters, it does not prove this exact ordering is required. This revision will be made. revision: yes
Referee: [Abstract / evidence citations] Abstract and evidence discussion: the 17% harm, harm-erasure, and doubling results are invoked to demonstrate that outcome depends on design, yet the manuscript provides no description of the underlying studies' methods, sample sizes, controls, or effect-size calculations, preventing readers from evaluating how strongly they support the six-move ordering as the operative mechanism.

Authors: The manuscript cites these results to illustrate that outcomes are design-dependent rather than to claim they directly validate the six-move sequence as the mechanism. To address the concern, we will add a short paragraph or footnote in the revised version summarizing the key methodological details of the referenced studies (e.g., sample sizes and basic design), while continuing to direct readers to the original papers for full details. This will help readers assess the evidence without expanding the paper beyond its scope as a framework proposal. revision: yes
Referee: [Placement rule / diagnostic] Placement rule and diagnostic paragraph: the rule to "secure the first hard attempt and the final unaided check" and the test "if letting AI in makes the task feel effortless, it is in the wrong place" are derived directly from the asserted sequence; if the sequence is only one sufficient path rather than required, the rule loses the generality asserted in the abstract.

Authors: We accept this point. The abstract will be revised to describe the framework as providing 'a practical foundation' rather than implying broad generality without qualification. The diagnostic is presented as a useful rule of thumb within the proposed model, and we will add language indicating that educators should adapt and validate it in their own settings. No new data is available to strengthen the claim, but the revision will align the wording with the illustrative nature of the model. revision: partial

Circularity Check

0 steps flagged

No significant circularity; six-move sequence positioned as external-derived frame

full rationale

The paper asserts the six-move sequence (Prime, Probe, Point, Attach, Strengthen, Test) as the basis for the placement rule and diagnostic without any equations, fitted parameters, or self-referential reductions shown in the provided text. It explicitly ties the sequence and rule to cited external causal studies on AI outcomes (17% harm, erasure of harm, doubled learning) rather than deriving the sequence from the rule or from self-citations. No load-bearing step reduces by construction to its own inputs; the framework is presented as a practical synthesis for educators. This is the normal self-contained case with no circularity patterns exhibited.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The paper introduces a new conceptual model whose central elements are postulated rather than derived from prior data or theorems; no free parameters are fitted in the abstract.

axioms (2)

ad hoc to paper A new idea is learned through exactly six ordered moves: Prime, Probe, Point, Attach, Strengthen, Test.
This sequence is presented as the foundational structure for the placement rule.
domain assumption Learning requires cognitive effort that cannot be fully replaced by AI without creating an illusion of mastery.
This is the background premise that makes the effortless diagnostic meaningful.

invented entities (1)

The Effortless Trap no independent evidence
purpose: Diagnostic rule that flags incorrect AI placement when the task feels too easy.
New concept introduced to operationalize the six-move model.

pith-pipeline@v0.9.1-grok · 5793 in / 1466 out tokens · 25676 ms · 2026-06-26T00:57:26.547194+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

45 extracted references · 26 canonical work pages

[1]

Bastani, H., Bastani, O., Sungu, A., Ge, H., Kabakcı, Ö., and Mariman, R. (2025). Generative AI without guardrails can harm learning: Evidence from high school mathematics.Proceedings of the National Academy of Sciences, 122(26):e2422633122. https://doi.org/10.1073/pnas. 2422633122

work page doi:10.1073/pnas 2025
[2]

Bearman, M., Tai, J., Dawson, P., Boud, D., and Ajjawi, R. (2024). Developing evaluative judgement for a time of generative artificial intelligence.Assessment & Evaluation in Higher Education, 49(6):893–905.https://doi.org/10.1080/02602938.2024.2335321

work page doi:10.1080/02602938.2024.2335321 2024
[3]

Biggs, J. (1996). Enhancing teaching through constructive alignment.Higher Education, 32:347–364. https://doi.org/10.1007/BF00138871

work page doi:10.1007/bf00138871 1996
[4]

Bloom, B. S. (1984). The 2 sigma problem: The search for methods of group instruction as effective as one-to-one tutoring.Educational Researcher, 13(6):4–16. https://doi.org/10.3102/ 0013189X013006004

1984
[5]

cognitive sovereignty

Brcic, M. (2025). The memory wars: AI memory, network effects, and the geopolitics of cognitive sovereignty. Preprint. Companion piece; source of the term “cognitive sovereignty”. https: //arxiv.org/abs/2508.05867

arXiv 2025
[6]

and Liu, D

Bridgeman, A. and Liu, D. (2024). Frequently asked questions about the two-lane approach to assessment in the age of AI. Teaching@Sydney, University of Syd- ney. https://educational-innovation.sydney.edu.au/teaching@sydney/ frequently-asked-questions-about-the-two-lane-approach-to-assessment-in-the-age-of-ai/

2024
[8]

Clark, D

Clark, A. and Chalmers, D. (1998). The extended mind.Analysis, 58(1):7–19. https://doi.org/ 10.1093/analys/58.1.7

work page doi:10.1093/analys/58.1.7 1998
[9]

A., Kulik, J

Cohen, P. A., Kulik, J. A., and Kulik, C.-L. C. (1982). Educational outcomes of tutoring: A meta-analysis of findings.American Educational Research Journal, 19(2):237–248. https: //doi.org/10.3102/00028312019002237

work page doi:10.3102/00028312019002237 1982
[10]

S., and Newman, S

Collins, A., Brown, J. S., and Newman, S. E. (1989). Cognitive apprenticeship: Teaching the crafts of reading, writing, and mathematics. In Resnick, L. B., editor,Knowing, Learning, and Instruction: Essays in Honor of Robert Glaser, pages 453–494. Lawrence Erlbaum

1989
[11]

Crouch, C. H. and Mazur, E. (2001). Peer instruction: Ten years of experience and results.American Journal of Physics, 69(9):970–977.https://doi.org/10.1119/1.1374249. 13

work page doi:10.1119/1.1374249 2001
[12]

(2021).Defending Assessment Security in a Digital World: Preventing E-Cheating and Supporting Academic Integrity in Higher Education

Dawson, P. (2021).Defending Assessment Security in a Digital World: Preventing E-Cheating and Supporting Academic Integrity in Higher Education. Routledge. Crossref registers 2020 (online); 2021 paperback.https://doi.org/10.4324/9780429324178

work page doi:10.4324/9780429324178 2021
[13]

S., Miller, K., Callaghan, K., and Kestin, G

Deslauriers, L., McCarty, L. S., Miller, K., Callaghan, K., and Kestin, G. (2019). Measuring actual learning versus feeling of learning in response to being actively engaged in the classroom. Proceedings of the National Academy of Sciences (PNAS), 116(39):19251–19257. https://doi. org/10.1073/pnas.1821936116

work page doi:10.1073/pnas.1821936116 2019
[14]

A., Marsh, E

Dunlosky, J., Rawson, K. A., Marsh, E. J., Nathan, M. J., and Willingham, D. T. (2013). Improving students’ learning with effective learning techniques.Psychological Science in the Public Interest, 14(1):4–58.https://doi.org/10.1177/1529100612453266

work page doi:10.1177/1529100612453266 2013
[15]

Fan, Y ., Tang, L., Le, H., Shen, K., Tan, S., Zhao, Y ., Shen, Y ., Li, X., and Gaševi´c, D. (2025). Beware of metacognitive laziness: Effects of generative artificial intelligence on learning motivation, processes, and performance.British Journal of Educational Technology, 56(2):489–530. https: //doi.org/10.1111/bjet.13544

work page doi:10.1111/bjet.13544 2025
[16]

L., McDonough, M., Smith, M

Freeman, S., Eddy, S. L., McDonough, M., Smith, M. K., Okoroafor, N., Jordt, H., and Wen- deroth, M. P. (2014). Active learning increases student performance in science, engineering, and mathematics.Proceedings of the National Academy of Sciences (PNAS), 111(23):8410–8415. https://doi.org/10.1073/pnas.1319030111

work page doi:10.1073/pnas.1319030111 2014
[17]

Furze, L., Perkins, M., Roe, J., and MacVaugh, J. (2024). The AI assessment scale (AIAS) in action: A pilot implementation of GenAI-supported assessment.Australasian Journal of Educational Technology, 40(4).https://doi.org/10.14742/ajet.9434

work page doi:10.14742/ajet.9434 2024
[18]

Gerlich, M. (2025). AI tools in society: Impacts on cognitive offloading and the future of critical thinking.Societies, 15(1):6.https://doi.org/10.3390/soc15010006

work page doi:10.3390/soc15010006 2025
[19]

Gick, M. L. and Holyoak, K. J. (1983). Schema induction and analogical transfer.Cognitive Psychology, 15(1):1–38.https://doi.org/10.1016/0010-0285(83)90002-6

work page doi:10.1016/0010-0285(83)90002-6 1983
[20]

Kalyuga, S. (2007). Expertise reversal effect and its implications for learner-tailored in- struction.Educational Psychology Review, 19(4):509–539. https://doi.org/10.1007/ s10648-007-9054-3

2007
[21]

Kapur, M. (2008). Productive failure.Cognition and Instruction, 26(3):379–424. https://doi. org/10.1080/07370000802212669

work page doi:10.1080/07370000802212669 2008
[22]

Kawecki, M. (2025). Mistrz. Documentary on Ryszard Szubartowski (III LO Gdynia), YouTube, 1 July 2025, https://www.youtube.com/watch?v=w20lk3OyLMI. A separate dramatized feature film was announced by Netflix in 2026, https://www.whats-on-netflix.com/news/ netflix-to-produce-polish-movie-about-the-teacher-who-mentored-the-minds-behind-openai/ . Press figur...

2025
[23]

Kestin, G., Miller, K., Klales, A., Milbourne, T., and Ponti, G. (2025). AI tutoring outperforms in- class active learning: an RCT introducing a novel research-based design in an authentic educational setting.Scientific Reports, 15(1):17458.https://doi.org/10.1038/s41598-025-97652-6

work page doi:10.1038/s41598-025-97652-6 2025
[24]

A., Sweller, J., and Clark, R

Kirschner, P. A., Sweller, J., and Clark, R. E. (2006). Why minimal guidance during instruction does not work: An analysis of the failure of constructivist, discovery, problem-based, experiential, and inquiry-based teaching.Educational Psychologist, 41(2):75–86. https://doi.org/10.1207/ s15326985ep4102_1

2006
[25]

Klein, C. R. and Klein, R. (2025). The extended hollowed mind: why foundational knowledge is indispensable in the age of AI.Frontiers in Artificial Intelligence, 8:1719019. https://doi. org/10.3389/frai.2025.1719019

work page doi:10.3389/frai.2025.1719019 2025
[27]

H., Sarkar, A., Tankelevitch, L., Drosos, I., Rintel, S., Banks, R., and Wilson, N

Lee, H.-P. H., Sarkar, A., Tankelevitch, L., Drosos, I., Rintel, S., Banks, R., and Wilson, N. (2025). The impact of generative AI on critical thinking: Self-reported reductions in cognitive effort and confidence effects from a survey of knowledge workers. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–22.https://d...

work page doi:10.3389/feduc.2026.1849821 2025
[28]

M., Howard, S., Bearman, M., Dawson, P., and Associates (2023)

Lodge, J. M., Howard, S., Bearman, M., Dawson, P., and Associates (2023). Assessment reform for the age of artificial intelligence. Discussion paper, Tertiary Education Quality and Stan- dards Agency (TEQSA). https://www.teqsa.gov.au/guides-resources/resources/ corporate-publications/assessment-reform-age-artificial-intelligence

2023
[29]

J., Collie, R

Martin, A. J., Collie, R. J., Kennett, R., Liu, D., Ginns, P., Sudimantara, L. B., Dewi, E. W., and Rüschenpöhler, L. G. (2025). Integrating generative AI and load reduction instruction to individualize and optimize students’ learning.Learning and Individual Differences, 121:102723. https://doi.org/10.1016/j.lindif.2025.102723

work page doi:10.1016/j.lindif.2025.102723 2025
[30]

Perkins, M., Roe, J., and Furze, L. (2025). Reimagining the artificial intelligence assessment scale: A refined framework for educational assessment.Journal of University Teaching & Learning Practice, 22(7).https://doi.org/10.53761/rrm4y757

work page doi:10.53761/rrm4y757 2025
[31]

Risko, E. F. and Gilbert, S. J. (2016). Cognitive offloading.Trends in Cognitive Sciences, 20(9):676– 688.https://doi.org/10.1016/j.tics.2016.07.002

work page doi:10.1016/j.tics.2016.07.002 2016
[32]

Roediger, H. L. and Karpicke, J. D. (2006). Test-enhanced learning: Taking memory tests improves long-term retention.Psychological Science, 17(3):249–255. https://doi.org/10.1111/j. 1467-9280.2006.01693.x

work page doi:10.1111/j 2006
[33]

L., Koomen, H

Roorda, D. L., Koomen, H. M. Y ., Spilt, J. L., and Oort, F. J. (2011). The influence of affective teacher-student relationships on students’ school engagement and achievement: A meta-analytic approach.Review of Educational Research, 81(4):493–529. https://doi.org/10.3102/ 0034654311421793

2011
[34]

Roscoe, R. D. and Chi, M. T. H. (2007). Understanding tutor learning: Knowledge-building and knowledge-telling in peer tutors’ explanations and questions.Review of Educational Research, 77(4):534–574.https://doi.org/10.3102/0034654307309920

work page doi:10.3102/0034654307309920 2007
[35]

Rotter, J., Benazet i Montobbio, P., and Hernández-Leo, D. (2026). Access timing as scaffolding: A reinforcement learning approach to GenAI in education. Preprint; single lab study, N=105. https://arxiv.org/abs/2605.15850

Pith/arXiv arXiv 2026
[36]

Shen, J. H. and Tamkin, A. (2026). How AI impacts skill formation. Preprint; randomized coding task, N=52.https://arxiv.org/abs/2601.20245

arXiv 2026
[37]

and Kapur, M

Sinha, T. and Kapur, M. (2021). When problem solving followed by instruction works: Evidence for productive failure.Review of Educational Research, 91(5):761–798. https://doi.org/10. 3102/00346543211019105

2021
[38]

Stankovic, M., Hirche, E., Kollatzsch, S., and Doetsch, J. N. (2025). Comment on: Your brain on ChatGPT: Accumulation of cognitive debt when using an AI assistant for essay writing tasks. Preprint; published critique of Kosmyna et al. (2025).https://arxiv.org/abs/2601.00856

arXiv 2025
[39]

(2011).Cognitive Load Theory

Sweller, J., Ayres, P., and Kalyuga, S. (2011).Cognitive Load Theory. Springer. https://doi. org/10.1007/978-1-4419-8126-4

work page doi:10.1007/978-1-4419-8126-4 2011
[40]

and Cooper, G

Sweller, J. and Cooper, G. A. (1985). The use of worked examples as a substitute for problem solving in learning algebra.Cognition and Instruction, 2(1):59–89. https://doi.org/10. 1207/s1532690xci0201_3. 15

1985
[41]

Tao, S. (2025). Aligning technology with cognitive development: a five-tiered framework to gen- erative AI in K-12 education.AI, Brain and Child, 1(1):20. https://doi.org/10.1007/ s44436-025-00024-0

2025
[42]

VanLehn, K. (2011). The relative effectiveness of human tutoring, intelligent tutoring systems, and other tutoring systems.Educational Psychologist, 46(4):197–221. https://doi.org/10.1080/ 00461520.2011.611369

arXiv 2011
[43]

and Johnston, S.-K

Vendrell, M. and Johnston, S.-K. (2026). Scaffolding critical thinking with generative AI: Design principles for integrating large language models in higher education.Computers and Education: Artificial Intelligence, 10:100572.https://doi.org/10.1016/j.caeai.2026.100572

work page doi:10.1016/j.caeai.2026.100572 2026
[44]

Walton, G. M. and Cohen, G. L. (2011). A brief social-belonging intervention improves academic and health outcomes of minority students.Science, 331(6023):1447–1451. https://doi.org/ 10.1126/science.1198364

work page doi:10.1126/science.1198364 2011
[45]

E., Ribeiro, A

Wang, R. E., Ribeiro, A. T., Robinson, C. D., Loeb, S., and Demszky, D. (2024). Tutor CoPilot: A human-AI approach for scaling real-time expertise. Preprint. https://arxiv.org/abs/2410. 03017

2024
[46]

and Hu, J

Yuan, B. and Hu, J. (2025). Bridging MOOCs, smart teaching, and AI: A decade of evolution toward a unified pedagogy. Preprint.https://arxiv.org/abs/2507.14266

arXiv 2025
[47]

Zhang, L., Lin, J., Kuang, Z., Xu, S., and Hu, X. (2024). SPL: A socratic playground for learning powered by large language model. Preprint.https://arxiv.org/abs/2406.13919. 16

arXiv 2024

[1] [1]

Bastani, H., Bastani, O., Sungu, A., Ge, H., Kabakcı, Ö., and Mariman, R. (2025). Generative AI without guardrails can harm learning: Evidence from high school mathematics.Proceedings of the National Academy of Sciences, 122(26):e2422633122. https://doi.org/10.1073/pnas. 2422633122

work page doi:10.1073/pnas 2025

[2] [2]

Bearman, M., Tai, J., Dawson, P., Boud, D., and Ajjawi, R. (2024). Developing evaluative judgement for a time of generative artificial intelligence.Assessment & Evaluation in Higher Education, 49(6):893–905.https://doi.org/10.1080/02602938.2024.2335321

work page doi:10.1080/02602938.2024.2335321 2024

[3] [3]

Biggs, J. (1996). Enhancing teaching through constructive alignment.Higher Education, 32:347–364. https://doi.org/10.1007/BF00138871

work page doi:10.1007/bf00138871 1996

[4] [4]

Bloom, B. S. (1984). The 2 sigma problem: The search for methods of group instruction as effective as one-to-one tutoring.Educational Researcher, 13(6):4–16. https://doi.org/10.3102/ 0013189X013006004

1984

[5] [5]

cognitive sovereignty

Brcic, M. (2025). The memory wars: AI memory, network effects, and the geopolitics of cognitive sovereignty. Preprint. Companion piece; source of the term “cognitive sovereignty”. https: //arxiv.org/abs/2508.05867

arXiv 2025

[6] [6]

and Liu, D

Bridgeman, A. and Liu, D. (2024). Frequently asked questions about the two-lane approach to assessment in the age of AI. Teaching@Sydney, University of Syd- ney. https://educational-innovation.sydney.edu.au/teaching@sydney/ frequently-asked-questions-about-the-two-lane-approach-to-assessment-in-the-age-of-ai/

2024

[7] [8]

Clark, D

Clark, A. and Chalmers, D. (1998). The extended mind.Analysis, 58(1):7–19. https://doi.org/ 10.1093/analys/58.1.7

work page doi:10.1093/analys/58.1.7 1998

[8] [9]

A., Kulik, J

Cohen, P. A., Kulik, J. A., and Kulik, C.-L. C. (1982). Educational outcomes of tutoring: A meta-analysis of findings.American Educational Research Journal, 19(2):237–248. https: //doi.org/10.3102/00028312019002237

work page doi:10.3102/00028312019002237 1982

[9] [10]

S., and Newman, S

Collins, A., Brown, J. S., and Newman, S. E. (1989). Cognitive apprenticeship: Teaching the crafts of reading, writing, and mathematics. In Resnick, L. B., editor,Knowing, Learning, and Instruction: Essays in Honor of Robert Glaser, pages 453–494. Lawrence Erlbaum

1989

[10] [11]

Crouch, C. H. and Mazur, E. (2001). Peer instruction: Ten years of experience and results.American Journal of Physics, 69(9):970–977.https://doi.org/10.1119/1.1374249. 13

work page doi:10.1119/1.1374249 2001

[11] [12]

(2021).Defending Assessment Security in a Digital World: Preventing E-Cheating and Supporting Academic Integrity in Higher Education

Dawson, P. (2021).Defending Assessment Security in a Digital World: Preventing E-Cheating and Supporting Academic Integrity in Higher Education. Routledge. Crossref registers 2020 (online); 2021 paperback.https://doi.org/10.4324/9780429324178

work page doi:10.4324/9780429324178 2021

[12] [13]

S., Miller, K., Callaghan, K., and Kestin, G

Deslauriers, L., McCarty, L. S., Miller, K., Callaghan, K., and Kestin, G. (2019). Measuring actual learning versus feeling of learning in response to being actively engaged in the classroom. Proceedings of the National Academy of Sciences (PNAS), 116(39):19251–19257. https://doi. org/10.1073/pnas.1821936116

work page doi:10.1073/pnas.1821936116 2019

[13] [14]

A., Marsh, E

Dunlosky, J., Rawson, K. A., Marsh, E. J., Nathan, M. J., and Willingham, D. T. (2013). Improving students’ learning with effective learning techniques.Psychological Science in the Public Interest, 14(1):4–58.https://doi.org/10.1177/1529100612453266

work page doi:10.1177/1529100612453266 2013

[14] [15]

Fan, Y ., Tang, L., Le, H., Shen, K., Tan, S., Zhao, Y ., Shen, Y ., Li, X., and Gaševi´c, D. (2025). Beware of metacognitive laziness: Effects of generative artificial intelligence on learning motivation, processes, and performance.British Journal of Educational Technology, 56(2):489–530. https: //doi.org/10.1111/bjet.13544

work page doi:10.1111/bjet.13544 2025

[15] [16]

L., McDonough, M., Smith, M

Freeman, S., Eddy, S. L., McDonough, M., Smith, M. K., Okoroafor, N., Jordt, H., and Wen- deroth, M. P. (2014). Active learning increases student performance in science, engineering, and mathematics.Proceedings of the National Academy of Sciences (PNAS), 111(23):8410–8415. https://doi.org/10.1073/pnas.1319030111

work page doi:10.1073/pnas.1319030111 2014

[16] [17]

Furze, L., Perkins, M., Roe, J., and MacVaugh, J. (2024). The AI assessment scale (AIAS) in action: A pilot implementation of GenAI-supported assessment.Australasian Journal of Educational Technology, 40(4).https://doi.org/10.14742/ajet.9434

work page doi:10.14742/ajet.9434 2024

[17] [18]

Gerlich, M. (2025). AI tools in society: Impacts on cognitive offloading and the future of critical thinking.Societies, 15(1):6.https://doi.org/10.3390/soc15010006

work page doi:10.3390/soc15010006 2025

[18] [19]

Gick, M. L. and Holyoak, K. J. (1983). Schema induction and analogical transfer.Cognitive Psychology, 15(1):1–38.https://doi.org/10.1016/0010-0285(83)90002-6

work page doi:10.1016/0010-0285(83)90002-6 1983

[19] [20]

Kalyuga, S. (2007). Expertise reversal effect and its implications for learner-tailored in- struction.Educational Psychology Review, 19(4):509–539. https://doi.org/10.1007/ s10648-007-9054-3

2007

[20] [21]

Kapur, M. (2008). Productive failure.Cognition and Instruction, 26(3):379–424. https://doi. org/10.1080/07370000802212669

work page doi:10.1080/07370000802212669 2008

[21] [22]

Kawecki, M. (2025). Mistrz. Documentary on Ryszard Szubartowski (III LO Gdynia), YouTube, 1 July 2025, https://www.youtube.com/watch?v=w20lk3OyLMI. A separate dramatized feature film was announced by Netflix in 2026, https://www.whats-on-netflix.com/news/ netflix-to-produce-polish-movie-about-the-teacher-who-mentored-the-minds-behind-openai/ . Press figur...

2025

[22] [23]

Kestin, G., Miller, K., Klales, A., Milbourne, T., and Ponti, G. (2025). AI tutoring outperforms in- class active learning: an RCT introducing a novel research-based design in an authentic educational setting.Scientific Reports, 15(1):17458.https://doi.org/10.1038/s41598-025-97652-6

work page doi:10.1038/s41598-025-97652-6 2025

[23] [24]

A., Sweller, J., and Clark, R

Kirschner, P. A., Sweller, J., and Clark, R. E. (2006). Why minimal guidance during instruction does not work: An analysis of the failure of constructivist, discovery, problem-based, experiential, and inquiry-based teaching.Educational Psychologist, 41(2):75–86. https://doi.org/10.1207/ s15326985ep4102_1

2006

[24] [25]

Klein, C. R. and Klein, R. (2025). The extended hollowed mind: why foundational knowledge is indispensable in the age of AI.Frontiers in Artificial Intelligence, 8:1719019. https://doi. org/10.3389/frai.2025.1719019

work page doi:10.3389/frai.2025.1719019 2025

[25] [27]

H., Sarkar, A., Tankelevitch, L., Drosos, I., Rintel, S., Banks, R., and Wilson, N

Lee, H.-P. H., Sarkar, A., Tankelevitch, L., Drosos, I., Rintel, S., Banks, R., and Wilson, N. (2025). The impact of generative AI on critical thinking: Self-reported reductions in cognitive effort and confidence effects from a survey of knowledge workers. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–22.https://d...

work page doi:10.3389/feduc.2026.1849821 2025

[26] [28]

M., Howard, S., Bearman, M., Dawson, P., and Associates (2023)

Lodge, J. M., Howard, S., Bearman, M., Dawson, P., and Associates (2023). Assessment reform for the age of artificial intelligence. Discussion paper, Tertiary Education Quality and Stan- dards Agency (TEQSA). https://www.teqsa.gov.au/guides-resources/resources/ corporate-publications/assessment-reform-age-artificial-intelligence

2023

[27] [29]

J., Collie, R

Martin, A. J., Collie, R. J., Kennett, R., Liu, D., Ginns, P., Sudimantara, L. B., Dewi, E. W., and Rüschenpöhler, L. G. (2025). Integrating generative AI and load reduction instruction to individualize and optimize students’ learning.Learning and Individual Differences, 121:102723. https://doi.org/10.1016/j.lindif.2025.102723

work page doi:10.1016/j.lindif.2025.102723 2025

[28] [30]

Perkins, M., Roe, J., and Furze, L. (2025). Reimagining the artificial intelligence assessment scale: A refined framework for educational assessment.Journal of University Teaching & Learning Practice, 22(7).https://doi.org/10.53761/rrm4y757

work page doi:10.53761/rrm4y757 2025

[29] [31]

Risko, E. F. and Gilbert, S. J. (2016). Cognitive offloading.Trends in Cognitive Sciences, 20(9):676– 688.https://doi.org/10.1016/j.tics.2016.07.002

work page doi:10.1016/j.tics.2016.07.002 2016

[30] [32]

Roediger, H. L. and Karpicke, J. D. (2006). Test-enhanced learning: Taking memory tests improves long-term retention.Psychological Science, 17(3):249–255. https://doi.org/10.1111/j. 1467-9280.2006.01693.x

work page doi:10.1111/j 2006

[31] [33]

L., Koomen, H

Roorda, D. L., Koomen, H. M. Y ., Spilt, J. L., and Oort, F. J. (2011). The influence of affective teacher-student relationships on students’ school engagement and achievement: A meta-analytic approach.Review of Educational Research, 81(4):493–529. https://doi.org/10.3102/ 0034654311421793

2011

[32] [34]

Roscoe, R. D. and Chi, M. T. H. (2007). Understanding tutor learning: Knowledge-building and knowledge-telling in peer tutors’ explanations and questions.Review of Educational Research, 77(4):534–574.https://doi.org/10.3102/0034654307309920

work page doi:10.3102/0034654307309920 2007

[33] [35]

Rotter, J., Benazet i Montobbio, P., and Hernández-Leo, D. (2026). Access timing as scaffolding: A reinforcement learning approach to GenAI in education. Preprint; single lab study, N=105. https://arxiv.org/abs/2605.15850

Pith/arXiv arXiv 2026

[34] [36]

Shen, J. H. and Tamkin, A. (2026). How AI impacts skill formation. Preprint; randomized coding task, N=52.https://arxiv.org/abs/2601.20245

arXiv 2026

[35] [37]

and Kapur, M

Sinha, T. and Kapur, M. (2021). When problem solving followed by instruction works: Evidence for productive failure.Review of Educational Research, 91(5):761–798. https://doi.org/10. 3102/00346543211019105

2021

[36] [38]

Stankovic, M., Hirche, E., Kollatzsch, S., and Doetsch, J. N. (2025). Comment on: Your brain on ChatGPT: Accumulation of cognitive debt when using an AI assistant for essay writing tasks. Preprint; published critique of Kosmyna et al. (2025).https://arxiv.org/abs/2601.00856

arXiv 2025

[37] [39]

(2011).Cognitive Load Theory

Sweller, J., Ayres, P., and Kalyuga, S. (2011).Cognitive Load Theory. Springer. https://doi. org/10.1007/978-1-4419-8126-4

work page doi:10.1007/978-1-4419-8126-4 2011

[38] [40]

and Cooper, G

Sweller, J. and Cooper, G. A. (1985). The use of worked examples as a substitute for problem solving in learning algebra.Cognition and Instruction, 2(1):59–89. https://doi.org/10. 1207/s1532690xci0201_3. 15

1985

[39] [41]

Tao, S. (2025). Aligning technology with cognitive development: a five-tiered framework to gen- erative AI in K-12 education.AI, Brain and Child, 1(1):20. https://doi.org/10.1007/ s44436-025-00024-0

2025

[40] [42]

VanLehn, K. (2011). The relative effectiveness of human tutoring, intelligent tutoring systems, and other tutoring systems.Educational Psychologist, 46(4):197–221. https://doi.org/10.1080/ 00461520.2011.611369

arXiv 2011

[41] [43]

and Johnston, S.-K

Vendrell, M. and Johnston, S.-K. (2026). Scaffolding critical thinking with generative AI: Design principles for integrating large language models in higher education.Computers and Education: Artificial Intelligence, 10:100572.https://doi.org/10.1016/j.caeai.2026.100572

work page doi:10.1016/j.caeai.2026.100572 2026

[42] [44]

Walton, G. M. and Cohen, G. L. (2011). A brief social-belonging intervention improves academic and health outcomes of minority students.Science, 331(6023):1447–1451. https://doi.org/ 10.1126/science.1198364

work page doi:10.1126/science.1198364 2011

[43] [45]

E., Ribeiro, A

Wang, R. E., Ribeiro, A. T., Robinson, C. D., Loeb, S., and Demszky, D. (2024). Tutor CoPilot: A human-AI approach for scaling real-time expertise. Preprint. https://arxiv.org/abs/2410. 03017

2024

[44] [46]

and Hu, J

Yuan, B. and Hu, J. (2025). Bridging MOOCs, smart teaching, and AI: A decade of evolution toward a unified pedagogy. Preprint.https://arxiv.org/abs/2507.14266

arXiv 2025

[45] [47]

Zhang, L., Lin, J., Kuang, Z., Xu, S., and Hu, X. (2024). SPL: A socratic playground for learning powered by large language model. Preprint.https://arxiv.org/abs/2406.13919. 16

arXiv 2024