Where's the Structure? A Systematic Literature Review of Empirical Research on Human-AI Collaboration and Hybrid Intelligence for Learning

Juan I. Asensio-P\'erez; Luis P. Prieto; Mar\'ia Jes\'us Rodr\'iguez-Triana; Mohamed Saban; Yannis Dimitriadis

arxiv: 2606.05222 · v1 · pith:67NQNIO6new · submitted 2026-05-30 · 💻 cs.CY · cs.AI· cs.HC

Where's the Structure? A Systematic Literature Review of Empirical Research on Human-AI Collaboration and Hybrid Intelligence for Learning

Luis P. Prieto , Juan I. Asensio-P\'erez , Mar\'ia Jes\'us Rodr\'iguez-Triana , Mohamed Saban , Yannis Dimitriadis This is my paper

Pith reviewed 2026-06-28 18:22 UTC · model grok-4.3

classification 💻 cs.CY cs.AIcs.HC

keywords human-AI collaborationhybrid intelligencesystematic literature reviewlearning supportcollaboration structureseducational AIdesign knowledgeresearch gaps

0 comments

The pith

A review of 62 studies shows that human-AI collaboration for learning benefits from structured processes and maps current structures and gaps.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper conducts a systematic literature review of 62 empirical studies on human-AI collaboration and hybrid intelligence for learning. It characterizes the collaboration processes, their structures, and the contexts in which they are applied. The review also extracts emerging design knowledge and highlights research gaps in the field. A sympathetic reader would care because, as with human collaboration, unstructured interactions may not yield effective learning outcomes. The findings provide a starting point for designing more effective AI-enhanced educational technologies.

Core claim

The review of 62 empirical studies characterizes collaboration processes, their structures, and contexts of application in human-AI collaboration for learning, while extracting emerging design knowledge and research gaps.

What carries the argument

The systematic literature review of empirical studies on human-AI collaboration, used to identify and categorize collaboration structures and processes.

Load-bearing premise

The assumption that the 62 selected studies adequately represent the entire field of human-AI collaboration for learning and that the categorization of structures and gaps is accurate.

What would settle it

Finding a significant number of additional empirical studies with effective unstructured human-AI collaboration, or a replication review reaching substantially different conclusions on structures and gaps.

read the original abstract

Artificial intelligence (AI) has been applied across educational contexts to support learning. One approach to such support is "human-AI collaboration" (also termed "hybrid intelligence"), where human(s) and AI components interact to promote human learning. However, as in human-to-human computer-supported collaborative learning (CSCL), unstructured interaction does not necessarily produce an effective learning experience. This paper reports a systematic literature review of empirical studies (N=62) on human-AI collaboration and hybrid intelligence for learning support. The review characterizes collaboration processes, their structures, and contexts of application. It also extracts emerging design knowledge and research gaps. Researchers and technology designers can use these findings as a starting point for structuring more effective AI-enhanced technologies for collaboration, in educational practice and future research.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a lit review of 62 studies on human-AI collaboration in learning that flags the need for structure and lists some gaps, but the complete absence of search methods, criteria, or synthesis details makes the whole thing hard to trust.

read the letter

The main takeaway is that this paper reviews 62 empirical studies on human-AI collaboration and hybrid intelligence for learning. It characterizes the processes, structures, and contexts involved, then pulls out emerging design knowledge and research gaps. The authors correctly note that unstructured interaction is unlikely to produce strong learning outcomes, drawing the parallel to CSCL.

What the paper does is organize existing work in one place and give designers a high-level map of where the field stands. That kind of synthesis can save time for people entering the area or looking for patterns across studies.

The soft spot is straightforward and central: the abstract states the sample size and goals but supplies no information at all on databases searched, search strings, inclusion or exclusion rules, screening process, quality assessment, data extraction, or how the thematic synthesis was done. Without those steps, there is no way to judge whether the 62 studies are representative or whether the reported structures and gaps reflect the literature rather than selection choices. For a systematic review, that is a load-bearing problem.

This work is aimed at researchers and designers in AI for education who want a quick overview of empirical human-AI collaboration studies. A reader already familiar with the subfield might skim it for the gap list, but anyone planning to cite the specific characterizations would need the full methods first.

It deserves peer review because the topic matters and the intent to structure the literature is reasonable, but the methods section would have to be added and scrutinized before any acceptance.

Referee Report

2 major / 1 minor

Summary. The manuscript reports a systematic literature review of N=62 empirical studies on human-AI collaboration and hybrid intelligence for learning. It aims to characterize collaboration processes, their structures, and contexts of application, extract emerging design knowledge, and identify research gaps to inform the design of AI-enhanced collaborative technologies.

Significance. If the synthesis rests on a rigorous, transparent, and representative sample with clearly documented methods, the extracted structures, design knowledge, and gaps could provide a useful foundation for researchers and designers working at the intersection of AI and computer-supported collaborative learning.

major comments (2)

[Abstract] Abstract: The abstract states N=62 and high-level goals but supplies no information on search strategy, inclusion criteria, quality assessment, or inter-rater reliability, so it is impossible to judge whether the synthesis supports the stated claims about collaboration processes and research gaps.
[Methods (or equivalent section describing the review process)] Review process / Methods section: No description of databases searched, search strings, screening process, data extraction protocol, or thematic synthesis method is provided. This information is load-bearing for determining whether the 62 studies constitute a representative sample and whether the reported structures and gaps accurately reflect the literature.

minor comments (1)

Add a PRISMA flow diagram or equivalent table documenting the study identification, screening, and inclusion steps.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive feedback on the transparency of our systematic review. We agree that the current manuscript version does not adequately document the review process and will revise accordingly to strengthen the paper.

read point-by-point responses

Referee: [Abstract] Abstract: The abstract states N=62 and high-level goals but supplies no information on search strategy, inclusion criteria, quality assessment, or inter-rater reliability, so it is impossible to judge whether the synthesis supports the stated claims about collaboration processes and research gaps.

Authors: We agree that the abstract should convey more information about the review's methodological rigor. In the revised version we will expand the abstract (within length constraints) to include a concise statement on the search strategy, inclusion/exclusion criteria, quality assessment, and synthesis approach. revision: yes
Referee: [Methods (or equivalent section describing the review process)] Review process / Methods section: No description of databases searched, search strings, screening process, data extraction protocol, or thematic synthesis method is provided. This information is load-bearing for determining whether the 62 studies constitute a representative sample and whether the reported structures and gaps accurately reflect the literature.

Authors: We acknowledge that the submitted manuscript lacks a sufficiently detailed Methods section. We will add a complete Methods section that reports the databases searched, exact search strings, PRISMA screening process, data extraction protocol, inter-rater reliability statistics, and the thematic synthesis procedure. This addition will directly address the concern about representativeness and allow readers to evaluate the validity of the extracted structures and gaps. revision: yes

Circularity Check

0 steps flagged

No circularity: literature review with no derivations or self-referential claims

full rationale

This is a systematic literature review synthesizing findings from 62 external empirical studies. No models, equations, parameters, predictions, or uniqueness theorems are derived. The central claims rest on external literature rather than any reduction to the authors' own inputs, fits, or prior self-citations. No load-bearing steps match the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The review's claims rest on the assumption that the literature search was exhaustive and unbiased and that the selected studies can be meaningfully compared; these are standard but unverified domain assumptions for any SLR.

axioms (1)

domain assumption The 62 studies identified through the search strategy are representative of empirical research on human-AI collaboration for learning
Required for the characterization of processes, structures, and gaps to be generalizable.

pith-pipeline@v0.9.1-grok · 5697 in / 1036 out tokens · 28539 ms · 2026-06-28T18:22:13.357630+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

42 extracted references · 11 canonical work pages

[1]

The form combined closed categories (e.g., educational level in IQ1.2; micro-/macro-type of collaboration structure as part of IQ3.1) with open responses (e.g., how is the collaboration process between learner and AI? IQ2.4). All team members independently coded a purposefully selected subset of four sources; disagreements and doubts were discussed until ...

2006
[2]

Regarding educational settings (IQ1.2, Figure 3, top-right), most studies were conducted in higher education (57/62 studies)

largely reflects our aim of identifying more established work (and the databases selected). Regarding educational settings (IQ1.2, Figure 3, top-right), most studies were conducted in higher education (57/62 studies). Only a few addressed other levels, such as C.-H. Lin et al.’s (2025) exploration of GenAI for second-language writing in K-12, or were leve...

2025
[3]

with versus without

and project-based learning (14 papers), which is consistent with our focus on HAIC/HI and with the fact that projects are often developed through teamwork. Figure 3 WHERE’S THE STRUCTURE? 16 Descriptive characteristics of the N=62 reviewed papers. The reviewed papers’ research designs (IQ1.6) also reflect an early-stage, technology-driven field. Most were...

2024
[4]

hybrid adaptivity

define HI as the "combination of human and machine intelligence, augmenting human intellect and capabilities instead of replacing them and achieving goals that were unreachable by either humans or machines" (p. 491). Notably, none of the reviewed studies reconceptualized these general definitions for learning-specific contexts. The work of Dellermann and ...

2021
[5]

AI assistant

further triangulates researchers' conceptualizations. A large majority (52/62 papers) used an "AI assistant" mode, in which humans offload tasks or expect answers from the AI (e.g., a ChatGPT chatbot). Despite the HAIC/HI framing, only 10 studies portrayed a "teammate AI" — in which the AI operates as a roughly equal team member (e.g., a system promoting ...

2023
[6]

collaboration

or as a teammate (Darban, 2024), reflecting the terminological fuzziness typical of early-stage research areas (cf. Borup et al., 2006; Rip & Voß, 2013). An inductive analysis of collaboration setups (IQ2.4) showed that most studies gave learners a vanilla, non-customized AI chatbot for student-initiated "collaboration" (34 studies), with some providing a...

2024
[7]

macro scripts/structures

or process-specific (e.g., designated AI use phases, as in Min et al., 2025). Technological scaffolding integrated into system interfaces was less common (6 papers). Another recurring setup inserted the AI as an actor in human-to-human conversations (6 papers; e.g., Gutiérrez-Ferré et al., 2024). While learner initiative dominated, notable exceptions exis...

2025
[8]

micro scripts/structures

– and an overlapping set of "micro scripts/structures" (25/38), which scaffold the interaction itself through devices such as sentence starters or question prompts (cf. Kobbe et al., 2007). As noted under IQ2.4, these structures were often delivered socially (15/38, e.g., teacher instructions as in Garro Mena,

2007
[9]

In 21 of those 38 studies, the constraints were additionally driven by LLM prompting (as in CodeTutor, Lyu et al.,

but were sometimes also embedded in technology (15/38, e.g., user interface phases/transitions, as in Weber et al., 2025). In 21 of those 38 studies, the constraints were additionally driven by LLM prompting (as in CodeTutor, Lyu et al.,

2025
[10]

AI creates/answers, humans refine

"AI creates/answers, humans refine" (46/62 papers): Students delegate tasks or pose questions to the AI, then assess the AI-generated artefacts or answers, deciding whether further refinement is needed and engaging in multiple cycles of human feedback and AI refinement, either by modifying outputs themselves or by issuing additional requests to the AI. In...

2025
[11]

Humans create, AI assesses

"Humans create, AI assesses" (18/62 papers): Students carry out a learning task, typically producing an artefact such as a document or drawing, and at various points may request AI assessment and feedback. That feedback may be structured by a pedagogical framework (akin to a micro structure) or constrained loosely, or not at all, e.g., through LLM prompti...

2025
[12]

AI participation in human collaboration

"AI participation in human collaboration" (7/62 papers): Students collaborate through computer support (e.g., a chat), and the AI monitors that collaboration, intervening in three ways: (a) providing an assessment of the ongoing human collaboration (Cai et al., 2024; Sankaranarayanan et al.,

2024
[13]

coach" or

– for example, a chatbot detecting unequal participation and suggesting more opportunities for less-active students (Cai et al., 2024); (b) answering a question raised by a participating student (Cai et al., 2024), potentially triggering a shift toward structure #2; or (c) playing the role of another team member (5 papers) – for instance, an LLM-based cha...

2024
[14]

Beyond these collaboration structures, a key concern in recent literature is the tension among human control vs

AI participation in human collaboration (bottom). Beyond these collaboration structures, a key concern in recent literature is the tension among human control vs. AI automation, and agency (IQ3.2) which has implications for the reliability, safety, and trustworthiness of human-centered AI (HCAI) approaches (Shneiderman, 2020). These tensions are well-reco...

2020
[15]

metacognitive laziness

or "metacognitive laziness" (Y. Fan et al., 2025). Regarding human vs. AI control, only in Mlynář (2024) do students exercise full control over the AI tool, as they are the ones actually building the machine learning model. Gyasi (2025) presents another notable case, allowing students to change the AI's "mode of contribution" (see IQ2.3) by switching betw...

2025
[16]

we encourage graduate students to reassess their SMART goals and action plans with their human mentors for personalized support

and find advantages in AI agents assuming various roles/personas, thus being perceived as realistic, competent, and dependable partners (Edwards et al., 2025). The reviewed papers offer a wide variety of more concrete design guidelines. These include using gamification to increase motivation (Aslan et al., 2024), having humans double-check AI outputs (“we...

2025
[17]

and WHERE’S THE STRUCTURE? 27 integrating interventions in wider contexts (e.g., going beyond coding to computational modeling, in Chen et al., 2024). Other proposals include combining AI agents with self-monitoring SRL (Self-Regulated Learning) checklists to reduce machine dominance and increase humanization of feedback (Darvishi et al.,

2024
[18]

are particularly good at recognizing errors or misunderstandings in students' tasks

and leveraging AI misrecognition as a learning opportunity rather than merely an error (Song et al., 2022). Since AI systems "are particularly good at recognizing errors or misunderstandings in students' tasks" (Weber et al., 2025, p. 669), instructional and technology designers can use them to spot specific errors and provide targeted formative feedback ...

2022
[19]

Authors also stress the need for AI literacy, including prompting skills (6 papers) and broader understanding of AI capabilities and limitations (12 papers)

(10 papers) and using misrecognitions or failures as learning opportunities (Song et al., 2022). Authors also stress the need for AI literacy, including prompting skills (6 papers) and broader understanding of AI capabilities and limitations (12 papers). Other works call for aligning HAIC/HI interventions with overall curriculum (e.g., Hwang et al.,

2022
[20]

co-learning

(9 papers), or for creating a community of all related stakeholders, e.g., educators, researchers, and developers (Alier et al., 2025). Worth highlighting on its own is Bosch et al.'s (2025) work, noted above as a rare case of genuine "co-learning", in which both human and AI learn through interaction with each other and the environment. The authors provi...

2025
[21]

co-creation

explicitly identify HAIC/HI structures and their associated interactions as a primary target for future research, reinforcing the conclusion from IQ2.4 that the collaborative dimension of HAIC/HI remains peripheral to most reviewed works. Discussion Implications of the Review Our analysis of 62 empirical studies on HAIC/HI for student learning reflects an...

2025
[22]

Hybrid intelligence

– remains WHERE’S THE STRUCTURE? 30 debatable, given the typical disparity in human and AI goals and the fact that, in the overwhelming majority of cases, AIs do not learn (see Bosch et al., 2025 for an exception). "Hybrid intelligence" is therefore more literally precise, focusing less on the interaction and relationships between humans and AIs, and more...

2025
[23]

collaboration

More detailed descriptions of human-AI interactions: as established in (CS)CL research (e.g., Dillenbourg et al., 1996; Stahl, 2006), granular accounts of "collaboration" are essential for understanding learning phenomena in these new settings

1996
[24]

Exploration of more complex collaboration structures: the structures surfaced in this review are relatively primitive (see IQ3.1). Emerging theoretical and empirical work is starting to identify more sophisticated micro- and macro-level HAIC/HI structures (e.g., Maya, 2024; Prieto et al., 2023), but empirical testing and comparison (against each other or ...

2024
[25]

novelty effects

More longitudinal studies: moving beyond one-shot interventions is necessary to understand HAIC/HI as a potentially distinct form of learning, to rule out "novelty effects", and to examine both its benefits and unwanted side-effects (such as “metacognitive laziness” in Y. Fan et al., 2025)

2025
[26]

Decision Support

More rigorous evaluation of learning gains and learner skills, given that assessing the effects of interventions on learning (not merely artifact quality or WHERE’S THE STRUCTURE? 31 task efficiency), despite being the ostensible goal of the field, remains comparatively rare (see IQ1.9 above; and Yan, Greiff, et al., 2025 for a broader discussion in the c...

2025
[27]

peer/equal

– such as assistant, coach, or teammate – the rarely-encountered "peer/equal" role, in which AI mimics a fellow student with a comparable knowledge level, merits deeper exploration. Adjacent literature reviews identify overlapping recurrent roles: learning partner (Deng & Yu, 2023), motivator, peer/co-learner (Han et al.,

2023
[28]

– all of which can be repurposed in a way that is pedagogically sound. Sharples (2023) proposes more specific AI functions for learning conversations: possibility engine, Socratic opponent, collaboration coach… categories that partly overlap our empirically-sourced ones. Other emerging sources describe pedagogical roles for GenAI in WHERE’S THE STRUCTURE?...

2023
[29]

where is the structure?

can be directly incorporated into prompts in LLM-based HAIC/HI systems. The paper's title question ("where is the structure?") also points to how structures are implemented. Our synthesis (IQ2.4, IQ3.1) reveals a majority of studies relying on socially-implemented structures, such as teacher instructions. This approach is flexible and low-cost but unrelia...

2013
[30]

We therefore acknowledge that, by publication, some findings may already be outdated

made for a slower review and writeup. We therefore acknowledge that, by publication, some findings may already be outdated. That said, our unsystematic reading of more recent sources appears to confirm the general trends identified (definitional fuzziness, lack of interaction detail, failure to measure learning – see, e.g., Ong et al., 2026; F. Zhang et a...

2026
[31]

may follow. Regarding future research, a priority should be convening groups of HAIC/HI (for learning) experts to engage in pattern synthesis, a practice well established in collaborative learning and learning design research (Baggetun et al., 2004; Goodyear,

2004
[32]

energy crisis

with the goal of deriving formal design patterns in the Alexandrian sense. Given the volume of recent work and likely tacit knowledge accumulating among researchers not yet codified in publications, formats such as Mor and colleagues’ "participatory pattern workshops" could be especially productive for building the design knowledge needed to support more ...

work page doi:10.13039/501100011033 2012
[33]

https://doi.org/10.48550/ARXIV.2503.16307 Akata, Z., Balliet, D., De Rijke, M., Dignum, F., Dignum, V., Eiben, G., Fokkens, A., Grossi, D., Hindriks, K., Hoos, H., Hung, H., Jonker, C., Monz, C., Neerincx, M., Oliehoek, F., Prakken, H., Schlobach, S., Van Der Gaag, L., Van Harmelen, F., … Welling, M. (2020). A Research Agenda for Hybrid Intelligence: Augm...

work page doi:10.48550/arxiv.2503.16307 2020
[34]

F., & Angel, S

https://doi.org/10.1007/s44163-024-00203-7 Alexander, C., Ishikawa, S., Silverstein, M., Jacobson, M., King, I. F., & Angel, S. (1977). A pattern language: Towns, buildings, construction. Oxford University Press. Alier, M., Pereira, J., García-Peñalvo, F. J., Casañ, M. J., & Cabré, J. (2025). LAMB: An open-source software framework to create artificial in...

work page doi:10.1007/s44163-024-00203-7 1977
[35]

Collaborative Learning

https://doi.org/10.3390/su15042940 Dillenbourg, P. (1999). What do you mean by “Collaborative Learning”? In P. Dillenbourg (Ed.), Collaborative Learning. Cognitive and Computational Approaches (pp. 1–19). Elsevier Science. Dillenbourg, P. (2013). Design for classroom orchestration. Computers and Education, 69, 485–492. WHERE’S THE STRUCTURE? 42 Dillenbour...

work page doi:10.3390/su15042940 1999
[36]

Let’s Ask the Robot!

https://doi.org/10.1186/s40594-025-00537-3 Fan, Y., Tang, L., Le, H., Shen, K., Tan, S., Zhao, Y., Shen, Y., Li, X., & Gašević, D. (2025). Beware of metacognitive laziness: Effects of generative artificial intelligence on learning motivation, processes, and performance. British Journal of Educational Technology, 56(2), 489–530. https://doi.org/10.1111/bje...

work page doi:10.1186/s40594-025-00537-3 2025
[37]

https://doi.org/10.1186/s41239-023-00426-1 Lee, G.-G., Mun, S., Shin, M.-K., & Zhai, X. (2025). Collaborative Learning with Artificial Intelligence Speakers: Pre-service Elementary Science Teachers’ Responses to the WHERE’S THE STRUCTURE? 49 Prototype. Science & Education, 34(2), 847–875. https://doi.org/10.1007/s11191-024-00526-y Lin, C.-H., Zhou, K., Li...

work page doi:10.1186/s41239-023-00426-1 2025
[38]

https://doi.org/10.5334/2008-13 WHERE’S THE STRUCTURE? 51 Nguyen, A., Hong, Y., Dang, B., & Huang, X. (2024). Human-AI collaboration patterns in AI-assisted academic writing. Studies in Higher Education, 49(5), 847–864. https://doi.org/10.1080/03075079.2024.2323593 Nguyen, A., Ilesanmi, F., Dang, B., Vuorenmaa, E., & Järvelä, S. (2024). Hybrid Intelligenc...

work page doi:10.5334/2008-13 2008
[39]

Human-AI Collaboration

https://doi.org/10.3390/fi16080268 Prieto, L. P., Asensio-Perez, J. I., Munoz-Cristobal, J. A., Dimitriadis, Y. A., Jorrin-Abellan, I. M., & Gomez-Sanchez, E. (2013). Enabling Teachers to Deploy CSCL Designs across WHERE’S THE STRUCTURE? 53 Distributed Learning Environments. IEEE Transactions on Learning Technologies, 6(4), 324–336. https://doi.org/10.110...

work page doi:10.3390/fi16080268 2013
[40]

https://doi.org/10.1613/jair.1.12360 Wiethof, C., & Bittner, E. A. C. (2021). Hybrid Intelligence – Combining the Human in the Loop with the Computer in the Loop: A Systematic Literature Review. International Conference on Information Systems (ICIS). Wollny, S., Schneider, J., Di Mitri, D., Weidlich, J., Rittberger, M., & Drachsler, H. (2021). Are We Ther...

work page doi:10.1613/jair.1.12360 2021
[41]

https://doi.org/10.1186/s41239-019-0171-0 Zhai, C., Wibowo, S., & Li, L. D. (2024). The effects of over-reliance on AI dialogue systems on students’ cognitive abilities: A systematic review. Smart Learning Environments, 11(1),

work page doi:10.1186/s41239-019-0171-0 2024
[42]

All content was reviewed and verified by the author

https://doi.org/10.1186/s40561-024-00316-7 Zhang, F., Gou, J., Shen, K. N., Camarinha-Matos, L. M., & Wang, Z. (2025). Effects of AI teammates on learning behavior in Human-AI collaboration environments: A perspective on self-regulated learning. Education and Information Technologies, 30(18), 26801–26825. https://doi.org/10.1007/s10639-025-13717-z WHERE’S...

work page doi:10.1186/s40561-024-00316-7 2025

[1] [1]

The form combined closed categories (e.g., educational level in IQ1.2; micro-/macro-type of collaboration structure as part of IQ3.1) with open responses (e.g., how is the collaboration process between learner and AI? IQ2.4). All team members independently coded a purposefully selected subset of four sources; disagreements and doubts were discussed until ...

2006

[2] [2]

Regarding educational settings (IQ1.2, Figure 3, top-right), most studies were conducted in higher education (57/62 studies)

largely reflects our aim of identifying more established work (and the databases selected). Regarding educational settings (IQ1.2, Figure 3, top-right), most studies were conducted in higher education (57/62 studies). Only a few addressed other levels, such as C.-H. Lin et al.’s (2025) exploration of GenAI for second-language writing in K-12, or were leve...

2025

[3] [3]

with versus without

and project-based learning (14 papers), which is consistent with our focus on HAIC/HI and with the fact that projects are often developed through teamwork. Figure 3 WHERE’S THE STRUCTURE? 16 Descriptive characteristics of the N=62 reviewed papers. The reviewed papers’ research designs (IQ1.6) also reflect an early-stage, technology-driven field. Most were...

2024

[4] [4]

hybrid adaptivity

define HI as the "combination of human and machine intelligence, augmenting human intellect and capabilities instead of replacing them and achieving goals that were unreachable by either humans or machines" (p. 491). Notably, none of the reviewed studies reconceptualized these general definitions for learning-specific contexts. The work of Dellermann and ...

2021

[5] [5]

AI assistant

further triangulates researchers' conceptualizations. A large majority (52/62 papers) used an "AI assistant" mode, in which humans offload tasks or expect answers from the AI (e.g., a ChatGPT chatbot). Despite the HAIC/HI framing, only 10 studies portrayed a "teammate AI" — in which the AI operates as a roughly equal team member (e.g., a system promoting ...

2023

[6] [6]

collaboration

or as a teammate (Darban, 2024), reflecting the terminological fuzziness typical of early-stage research areas (cf. Borup et al., 2006; Rip & Voß, 2013). An inductive analysis of collaboration setups (IQ2.4) showed that most studies gave learners a vanilla, non-customized AI chatbot for student-initiated "collaboration" (34 studies), with some providing a...

2024

[7] [7]

macro scripts/structures

or process-specific (e.g., designated AI use phases, as in Min et al., 2025). Technological scaffolding integrated into system interfaces was less common (6 papers). Another recurring setup inserted the AI as an actor in human-to-human conversations (6 papers; e.g., Gutiérrez-Ferré et al., 2024). While learner initiative dominated, notable exceptions exis...

2025

[8] [8]

micro scripts/structures

– and an overlapping set of "micro scripts/structures" (25/38), which scaffold the interaction itself through devices such as sentence starters or question prompts (cf. Kobbe et al., 2007). As noted under IQ2.4, these structures were often delivered socially (15/38, e.g., teacher instructions as in Garro Mena,

2007

[9] [9]

In 21 of those 38 studies, the constraints were additionally driven by LLM prompting (as in CodeTutor, Lyu et al.,

but were sometimes also embedded in technology (15/38, e.g., user interface phases/transitions, as in Weber et al., 2025). In 21 of those 38 studies, the constraints were additionally driven by LLM prompting (as in CodeTutor, Lyu et al.,

2025

[10] [10]

AI creates/answers, humans refine

"AI creates/answers, humans refine" (46/62 papers): Students delegate tasks or pose questions to the AI, then assess the AI-generated artefacts or answers, deciding whether further refinement is needed and engaging in multiple cycles of human feedback and AI refinement, either by modifying outputs themselves or by issuing additional requests to the AI. In...

2025

[11] [11]

Humans create, AI assesses

"Humans create, AI assesses" (18/62 papers): Students carry out a learning task, typically producing an artefact such as a document or drawing, and at various points may request AI assessment and feedback. That feedback may be structured by a pedagogical framework (akin to a micro structure) or constrained loosely, or not at all, e.g., through LLM prompti...

2025

[12] [12]

AI participation in human collaboration

"AI participation in human collaboration" (7/62 papers): Students collaborate through computer support (e.g., a chat), and the AI monitors that collaboration, intervening in three ways: (a) providing an assessment of the ongoing human collaboration (Cai et al., 2024; Sankaranarayanan et al.,

2024

[13] [13]

coach" or

– for example, a chatbot detecting unequal participation and suggesting more opportunities for less-active students (Cai et al., 2024); (b) answering a question raised by a participating student (Cai et al., 2024), potentially triggering a shift toward structure #2; or (c) playing the role of another team member (5 papers) – for instance, an LLM-based cha...

2024

[14] [14]

Beyond these collaboration structures, a key concern in recent literature is the tension among human control vs

AI participation in human collaboration (bottom). Beyond these collaboration structures, a key concern in recent literature is the tension among human control vs. AI automation, and agency (IQ3.2) which has implications for the reliability, safety, and trustworthiness of human-centered AI (HCAI) approaches (Shneiderman, 2020). These tensions are well-reco...

2020

[15] [15]

metacognitive laziness

or "metacognitive laziness" (Y. Fan et al., 2025). Regarding human vs. AI control, only in Mlynář (2024) do students exercise full control over the AI tool, as they are the ones actually building the machine learning model. Gyasi (2025) presents another notable case, allowing students to change the AI's "mode of contribution" (see IQ2.3) by switching betw...

2025

[16] [16]

we encourage graduate students to reassess their SMART goals and action plans with their human mentors for personalized support

and find advantages in AI agents assuming various roles/personas, thus being perceived as realistic, competent, and dependable partners (Edwards et al., 2025). The reviewed papers offer a wide variety of more concrete design guidelines. These include using gamification to increase motivation (Aslan et al., 2024), having humans double-check AI outputs (“we...

2025

[17] [17]

and WHERE’S THE STRUCTURE? 27 integrating interventions in wider contexts (e.g., going beyond coding to computational modeling, in Chen et al., 2024). Other proposals include combining AI agents with self-monitoring SRL (Self-Regulated Learning) checklists to reduce machine dominance and increase humanization of feedback (Darvishi et al.,

2024

[18] [18]

are particularly good at recognizing errors or misunderstandings in students' tasks

and leveraging AI misrecognition as a learning opportunity rather than merely an error (Song et al., 2022). Since AI systems "are particularly good at recognizing errors or misunderstandings in students' tasks" (Weber et al., 2025, p. 669), instructional and technology designers can use them to spot specific errors and provide targeted formative feedback ...

2022

[19] [19]

Authors also stress the need for AI literacy, including prompting skills (6 papers) and broader understanding of AI capabilities and limitations (12 papers)

(10 papers) and using misrecognitions or failures as learning opportunities (Song et al., 2022). Authors also stress the need for AI literacy, including prompting skills (6 papers) and broader understanding of AI capabilities and limitations (12 papers). Other works call for aligning HAIC/HI interventions with overall curriculum (e.g., Hwang et al.,

2022

[20] [20]

co-learning

(9 papers), or for creating a community of all related stakeholders, e.g., educators, researchers, and developers (Alier et al., 2025). Worth highlighting on its own is Bosch et al.'s (2025) work, noted above as a rare case of genuine "co-learning", in which both human and AI learn through interaction with each other and the environment. The authors provi...

2025

[21] [21]

co-creation

explicitly identify HAIC/HI structures and their associated interactions as a primary target for future research, reinforcing the conclusion from IQ2.4 that the collaborative dimension of HAIC/HI remains peripheral to most reviewed works. Discussion Implications of the Review Our analysis of 62 empirical studies on HAIC/HI for student learning reflects an...

2025

[22] [22]

Hybrid intelligence

– remains WHERE’S THE STRUCTURE? 30 debatable, given the typical disparity in human and AI goals and the fact that, in the overwhelming majority of cases, AIs do not learn (see Bosch et al., 2025 for an exception). "Hybrid intelligence" is therefore more literally precise, focusing less on the interaction and relationships between humans and AIs, and more...

2025

[23] [23]

collaboration

More detailed descriptions of human-AI interactions: as established in (CS)CL research (e.g., Dillenbourg et al., 1996; Stahl, 2006), granular accounts of "collaboration" are essential for understanding learning phenomena in these new settings

1996

[24] [24]

Exploration of more complex collaboration structures: the structures surfaced in this review are relatively primitive (see IQ3.1). Emerging theoretical and empirical work is starting to identify more sophisticated micro- and macro-level HAIC/HI structures (e.g., Maya, 2024; Prieto et al., 2023), but empirical testing and comparison (against each other or ...

2024

[25] [25]

novelty effects

More longitudinal studies: moving beyond one-shot interventions is necessary to understand HAIC/HI as a potentially distinct form of learning, to rule out "novelty effects", and to examine both its benefits and unwanted side-effects (such as “metacognitive laziness” in Y. Fan et al., 2025)

2025

[26] [26]

Decision Support

More rigorous evaluation of learning gains and learner skills, given that assessing the effects of interventions on learning (not merely artifact quality or WHERE’S THE STRUCTURE? 31 task efficiency), despite being the ostensible goal of the field, remains comparatively rare (see IQ1.9 above; and Yan, Greiff, et al., 2025 for a broader discussion in the c...

2025

[27] [27]

peer/equal

– such as assistant, coach, or teammate – the rarely-encountered "peer/equal" role, in which AI mimics a fellow student with a comparable knowledge level, merits deeper exploration. Adjacent literature reviews identify overlapping recurrent roles: learning partner (Deng & Yu, 2023), motivator, peer/co-learner (Han et al.,

2023

[28] [28]

– all of which can be repurposed in a way that is pedagogically sound. Sharples (2023) proposes more specific AI functions for learning conversations: possibility engine, Socratic opponent, collaboration coach… categories that partly overlap our empirically-sourced ones. Other emerging sources describe pedagogical roles for GenAI in WHERE’S THE STRUCTURE?...

2023

[29] [29]

where is the structure?

can be directly incorporated into prompts in LLM-based HAIC/HI systems. The paper's title question ("where is the structure?") also points to how structures are implemented. Our synthesis (IQ2.4, IQ3.1) reveals a majority of studies relying on socially-implemented structures, such as teacher instructions. This approach is flexible and low-cost but unrelia...

2013

[30] [30]

We therefore acknowledge that, by publication, some findings may already be outdated

made for a slower review and writeup. We therefore acknowledge that, by publication, some findings may already be outdated. That said, our unsystematic reading of more recent sources appears to confirm the general trends identified (definitional fuzziness, lack of interaction detail, failure to measure learning – see, e.g., Ong et al., 2026; F. Zhang et a...

2026

[31] [31]

may follow. Regarding future research, a priority should be convening groups of HAIC/HI (for learning) experts to engage in pattern synthesis, a practice well established in collaborative learning and learning design research (Baggetun et al., 2004; Goodyear,

2004

[32] [32]

energy crisis

with the goal of deriving formal design patterns in the Alexandrian sense. Given the volume of recent work and likely tacit knowledge accumulating among researchers not yet codified in publications, formats such as Mor and colleagues’ "participatory pattern workshops" could be especially productive for building the design knowledge needed to support more ...

work page doi:10.13039/501100011033 2012

[33] [33]

https://doi.org/10.48550/ARXIV.2503.16307 Akata, Z., Balliet, D., De Rijke, M., Dignum, F., Dignum, V., Eiben, G., Fokkens, A., Grossi, D., Hindriks, K., Hoos, H., Hung, H., Jonker, C., Monz, C., Neerincx, M., Oliehoek, F., Prakken, H., Schlobach, S., Van Der Gaag, L., Van Harmelen, F., … Welling, M. (2020). A Research Agenda for Hybrid Intelligence: Augm...

work page doi:10.48550/arxiv.2503.16307 2020

[34] [34]

F., & Angel, S

https://doi.org/10.1007/s44163-024-00203-7 Alexander, C., Ishikawa, S., Silverstein, M., Jacobson, M., King, I. F., & Angel, S. (1977). A pattern language: Towns, buildings, construction. Oxford University Press. Alier, M., Pereira, J., García-Peñalvo, F. J., Casañ, M. J., & Cabré, J. (2025). LAMB: An open-source software framework to create artificial in...

work page doi:10.1007/s44163-024-00203-7 1977

[35] [35]

Collaborative Learning

https://doi.org/10.3390/su15042940 Dillenbourg, P. (1999). What do you mean by “Collaborative Learning”? In P. Dillenbourg (Ed.), Collaborative Learning. Cognitive and Computational Approaches (pp. 1–19). Elsevier Science. Dillenbourg, P. (2013). Design for classroom orchestration. Computers and Education, 69, 485–492. WHERE’S THE STRUCTURE? 42 Dillenbour...

work page doi:10.3390/su15042940 1999

[36] [36]

Let’s Ask the Robot!

https://doi.org/10.1186/s40594-025-00537-3 Fan, Y., Tang, L., Le, H., Shen, K., Tan, S., Zhao, Y., Shen, Y., Li, X., & Gašević, D. (2025). Beware of metacognitive laziness: Effects of generative artificial intelligence on learning motivation, processes, and performance. British Journal of Educational Technology, 56(2), 489–530. https://doi.org/10.1111/bje...

work page doi:10.1186/s40594-025-00537-3 2025

[37] [37]

https://doi.org/10.1186/s41239-023-00426-1 Lee, G.-G., Mun, S., Shin, M.-K., & Zhai, X. (2025). Collaborative Learning with Artificial Intelligence Speakers: Pre-service Elementary Science Teachers’ Responses to the WHERE’S THE STRUCTURE? 49 Prototype. Science & Education, 34(2), 847–875. https://doi.org/10.1007/s11191-024-00526-y Lin, C.-H., Zhou, K., Li...

work page doi:10.1186/s41239-023-00426-1 2025

[38] [38]

https://doi.org/10.5334/2008-13 WHERE’S THE STRUCTURE? 51 Nguyen, A., Hong, Y., Dang, B., & Huang, X. (2024). Human-AI collaboration patterns in AI-assisted academic writing. Studies in Higher Education, 49(5), 847–864. https://doi.org/10.1080/03075079.2024.2323593 Nguyen, A., Ilesanmi, F., Dang, B., Vuorenmaa, E., & Järvelä, S. (2024). Hybrid Intelligenc...

work page doi:10.5334/2008-13 2008

[39] [39]

Human-AI Collaboration

https://doi.org/10.3390/fi16080268 Prieto, L. P., Asensio-Perez, J. I., Munoz-Cristobal, J. A., Dimitriadis, Y. A., Jorrin-Abellan, I. M., & Gomez-Sanchez, E. (2013). Enabling Teachers to Deploy CSCL Designs across WHERE’S THE STRUCTURE? 53 Distributed Learning Environments. IEEE Transactions on Learning Technologies, 6(4), 324–336. https://doi.org/10.110...

work page doi:10.3390/fi16080268 2013

[40] [40]

https://doi.org/10.1613/jair.1.12360 Wiethof, C., & Bittner, E. A. C. (2021). Hybrid Intelligence – Combining the Human in the Loop with the Computer in the Loop: A Systematic Literature Review. International Conference on Information Systems (ICIS). Wollny, S., Schneider, J., Di Mitri, D., Weidlich, J., Rittberger, M., & Drachsler, H. (2021). Are We Ther...

work page doi:10.1613/jair.1.12360 2021

[41] [41]

https://doi.org/10.1186/s41239-019-0171-0 Zhai, C., Wibowo, S., & Li, L. D. (2024). The effects of over-reliance on AI dialogue systems on students’ cognitive abilities: A systematic review. Smart Learning Environments, 11(1),

work page doi:10.1186/s41239-019-0171-0 2024

[42] [42]

All content was reviewed and verified by the author

https://doi.org/10.1186/s40561-024-00316-7 Zhang, F., Gou, J., Shen, K. N., Camarinha-Matos, L. M., & Wang, Z. (2025). Effects of AI teammates on learning behavior in Human-AI collaboration environments: A perspective on self-regulated learning. Education and Information Technologies, 30(18), 26801–26825. https://doi.org/10.1007/s10639-025-13717-z WHERE’S...

work page doi:10.1186/s40561-024-00316-7 2025