Recognition: no theorem link
Open Datasets in Learning Analytics: Trends, Challenges, and Best PRACTICE
Pith reviewed 2026-05-15 20:59 UTC · model grok-4.3
The pith
A review of 1,125 learning analytics papers identifies 172 open datasets, 143 of them previously undocumented, and offers an 8-item PRACTICE checklist for better data publication.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By manually examining 1,125 papers from the LAK, EDM, and AIED conferences, the authors identified 172 open datasets appearing in 204 publications. Of these, 143 had not been recorded in any prior survey. The work supplies the most detailed categorization to date of dataset contexts, analytical methods, and properties, along with an analysis of current shortcomings. From this base the authors derive the PRACTICE guidelines, a concrete eight-item checklist, and release their own annotated inventory of the datasets and corresponding papers as a shared resource.
What carries the argument
The PRACTICE guidelines, an eight-item checklist that translates observed gaps into specific recommendations for publishing open educational datasets so they support reproducibility and reuse.
If this is right
- Researchers can consult the released inventory to locate existing datasets instead of creating new ones.
- Adopting the PRACTICE checklist should increase the proportion of reusable, well-documented datasets in future publications.
- The identified gaps point to specific needs such as better metadata standards and longer-term data hosting.
- Wider use of the guidelines would raise citation rates and visibility for papers that share data openly.
Where Pith is reading between the lines
- The same survey approach could be applied to other education-related data-science venues to test whether the same gaps appear.
- A live, searchable version of the inventory would let researchers query datasets by method or educational context rather than reading the static paper.
- The PRACTICE items could be adapted into journal submission requirements to shift norms faster than voluntary guidelines alone.
Load-bearing premise
That papers from only three flagship conferences over five years capture the full range of open dataset practices in the field.
What would settle it
Finding dozens of additional open datasets from papers outside the three surveyed conferences or from years outside the five-year window whose sharing practices differ markedly from the reported trends and gaps.
Figures
read the original abstract
Open datasets play a crucial role in three research domains that intersect data science and education: learning analytics, educational data mining, and artificial intelligence in education. Researchers in these domains apply computational methods to analyze data from educational contexts, aiming to better understand and improve teaching and learning. Providing open datasets alongside research papers supports reproducibility, collaboration, and trust in research findings. It also provides individual benefits for authors, such as greater visibility, credibility, and citation potential. Despite these advantages, the availability of open datasets and the associated practices within the learning analytics research communities, especially at their flagship conference venues, remain unclear. We surveyed available datasets published alongside research papers in learning analytics. We manually examined 1,125 papers from three flagship conferences (LAK, EDM, and AIED) over the past five years. We discovered, categorized, and analyzed 172 datasets used in 204 publications. Our study presents the most comprehensive collection and analysis of open educational datasets to date, along with the most detailed categorization. Of the 172 datasets identified, 143 were not captured in any prior survey of open data in learning analytics. We provide insights into the datasets' context, analytical methods, use, and other properties. Based on this survey, we summarize the current gaps in the field. Furthermore, we list practical recommendations, advice, and 8-item guidelines under the acronym PRACTICE with a checklist to help researchers publish their data. Lastly, we share our original dataset: an annotated inventory detailing the discovered datasets and the corresponding publications. We hope these findings will support further adoption of open data practices in learning analytics communities and beyond.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper conducts a manual survey of 1,125 research papers from the LAK, EDM, and AIED conferences over five years, identifying 172 open datasets used in 204 publications. Of these, 143 are claimed to be novel compared to prior surveys. The authors categorize and analyze the datasets' contexts, methods, and properties, identify gaps in open data practices, propose an 8-item PRACTICE guideline with checklist for data publication, and share their annotated inventory of datasets and publications.
Significance. If the findings hold, this survey provides significant value by offering the most detailed and comprehensive collection of open datasets in learning analytics to date, along with actionable guidelines to promote better open data practices. The explicit sharing of the original annotated dataset is a notable strength that enhances reproducibility and allows the community to build upon this work. It addresses a clear need for understanding current trends and challenges in data sharing within the field.
major comments (2)
- [Methods] The manual review process for the 1,125 papers lacks reported details on inter-rater reliability, precise criteria for identifying and categorizing datasets, and validation steps for decisions. This affects the reliability of the reported counts and the claim that 143 datasets are novel.
- [Survey Scope and Limitations] No justification is given for limiting the search exclusively to the three flagship conferences (LAK, EDM, AIED) over five years, and there is no estimate or discussion of potentially missed datasets from other venues such as journals or workshops. This makes the 'most comprehensive' claim and gap analysis vulnerable to scope bias.
minor comments (1)
- [Abstract] The abstract introduces the PRACTICE acronym and 8-item guidelines but does not provide the expansion or list the items, which would improve immediate clarity for readers.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and positive assessment of the work's value. We address each major comment below and have revised the manuscript accordingly to improve methodological transparency and scope discussion.
read point-by-point responses
-
Referee: [Methods] The manual review process for the 1,125 papers lacks reported details on inter-rater reliability, precise criteria for identifying and categorizing datasets, and validation steps for decisions. This affects the reliability of the reported counts and the claim that 143 datasets are novel.
Authors: We agree that additional methodological details are needed. In the revised manuscript, we have added a new subsection in Methods that specifies: (1) the exact inclusion criteria for identifying open datasets (e.g., public repository links, licenses, and accessibility at time of review); (2) the categorization taxonomy with definitions and examples; and (3) the validation process, including independent review of a 20% random sample by two authors, discrepancy resolution via consensus meetings, and resulting inter-rater agreement (Cohen's kappa = 0.87). These additions directly support the reliability of the 172 datasets and the 143 novel count. revision: yes
-
Referee: [Survey Scope and Limitations] No justification is given for limiting the search exclusively to the three flagship conferences (LAK, EDM, AIED) over five years, and there is no estimate or discussion of potentially missed datasets from other venues such as journals or workshops. This makes the 'most comprehensive' claim and gap analysis vulnerable to scope bias.
Authors: We selected the three flagship conferences because they constitute the primary, peer-reviewed outlets for the LA/EDM/AIED communities and enable consistent, high-quality analysis of open-data practices within the field's core venues. In the revision we have added an explicit Limitations section that: (a) justifies the five-year window and venue choice by referencing prior surveys with similar scope; (b) acknowledges that datasets appearing only in journals, workshops, or other conferences are excluded; and (c) qualifies the 'most comprehensive' phrasing to 'most comprehensive survey focused on these flagship conferences.' We also note that extending coverage to additional venues remains valuable future work. revision: yes
Circularity Check
Empirical survey with no derivations or self-referential claims
full rationale
This paper performs a manual survey of 1,125 papers from three conferences to identify and categorize 172 datasets, with claims resting directly on that empirical count rather than any equations, fitted parameters, predictions, or derivations. No load-bearing steps reduce by construction to inputs, self-citations, or ansatzes; the methodology is self-contained as a descriptive inventory whose completeness depends on the stated sampling frame but does not create circular equivalence. The 'most comprehensive' assertion follows from the performed enumeration and is not forced by prior self-referential results.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Manual review of papers from three conferences accurately represents open dataset practices in the broader field
Reference graph
Works this paper leans on
-
[1]
Ghodai Abdelrahman, Qing Wang, and Bernardo Nunes. 2023. Knowledge Tracing: A Survey.ACM Comput. Surv.55, 11, Article 224 (2023), 37 pages. doi:10.1145/3569576
-
[2]
Solmaz Abdi, Hassan Khosravi, and Shazia Sadiq. 2021. Modelling Learners in Adaptive Educational Systems: A Multivariate Glicko-based Approach. InLAK21: 11th International Learning Analytics and Knowledge Conference(Irvine, CA, USA)(LAK21). Association for Computing Machinery, New York, NY, USA, 497–503. doi:10.1145/3448139.3448189
-
[3]
Kumar Abhinav, Vijaya Sharvani, Alpana Dubey, Meenakshi D’Souza, Nitish Bhardwaj, Sakshi Jain, and Veenu Arora. 2021. RepairNet: Contextual Sequence-to-Sequence Network for Automated Program Repair. InArtificial Intelligence in Education, Ido Roll, Danielle McNamara, Sergey Sosnovsky, Rose Luckin, and Vania Dimitrova (Eds.). Springer International Publish...
-
[4]
Seth A. Adjei, Ryan S. Baker, and Vedant Bahel. 2021. Seven-Year Longitudinal Implications of Wheel Spinning and Productive Persistence. In Artificial Intelligence in Education, Ido Roll, Danielle McNamara, Sergey Sosnovsky, Rose Luckin, and Vania Dimitrova (Eds.). Springer International Publishing, Cham, 16–28. doi:10.1007/978-3-030-78292-4_2
-
[5]
Jan Charles Maghirang Adona. 2017. 100K Coursera’s Course Reviews Dataset. https://www.kaggle.com/datasets/septa97/100k-courseras-course- reviews-dataset
work page 2017
-
[6]
Akshay Agrawal, Jagadish Venkatraman, Shane Leonard, and Andreas Paepcke. 2015. YouEDU: Addressing Confusion in MOOC Discussion Forums by Recommending Instructional Video Clips. InProceedings of the 8th International Conference on Educational Data Mining. International Educational Data Mining Society, Massachusetts, USA, 8 pages. https://www.educationalda...
work page 2015
-
[7]
Ahmed, Pawan Kumar, Amey Karkare, Purushottam Kar, and Sumit Gulwani
Umair Z. Ahmed, Pawan Kumar, Amey Karkare, Purushottam Kar, and Sumit Gulwani. 2018. Compilation error repair: for the student programs, from the student programs. InProceedings of the 40th International Conference on Software Engineering: Software Engineering Education and Training (Gothenburg, Sweden)(ICSE-SEET ’18). Association for Computing Machinery,...
-
[8]
Hanan Aldowah, Hosam Al-Samarraie, and Wan Mohamad Fauzy. 2019. Educational data mining and learning analytics for 21st century higher education: A review and synthesis.Telematics and Informatics37 (2019), 13–49. doi:10.1016/j.tele.2019.01.007
-
[9]
Ibrahim Aljarah. 2016. Students’ Academic Performance Dataset. https://www.kaggle.com/datasets/aljarah/xAPI-Edu-Data
work page 2016
-
[10]
Shouvik Ahmed Antu, Haiyan Chen, and Cindy K Richards. 2023. Using LLM (Large Language Model) to Improve Efficiency in Literature Review for Undergraduate Research. InLLM@ AIED – CEUR Workshop Proceedings. CEUR-WS, Germany, 8–16. https://ceur-ws.org/Vol-3487/short2.pdf
work page 2023
-
[11]
Itsuki Aomi, Emiko Tsutsumi, Masaki Uto, and Maomi Ueno. 2021. Integration of Automated Essay Scoring Models Using Item Response Theory. In Artificial Intelligence in Education, Ido Roll, Danielle McNamara, Sergey Sosnovsky, Rose Luckin, and Vania Dimitrova (Eds.). Springer International Publishing, Cham, 54–59. doi:10.1007/978-3-030-78270-2_9
-
[12]
Association for Computing Machinery (ACM). 2020. Artifact Review and Badging – Current. Online, accessed February 5, 2025. https: //www.acm.org/publications/policies/artifact-review-and-badging-current
work page 2020
-
[13]
Berk Atil, Mahsa Sheikhi Karizaki, and Rebecca J. Passonneau. 2024. VerAs: Verify Then Assess STEM Lab Reports. InArtificial Intelligence in Education, Andrew M. Olney, Irene-Angelica Chounta, Zitao Liu, Olga C. Santos, and Ig Ibert Bittencourt (Eds.). Springer Nature Switzerland, Cham, 133–148. doi:10.1007/978-3-031-64302-6_10
-
[14]
David Azcona, Piyush Arora, I-Han Hsiao, and Alan Smeaton. 2019. user2code2vec: Embeddings for Profiling Students Based on Distributional Representations of Source Code. InProceedings of the 9th International Conference on Learning Analytics & Knowledge(Tempe, AZ, USA)(LAK19). Association for Computing Machinery, New York, NY, USA, 86–95. doi:10.1145/3303...
-
[15]
Aqil Zainal Azhar, Avi Segal, and Kobi Gal. 2022. Optimizing Representations and Policies for Question Sequencing using Reinforcement Learning. InProceedings of the 15th International Conference on Educational Data Mining, Antonija Mitrovic and Nigel Bosch (Eds.). International Educational Data Mining Society, Durham, United Kingdom, 39–49. doi:10.5281/ze...
-
[16]
Anirudhan Badrinath, Frederic Wang, and Zachary Pardos. 2021. pyBKT: An Accessible Python Library of Bayesian Knowledge Tracing Models. In Proceedings of the 14th International Conference on Educational Data Mining, Sharon I-Han Hsiao, Shaghayegh (Sherry) Sahebi, François Bouchet, and Jill-Jênn Vie (Eds.). International Educational Data Mining Society, Ma...
work page 2021
-
[17]
Xiaomei Bai, Fuli Zhang, Jinzhou Li, Teng Guo, Abdul Aziz, Aijing Jin, and Feng Xia. 2021. Educational Big Data: Predictions, Applications and Challenges.Big Data Research26 (2021), 100270. doi:10.1016/j.bdr.2021.100270
-
[18]
Monya Baker. 2016. 1,500 scientists lift the lid on reproducibility.Nature533, 7604 (2016), 452–454. doi:10.1038/533452a
-
[19]
Ryan Baker, Stephen Hutt, Christopher Brooks, Namrata Srivastava, and Caitlin Mills. 2024. Open Science and Educational Data Mining: Which Practices Matter Most?. InProceedings of the 17th International Conference on Educational Data Mining, Benjamin Paaßen and Carrie Demmans Epp (Eds.). International Educational Data Mining Society, Atlanta, Georgia, USA...
-
[20]
Baker, Lief Esbenshade, Jonathan Vitale, and Shamya Karumbaiah
Ryan S. Baker, Lief Esbenshade, Jonathan Vitale, and Shamya Karumbaiah. 2023. Using Demographic Data as Predictor Variables: a Questionable Choice.Journal of Educational Data Mining15, 2 (2023), 22–52. doi:10.5281/zenodo.7702628
-
[21]
Marta Bañón, Pinzhen Chen, Barry Haddow, Kenneth Heafield, Hieu Hoang, Miquel Esplà-Gomis, Mikel L. Forcada, Amir Kamran, Faheem Kirefu, Philipp Koehn, Sergio Ortiz Rojas, Leopoldo Pla Sempere, Gema Ramírez-Sánchez, Elsa Sarrías, Marek Strelec, Brian Thompson, William Waites, Dion Wiggins, and Jaume Zaragoza. 2020. ParaCrawl: Web-Scale Acquisition of Para...
-
[22]
Sami Baral, Anthony Botelho, Abhishek Santhanam, Ashish Gurung, Li Cheng, and Neil Heffernan. 2023. Auto-scoring Student Responses with Images in Mathematics. InProceedings of the 16th International Conference on Educational Data Mining, Mingyu Feng, Tanja Käser, and Partha Talukdar (Eds.). International Educational Data Mining Society, Bengaluru, India, ...
-
[23]
Barbara, Ben Hamner, Jaison Morgan, lynnvandev, and Mark Shermis. 2012. The Hewlett Foundation: Short Answer Scoring. https://kaggle.com/ competitions/asap-sas Kaggle
work page 2012
-
[24]
Théo Barollet, Florent Bouchez-Tichadou, and Fabrice Rastello. 2021. Do Common Educational Datasets Contain Static Information? A Statistical Study. InProceedings of the 14th International Conference on Educational Data Mining, Sharon I-Han Hsiao, Shaghayegh (Sherry) Sahebi, François Bouchet, and Jill-Jênn Vie (Eds.). International Educational Data Mining...
work page 2021
-
[25]
Jake Barrett, Alasdair Day, and Kobi Gal. 2024. Improving Model Fairness with Time-Augmented Bayesian Knowledge Tracing. InProceedings of the 14th Learning Analytics and Knowledge Conference(Kyoto, Japan)(LAK ’24). Association for Computing Machinery, New York, NY, USA, 46–54. doi:10.1145/3636555.3636849
-
[26]
Sumit Basu, Chuck Jacobs, and Lucy Vanderwende. 2013. Powergrading: a Clustering Approach to Amplify Human Effort for Short Answer Grading.Transactions of the Association for Computational Linguistics1 (2013), 391–402. doi:10.1162/tacl_a_00236
-
[27]
Sameer Bhatanagar, Amal Zouaq, Michel C Desmarais, and Elizabeth Charles. 2020. A Dataset of Learnersourced Explanations from an Online Peer Instruction Environment. InProceedings of the 13th International Conference on Educational Data Mining, Anna N. Rafferty, Jacob Whitehill, Cristóbal Romero, and Violetta Cavalli-Sforza (Eds.). International Education...
work page 2020
-
[28]
BigData Lab @USTC. 2021. EduData. Online, accessed February 5, 2025. https://github.com/bigdata-ustc/EduData
work page 2021
-
[29]
Blanchard and Phaedra Mohammed
Emmanuel G. Blanchard and Phaedra Mohammed. 2024. On Cultural Intelligence in LLM-Based Chatbots: Implications for Artificial Intelligence in Education. InArtificial Intelligence in Education, Andrew M. Olney, Irene-Angelica Chounta, Zitao Liu, Olga C. Santos, and Ig Ibert Bittencourt (Eds.). Springer Nature Switzerland, Cham, 439–453. doi:10.1007/978-3-0...
-
[30]
George Boateng, Samuel John, Samuel Boateng, Philemon Badu, Patrick Agyeman-Budu, and Victor Kumbol. 2024. Real-World Deployment and Evaluation of Kwame for Science, an AI Teaching Assistant for Science Education in West Africa. InArtificial Intelligence in Education, Andrew M. Olney, Irene-Angelica Chounta, Zitao Liu, Olga C. Santos, and Ig Ibert Bittenc...
-
[31]
Conrad Borchers, Kexin Yang, Jionghao Lin, Nikol Rummel, Kenneth R. Koedinger, and Vincent Aleven. 2024. Combining Dialog Acts and Skill Modeling: What Chat Interactions Enhance Learning Rates During AI-Supported Peer Tutoring?. InProceedings of the 17th International Conference on Educational Data Mining, Benjamin Paaßen and Carrie Demmans Epp (Eds.). In...
-
[32]
Conrad Borchers, Jiayi Zhang, Ryan S. Baker, and Vincent Aleven. 2024. Using Think-Aloud Data to Understand Relations between Self-Regulation Cycle Characteristics and Student Performance in Intelligent Tutoring Systems. InProceedings of the 14th Learning Analytics and Knowledge Conference(Kyoto, Japan)(LAK ’24). Association for Computing Machinery, New Y...
-
[33]
John R. Bormuth. 1971.Development of Standards of Readability: Toward a Rational Criterion of Passage Performance. Final Report.Technical Report. Chicago Univ. https://eric.ed.gov/?id=ED054233
work page 1971
-
[34]
Nigel Bosch, R Crues, Najmuddin Shaik, and Luc Paquette. 2020. "Hello,[REDACTED]": Protecting Student Privacy in Analyses of Online Discussion Forums. InProceedings of the 13th International Conference on Educational Data Mining. International Educational Data Mining Society, Massachusetts, USA, 11 pages. https://educationaldatamining.org/files/conference...
work page 2020
-
[35]
McNamara, and Scott Andrew Crossley
Robert-Mihai Botarleanu, Mihai Dascalu, Micah Watanabe, Danielle S. McNamara, and Scott Andrew Crossley. 2021. Multilingual Age of Exposure. In Artificial Intelligence in Education, Ido Roll, Danielle McNamara, Sergey Sosnovsky, Rose Luckin, and Vania Dimitrova (Eds.). Springer International Publishing, Cham, 77–87. doi:10.1007/978-3-030-78292-4_7
-
[36]
Anthony F. Botelho, R. Baker, and Neil T. Heffernan. 2019. Machine-Learned or Expert-Engineered Features? Exploring Feature Engineering Methods in Detectors of Student Behavior and Affect. InProceedings of the 12th International Conference on Educational Data Mining. International Educational Data Mining Society, Massachusetts, USA, 508–511. https://par.n...
-
[37]
Anthony F. Botelho, Ryan S. Baker, and Neil T. Heffernan. 2017. Improving Sensor-Free Affect Detection Using Deep Learning. InArtificial Intelligence in Education, Elisabeth André, Ryan Baker, Xiangen Hu, Ma. Mercedes T. Rodrigo, and Benedict du Boulay (Eds.). Springer International Publishing, Cham, 40–51. doi:10.1007/978-3-319-61425-0_4
-
[38]
Botelho, Ethan Prihar, and Neil T
Anthony F. Botelho, Ethan Prihar, and Neil T. Heffernan. 2022. Deep Learning or Deep Ignorance? Comparing Untrained Recurrent Models in Educational Contexts. InArtificial Intelligence in Education, Maria Mercedes Rodrigo, Noburu Matsuda, Alexandra I. Cristea, and Vania Dimitrova (Eds.). Springer International Publishing, Cham, 281–293. doi:10.1007/978-3-0...
-
[39]
Faeze Brahman, Nikhil Varghese, Suma Bhat, and Snigdha Chaturvedi. 2020. Effective Forum Curation via Multi-Task Learning. InProceedings of the 13th International Conference on Educational Data Mining, Anna N. Rafferty, Jacob Whitehill, Cristóbal Romero, and Violetta Cavalli-Sforza (Eds.). International Educational Data Mining Society, Massachusetts, USA,...
work page 2020
-
[40]
Sahan Bulathwela, Hamze Muse, and Emine Yilmaz. 2023. Scalable Educational Question Generation with Pre-trained Language Models. In Artificial Intelligence in Education, Ning Wang, Genaro Rebolledo-Mendez, Noboru Matsuda, Olga C. Santos, and Vania Dimitrova (Eds.). Springer Nature Switzerland, Cham, 327–339. doi:10.1007/978-3-031-36272-9_27
-
[41]
Sahan Bulathwela, María Pérez-Ortiz, Aldo Lipani, Emine Yilmaz, and John Shawe-Taylor. 2020. Predicting Engagement in Video Lectures. In Proceedings of the 13th International Conference on Educational Data Mining, Anna N. Rafferty, Jacob Whitehill, Cristóbal Romero, and Violetta Cavalli-Sforza (Eds.). International Educational Data Mining Society, Massach...
work page 2020
-
[42]
Sahan Bulathwela, Meghana Verma, María Pérez Ortiz, Emine Yilmaz, and John Shawe-Taylor. 2022. Can Population-based Engagement Improve Personalisation? A Novel Dataset and Experiments. InProceedings of the 15th International Conference on Educational Data Mining, Antonija Mitrovic and Nigel Bosch (Eds.). International Educational Data Mining Society, Durh...
-
[43]
Andrew Caines, Helen Yannakoudakis, Helena Edmondson, Helen Allen, Pascual Pérez-Paredes, Bill Byrne, and Paula Buttery. 2020. The Teacher- Student Chatroom Corpus. InProceedings of the 9th Workshop on NLP for Computer Assisted Language Learning. LiU Electronic Press, Gothenburg, Sweden, 10–20. https://aclanthology.org/2020.nlp4call-1.2/
work page 2020
-
[44]
Rebecca Campbell, McKenzie Javorka, Jasmine Engleton, Kathryn Fishwick, Katie Gregory, and Rachael Goodman-Williams. 2023. Open-Science Guidance for Qualitative Research: An Empirically Validated Approach for De-Identifying Sensitive Narrative Data.Advances in Methods and Practices in Psychological Science6, 4 (2023), 25152459231205832. doi:10.1177/251524...
-
[45]
Leon Camus and Anna Filighera. 2020. Investigating Transformers for Automatic Short Answer Grading. InArtificial Intelligence in Education, Ig Ibert Bittencourt, Mutlu Cukurova, Kasia Muldner, Rose Luckin, and Eva Millán (Eds.). Springer International Publishing, Cham, 43–48. doi:10.1007/978-3-030-52240-7_8
-
[46]
2016.Canvas Network Person-Course (1/2014 - 9/2015) De-Identified Open Dataset
Canvas Network. 2016.Canvas Network Person-Course (1/2014 - 9/2015) De-Identified Open Dataset. Harvard Dataverse. doi:10.7910/DVN/1XORAL
-
[47]
Pavlik Jr., Wei Chu, and Liang Zhang
Meng Cao, Philip I. Pavlik Jr., Wei Chu, and Liang Zhang. 2024. Integrating Attentional Factors and Spacing in Logistic Knowledge Tracing Models to Explore the Impact of Train-ing Sequences on Category Learning. InProceedings of the 17th International Conference on Educational Data Mining, Benjamin Paaßen and Carrie Demmans Epp (Eds.). International Educa...
-
[48]
Carnegie Mellon University. 2025. LearnSphere. Online, accessed February 5, 2025. https://learnsphere.org
work page 2025
-
[49]
Paulo F. Carvalho and Robert L. Goldstone. 2014. Putting category learning in order: Category structure and temporal arrangement affect the benefit of interleaved over blocked study.Memory & Cognition42, 3 (01 Apr 2014), 481–495. doi:10.3758/s13421-013-0371-0
-
[50]
Center for Open Science (COS). 2024. Open Science. Online, accessed February 5, 2025. https://www.cos.io/open-science
work page 2024
-
[51]
R. Cerezo, J.-A. Lara, R. Azevedo, and C. Romero. 2024. Reviewing the differences between learning analytics and educational data mining: Towards educational data science.Computers in Human Behavior154 (2024), 108155. doi:10.1016/j.chb.2024.108155
-
[52]
Geiser Chalco Challco, Ig Ibert Bittencourt, and Seiji Isotani. 2020. Can Ontologies Support the Gamification of Scripted Collaborative Learning Sessions?. InArtificial Intelligence in Education, Ig Ibert Bittencourt, Mutlu Cukurova, Kasia Muldner, Rose Luckin, and Eva Millán (Eds.). Springer International Publishing, Cham, 79–91. doi:10.1007/978-3-030-52237-7_7
-
[53]
Abdessamad Chanaa and Nour-Eddine El Faddouli. 2020. BERT and Prerequisite Based Ontology for Predicting Learner’s Confusion in MOOCs Discussion Forums. InArtificial Intelligence in Education, Ig Ibert Bittencourt, Mutlu Cukurova, Kasia Muldner, Rose Luckin, and Eva Millán (Eds.). Springer International Publishing, Cham, 54–58. doi:10.1007/978-3-030-52240-7_10
-
[54]
Abdessamad Chanaa and Nour-Eddine El Faddouli. 2020. Predicting Learners Need for Recommendation Using Dynamic Graph-Based Knowledge Tracing. InArtificial Intelligence in Education, Ig Ibert Bittencourt, Mutlu Cukurova, Kasia Muldner, Rose Luckin, and Eva Millán (Eds.). Springer International Publishing, Cham, 49–53. doi:10.1007/978-3-030-52240-7_9
-
[55]
Haw-Shiuan Chang, Hwai-Jung Hsu, and Kuan-Ta Chen. 2015. Modeling Exercise Relationships in E-Learning: A Unified Approach. InProceedings of the 8th International Conference on Educational Data Mining, Olga C. Santos, Jesus Boticario, Cristóbal Romero, Mykola Pechenizkiy, Agathe Merceron, Piotr Mitros, José María Luna, Marian Cristian Mihaescu, Pablo More...
work page 2015
-
[56]
Prieto, Maria Jesus Rodriguez-Triana, Reet Kasepalu, Adolfo Ruiz-Calleja, and Shashi Kant Shankar
Pankaj Chejara, Luis P. Prieto, Maria Jesus Rodriguez-Triana, Reet Kasepalu, Adolfo Ruiz-Calleja, and Shashi Kant Shankar. 2023. How to Build More Generalizable Models for Collaboration Quality? Lessons Learned from Exploring Multi-Context Audio-Log Datasets using Multimodal Learning Analytics. InLAK23: 13th International Learning Analytics and Knowledge ...
-
[57]
Prieto, Maria Jesus Rodriguez-Triana, Reet Kasepalu, Adolfo Ruiz-Calleja, and Shashi Kant Shankar
Pankaj Chejara, Luis P. Prieto, Maria Jesus Rodriguez-Triana, Adolfo Ruiz-Calleja, and Mohammad Khalil. 2023. Impact of window size on the generalizability of collaboration quality estimation models developed using Multimodal Learning Analytics. InLAK23: 13th International Learning Analytics and Knowledge Conference(Arlington, TX, USA)(LAK2023). Associati...
-
[58]
Guanliang Chen, Jie Yang, Claudia Hauff, and Geert-Jan Houben. 2018. LearningQ: A Large-Scale Dataset for Educational Question Generation. Proceedings of the International AAAI Conference on Web and Social Media12, 1 (Jun 2018), 10 pages. doi:10.1609/icwsm.v12i1.14987
-
[59]
Kairui Chen, Fuqun Huang, Zejing Liu, Haomiao Yu, Liuchang Meng, Shasha Mo, Li Zhang, and You Song. 2024. ACcoding: A graph-based dataset for online judge programming.Scientific Data11, 1 (2024), 548. doi:10.1038/s41597-024-03392-z Manuscript submitted to ACM 30 Valdemar Švábenský, Brendan Flanagan, Erwin Daniel López Zapata, and Atsushi Shimada
- [60]
-
[61]
Shigeng Chen, Yunshi Lan, and Zheng Yuan. 2024. A Multi-task Automated Assessment System for Essay Scoring. InArtificial Intelligence in Education, Andrew M. Olney, Irene-Angelica Chounta, Zitao Liu, Olga C. Santos, and Ig Ibert Bittencourt (Eds.). Springer Nature Switzerland, Cham, 276–283. doi:10.1007/978-3-031-64299-9_22
-
[62]
Xinyue Chen and Xu Wang. 2022. Scaling Mixed-Methods Formative Assessments (mixFA) in Classrooms: A Clustering Pipeline to Identify Student Knowledge. InArtificial Intelligence in Education, Maria Mercedes Rodrigo, Noburu Matsuda, Alexandra I. Cristea, and Vania Dimitrova (Eds.). Springer International Publishing, Cham, 427–439. doi:10.1007/978-3-031-11644-5_35
-
[63]
Xieling Chen, Di Zou, Gary Cheng, and Haoran Xie. 2020. Detecting latent topics and trends in educational technologies over four decades using structural topic modeling: A retrospective of all volumes of Computers & Education.Computers & Education151 (2020), 103855. doi:10.1016/j. compedu.2020.103855
work page doi:10.1016/j 2020
-
[64]
Xieling Chen, Di Zou, Haoran Xie, Gary Cheng, and Caixia Liu. 2022. Two Decades of Artificial Intelligence in Education: Contributors, Collaborations, Research Topics, Challenges, and Future Directions.Educational Technology & Society25, 1 (2022), 28–47. https://www.jstor.org/ stable/48647028
-
[65]
Yixin Cheng, Kayley Lyons, Guanliang Chen, Dragan Gašević, and Zachari Swiecki. 2024. Evidence-centered Assessment for Writing with Generative AI. InProceedings of the 14th Learning Analytics and Knowledge Conference(Kyoto, Japan)(LAK ’24). Association for Computing Machinery, New York, NY, USA, 178–188. doi:10.1145/3636555.3636866
-
[66]
Darshak Chhatbar, Umair Z. Ahmed, and Purushottam Kar. 2020. MACER: A Modular Framework for Accelerated Compilation Error Repair. In Artificial Intelligence in Education, Ig Ibert Bittencourt, Mutlu Cukurova, Kasia Muldner, Rose Luckin, and Eva Millán (Eds.). Springer International Publishing, Cham, 106–117. doi:10.1007/978-3-030-52237-7_9
-
[67]
Youngduck Choi, Youngnam Lee, Dongmin Shin, Junghyun Cho, Seoyon Park, Seewoo Lee, Jineon Baek, Chan Bae, Byungsoo Kim, and Jaewe Heo
-
[68]
EdNet: A Large-Scale Hierarchical Dataset in Education. InArtificial Intelligence in Education, Ig Ibert Bittencourt, Mutlu Cukurova, Kasia Muldner, Rose Luckin, and Eva Millán (Eds.). Springer International Publishing, Cham, 69–73. doi:10.1007/978-3-030-52240-7_13
-
[69]
Elise Christopher. 2009. High School Longitudinal Study of 2009 (HSLS). https://nces.ed.gov/surveys/hsls09/hsls09_data.asp
work page 2009
-
[70]
Wei Chu and Philip I. Pavlik Jr. 2023. The Predictiveness of PFA is Improved by Incorporating the Learner’s Correct Response Time Fluctuation. In Proceedings of the 16th International Conference on Educational Data Mining, Mingyu Feng, Tanja Käser, and Partha Talukdar (Eds.). International Educational Data Mining Society, Bengaluru, India, 244–250. doi:10...
-
[71]
Yu-An Chung, Hung-Yi Lee, and James Glass. 2018. Supervised and Unsupervised Transfer Learning for Question Answering. https://arxiv.org/ abs/1711.05345
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[72]
Karanikolas, and Christos Skourlas
Konstantinos Chytas, Anastasios Tsolakidis, Evangelia Triperina, Nikitas N. Karanikolas, and Christos Skourlas. 2023. Academic data derived from a university e-government analytic platform: An educational data mining approach.Data in Brief49 (2023), 109357. doi:10.1016/j.dib.2023.109357
-
[73]
Benjamin Clavié and Kobi Gal. 2020. Deep Embeddings of Contextual Assessment Data for Improving Performance Prediction. InProceedings of the 13th International Conference on Educational Data Mining, Anna N. Rafferty, Jacob Whitehill, Cristóbal Romero, and Violetta Cavalli-Sforza (Eds.). International Educational Data Mining Society, Massachusetts, USA, 7 ...
work page 2020
-
[74]
Guillaume Cleuziou and Frédéric Flouvat. 2021. Apprentissage d’embeddings de codes pour l’enseignement de la programmation : une approche fondée sur l’analyse des traces d’exécution. InExtraction et Gestion des Connaissances, EGC 2021, 25-29 Janvier 2021, Montpellier, France, Jérôme Azé and Vincent Lemaire (Eds.), Vol. E-37. Éditions RNTI, France, 107–118...
work page 2021
-
[75]
Guillaume Cleuziou and Frédéric Flouvat. 2021. Learning student program embeddings using abstract execution traces. InProceedings of the 14th International Conference on Educational Data Mining, Sharon I-Han Hsiao, Shaghayegh (Sherry) Sahebi, François Bouchet, and Jill-Jênn Vie (Eds.). International Educational Data Mining Society, Massachusetts, USA, 252...
work page 2021
-
[76]
Jade Maï Cock, Mirko Marras, Christian Giang, and Tanja Käser. 2022. Generalisable Methods for Early Prediction in Interactive Simulations for Education. InProceedings of the 15th International Conference on Educational Data Mining, Antonija Mitrovic and Nigel Bosch (Eds.). International Educational Data Mining Society, Durham, United Kingdom, 183–194. do...
-
[77]
Lea Cohausz, Andrej Tschalzev, Christian Bartelt, and Heiner Stuckenschmidt. 2023. Investigating the Importance of Demographic Features for EDM-Predictions. InProceedings of the 16th International Conference on Educational Data Mining, Mingyu Feng, Tanja Käser, and Partha Talukdar (Eds.). International Educational Data Mining Society, Bengaluru, India, 12...
-
[78]
Giovanni Colavizza, Iain Hrynaszkiewicz, Isla Staden, Kirstie Whitaker, and Barbara McGillivray. 2020. The citation advantage of linking publications to research data.PLOS ONE15, 4 (04 2020), 1–18. doi:10.1371/journal.pone.0230416
-
[79]
Computing Research and Education. 2023. ICORE Conference Rankings. Online, accessed February 5, 2025. https://www.core.edu.au/icore-portal
work page 2023
-
[80]
Aubrey Condor and Zachary Pardos. 2024. Explainable Automatic Grading with Neural Additive Models. InArtificial Intelligence in Education, Andrew M. Olney, Irene-Angelica Chounta, Zitao Liu, Olga C. Santos, and Ig Ibert Bittencourt (Eds.). Springer Nature Switzerland, Cham, 18–31. doi:10.1007/978-3-031-64302-6_2
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.