pith. machine review for the scientific record. sign in

arxiv: 2604.17569 · v1 · submitted 2026-04-19 · 💻 cs.CL

Recognition: unknown

MAPLE: A Meta-learning Framework for Cross-Prompt Essay Scoring

Authors on Pith no claims yet

Pith reviewed 2026-05-10 05:53 UTC · model grok-4.3

classification 💻 cs.CL
keywords automated essay scoringmeta-learningprototypical networkscross-prompt evaluationtransferable representationsquadratic weighted kappaELLIPSELAILA
0
0 comments X

The pith

MAPLE meta-learning framework improves cross-prompt automated essay scoring

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces MAPLE as a way to handle the problem of scoring essays written for prompts the model has not seen before. Automated essay scoring models typically struggle when the writing prompt changes because the style, topic, and grading standards differ. MAPLE addresses this by using meta-learning with prototypical networks to build representations that transfer across prompts. This leads to better performance on new prompts, as shown by strong results on two of the three tested datasets.

Core claim

MAPLE is a meta-learning framework that leverages prototypical networks to learn transferable representations across different writing prompts. On the ELLIPSE and LAILA datasets, it achieves state-of-the-art performance, outperforming strong baselines by 8.5 and 3 points in QWK. On the ASAP dataset with heterogeneous score ranges, it provides improvements on several traits, demonstrating its utility in unified scoring settings.

What carries the argument

Prototypical networks within the MAPLE meta-learning framework, which learn prompt-agnostic essay representations for generalization to unseen prompts.

Load-bearing premise

That the meta-learned representations from prototypical networks will generalize across prompts without major changes in writing style or scoring criteria.

What would settle it

A test on prompts with markedly different topics and rubrics where MAPLE fails to outperform conventional fine-tuned models in QWK score.

Figures

Figures reproduced from arXiv: 2604.17569 by May Bashendy, Salam Albatarni, Sohaila Eltanbouly, Tamer Elsayed.

Figure 1
Figure 1. Figure 1: MAPLE task generation. In meta-training, we explore two settings, binary and multiclass classifi￾cation. Each sampled task includes a support set (for Ci computation) and a query set (for evaluation/learner￾update). In meta-testing, the task is multiclass, where the support set includes all training data and the query set corresponds to an unseen prompt. Finally, the model is updated based on its perfor￾ma… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of MAPLE showing (a) the meta￾learner architecture and (b) the prediction process. 4.1 Meta-training Task Formulation Cross-prompt AES aims to train a model on essays from a set of source prompts P s = {p s i } and test its generalization capability on essays from an unseen target prompt p t . A direct adaptation of cross￾prompt AES into the meta-learning framework is to frame each prompt p s i as… view at source ↗
read the original abstract

Automated Essay Scoring (AES) faces significant challenges in cross-prompt settings, where models must generalize to unseen writing prompts. To address this limitation, we propose MAPLE, a meta-learning framework that leverages prototypical networks to learn transferable representations across different writing prompts. Across three diverse datasets (ELLIPSE and ASAP (English), and LAILA (Arabic)), MAPLE achieves state-of-the-art performance on ELLIPSE and LAILA, outperforming strong baselines by 8.5 and 3 points in QWK, respectively. On ASAP, where prompts exhibit heterogeneous score ranges, MAPLE yields improvements on several traits, highlighting the strengths of our approach in unified scoring settings. Overall, our results demonstrate the potential of meta-learning for building robust cross-prompt AES systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes MAPLE, a meta-learning framework that uses prototypical networks to learn transferable representations for cross-prompt automated essay scoring. It reports state-of-the-art results on the ELLIPSE and LAILA datasets, outperforming baselines by 8.5 and 3 QWK points respectively, with additional improvements noted on the ASAP dataset for certain traits.

Significance. If the central claims hold, MAPLE could represent a meaningful advance in handling prompt variability in AES by leveraging meta-learning for prompt-agnostic representations. This has potential implications for building more robust scoring systems across diverse writing tasks and languages. However, the lack of detailed experimental validation in the abstract and the noted mismatch between prototypical networks and regression tasks raises questions about the generalizability of the approach.

major comments (2)
  1. Abstract: The abstract reports empirical gains of 8.5 and 3 QWK points on ELLIPSE and LAILA but provides no details on experimental setup, baselines, statistical significance, error bars, or data splits. This absence makes the central performance claims impossible to verify from the given text.
  2. Abstract / Methods: Prototypical networks are designed for classification tasks using class centroids and distance-based prediction. AES is a regression/ordinal task with prompt-specific score ranges and distributions (as noted for ASAP). The paper does not detail the adaptation (e.g., how distance to prototypes is mapped to numeric scores) or demonstrate that this preserves prompt-invariance under score-range shifts, which is load-bearing for the cross-prompt generalization claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and insightful comments on our manuscript. We address each major comment below and have revised the manuscript to incorporate the suggested improvements.

read point-by-point responses
  1. Referee: Abstract: The abstract reports empirical gains of 8.5 and 3 QWK points on ELLIPSE and LAILA but provides no details on experimental setup, baselines, statistical significance, error bars, or data splits. This absence makes the central performance claims impossible to verify from the given text.

    Authors: We agree that the abstract is necessarily concise and omits granular experimental details. The full experimental setup, including data splits, baselines, evaluation protocol, and reporting of standard deviations across multiple runs, is provided in Sections 3 and 4. To improve verifiability from the abstract alone, we will revise it to briefly note the cross-prompt evaluation setting, the datasets used, and that results include standard deviations from repeated runs with statistical significance tests reported in the main text. revision: yes

  2. Referee: Abstract / Methods: Prototypical networks are designed for classification tasks using class centroids and distance-based prediction. AES is a regression/ordinal task with prompt-specific score ranges and distributions (as noted for ASAP). The paper does not detail the adaptation (e.g., how distance to prototypes is mapped to numeric scores) or demonstrate that this preserves prompt-invariance under score-range shifts, which is load-bearing for the cross-prompt generalization claim.

    Authors: This is a fair observation on the adaptation required for a regression task. While the Methods section outlines the meta-learning framework and use of prototypes derived from support-set representations, we acknowledge that the precise mapping from prototype distances/similarities to numeric scores and the handling of heterogeneous score ranges could be clarified further. In the revised version, we will expand the Methods section with an explicit description of the regression adaptation (including any learned projection or interpolation step) and add targeted analysis or supplementary experiments on the ASAP dataset to demonstrate that the learned representations remain effective under score-range shifts. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical meta-learning results only.

full rationale

The paper introduces MAPLE as a meta-learning framework adapting prototypical networks for cross-prompt AES, with performance evaluated empirically on ELLIPSE, ASAP, and LAILA datasets. No derivation chain, equations, or first-principles predictions exist that could reduce to inputs by construction. Claims rest on standard empirical comparisons to baselines (QWK improvements reported), without fitted parameters renamed as predictions, self-definitional loops, or load-bearing self-citations. The central premise (transferable representations via meta-training) is tested via held-out prompt experiments and does not collapse to tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are identifiable from the abstract alone. The approach implicitly assumes standard meta-learning transferability without additional postulates.

pith-pipeline@v0.9.0 · 5435 in / 1047 out tokens · 48464 ms · 2026-05-10T05:53:35.757144+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

88 extracted references · 38 canonical work pages · 1 internal anchor

  1. [1]

    Lee, Yejin and Jeong, Seokwon and Kim, Hongjin and Kim, Tae-il and Choi, Sung-Won and Kim, Harksoo , booktitle=

  2. [2]

    https://proceedings.mlr.press/v70/finn17a/finn17a.pdf

    Model-agnostic meta-learning for fast adaptation of deep networks , author=. International conference on machine learning , pages=. 2017 , url="https://proceedings.mlr.press/v70/finn17a/finn17a.pdf", organization=

  3. [3]

    arXiv preprint arXiv:2107.14035 , year=

    Prototransformer: A meta-learning approach to providing student feedback , author=. arXiv preprint arXiv:2107.14035 , year=

  4. [4]

    Proceedings of the 2010 conference on empirical methods in natural language processing , pages=

    Modeling organization in student essays , author=. Proceedings of the 2010 conference on empirical methods in natural language processing , pages=

  5. [5]

    Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications , pages=

    Can neural networks automatically score essay traits? , author=. Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications , pages=

  6. [6]

    Transactions of the Association for Computational Linguistics , volume=

    Towards evaluating narrative quality in student writing , author=. Transactions of the Association for Computational Linguistics , volume=. 2018 , publisher=

  7. [7]

    Many Hands Make Light Work: Using Essay Traits to Automatically Score Essays

    Kumar, Rahul and Mathias, Sandeep and Saha, Sriparna and Bhattacharyya, Pushpak. Many Hands Make Light Work: Using Essay Traits to Automatically Score Essays. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2022. doi:10.18653/v1/2022.naacl-main.106

  8. [8]

    Attention-based Recurrent Convolutional Neural Network for Automatic Essay Scoring

    Dong, Fei and Zhang, Yue and Yang, Jie. Attention-based Recurrent Convolutional Neural Network for Automatic Essay Scoring. Proceedings of the 21st Conference on Computational Natural Language Learning ( C o NLL 2017). 2017. doi:10.18653/v1/K17-1017

  9. [9]

    Mathias, Sandeep and Bhattacharyya, Pushpak , booktitle=

  10. [10]

    , author=

    Weighted kappa: nominal scale agreement provision for scaled disagreement or partial credit. , author=. Psychological bulletin , volume=. 1968 , publisher=

  11. [11]

    and Laflair, Geoffrey and Verardi, Anthony and Burstein, Jill

    Yancey, Kevin P. and Laflair, Geoffrey and Verardi, Anthony and Burstein, Jill. Rating Short L 2 Essays on the CEFR Scale with GPT -4. Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023). 2023. doi:10.18653/v1/2023.bea-1.49

  12. [12]

    Automated evaluation of written discourse coherence using GPT -4

    Naismith, Ben and Mulcaire, Phoebe and Burstein, Jill. Automated evaluation of written discourse coherence using GPT -4. Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023). 2023. doi:10.18653/v1/2023.bea-1.32

  13. [13]

    arXiv preprint arXiv:2310.05191 , year=

    Fabric: Automated scoring and feedback generation for essays , author=. arXiv preprint arXiv:2310.05191 , year=

  14. [14]

    arXiv preprint arXiv:2401.06431 , year=

    From Automation to Augmentation: Large Language Models Elevating Essay Scoring Landscape , author=. arXiv preprint arXiv:2401.06431 , year=

  15. [15]

    grading essays by computer , author=

    The imminence of... grading essays by computer , author=. The Phi Delta Kappan , volume=. 1966 , publisher=

  16. [16]

    Proceedings of the 2015 conference on empirical methods in natural language processing , pages=

    Flexible domain adaptation for automated essay scoring using correlated linear regression , author=. Proceedings of the 2015 conference on empirical methods in natural language processing , pages=

  17. [17]

    Gaheen and Rania M

    Marwa M. Gaheen and Rania M. ElEraky and Ahmed A. Ewees , journal=. Optimized Neural Network-Based Improved Multiverse Optimizer Algorithm For Automated. 2020 , volume=

  18. [18]

    Automated students

    Gaheen, Marwa M and ElEraky, Rania M and Ewees, Ahmed A , journal=. Automated students. 2021 , publisher=

  19. [19]

    Automatic

    Abdeljaber, Hikmat A , journal=. Automatic. 2021 , publisher=

  20. [20]

    Automatic scoring of

    Alsanie, Waleed and Alkanhal, Mohamed I and Alhamadi, Mohammed and Alqabbany, Abdulaziz O , journal=. Automatic scoring of. 2022 , publisher=

  21. [21]

    Beyond essay length: evaluating e-rater

    Chodorow, Martin and Burstein, Jill , journal=. Beyond essay length: evaluating e-rater. 2004 , publisher=

  22. [22]

    Findings of the Association for Computational Linguistics: EMNLP 2020 , pages=

    Enhancing automated essay scoring performance via fine-tuning pre-trained language models with combination of regression and ranking , author=. Findings of the Association for Computational Linguistics: EMNLP 2020 , pages=

  23. [23]

    Proceedings of the 29th International Conference on Computational Linguistics , pages=

    Automated Essay Scoring via Pairwise Contrastive Regression , author=. Proceedings of the 29th International Conference on Computational Linguistics , pages=

  24. [24]

    Precise zero-shot dense retrieval without relevance labels

    Gao, Luyu and Ma, Xueguang and Lin, Jimmy and Callan, Jamie. Precise Z ero- S hot Dense Retrieval without Relevance Labels. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2023. doi:10.18653/v1/2023.acl-long.99

  25. [25]

    Large Language Models Are State-of-the-Art Evaluators of Translation Quality

    Kocmi, Tom and Federmann, Christian. Large Language Models Are State-of-the-Art Evaluators of Translation Quality. Proceedings of the 24th Annual Conference of the European Association for Machine Translation. 2023

  26. [26]

    Sentiment Analysis in the Era of Large Language Models: A Reality Check.CoRR abs/2305.15005, 2023

    Sentiment Analysis in the Era of Large Language Models: A Reality Check , author=. arXiv preprint arXiv:2305.15005 , year=

  27. [27]

    Lexical Chaining for Measuring Discourse Coherence Quality in Test-taker Essays

    Somasundaran, Swapna and Burstein, Jill and Chodorow, Martin. Lexical Chaining for Measuring Discourse Coherence Quality in Test-taker Essays. Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 2014

  28. [28]

    Modeling Prompt Adherence in Student Essays

    Persing, Isaac and Ng, Vincent. Modeling Prompt Adherence in Student Essays. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2014. doi:10.3115/v1/P14-1144

  29. [29]

    Thank `` Goodness '' ! A Way to Measure Style in Student Essays

    Mathias, Sandeep and Bhattacharyya, Pushpak. Thank `` Goodness '' ! A Way to Measure Style in Student Essays. Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications. 2018. doi:10.18653/v1/W18-3705

  30. [30]

    Starling-7

    Zhu, Banghua and Frick, Evan and Wu, Tianhao and Zhu, Hanlin and Jiao, Jiantao , month =. Starling-7

  31. [31]

    Touvron, Hugo and Martin, Louis and Stone, Kevin and Albert, Peter and Almahairi, Amjad and Babaei, Yasmine and Bashlykov, Nikolay and Batra, Soumya and Bhargava, Prajjwal and Bhosale, Shruti and others , journal=

  32. [32]

    Advances in Neural Information Processing Systems , volume=

    Why do tree-based models still outperform deep learning on typical tabular data? , author=. Advances in Neural Information Processing Systems , volume=

  33. [33]

    Information Fusion , volume=

    Tabular data: Deep learning is not all you need , author=. Information Fusion , volume=. 2022 , publisher=

  34. [34]

    A Neural Approach to Automated Essay Scoring

    Taghipour, Kaveh and Ng, Hwee Tou. A Neural Approach to Automated Essay Scoring. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016. doi:10.18653/v1/D16-1193

  35. [35]

    https://arxiv.org/pdf/2008.01441

    Prompt agnostic essay scorer: a domain generalization approach to cross-prompt automated essay scoring , author=. arXiv preprint arXiv:2008.01441 , url="https://arxiv.org/pdf/2008.01441", year=

  36. [36]

    Proceedings of the AAAI conference on artificial intelligence , volume=

    Automated cross-prompt scoring of essay traits , author=. Proceedings of the AAAI conference on artificial intelligence , volume=

  37. [37]

    1967 , institution=

    Automated readability index , author=. 1967 , institution=

  38. [38]

    Achiam, Josh and Adler, Steven and Agarwal, Sandhini and Ahmad, Lama and Akkaya, Ilge and Aleman, Florencia Leoni and Almeida, Diogo and Altenschmidt, Janko and Altman, Sam and Anadkat, Shyamal and others , journal=

  39. [39]

    Sparks of Artificial General Intelligence: Early experiments with GPT-4

    Bubeck, S. Sparks of artificial general intelligence: Early experiments with. arXiv preprint arXiv:2303.12712 , year=

  40. [40]

    Ke, Zixuan and Ng, Vincent , booktitle=

  41. [41]

    Proceedings of the ACM'82 Conference , pages=

    Computer-based readability indexes , author=. Proceedings of the ACM'82 Conference , pages=

  42. [42]

    Automated Essay Scoring by Maximizing Human-Machine Agreement

    Chen, Hongbo and He, Ben. Automated Essay Scoring by Maximizing Human-Machine Agreement. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 2013

  43. [43]

    TDNN : A Two-stage Deep Neural Network for Prompt-independent Automated Essay Scoring

    Jin, Cancan and He, Ben and Hui, Kai and Sun, Le. TDNN : A Two-stage Deep Neural Network for Prompt-independent Automated Essay Scoring. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2018. doi:10.18653/v1/P18-1100

  44. [44]

    Towards Evaluating Narrative Quality In Student Writing

    Somasundaran, Swapna and Flor, Michael and Chodorow, Martin and Molloy, Hillary and Gyawali, Binod and McCulla, Laura. Towards Evaluating Narrative Quality In Student Writing. Transactions of the Association for Computational Linguistics. 2018. doi:10.1162/tacl_a_00007

  45. [45]

    Poli, Michael and Wang, Jue and Massaroli, Stefano and Quesnelle, Jeffrey and Carlow, Ryan and Nguyen, Eric and Thomas, Armin , month = 12, year = 2023, url =

  46. [46]

    Chawla, Nitesh V and Bowyer, Kevin W and Hall, Lawrence O and Kegelmeyer, W Philip , journal=

  47. [47]

    thought” of LLM by finding the “circuit

    Gemma Team , year=. Gemma , url=. doi:10.34740/KAGGLE/M/3301 , publisher=

  48. [48]

    Automated Essay Scoring: A Reflection on the State of the Art

    Li, Shengjie and Ng, Vincent. Automated Essay Scoring: A Reflection on the State of the Art. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024

  49. [49]

    Can Large Language Models Automatically Score Proficiency of Written Essays?

    Mansour, Watheq Ahmad and Albatarni, Salam and Eltanbouly, Sohaila and Elsayed, Tamer. Can Large Language Models Automatically Score Proficiency of Written Essays?. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). 2024

  50. [50]

    Crossley, Scott and Tian, Yu and Baffour, Perpetual and Franklin, Alex and Kim, Youngmeen and Morris, Wesley and Benner, Meg and Picou, Aigner and Boser, Ulrich , journal=. The. 2023 , publisher=

  51. [51]

    Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing , pages=

    Meta-Learning for Low-Resource Neural Machine Translation , author=. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing , pages=

  52. [52]

    Prompt- and Trait Relation-aware Cross-prompt Essay Trait Scoring

    Do, Heejin and Kim, Yunsu and Lee, Gary Geunbae. Prompt- and Trait Relation-aware Cross-prompt Essay Trait Scoring. Findings of the Association for Computational Linguistics: ACL 2023. 2023. doi:10.18653/v1/2023.findings-acl.98

  53. [53]

    PLAES : Prompt-generalized and Level-aware Learning Framework for Cross-prompt Automated Essay Scoring

    Chen, Yuan and Li, Xia. PLAES : Prompt-generalized and Level-aware Learning Framework for Cross-prompt Automated Essay Scoring. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). 2024

  54. [54]

    Conundrums in Cross-Prompt Automated Essay Scoring: Making Sense of the State of the Art

    Li, Shengjie and Ng, Vincent. Conundrums in Cross-Prompt Automated Essay Scoring: Making Sense of the State of the Art. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2024. doi:10.18653/v1/2024.acl-long.414

  55. [55]

    Biometrics , year=

    The Measurement of Observer Agreement for Categorical Data , author=. Biometrics , year=

  56. [56]

    PMAES : Prompt-mapping Contrastive Learning for Cross-prompt Automated Essay Scoring

    Chen, Yuan and Li, Xia. PMAES : Prompt-mapping Contrastive Learning for Cross-prompt Automated Essay Scoring. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2023. doi:10.18653/v1/2023.acl-long.83

  57. [57]

    arXiv preprint arXiv:2403.08332 , year=

    Autoregressive Score Generation for Multi-trait Essay Scoring , author=. arXiv preprint arXiv:2403.08332 , year=

  58. [58]

    Autoregressive Score Generation for Multi-trait Essay Scoring

    Do, Heejin and Kim, Yunsu and Lee, Gary. Autoregressive Score Generation for Multi-trait Essay Scoring. Findings of the Association for Computational Linguistics: EACL 2024. 2024

  59. [59]

    Exploring LLM Prompting Strategies for Joint Essay Scoring and Feedback Generation

    Stahl, Maja and Biermann, Leon and Nehring, Andreas and Wachsmuth, Henning. Exploring LLM Prompting Strategies for Joint Essay Scoring and Feedback Generation. Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024). 2024

  60. [60]

    Unleashing Large Language Models ' Proficiency in Z ero-shot Essay Scoring

    Lee, Sanwoo and Cai, Yida and Meng, Desong and Wang, Ziyang and Wu, Yunfang. Unleashing Large Language Models ' Proficiency in Z ero-shot Essay Scoring. Findings of the Association for Computational Linguistics: EMNLP 2024. 2024. doi:10.18653/v1/2024.findings-emnlp.10

  61. [61]

    Knowledge-Based Systems , volume =

    Xia Li and Minping Chen and Jian-Yun Nie , keywords =. Knowledge-Based Systems , volume =. 2020 , issn =. doi:https://doi.org/10.1016/j.knosys.2020.106491 , url =

  62. [62]

    Meng, Yu and Xia, Mengzhou and Chen, Danqi , journal=

  63. [63]

    Linguistic features and proficiency classification in L 2 S panish and L 2 P ortuguese

    del R \' o, Iria. Linguistic features and proficiency classification in L 2 S panish and L 2 P ortuguese. Proceedings of the 8th Workshop on NLP for Computer Assisted Language Learning. 2019

  64. [64]

    Advances in neural information processing systems , volume=

    Prototypical networks for few-shot learning , author=. Advances in neural information processing systems , volume=

  65. [65]

    2013 , publisher=

    Handbook of automated essay evaluation: Current applications and new directions , author=. 2013 , publisher=

  66. [66]

    Autoregressive Multi-trait Essay Scoring via Reinforcement Learning with Scoring-aware Multiple Rewards

    Do, Heejin and Ryu, Sangwon and Lee, Gary. Autoregressive Multi-trait Essay Scoring via Reinforcement Learning with Scoring-aware Multiple Rewards. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024. doi:10.18653/v1/2024.emnlp-main.917

  67. [67]

    arXiv preprint arXiv:1903.03096 , year=

    Meta-dataset: A dataset of datasets for learning to learn from few examples , author=. arXiv preprint arXiv:1903.03096 , year=

  68. [68]

    Multilingual and cross-lingual document classification: A meta-learning approach

    van der Heijden, Niels and Yannakoudakis, Helen and Mishra, Pushkar and Shutova, Ekaterina. Multilingual and cross-lingual document classification: A meta-learning approach. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. 2021. doi:10.18653/v1/2021.eacl-main.168

  69. [69]

    International Conference on Artificial Intelligence in Education , pages=

    Generalizable automatic short answer scoring via prototypical neural network , author=. International Conference on Artificial Intelligence in Education , pages=. 2023 , organization=

  70. [70]

    Meta-Learning for Effective Multi-task and Multilingual Modelling

    Tarunesh, Ishan and Khyalia, Sushil and Kumar, Vishwajeet and Ramakrishnan, Ganesh and Jyothi, Preethi. Meta-Learning for Effective Multi-task and Multilingual Modelling. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. 2021. doi:10.18653/v1/2021.eacl-main.314

  71. [71]

    Advances in Neural Information Processing Systems , volume=

    Meta-learning from tasks with heterogeneous attribute spaces , author=. Advances in Neural Information Processing Systems , volume=

  72. [72]

    2020 , isbn =

    Cao, Yue and Jin, Hanqi and Wan, Xiaojun and Yu, Zhiwei , title =. 2020 , isbn =. doi:10.1145/3397271.3401037 , booktitle =

  73. [73]

    Neurocomputing , volume =

    Jiangsong Xu and Jian Liu and Mingwei Lin and Jiayin Lin and Shenbao Yu and Liang Zhao and Jun Shen , keywords =. Neurocomputing , volume =. 2025 , issn =. doi:https://doi.org/10.1016/j.neucom.2024.129283 , url =

  74. [74]

    Pairwise dual-level alignment for cross-prompt automated essay scoring , journal =

    Chunyun Zhang and Jiqin Deng and Xiaolin Dong and Hongyan Zhao and Kailin Liu and Chaoran Cui , keywords =. Pairwise dual-level alignment for cross-prompt automated essay scoring , journal =. 2025 , issn =. doi:https://doi.org/10.1016/j.eswa.2024.125924 , url =

  75. [75]

    Experiments with Universal CEFR Classification

    Vajjala, Sowmya and Rama, Taraka. Experiments with Universal CEFR Classification. Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications. 2018. doi:10.18653/v1/W18-0515

  76. [76]

    Mixture of Ordered Scoring Experts for Cross-prompt Essay Trait Scoring

    Chen, Po-Kai and Tsai, Bo-Wei and Wei, Shao Kuan and Wang, Chien-Yao and Wang, Jia-Ching and Huang, Yi-Ting. Mixture of Ordered Scoring Experts for Cross-prompt Essay Trait Scoring. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.acl-long.884

  77. [77]

    Automated essay scoring in

    Ghazawi, Rayed and Simpson, Edwin , journal=. Automated essay scoring in

  78. [78]

    Automatic Scoring of

    Mahmoud, Somaia and Nabil, Emad and Torki, Marwan , journal=. Automatic Scoring of. 2024 , publisher=

  79. [79]

    How well can

    Ghazawi, Rayed and Simpson, Edwin , journal=. How well can

  80. [80]

    Communications of the IBIMA , volume =

    Rim Aroua Machhout and Chiraz Ben Othmane Zribi , title =. Communications of the IBIMA , volume =. 2024 , article-id =. doi:10.5171/2024.176992 , url =

Showing first 80 references.