RPO-PDT: Demonstrating Role-Play-Based Knowledge Adaptation for Student Support Dialogue (Demonstration System)

Ewa Olton; Filip Janik; Harris Spratt; Md Zia Ullah; Robert Smales; Shea Tait; Yanchao Yu

arxiv: 2606.09255 · v1 · pith:IJ7YDQQPnew · submitted 2026-06-08 · 💻 cs.RO

RPO-PDT: Demonstrating Role-Play-Based Knowledge Adaptation for Student Support Dialogue (Demonstration System)

Filip Janik , Ewa Olton , Robert Smales , Harris Spratt , Shea Tait , Md Zia Ullah , Yanchao Yu This is my paper

Pith reviewed 2026-06-27 16:43 UTC · model grok-4.3

classification 💻 cs.RO

keywords role-play dialoguestudent supportknowledge adaptationpersonal development tutorsafety policiesembodied interactionretrieval-grounded system

0 comments

The pith

RPO-PDT demonstrates a dialogue system that adapts tutor strategies by replaying unresolved student interactions from the student perspective to build reusable memory.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents RPO-PDT as a retrieval-grounded system that delivers institution-specific Personal Development Tutor guidance while remaining constrained by explicit persona, boundary, confidentiality, and safety policies. It centers on a reverse-roleplay loop that turns unresolved exchanges into training material by re-enacting them from the student's viewpoint, then stores the resulting alternative strategies for later use. A reader would care because the approach offers one concrete way to keep AI support both grounded in institutional knowledge and adaptive without relaxing safety rules. The system is shown working in both text and embodied Furhat formats.

Core claim

RPO-PDT is a retrieval-grounded, role-play-based dialogue system for adaptive student support in higher education that provides institution-specific Personal Development Tutor guidance using structured knowledge sources, remains constrained by explicit persona, boundary, confidentiality, and safety policies, and is built around a reverse-roleplay loop in which unresolved interactions are replayed from the student perspective so that alternative tutor strategies can be generated and stored as reusable strategy memory.

What carries the argument

The reverse-roleplay loop, which replays unresolved tutor-student interactions from the student perspective to generate and store alternative tutor strategies as reusable memory.

If this is right

The system generates reusable strategy memory from role-play replays without external retraining.
Explicit persona, boundary, confidentiality, and safety policies remain active constraints during both text and embodied interactions.
Structured knowledge sources allow institution-specific guidance while the role-play mechanism supplies adaptation.
The same architecture supports both text-based and Furhat-based embodied student interactions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The stored strategy memory could be inspected or edited by human staff before reuse.
The approach might extend to other constrained professional dialogues if the replay step generalizes beyond student support.
Long-term use would require tracking whether the growing memory begins to duplicate or conflict with the original policy constraints.

Load-bearing premise

Replaying interactions from the student perspective will reliably produce effective and reusable alternative tutor strategies that improve the system's performance over time.

What would settle it

Run a sequence of unresolved student-tutor exchanges through the loop, then measure whether the newly generated strategies produce higher resolution rates on matched follow-up cases compared with the original strategies.

Figures

Figures reproduced from arXiv: 2606.09255 by Ewa Olton, Filip Janik, Harris Spratt, Md Zia Ullah, Robert Smales, Shea Tait, Yanchao Yu.

**Figure 1.** Figure 1: RPO-PDT system architecture. The system integrates text- and Furhat-based interaction, Rasa dialogue orchestration, [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: RPO-PDT dialogue demonstration. The interface shows two controlled role configurations: in the PDT role, the [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

read the original abstract

We present RPO-PDT: a retrieval-grounded, role-play-based dialogue system for adaptive student support in higher education. RPO-PDT is: (1) able to provide institution-specific Personal Development Tutor (PDT) guidance using structured knowledge sources; (2) constrained by explicit persona, boundary, confidentiality, and safety policies; and (3) designed around a reverse-roleplay loop where unresolved interactions are replayed from the student perspective, enabling alternative tutor strategies to be generated and stored as reusable strategy memory. RPO-PDT supports both text-based and Furhat-based embodied interaction for demonstrating grounded, safe, and adaptive student-support dialogue.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

RPO-PDT describes a demo system for safe student-support dialogue with a reverse-roleplay loop, but offers no evaluation or examples to show the loop actually improves anything.

read the letter

The paper is a system description of RPO-PDT, a retrieval-grounded dialogue setup for higher-ed personal development tutoring. It adds explicit persona, boundary, confidentiality, and safety policies on top of structured knowledge sources, and it includes a reverse-roleplay loop that replays stuck interactions from the student side to generate and store alternative tutor strategies.

The design choices around safety constraints and institution-specific retrieval are sensible and clearly motivated for this use case. Supporting both text and Furhat embodied interaction also makes sense for a demonstration. Those parts read as practical engineering for an educational chatbot.

The soft spot is the adaptive claim. The paper presents the loop as the mechanism that produces reusable strategy memory and better performance over time, but it supplies no dialogues, no qualitative traces, no ablation, and no user feedback to show the generated strategies are better than a plain retrieval baseline or that reuse actually helps. The stress-test note is accurate on this point: the benefit is asserted rather than demonstrated.

This is the sort of work that might interest people building constrained dialogue systems for education or who want to see a Furhat demo. A reader looking for new methods, measured gains, or reproducible findings will not find them here. The citation pattern is light because there are no new results to cite.

I would not bring it to a reading group. I would not cite it. For peer review in a research track I would recommend against sending it out; the lack of any grounding for the central adaptive mechanism makes it better suited to a demo or workshop slot if the venue has one.

Referee Report

2 major / 0 minor

Summary. The manuscript presents RPO-PDT, a retrieval-grounded, role-play-based dialogue system for adaptive student support in higher education. It claims to (1) deliver institution-specific Personal Development Tutor (PDT) guidance from structured knowledge sources, (2) enforce explicit persona, boundary, confidentiality, and safety policies, and (3) incorporate a reverse-roleplay loop in which unresolved interactions are replayed from the student perspective to generate alternative tutor strategies that are stored as reusable strategy memory. The system is shown in both text-based and Furhat embodied interaction modes.

Significance. If the reverse-roleplay loop reliably produces effective, reusable strategies, the work could advance safe, policy-constrained dialogue systems for educational support and contribute to self-adaptive tutoring architectures. The explicit emphasis on grounding and safety policies is a constructive element for responsible deployment in higher education. However, the absence of any evaluation, examples, or ablation data means the practical significance remains that of an architectural description rather than a validated adaptive mechanism.

major comments (2)

[Abstract] Abstract: The central claim that the reverse-roleplay loop 'enables alternative tutor strategies to be generated and stored as reusable strategy memory' is presented as a core design feature, yet the manuscript supplies no qualitative examples of generated strategies, no comparison against non-roleplay baselines, and no indication of whether memory reuse produces measurable improvements in dialogue quality or support outcomes.
[Abstract] Abstract (system description): The assertion that the system achieves 'adaptive' knowledge adaptation rests on the untested assumption that replaying interactions from the student perspective will reliably yield effective and reusable tutor strategies; this assumption is load-bearing for the title's claim of demonstrating role-play-based knowledge adaptation but receives no supporting demonstration or validation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback. This is a demonstration paper describing the RPO-PDT system architecture rather than an empirical evaluation study. We address each major comment below.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that the reverse-roleplay loop 'enables alternative tutor strategies to be generated and stored as reusable strategy memory' is presented as a core design feature, yet the manuscript supplies no qualitative examples of generated strategies, no comparison against non-roleplay baselines, and no indication of whether memory reuse produces measurable improvements in dialogue quality or support outcomes.

Authors: We agree the manuscript provides no examples or comparisons, consistent with its scope as a system demonstration. The reverse-roleplay loop is presented as an architectural mechanism for generating and storing strategies. In a revision we will add qualitative examples of the replay process and resulting strategy memory entries to illustrate the design. revision: yes
Referee: [Abstract] Abstract (system description): The assertion that the system achieves 'adaptive' knowledge adaptation rests on the untested assumption that replaying interactions from the student perspective will reliably yield effective and reusable tutor strategies; this assumption is load-bearing for the title's claim of demonstrating role-play-based knowledge adaptation but receives no supporting demonstration or validation.

Authors: The manuscript describes the reverse-roleplay loop as the mechanism intended to support adaptation but does not claim or demonstrate empirical effectiveness. We will revise the abstract to more precisely frame the contribution as a design for role-play-based adaptation in a demonstration system, removing any implication of validated outcomes. revision: yes

Circularity Check

0 steps flagged

No circularity: system description contains no derivations, fits, or self-referential claims

full rationale

The paper is a demonstration-system description of an architecture (retrieval-grounded role-play dialogue with explicit policies and a reverse-roleplay loop). No equations, parameters, predictions, or uniqueness theorems appear. The reverse-roleplay loop is presented as a design feature whose effectiveness is an untested assumption, but this is a validation gap rather than circularity. No step reduces to its own inputs by construction, self-citation, or renaming. The derivation chain is empty; the work is self-contained as an engineering demonstration.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no technical derivations, parameters, or new entities to audit.

pith-pipeline@v0.9.1-grok · 5660 in / 1089 out tokens · 27003 ms · 2026-06-27T16:43:38.618080+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

58 extracted references · 2 canonical work pages

[1]

Optimising strategies for learning visually grounded word meanings through interaction , Year =

Yanchao Yu , School =. Optimising strategies for learning visually grounded word meanings through interaction , Year =
[2]

arXiv preprint arXiv:2511.11881 , year=

Better LLM Reasoning via Dual-Play , author=. arXiv preprint arXiv:2511.11881 , year=

arXiv
[3]

Alexa Prize Proceedings , title =

Papaioannou, Ioannis and Curry, Amanda Cercas and Part, Jose L and Shalyminov, Igor and Xu, Xinnuo and Yu, Yanchao and Du. Alexa Prize Proceedings , title =. 2017 , url =

2017
[4]

18th Workshop on the Semantics and Pragmatics of Dialogue (SemDial/DialWatt) , month =

Rieser, Verena and Janarthanam, Srinivasan and Taylor, Andy and Yu, Yanchao and Lemon, Oliver , title =. 18th Workshop on the Semantics and Pragmatics of Dialogue (SemDial/DialWatt) , month =. 2014 , address =

2014
[5]

and Yu, Yanchao and Siei

Gunson, Nancie and Garcia, Daniel Hernandez and Part, Jose L. and Yu, Yanchao and Siei. Combining Visual and Social Dialogue for Human-Robot Interaction , booktitle =. 2021 , address =

2021
[6]

arXiv preprint arXiv:2310.10683 , year=

Large Language Model Unlearning , author=. arXiv preprint arXiv:2310.10683 , year=

arXiv
[7]

arXiv preprint arXiv:2401.06121 , year=

TOFU: A Task of Fictitious Unlearning for LLMs , author=. arXiv preprint arXiv:2401.06121 , year=

Pith/arXiv arXiv
[8]

arXiv preprint arXiv:2310.07579 , year=

In-context unlearning: Language models as few shot unlearners , author=. arXiv preprint arXiv:2310.07579 , year=

arXiv
[9]

Proceedings of the 58th annual meeting of the association for computational linguistics , pages=

Unsupervised cross-lingual representation learning at scale , author=. Proceedings of the 58th annual meeting of the association for computational linguistics , pages=
[10]

2021 IEEE Symposium on Security and Privacy (SP) , pages=

Machine unlearning , author=. 2021 IEEE Symposium on Security and Privacy (SP) , pages=. 2021 , organization=

2021
[11]

Deep Learning for Micro-Expression Recognition: A Survey , year=

Li, Yante and Wei, Jinsheng and Liu, Yang and Kauttonen, Janne and Zhao, Guoying , journal=. Deep Learning for Micro-Expression Recognition: A Survey , year=
[12]

IEEE Transactions on Affective Computing , year=

An overview of facial micro-expression analysis: Data, methodology and challenge , author=. IEEE Transactions on Affective Computing , year=
[13]

Proceedings of the 22nd Annual ACM Interaction Design and Children Conference , pages=

Designing Parent-child-robot Interactions to Facilitate In-Home Parental Math Talk with Young Children , author=. Proceedings of the 22nd Annual ACM Interaction Design and Children Conference , pages=
[14]

Electronic Markets , volume=

Microexpressions in digital humans: perceived affect, sincerity, and trustworthiness , author=. Electronic Markets , volume=. 2022 , publisher=

2022
[15]

Neural Processing Letters , pages=

A Survey of Micro-expression Recognition Methods Based on LBP, Optical Flow and Deep Learning , author=. Neural Processing Letters , pages=. 2023 , publisher=

2023
[16]

Frontiers in psychology , volume=

Automatic micro-expression analysis: open challenges , author=. Frontiers in psychology , volume=. 2019 , publisher=

2019
[17]

Machine Learning and Knowledge Extraction , volume=

Review of automatic microexpression recognition in the past decade , author=. Machine Learning and Knowledge Extraction , volume=. 2021 , publisher=

2021
[18]

arXiv preprint arXiv:2307.09288 , year=

Llama 2: Open foundation and fine-tuned chat models , author=. arXiv preprint arXiv:2307.09288 , year=

Pith/arXiv arXiv
[19]

Transactions of the Association for Computational Linguistics , volume=

State of what art? a call for multi-prompt llm evaluation , author=. Transactions of the Association for Computational Linguistics , volume=. 2024 , publisher=

2024
[20]

arXiv preprint arXiv:2404.09971 , year=

Constructing Benchmarks and Interventions for Combating Hallucinations in LLMs , author=. arXiv preprint arXiv:2404.09971 , year=

arXiv
[21]

arXiv preprint arXiv:2401.10019 , year=

R-judge: Benchmarking safety risk awareness for llm agents , author=. arXiv preprint arXiv:2401.10019 , year=

arXiv
[22]

arXiv preprint arXiv:2407.21783 , year=

The llama 3 herd of models , author=. arXiv preprint arXiv:2407.21783 , year=

Pith/arXiv arXiv
[23]

arXiv preprint arXiv:2310.10501 , year=

Nemo guardrails: A toolkit for controllable and safe llm applications with programmable rails , author=. arXiv preprint arXiv:2310.10501 , year=

arXiv
[24]

2024 , url=

Malte Ostendorff and Pedro Ortiz Suarez and Lucas Fonseca Lage and Georg Rehm , booktitle=. 2024 , url=

2024
[25]

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts , pages=

Unsupervised cross-lingual representation learning , author=. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts , pages=
[26]

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model , author=
[27]

In: Proceedings of the 2021 Conference of the North American Chap- ter of the Association for Computational Linguistics: Human Language Tech- nologies, pp

Xue, Linting and Constant, Noah and Roberts, Adam and Kale, Mihir and Al-Rfou, Rami and Siddhant, Aditya and Barua, Aditya and Raffel, Colin. m T 5: A Massively Multilingual Pre-trained Text-to-Text Transformer. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2...

work page doi:10.18653/v1/2021.naacl-main.41 2021
[28]

ACM Computing Surveys , year=

Continual learning of large language models: A comprehensive survey , author=. ACM Computing Surveys , year=
[29]

arXiv preprint arXiv:2310.06762 , year=

Trace: A comprehensive benchmark for continual learning in large language models , author=. arXiv preprint arXiv:2310.06762 , year=

arXiv
[30]

Proceedings of the AAAI Conference on Artificial Intelligence , series=

Memorybank: Enhancing large language models with long-term memory , author=. Proceedings of the AAAI Conference on Artificial Intelligence , series=
[31]

, author=

Lora: Low-rank adaptation of large language models. , author=. ICLR , volume=
[32]

Advances in Neural Information Processing Systems , volume=

Reflexion: Language agents with verbal reinforcement learning , author=. Advances in Neural Information Processing Systems , volume=
[33]

Advances in Neural Information Processing Systems , volume=

Self-refine: Iterative refinement with self-feedback , author=. Advances in Neural Information Processing Systems , volume=
[34]

Nature , volume=

Role play with large language models , author=. Nature , volume=. 2023 , publisher=

2023
[35]

Proceedings of the 29th International Conference on Computational Linguistics , pages=

Topkg: Target-oriented dialog via global planning on knowledge graph , author=. Proceedings of the 29th International Conference on Computational Linguistics , pages=
[36]

2024 IEEE AITU: Digital Generation , pages=

Simulating life: The application of generative agents in virtual environments , author=. 2024 IEEE AITU: Digital Generation , pages=. 2024 , organization=

2024
[37]

Findings of the Association for Computational Linguistics: ACL 2024 , pages=

Rolellm: Benchmarking, eliciting, and enhancing role-playing abilities of large language models , author=. Findings of the Association for Computational Linguistics: ACL 2024 , pages=

2024
[38]

arXiv preprint arXiv:2210.04185 , year=

Controllable dialogue simulation with in-context learning , author=. arXiv preprint arXiv:2210.04185 , year=

arXiv
[39]

Extended abstracts of the CHI conference on human factors in computing systems , pages=

Generating personas using LLMs and assessing their viability , author=. Extended abstracts of the CHI conference on human factors in computing systems , pages=
[40]

Findings of the Association for Computational Linguistics: NAACL 2022 , pages=

BanglaBERT: Language model pretraining and benchmarks for low-resource language understanding evaluation in Bangla , author=. Findings of the Association for Computational Linguistics: NAACL 2022 , pages=

2022
[41]

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

AmericasNLI: Evaluating zero-shot natural language understanding of pretrained multilingual models in truly low-resource languages , author=. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=
[42]

Cross-Lingual Transfer Learning for Low-Resource NLP Tasks: Leveraging Multilingual Pretrained Models , author=
[43]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

Unibridge: A unified approach to cross-lingual transfer learning for low-resource languages , author=. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=
[44]

arXiv preprint arXiv:2305.15525 , year=

Large language models are few-shot health learners , author=. arXiv preprint arXiv:2305.15525 , year=

arXiv
[45]

arXiv preprint arXiv:2402.08526 , year=

Can LLMs Learn New Concepts Incrementally without Forgetting? , author=. arXiv preprint arXiv:2402.08526 , year=

arXiv
[46]

Recommendation as a Communication Game: Self-Supervised Bot-Play for Goal-oriented Dialogue

Kang, Dongyeop and Balakrishnan, Anusha and Shah, Pararth and Crook, Paul and Boureau, Y-Lan and Weston, Jason. Recommendation as a Communication Game: Self-Supervised Bot-Play for Goal-oriented Dialogue. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Pr...

work page doi:10.18653/v1/d19-1203 2019
[47]

Proceedings of the 36th annual acm symposium on user interface software and technology , pages=

Generative agents: Interactive simulacra of human behavior , author=. Proceedings of the 36th annual acm symposium on user interface software and technology , pages=
[48]

LLM Alignment via Reinforcement Learning from Multi-role Debates as Feedback , author=
[49]

Proceedings of the 31st International Conference on Computational Linguistics , pages=

Continual Learning Using Only Large Language Model Prompting , author=. Proceedings of the 31st International Conference on Computational Linguistics , pages=
[50]

International conference on machine learning , pages=

Improving language models by retrieving from trillions of tokens , author=. International conference on machine learning , pages=. 2022 , organization=

2022
[51]

Transactions of the Association for Computational Linguistics , volume=

Relational memory-augmented language models , author=. Transactions of the Association for Computational Linguistics , volume=. 2022 , publisher=

2022
[52]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 6: Tutorial Abstracts) , pages=

Retrieval-based language models and applications , author=. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 6: Tutorial Abstracts) , pages=
[53]

2024 , eprint=

Direct Preference Optimization: Your Language Model is Secretly a Reward Model , author=. 2024 , eprint=

2024
[54]

arXiv preprint arXiv:2406.05534 , year=

Online dpo: Online direct preference optimization with fast-slow chasing , author=. arXiv preprint arXiv:2406.05534 , year=

arXiv
[55]

Advances in Neural Information Processing Systems , volume=

Jailbreaking large language models against moderation guardrails via cipher characters , author=. Advances in Neural Information Processing Systems , volume=
[56]

arXiv preprint arXiv:2412.19437 , year=

Deepseek-v3 technical report , author=. arXiv preprint arXiv:2412.19437 , year=

Pith/arXiv arXiv
[57]

arXiv preprint arXiv:1712.05181 , year=

Rasa: Open source language understanding and dialogue management , author=. arXiv preprint arXiv:1712.05181 , year=

Pith/arXiv arXiv
[58]

Cognitive behavioural systems: COST 2102 international training school, dresden, Germany, february 21-26, 2011, revised selected papers , pages=

Furhat: a back-projected human-like robot head for multiparty human-machine interaction , author=. Cognitive behavioural systems: COST 2102 international training school, dresden, Germany, february 21-26, 2011, revised selected papers , pages=. 2012 , publisher=

2011

[1] [1]

Optimising strategies for learning visually grounded word meanings through interaction , Year =

Yanchao Yu , School =. Optimising strategies for learning visually grounded word meanings through interaction , Year =

[2] [2]

arXiv preprint arXiv:2511.11881 , year=

Better LLM Reasoning via Dual-Play , author=. arXiv preprint arXiv:2511.11881 , year=

arXiv

[3] [3]

Alexa Prize Proceedings , title =

Papaioannou, Ioannis and Curry, Amanda Cercas and Part, Jose L and Shalyminov, Igor and Xu, Xinnuo and Yu, Yanchao and Du. Alexa Prize Proceedings , title =. 2017 , url =

2017

[4] [4]

18th Workshop on the Semantics and Pragmatics of Dialogue (SemDial/DialWatt) , month =

Rieser, Verena and Janarthanam, Srinivasan and Taylor, Andy and Yu, Yanchao and Lemon, Oliver , title =. 18th Workshop on the Semantics and Pragmatics of Dialogue (SemDial/DialWatt) , month =. 2014 , address =

2014

[5] [5]

and Yu, Yanchao and Siei

Gunson, Nancie and Garcia, Daniel Hernandez and Part, Jose L. and Yu, Yanchao and Siei. Combining Visual and Social Dialogue for Human-Robot Interaction , booktitle =. 2021 , address =

2021

[6] [6]

arXiv preprint arXiv:2310.10683 , year=

Large Language Model Unlearning , author=. arXiv preprint arXiv:2310.10683 , year=

arXiv

[7] [7]

arXiv preprint arXiv:2401.06121 , year=

TOFU: A Task of Fictitious Unlearning for LLMs , author=. arXiv preprint arXiv:2401.06121 , year=

Pith/arXiv arXiv

[8] [8]

arXiv preprint arXiv:2310.07579 , year=

In-context unlearning: Language models as few shot unlearners , author=. arXiv preprint arXiv:2310.07579 , year=

arXiv

[9] [9]

Proceedings of the 58th annual meeting of the association for computational linguistics , pages=

Unsupervised cross-lingual representation learning at scale , author=. Proceedings of the 58th annual meeting of the association for computational linguistics , pages=

[10] [10]

2021 IEEE Symposium on Security and Privacy (SP) , pages=

Machine unlearning , author=. 2021 IEEE Symposium on Security and Privacy (SP) , pages=. 2021 , organization=

2021

[11] [11]

Deep Learning for Micro-Expression Recognition: A Survey , year=

Li, Yante and Wei, Jinsheng and Liu, Yang and Kauttonen, Janne and Zhao, Guoying , journal=. Deep Learning for Micro-Expression Recognition: A Survey , year=

[12] [12]

IEEE Transactions on Affective Computing , year=

An overview of facial micro-expression analysis: Data, methodology and challenge , author=. IEEE Transactions on Affective Computing , year=

[13] [13]

Proceedings of the 22nd Annual ACM Interaction Design and Children Conference , pages=

Designing Parent-child-robot Interactions to Facilitate In-Home Parental Math Talk with Young Children , author=. Proceedings of the 22nd Annual ACM Interaction Design and Children Conference , pages=

[14] [14]

Electronic Markets , volume=

Microexpressions in digital humans: perceived affect, sincerity, and trustworthiness , author=. Electronic Markets , volume=. 2022 , publisher=

2022

[15] [15]

Neural Processing Letters , pages=

A Survey of Micro-expression Recognition Methods Based on LBP, Optical Flow and Deep Learning , author=. Neural Processing Letters , pages=. 2023 , publisher=

2023

[16] [16]

Frontiers in psychology , volume=

Automatic micro-expression analysis: open challenges , author=. Frontiers in psychology , volume=. 2019 , publisher=

2019

[17] [17]

Machine Learning and Knowledge Extraction , volume=

Review of automatic microexpression recognition in the past decade , author=. Machine Learning and Knowledge Extraction , volume=. 2021 , publisher=

2021

[18] [18]

arXiv preprint arXiv:2307.09288 , year=

Llama 2: Open foundation and fine-tuned chat models , author=. arXiv preprint arXiv:2307.09288 , year=

Pith/arXiv arXiv

[19] [19]

Transactions of the Association for Computational Linguistics , volume=

State of what art? a call for multi-prompt llm evaluation , author=. Transactions of the Association for Computational Linguistics , volume=. 2024 , publisher=

2024

[20] [20]

arXiv preprint arXiv:2404.09971 , year=

Constructing Benchmarks and Interventions for Combating Hallucinations in LLMs , author=. arXiv preprint arXiv:2404.09971 , year=

arXiv

[21] [21]

arXiv preprint arXiv:2401.10019 , year=

R-judge: Benchmarking safety risk awareness for llm agents , author=. arXiv preprint arXiv:2401.10019 , year=

arXiv

[22] [22]

arXiv preprint arXiv:2407.21783 , year=

The llama 3 herd of models , author=. arXiv preprint arXiv:2407.21783 , year=

Pith/arXiv arXiv

[23] [23]

arXiv preprint arXiv:2310.10501 , year=

Nemo guardrails: A toolkit for controllable and safe llm applications with programmable rails , author=. arXiv preprint arXiv:2310.10501 , year=

arXiv

[24] [24]

2024 , url=

Malte Ostendorff and Pedro Ortiz Suarez and Lucas Fonseca Lage and Georg Rehm , booktitle=. 2024 , url=

2024

[25] [25]

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts , pages=

Unsupervised cross-lingual representation learning , author=. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts , pages=

[26] [26]

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model , author=

[27] [27]

In: Proceedings of the 2021 Conference of the North American Chap- ter of the Association for Computational Linguistics: Human Language Tech- nologies, pp

Xue, Linting and Constant, Noah and Roberts, Adam and Kale, Mihir and Al-Rfou, Rami and Siddhant, Aditya and Barua, Aditya and Raffel, Colin. m T 5: A Massively Multilingual Pre-trained Text-to-Text Transformer. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2...

work page doi:10.18653/v1/2021.naacl-main.41 2021

[28] [28]

ACM Computing Surveys , year=

Continual learning of large language models: A comprehensive survey , author=. ACM Computing Surveys , year=

[29] [29]

arXiv preprint arXiv:2310.06762 , year=

Trace: A comprehensive benchmark for continual learning in large language models , author=. arXiv preprint arXiv:2310.06762 , year=

arXiv

[30] [30]

Proceedings of the AAAI Conference on Artificial Intelligence , series=

Memorybank: Enhancing large language models with long-term memory , author=. Proceedings of the AAAI Conference on Artificial Intelligence , series=

[31] [31]

, author=

Lora: Low-rank adaptation of large language models. , author=. ICLR , volume=

[32] [32]

Advances in Neural Information Processing Systems , volume=

Reflexion: Language agents with verbal reinforcement learning , author=. Advances in Neural Information Processing Systems , volume=

[33] [33]

Advances in Neural Information Processing Systems , volume=

Self-refine: Iterative refinement with self-feedback , author=. Advances in Neural Information Processing Systems , volume=

[34] [34]

Nature , volume=

Role play with large language models , author=. Nature , volume=. 2023 , publisher=

2023

[35] [35]

Proceedings of the 29th International Conference on Computational Linguistics , pages=

Topkg: Target-oriented dialog via global planning on knowledge graph , author=. Proceedings of the 29th International Conference on Computational Linguistics , pages=

[36] [36]

2024 IEEE AITU: Digital Generation , pages=

Simulating life: The application of generative agents in virtual environments , author=. 2024 IEEE AITU: Digital Generation , pages=. 2024 , organization=

2024

[37] [37]

Findings of the Association for Computational Linguistics: ACL 2024 , pages=

Rolellm: Benchmarking, eliciting, and enhancing role-playing abilities of large language models , author=. Findings of the Association for Computational Linguistics: ACL 2024 , pages=

2024

[38] [38]

arXiv preprint arXiv:2210.04185 , year=

Controllable dialogue simulation with in-context learning , author=. arXiv preprint arXiv:2210.04185 , year=

arXiv

[39] [39]

Extended abstracts of the CHI conference on human factors in computing systems , pages=

Generating personas using LLMs and assessing their viability , author=. Extended abstracts of the CHI conference on human factors in computing systems , pages=

[40] [40]

Findings of the Association for Computational Linguistics: NAACL 2022 , pages=

BanglaBERT: Language model pretraining and benchmarks for low-resource language understanding evaluation in Bangla , author=. Findings of the Association for Computational Linguistics: NAACL 2022 , pages=

2022

[41] [41]

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

AmericasNLI: Evaluating zero-shot natural language understanding of pretrained multilingual models in truly low-resource languages , author=. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

[42] [42]

Cross-Lingual Transfer Learning for Low-Resource NLP Tasks: Leveraging Multilingual Pretrained Models , author=

[43] [43]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

Unibridge: A unified approach to cross-lingual transfer learning for low-resource languages , author=. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

[44] [44]

arXiv preprint arXiv:2305.15525 , year=

Large language models are few-shot health learners , author=. arXiv preprint arXiv:2305.15525 , year=

arXiv

[45] [45]

arXiv preprint arXiv:2402.08526 , year=

Can LLMs Learn New Concepts Incrementally without Forgetting? , author=. arXiv preprint arXiv:2402.08526 , year=

arXiv

[46] [46]

Recommendation as a Communication Game: Self-Supervised Bot-Play for Goal-oriented Dialogue

Kang, Dongyeop and Balakrishnan, Anusha and Shah, Pararth and Crook, Paul and Boureau, Y-Lan and Weston, Jason. Recommendation as a Communication Game: Self-Supervised Bot-Play for Goal-oriented Dialogue. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Pr...

work page doi:10.18653/v1/d19-1203 2019

[47] [47]

Proceedings of the 36th annual acm symposium on user interface software and technology , pages=

Generative agents: Interactive simulacra of human behavior , author=. Proceedings of the 36th annual acm symposium on user interface software and technology , pages=

[48] [48]

LLM Alignment via Reinforcement Learning from Multi-role Debates as Feedback , author=

[49] [49]

Proceedings of the 31st International Conference on Computational Linguistics , pages=

Continual Learning Using Only Large Language Model Prompting , author=. Proceedings of the 31st International Conference on Computational Linguistics , pages=

[50] [50]

International conference on machine learning , pages=

Improving language models by retrieving from trillions of tokens , author=. International conference on machine learning , pages=. 2022 , organization=

2022

[51] [51]

Transactions of the Association for Computational Linguistics , volume=

Relational memory-augmented language models , author=. Transactions of the Association for Computational Linguistics , volume=. 2022 , publisher=

2022

[52] [52]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 6: Tutorial Abstracts) , pages=

Retrieval-based language models and applications , author=. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 6: Tutorial Abstracts) , pages=

[53] [53]

2024 , eprint=

Direct Preference Optimization: Your Language Model is Secretly a Reward Model , author=. 2024 , eprint=

2024

[54] [54]

arXiv preprint arXiv:2406.05534 , year=

Online dpo: Online direct preference optimization with fast-slow chasing , author=. arXiv preprint arXiv:2406.05534 , year=

arXiv

[55] [55]

Advances in Neural Information Processing Systems , volume=

Jailbreaking large language models against moderation guardrails via cipher characters , author=. Advances in Neural Information Processing Systems , volume=

[56] [56]

arXiv preprint arXiv:2412.19437 , year=

Deepseek-v3 technical report , author=. arXiv preprint arXiv:2412.19437 , year=

Pith/arXiv arXiv

[57] [57]

arXiv preprint arXiv:1712.05181 , year=

Rasa: Open source language understanding and dialogue management , author=. arXiv preprint arXiv:1712.05181 , year=

Pith/arXiv arXiv

[58] [58]

Cognitive behavioural systems: COST 2102 international training school, dresden, Germany, february 21-26, 2011, revised selected papers , pages=

Furhat: a back-projected human-like robot head for multiparty human-machine interaction , author=. Cognitive behavioural systems: COST 2102 international training school, dresden, Germany, february 21-26, 2011, revised selected papers , pages=. 2012 , publisher=

2011