arxiv: 2605.10339 · v1 · submitted 2026-05-11 · 💻 cs.CL

Recognition: 2 theorem links

· Lean Theorem

An Annotation Scheme and Classifier for Personal Facts in Dialogue

Konstantin Zaitsev

Authors on Pith no claims yet

Pith reviewed 2026-05-12 04:49 UTC · model grok-4.3

classification 💻 cs.CL

keywords personal factsdialogue systemsannotation schemefact classificationtransformer classifiermulti-head modelpersonalizationfew-shot comparison

0 comments

The pith

Extended annotation scheme for personal facts lets a small classifier outperform few-shot LLMs at lower cost.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a more detailed way to label personal information shared in conversations by adding categories for a speaker's background and belongings plus attributes that track how long a fact holds, whether it remains valid, and whether it invites further questions. This addresses shortcomings in earlier schemes by supporting organized storage of user details and spotting which facts fit naturally into ongoing dialogue. The authors labeled 2,779 facts drawn from existing multi-turn chat data and trained a multi-head classifier on transformer encoders. When paired with a 300-million-parameter encoder, the model reaches 81.6 percent macro F1 and beats the strongest few-shot large-language-model baseline by nearly nine points while using far less computation. The approach is positioned for practical use in systems that need to maintain consistent, high-quality personal memory across sessions.

Core claim

We present an extended annotation scheme for personal fact classification that addresses limitations in existing approaches, particularly PeaCoK. Our scheme introduces new categories (Demographics, Possessions) and attributes (Duration, Validity, Followup) that enable structured storage, quality filtering, and identification of facts suitable for dialogue continuation. We manually annotated 2,779 facts from Multi-Session Chat and trained a multi-head classifier based on transformer encoders. Combined with the Gemma-300M encoder, the classifier achieves 81.6 ± 2.6% macro F1, outperforming all few-shot LLM baselines (best: GPT-5.4-mini, 72.92%) by nearly 9 percentage points while requiring a

What carries the argument

Multi-head classifier built on transformer encoders and trained on the extended personal-fact annotation scheme with added categories and attributes.

If this is right

Personal facts extracted from dialogue can be stored in a more organized, filterable form.
Quality control becomes possible by checking the new validity and duration attributes.
Dialogue systems gain a clearer signal for which facts to bring up again in later turns.
The same classification task can be performed with substantially lower compute than prompting large models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The scheme could be layered on top of existing memory modules in chatbots to reduce contradictory or outdated responses over long conversations.
Error patterns around temporal and pragmatic interpretation suggest the annotation could be combined with separate temporal-reasoning modules for further gains.
The public dataset release allows direct testing of whether downstream personalization metrics improve when the new attributes are used for filtering.

Load-bearing premise

The new categories for demographics and possessions together with the duration, validity, and followup attributes truly improve structured storage, quality filtering, and selection of facts worth continuing in real personalized dialogue systems.

What would settle it

Integrate the classifier into a live multi-session dialogue system, run controlled comparisons against the prior scheme, and check whether fact consistency and user satisfaction scores show measurable gains.

Figures

Figures reproduced from arXiv: 2605.10339 by Konstantin Zaitsev.

read the original abstract

The advancement of Large Language Models (LLMs) has enabled their application in personalized dialogue systems. We present an extended annotation scheme for personal fact classification that addresses limitations in existing approaches, particularly PeaCoK. Our scheme introduces new categories (Demographics, Possessions) and attributes (Duration, Validity, Followup) that enable structured storage, quality filtering, and identification of facts suitable for dialogue continuation. We manually annotated 2,779 facts from Multi-Session Chat and trained a multi-head classifier based on transformer encoders. Combined with the Gemma-300M encoder, the classifier achieves $81.6 \pm 2.6$\% macro F1, outperforming all few-shot LLM baselines (best: GPT-5.4-mini, 72.92\%) by nearly 9 percentage points while requiring substantially fewer computational resources. Error analysis reveals persistent challenges in semantic boundary disambiguation, temporal aspect interpretation, and pragmatic reasoning for followup assessment. The dataset\footnotemark[1] and classifier\footnotemark[2] are publicly available.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper delivers a clean, reproducible extension to personal-fact annotation with a classifier that beats few-shot LLM baselines on the new data, but leaves the practical value of the added categories untested.

read the letter

The core advance is an annotation scheme that adds Demographics and Possessions as categories plus Duration, Validity, and Followup as attributes on top of PeaCoK. They labeled 2,779 facts from Multi-Session Chat, trained a multi-head classifier on transformer encoders, and report 81.6 ± 2.6% macro F1 with Gemma-300M, beating the strongest few-shot LLM baseline by roughly nine points while using far less compute. The dataset and model are released, which is useful on its own.

Referee Report

1 major / 2 minor

Summary. The paper proposes an extended annotation scheme for personal facts in dialogue that adds new categories (Demographics, Possessions) and attributes (Duration, Validity, Followup) to prior work such as PeaCoK. It manually annotates 2,779 facts from the Multi-Session Chat corpus, trains a multi-head classifier on transformer encoders, and reports that the Gemma-300M variant reaches 81.6 ± 2.6% macro F1, outperforming few-shot LLM baselines (best GPT-5.4-mini at 72.92%) while using fewer resources. The dataset and classifier are released publicly, accompanied by error analysis on semantic, temporal, and pragmatic classification difficulties.

Significance. If the classification results hold, the work supplies a stronger, lower-cost baseline for personal-fact extraction together with a publicly available dataset and model. The concrete F1 scores, standard-deviation reporting, direct baseline comparisons, and open release constitute clear strengths. The claimed utility of the new categories and attributes for structured storage, quality filtering, and dialogue-continuation suitability, however, remains untested.

major comments (1)

[Abstract / Introduction] Abstract and Introduction: the central motivation that the added categories (Demographics, Possessions) and attributes (Duration, Validity, Followup) 'enable structured storage, quality filtering, and identification of facts suitable for dialogue continuation' is stated without any ablation, downstream task (e.g., fact retention across turns or filtering precision), or user-study evidence showing improvement over PeaCoK or other existing schemes.

minor comments (2)

[Evaluation] Evaluation section: inter-annotator agreement statistics for the full annotation scheme (including the new attributes) are not reported in sufficient detail, limiting assessment of label reliability for the 2,779-fact dataset.
[Baselines] Baselines: the exact few-shot prompting templates, temperature settings, and output parsing procedures used for the LLM baselines (including GPT-5.4-mini) should be provided in an appendix or supplementary material to support reproducibility.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback and for highlighting both the strengths of our classification results and the open release of the dataset and model. We address the major comment regarding the motivation for the new categories and attributes below.

read point-by-point responses

Referee: [Abstract / Introduction] Abstract and Introduction: the central motivation that the added categories (Demographics, Possessions) and attributes (Duration, Validity, Followup) 'enable structured storage, quality filtering, and identification of facts suitable for dialogue continuation' is stated without any ablation, downstream task (e.g., fact retention across turns or filtering precision), or user-study evidence showing improvement over PeaCoK or other existing schemes.

Authors: We agree that the paper would be strengthened by explicit evidence linking the new categories and attributes to downstream benefits. Our primary contribution is the extended annotation scheme, the manually annotated dataset of 2,779 facts, and the multi-head classifier achieving 81.6% macro F1. The stated motivations follow directly from documented limitations in PeaCoK (e.g., absence of temporal validity leading to stale facts and lack of followup flags for dialogue continuation). In the revised manuscript we will (1) expand the Introduction with concrete examples from our annotations illustrating how Duration/Validity support quality filtering and how Followup flags identify continuation-suitable facts, and (2) add a short 'Potential Applications' subsection that outlines plausible uses for structured storage and dialogue systems without claiming empirical gains. We will not add new ablation or user studies, as those fall outside the current scope focused on scheme design and classification performance. revision: yes

Circularity Check

0 steps flagged

No circularity: standard empirical pipeline on new annotation

full rationale

The paper defines a new annotation scheme with added categories and attributes, manually annotates 2,779 facts from an external corpus (Multi-Session Chat), trains a multi-head transformer classifier, and reports macro F1 against independent few-shot LLM baselines. All performance numbers arise from conventional train/test splits and cross-validation on the authors' own labeled data; no equations, parameters, or predictions are defined in terms of the target metrics, and no self-citations serve as load-bearing premises for the classifier results or scheme utility. The downstream-utility claim is simply untested rather than circular.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the reliability of the new annotation scheme for capturing dialogue-useful facts and on standard supervised learning assumptions for transformer-based classification.

axioms (2)

domain assumption Human annotations using the extended scheme provide consistent and useful ground truth labels for personal facts.
Invoked when training the classifier on the 2,779 annotated facts.
standard math Transformer encoder models can learn multi-head classification of dialogue facts from labeled text.
Standard assumption underlying the Gemma-300M based classifier.

pith-pipeline@v0.9.0 · 5472 in / 1445 out tokens · 45278 ms · 2026-05-12T04:49:43.514362+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We present an extended annotation scheme for personal fact classification... new categories (Demographics, Possessions) and attributes (Duration, Validity, Followup)... multi-head classifier based on transformer encoders... 81.6 ± 2.6% macro F1
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The annotation schema... Main Category, Time, Referent, Duration, Validity, Invalidity Reason, and Followup

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 9 internal anchors

[1]

Chalkidis, E

I. Chalkidis, E. Fergadiotis, P. Malakasiotis et al.Large-Scale Multi-Label Text Classification on EU Legislation. In: Proceedings of the 57th Annual Meeting of theAssociationforComputationalLinguistics,pp.6314–6322,2019.https://doi. org/10.18653/v1/P19-1636

work page doi:10.18653/v1/p19-1636 2019
[2]

J. Chen, H. Lin, X. Han et al.Benchmarking Large Language Models in Retrieval- Augmented Generation.arXiv:2309.01431,2023.https://arxiv.org/abs/2309. 01431

work page arXiv 2023
[3]

J. Chen, S. Xiao, P. Zhang et al.M3-Embedding: Multi-Linguality, Multi- Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Dis- tillation. In: Findings of the Association for Computational Linguistics: ACL 2024, pp. 2318–2335, 2024.https://doi.org/10.18653/v1/2024.findings-acl. 137

work page doi:10.18653/v1/2024.findings-acl 2024
[4]

Y. Deng, C. Ye, Z. Huang et al.GraphVis: Boosting LLMs with Visual Knowledge Graph Integration. In: The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024.https://openreview.net/forum?id=haVPmN8UGi

work page 2024
[5]

doi:10.18653/v1/N19-1423 , pages =

J. Devlin, M.-W. Chang, K. Lee et al.BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186, 2019.https://doi.org/10.18653/v1/N19-1423

work page doi:10.18653/v1/n19-1423 2019
[7]

URL https://arxiv.org/abs/ 2502.13595

K. Enevoldsen, I. Chung, I. Kerboua et al.MMTEB: Massive Multilingual Text Embedding Benchmark. In: arXiv preprint arXiv:2502.13595, 2025.https://doi. org/10.48550/arXiv.2502.13595

work page doi:10.48550/arxiv.2502.13595 2025
[9]

Fatemi, J

B. Fatemi, J. Halcrow, B. Perozzi.Talk like a Graph: Encoding Graphs for Large Language Models. In: The Twelfth International Conference on Learning Represen- tations, 2024.https://openreview.net/forum?id=IuXR1CCrSi

work page 2024
[10]

S. Gao, B. Borges, S. Oh et al.PeaCoK: Persona Commonsense Knowledge for Consistent and Engaging Narratives. In: Proceedings of the 61st Annual Meeting of 210 K. ZAITSEV the Association for Computational Linguistics (Volume 1: Long Papers), pp. 6569– 6591, 2023.https://doi.org/10.18653/v1/2023.acl-long.362

work page doi:10.18653/v1/2023.acl-long.362 2023
[11]

Gemma 3 Technical Report

Gemma Team, A. Kamath, J. Ferret et al.Gemma 3 Technical Report. arXiv:2503.19786, 2025.https://arxiv.org/abs/2503.19786

work page internal anchor Pith review Pith/arXiv arXiv 2025
[12]

doi:10.1073/pnas.2305016120 , author =

F. Gilardi, M. Alizadeh, M. Kubli.ChatGPT outperforms crowd workers for text-annotation tasks. In: Proceedings of the National Academy of Sciences, pp. e2305016120, 2023.https://doi.org/10.1073/pnas.2305016120

work page doi:10.1073/pnas.2305016120 2023
[13]

X. He, Z. Lin, Y. Gong et al.AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators. In: Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 6: Industry Track), pp. 165–190, 2024.https: //doi.org/10.18653/v1/2024.naacl-industry.15

work page doi:10.18653/v1/2024.naacl-industry.15 2024
[14]

Hsieh, S

C.-P. Hsieh, S. Sun, S. Kriman et al.RULER: What’s the Real Context Size of Your Long-Context Language Models?. In: First Conference on Language Modeling, 2024. https://openreview.net/forum?id=kIoBbc76Sy

work page 2024
[15]

Huang, S

Q. Huang, S. Fu, X. Liu et al.Learning Retrieval Augmentation for Personalized Dialogue Generation. In: Proceedings of the 2023 Conference on Empirical Meth- ods in Natural Language Processing, pp. 2523–2540, 2023.https://doi.org/10. 18653/v1/2023.emnlp-main.154

work page 2023
[16]

In Ku, L.-W., Martins, A

Q. Huang, X. Liu, T. Ko et al.Selective Prompting Tuning for Personalized Con- versations with LLMs. In: Findings of the Association for Computational Linguis- tics: ACL 2024, pp. 16212–16226, 2024.https://doi.org/10.18653/v1/2024. findings-acl.959

work page doi:10.18653/v1/2024 2024
[17]

B. Jin, J. Yoon, J. Han et al.Long-Context LLMs Meet RAG: Overcoming Chal- lenges for Long Inputs in RAG. arXiv:2410.05983, 2024.https://arxiv.org/ abs/2410.05983

work page arXiv 2024
[18]

Kuratov, A

Y. Kuratov, A. Bulatov, P. Anokhin et al.BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack. In: The Thirty-eighth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2024. https://openreview.net/forum?id=u7m2CG84BQ

work page 2024
[19]

J. R. Landis, G. G. Koch.The Measurement of Observer Agreement for Categorical Data. In: Biometrics, pp. 159–174, 1977.https://doi.org/10.2307/2529310

work page doi:10.2307/2529310 1977
[20]

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

P. Lewis, E. Perez, A. Piktus et al.Retrieval-Augmented Generation for Knowledge- Intensive NLP Tasks.In:AdvancesinNeuralInformationProcessingSystems,2020. https://arxiv.org/abs/2005.11401

work page internal anchor Pith review Pith/arXiv arXiv 2020
[21]

H. Li, C. Yang, A. Zhang et al.Hello Again! LLM-powered Personalized Agent for Long-term Dialogue. In: Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pp. 5259–5276, 2025.https: //aclanthology.org/2025.naacl-long.272/

work page 2025
[22]

J. Liu, Z. Qiu, Z. Li et al.A Survey of Personalized Large Language Models: Progress and Future Directions. arXiv:2502.11528, 2025.https://arxiv.org/ abs/2502.11528

work page arXiv 2025
[23]

N. F. Liu, K. Lin, J. Hewitt et al.Lost in the Middle: How Language Models Use Long Contexts. In: Transactions of the Association for Computational Linguistics, pp. 157–173, 2024.https://doi.org/10.1162/tacl_a_00638. PERSONAL FACTS ANNOTATION 211

work page doi:10.1162/tacl_a_00638 2024
[24]

S. Liu, H. Cho, M. Freedman et al.RECAP: Retrieval-Enhanced Context-Aware Prefix Encoder for Personalized Dialogue Response Generation. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 8404–8419, 2023.https://doi.org/10.18653/v1/2023. acl-long.468

work page doi:10.18653/v1/2023 2023
[25]

Y. Liu, M. Ott, N. Goyal et al.RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv:1907.11692, 2019.https://arxiv.org/abs/1907.11692

work page internal anchor Pith review Pith/arXiv arXiv 1907
[26]

MemGPT: Towards LLMs as Operating Systems

C. Packer, S. Wooders, K. Lin et al.MemGPT: Towards LLMs as Operating Sys- tems. arXiv:2310.08560, 2024.https://arxiv.org/abs/2310.08560

work page internal anchor Pith review Pith/arXiv arXiv 2024
[27]

S. Pan, L. Luo, Y. Wang et al.Unifying Large Language Models and Knowledge Graphs: A Roadmap. In: IEEE Transactions on Knowledge and Data Engineering, pp. 3580–3599, 2024.https://doi.org/10.1109/tkde.2024.3352100

work page doi:10.1109/tkde.2024.3352100 2024
[28]

J. Read, B. Pfahringer, G. Holmes et al.Classifier Chains for Multi-label Classifi- cation. In: Machine Learning and Knowledge Discovery in Databases, pp. 254–269, 2009.https://doi.org/10.1007/978-3-642-04174-7_17

work page doi:10.1007/978-3-642-04174-7_17 2009
[29]

A. Rios, R. Kavuluru.Few-Shot and Zero-Shot Multi-Label Learning for Structured Label Spaces.In:Proceedingsofthe2018ConferenceonEmpiricalMethodsinNatu- ral Language Processing, pp. 3132–3142, 2018.https://doi.org/10.18653/v1/ D18-1352

work page doi:10.18653/v1/ 2018
[30]

A.Singh,A.Fry,A.Perelmanetal.OpenAI GPT-5 System Card.arXiv:2601.03267, 2025.https://arxiv.org/abs/2601.03267

work page internal anchor Pith review Pith/arXiv arXiv 2025
[31]

Y. Tang, B. Wang, M. Fang et al.Enhancing Personalized Dialogue Genera- tion with Contrastive Latent Variables: Combining Sparse and Dense Persona. arXiv:2305.11482, 2023.https://arxiv.org/abs/2305.11482

work page arXiv 2023
[32]

Tseng, Y.-C

Y.-M. Tseng, Y.-C. Huang, T.-Y. Hsiao et al.Two Tales of Persona in LLMs: A Survey of Role-Playing and Personalization. In: Findings of the Association for Computational Linguistics: EMNLP 2024, pp. 16612–16631, 2024.https://doi. org/10.18653/v1/2024.findings-emnlp.969

work page doi:10.18653/v1/2024.findings-emnlp.969 2024
[33]

Tsoumakas, I

G. Tsoumakas, I. Katakis.Multi-Label Classification: An Overview. In: Int. J. Data Warehous. Min., pp. 1–13, 2007.https://doi.org/10.4018/jdwm.2007070101

work page doi:10.4018/jdwm.2007070101 2007
[34]

Vaswani, N

A. Vaswani, N. Shazeer, N. Parmar et al.Attention Is All You Need. In: Advances in Neural Information Processing Systems, 2017.https://arxiv.org/abs/1706. 03762

work page 2017
[35]

H. S. Vera, S. Dua, B. Zhang et al.EmbeddingGemma: Powerful and Lightweight Text Representations. arXiv:2509.20354, 2025.https://arxiv.org/abs/2509. 20354

work page internal anchor Pith review arXiv 2025
[36]

L. Wang, N. Yang, X. Huang et al.Multilingual E5 Text Embeddings: A Technical Report. arXiv:2402.05672, 2024.https://arxiv.org/abs/2402.05672

work page internal anchor Pith review arXiv 2024
[37]

S. Xiao, Z. Liu, P. Zhang et al.C-Pack: Packed Resources For General Chinese Embeddings. arXiv:2309.07597, 2023.https://arxiv.org/abs/2309.07597

work page internal anchor Pith review arXiv 2023
[38]

J. Xu, A. Szlam, J. Weston.Beyond Goldfish Memory: Long-Term Open-Domain Conversation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 5180–5197, 2022.https: //doi.org/10.18653/v1/2022.acl-long.356

work page doi:10.18653/v1/2022.acl-long.356 2022
[39]

A. Yang, A. Li, B. Yang et al.Qwen3 Technical Report. arXiv:2505.09388, 2025. https://arxiv.org/abs/2505.09388. 212 K. ZAITSEV

work page internal anchor Pith review Pith/arXiv arXiv 2025
[40]

In: Proceedings of the 27th International Conference on Computational Linguistics, pp

P.Yang,X.Sun,W.Lietal.SGM: Sequence Generation Model for Multi-label Clas- sification. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 3915–3926, 2018.https://aclanthology.org/C18-1330/

work page 2018
[41]

Z. Yi, J. Ouyang, Z. Xu et al.A Survey on Recent Advances in LLM-Based Multi- turn Dialogue Systems. In: ACM Comput. Surv., vol. 58, no. 6, pp. 1–38, 2025. https://doi.org/10.1145/3771090

work page doi:10.1145/3771090 2025
[42]

R. You, Z. Zhang, Z. Wang et al.AttentionXML: Label Tree-based Attention- Aware Deep Model for High-Performance Extreme Multi-Label Text Classification. arXiv:1811.01727, 2019.https://arxiv.org/abs/1811.01727

work page arXiv 2019
[43]

Zhang, E

S. Zhang, E. Dinan, J. Urbanek et al.Personalizing Dialogue Agents: I have a dog, do you have pets too?. arXiv:1801.07243, 2018.https://arxiv.org/abs/1801. 07243

work page arXiv 2018
[44]

Zhong, L

W. Zhong, L. Guo, Q. Gao et al.MemoryBank: Enhancing Large Language Models with Long-Term Memory. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 19724–19731, 2024.https://doi.org/10.1609/aaai.v38i17. 29946

work page doi:10.1609/aaai.v38i17 2024
[45]

J. Zhou, C. Ma, D. Long et al.Hierarchy-Aware Global Model for Hierarchical Text Classification. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 1106–1117, 2020.https://doi.org/10.18653/ v1/2020.acl-main.104

work page 2020
[46]

Y. Zhu, P. Zhang, E.-U. Haq et al.Can ChatGPT Reproduce Human-Generated Labels? A Study of Social Computing Tasks. arXiv:2304.10145, 2023.https:// arxiv.org/abs/2304.10145. Схема аннотирования и классификатор персональных фактов в диалоге К. Зайцев Аннотация.Развитиебольшихязыковыхмоделей(LLM)сделаловоз- можным их применение в персонализированных диалогов...

work page arXiv 2023