pith. sign in

arxiv: 2605.21269 · v1 · pith:KHJA7XGKnew · submitted 2026-05-20 · 💻 cs.SE · cs.MA

Transforming Privacy Artifacts into Accessible Reports for Non-Technical Stakeholders

Pith reviewed 2026-05-21 03:20 UTC · model grok-4.3

classification 💻 cs.SE cs.MA
keywords Privacy by DesignRequirements EngineeringLarge Language ModelsHuman-centric systemsIndustry 5.0Privacy transparencyNon-technical stakeholdersMonitoring systems
0
0 comments X

The pith

A conceptual framework uses large language models to turn technical privacy artifacts into accessible reports for non-technical stakeholders such as workers and unions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tackles privacy concerns in Industry 5.0 systems that monitor human workers, where lack of transparency often causes rejection of human-machine collaboration by affected parties and unions. Current requirements engineering practices offer little help in making privacy threats and mitigations understandable to people without technical backgrounds. The authors propose a conceptual framework rooted in Privacy by Design principles that employs large language models to convert technical artifacts into clear privacy reports. This framework guides the process from monitoring use cases and requirements all the way to informed decision-making guidance. Initial insights come from two industry use cases along with an evaluation of the generated reports.

Core claim

The paper establishes a conceptual framework that starts with human monitoring-related use cases and requirements and leverages large language models to transform technical privacy artifacts into accessible reports, thereby supporting informed decision-making by non-technical stakeholders in human-centric industrial systems.

What carries the argument

The conceptual framework that applies large language models to convert technical privacy artifacts into accessible reports while following Privacy by Design principles.

If this is right

  • Privacy threats and mitigation strategies become communicable early in the design process to workers and unions.
  • Human-machine collaboration features gain greater acceptance through improved transparency.
  • Non-technical stakeholders can participate in informed decisions about privacy implications.
  • Privacy transparency integrates more directly into requirements engineering for industrial systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same transformation approach could apply to privacy communication in other monitoring-heavy fields such as healthcare or logistics.
  • Longer-term adoption would require measuring whether the reports actually change stakeholder behavior or acceptance rates.
  • Combining the framework with existing privacy impact assessment tools could create end-to-end traceable documentation.

Load-bearing premise

Large language models can accurately translate technical privacy artifacts into accessible reports that keep all critical information intact and genuinely support better decision-making without errors or harmful simplifications.

What would settle it

A controlled evaluation where non-technical stakeholders read the generated reports and then fail to correctly identify key privacy threats or make decisions that reflect understanding of the mitigations.

Figures

Figures reproduced from arXiv: 2605.21269 by Benedikt Dornauer, Christian Wolf, Clemens Sauerwein, Michael Vierhauser, Ruth Breu, Tina Mersch, Zoe Pfister.

Figure 1
Figure 1. Figure 1: Overview of our proposed framework, transforming human monitoring requirements and privacy analysis artifacts into stakeholder [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Single-Shot Prompt Template with Chain-of-Thought Scratch [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Example of Inputs and Generated Outputs of the Workflow based on the example (UC1) introduced in Section [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
read the original abstract

The transition toward Industry 5.0 is reshaping industrial work environments with an emphasis on human-centricity, enabling close collaboration between humans and machines to enhance productivity and flexibility. However, such systems typically require monitoring of human workers and operators, often involving sensitive data, raising significant privacy concerns. As a result, affected workers and unions frequently reject human-machine collaboration features due to a lack of transparency regarding privacy threats and implemented mitigation strategies. To enable early stakeholder involvement, establish trust, and support informed decision-making, privacy implications must be communicated in a way understandable to non-technical stakeholders. Yet, current Requirements Engineering (RE) practices provide limited methodological support for making privacy threats and mitigations accessible to non-technical stakeholders (e.g., individual workers or their representative unions). In this RE@Next paper, we propose a conceptual framework that guides software design from human monitoring-related use cases and requirements to informed decision-making guidance focusing on non-technical stakeholders. Building on principles such as Privacy by Design, the framework leverages Large Language Models (LLMs) to transform technical artifacts into accessible privacy reports. We share initial insights from two industry use cases, evaluate the quality of the generated reports, and outline future research directions toward integrating privacy transparency into RE processes for human-centric industrial systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes a conceptual framework that guides software design from human monitoring-related use cases and requirements to informed decision-making guidance for non-technical stakeholders. Building on Privacy by Design, it leverages LLMs to transform technical privacy artifacts into accessible reports, shares initial insights from two industry use cases, evaluates the quality of the generated reports, and outlines future research directions for integrating privacy transparency into RE processes for human-centric industrial systems.

Significance. If the central claim holds, the work could help close a methodological gap in Requirements Engineering by enabling early involvement of workers and unions in privacy-sensitive human-machine collaboration systems. This has potential to build trust and support informed decisions in Industry 5.0 contexts where monitoring raises privacy concerns.

major comments (2)
  1. [Use cases and evaluation] The evaluation of report quality from the two industry use cases is described only at a high level in the abstract and use-case section; no explicit methods, metrics (e.g., fidelity, completeness, or accessibility scores), data, or limitations are provided. This makes it impossible to verify whether the LLM outputs preserve all critical privacy threats and mitigations without oversimplification or hallucination.
  2. [Conceptual framework description] The framework's core step—LLM-based transformation of technical artifacts—lacks any described safeguards such as grounding, fact-checking, retrieval-augmented generation, or post-generation validation. Without these controls, the claim that the resulting reports enable reliable informed decision-making rests on an untested assumption.
minor comments (1)
  1. [Abstract] The abstract states that the framework 'leverages Large Language Models (LLMs)' but does not specify which models, prompting strategies, or input artifact formats are used; adding a brief example would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback on our manuscript. We address each major comment below and outline the revisions we will make to strengthen the paper.

read point-by-point responses
  1. Referee: [Use cases and evaluation] The evaluation of report quality from the two industry use cases is described only at a high level in the abstract and use-case section; no explicit methods, metrics (e.g., fidelity, completeness, or accessibility scores), data, or limitations are provided. This makes it impossible to verify whether the LLM outputs preserve all critical privacy threats and mitigations without oversimplification or hallucination.

    Authors: We agree that the evaluation is presented at a high level and lacks explicit methods, metrics, or a discussion of limitations. As an RE@Next paper, our primary contribution is the conceptual framework, with the use cases intended as illustrative examples rather than a comprehensive empirical study. To address this, we will revise the manuscript to add a dedicated evaluation subsection. This will describe the qualitative assessment process used (including how fidelity to source artifacts, completeness of privacy information, and accessibility were considered), note the absence of quantitative scores in the current version, and explicitly list limitations such as the preliminary nature of the insights and potential for LLM hallucination. These additions will make the claims more verifiable without altering the paper's conceptual focus. revision: yes

  2. Referee: [Conceptual framework description] The framework's core step—LLM-based transformation of technical artifacts—lacks any described safeguards such as grounding, fact-checking, retrieval-augmented generation, or post-generation validation. Without these controls, the claim that the resulting reports enable reliable informed decision-making rests on an untested assumption.

    Authors: We acknowledge that the framework description does not currently specify safeguards for the LLM transformation step, which is a substantive limitation given the known risks of oversimplification or hallucination. This leaves the reliability of the generated reports as an assumption. In the revised manuscript, we will expand the framework section to incorporate proposed safeguards, including retrieval-augmented generation grounded in the original technical artifacts, post-generation expert validation, and iterative human-in-the-loop checks. These will be framed as recommended practices within the framework and highlighted as key areas for future empirical validation, thereby addressing the concern directly. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the conceptual framework proposal

full rationale

The paper presents a conceptual framework that guides software design from human monitoring use cases to accessible privacy reports via LLMs, building on Privacy by Design. No equations, fitted parameters, or quantitative predictions are present that could reduce claims to inputs by construction. The framework is described as a new contribution with initial insights from two industry use cases, and no load-bearing self-citations, uniqueness theorems, or ansatzes are invoked in a way that creates circularity. The derivation chain remains independent and self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The paper introduces a new conceptual framework as its core addition. It relies on domain assumptions about the effectiveness of Privacy by Design and LLM capabilities for accurate summarization, with no free parameters or new physical entities postulated.

axioms (2)
  • domain assumption Privacy by Design principles provide a suitable foundation for addressing privacy in human-centric industrial monitoring systems.
    Invoked in the abstract as the basis for the framework.
  • ad hoc to paper Large Language Models can transform technical privacy artifacts into accessible reports without critical loss of meaning or introduction of inaccuracies.
    Central to the proposed method but not independently verified in the abstract.
invented entities (1)
  • Conceptual framework for LLM-based privacy report generation no independent evidence
    purpose: To guide transformation of technical artifacts into accessible reports for non-technical stakeholders.
    The main novel contribution introduced in the paper.

pith-pipeline@v0.9.0 · 5769 in / 1458 out tokens · 29452 ms · 2026-05-21T03:20:02.031229+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    We propose a conceptual framework that guides software design from human monitoring-related use cases and requirements to informed decision-making guidance focusing on non-technical stakeholders. Building on principles such as Privacy by Design, the framework leverages Large Language Models (LLMs) to transform technical artifacts into accessible privacy reports.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages

  1. [1]

    Industry5.0: Prospectandretrospect,

    J. Leng, W. Sha, B. Wang, P. Zheng, C. Zhuang, Q. Liu, T. Wuest, D.Mourtzis,andL.Wang,“Industry5.0: Prospectandretrospect,” Journal of Manufacturing Systems, vol. 65, pp. 279–295, Oct. 2022

  2. [2]

    Continuous requirements engineering for digital transformation.,

    K. Michailidis, R. Strazdina, and M. Kirikova, “Continuous requirements engineering for digital transformation.,” inBIR Workshops, CEUR Proc., pp. 26–40, 2021

  3. [3]

    Success factors for introducing industrial human-robot interaction in practice: An empirically driven framework,

    T. Kopp, M. Baumgartner, and S. Kinkel, “Success factors for introducing industrial human-robot interaction in practice: An empirically driven framework,”The International Journal of Advanced Manufacturing Technology, vol. 112, pp. 685–704, Jan. 2021

  4. [4]

    EfficientandscalableruntimemonitoringforCyber-Physical System,

    X. Zheng, C. Julien, R. Podorozhny, F. Cassez, and T. Rakotoariv- elo,“EfficientandscalableruntimemonitoringforCyber-Physical System,”IEEE Systems Journal, vol. 12, no. 2, pp. 1667–1678, 2016

  5. [5]

    Dynamic monitoringforimprovingworkersafetyattheworkplace: usecase from a manufacturing shop floor,

    A. Stojadinović, N. Stojanović, and L. Stojanović, “Dynamic monitoringforimprovingworkersafetyattheworkplace: usecase from a manufacturing shop floor,” inProc. of the 9th ACM Int’l Conf. on Distributed Event-Based Systems, pp. 205–216, 2015

  6. [6]

    Monitoring CPS at runtime-a case study in the UAV domain,

    M. Vierhauser, J. Cleland-Huang, S. Bayley, T. Krismayer, R. Ra- biser, and P. Grünbacher, “Monitoring CPS at runtime-a case study in the UAV domain,” inProc. of the 44th Euromicro Conf. on Software Engineering and Advanced Applications, pp. 73–80, IEEE, 2018

  7. [7]

    An adaptive human sensor framework for human– robot collaboration,

    A. Buerkle, H. Matharu, A. Al-Yacoub, N. Lohse, T. Bamber, and P. Ferreira, “An adaptive human sensor framework for human– robot collaboration,”The International Journal of Advanced Manufacturing Technology, vol. 119, no. 1, pp. 1233–1248, 2022

  8. [8]

    An ar-based worker support system for human-robot collaboration,

    H. Liu and L. Wang, “An ar-based worker support system for human-robot collaboration,”Procedia Manufacturing, vol. 11, pp. 22–30, 2017

  9. [9]

    The future of bridging humans, robots, and humanoids with process simulate software

    Alex Greenberg – Siemens, “The future of bridging humans, robots, and humanoids with process simulate software.” https://blogs.sw.siemens.com/tecnomatix/the- future-of-bridging-humans-robots-and-humanoids- with-process-simulate-software-video, 2025

  10. [10]

    Employee monitoring: French sa fined amazon france logistique eur 32 mil- Preprint – Transforming Privacy Artifacts into Accessible Reports for Non-Technical Stakeholders8 lion

    European Data Protection Board, “Employee monitoring: French sa fined amazon france logistique eur 32 mil- Preprint – Transforming Privacy Artifacts into Accessible Reports for Non-Technical Stakeholders8 lion.” https://www.edpb.europa.eu/news/national- news/2024/employee-monitoring-french-sa-fined- amazon-france-logistique-eu32-million_en, 2024

  11. [11]

    Hamburg commissioner fines h&m 35.3 million euro for data protection violations in service centre

    European Data Protection Board, “Hamburg commissioner fines h&m 35.3 million euro for data protection violations in service centre.” https://www.edpb.europa.eu/news/national- news/2020/hamburg-commissioner-fines-hm-353- million-euro-data-protection-violations_en, 2020

  12. [12]

    How much is too much: employee monitoring, surveillance, and strain,

    T. Singh and A. Johnston, “How much is too much: employee monitoring, surveillance, and strain,” inProc. of the 15th Int’l Conf. on Computational Intelligence and Security, 2019

  13. [13]

    Electronicworkplacemonitoring: what employees think,

    E.Oz,R.Glass,andR.Behling,“Electronicworkplacemonitoring: what employees think,”Omega, vol. 27, no. 2, pp. 167–177, 1999

  14. [14]

    PublicationsOfficeoftheEuropeanUnion, 2020

    Eurofound,Employee monitoring and surveillance – The chal- lengesofdigitalisation. PublicationsOfficeoftheEuropeanUnion, 2020

  15. [15]

    Dataveillance: Employee moni- toring &information privacy concerns inthe workplace,

    R. Connolly and C. McParland, “Dataveillance: Employee moni- toring &information privacy concerns inthe workplace,”Journal of Information Technology Research, vol. 5, no. 2, pp. 31–45, 2012

  16. [16]

    Privacy by design: Aligning gdpr and software engineering specifications with a requirements engineering ap- proach,

    O. Kosenkov, E. Zabardast, D. Fucci, D. Mendez, and M. Un- terkalmsteiner, “Privacy by design: Aligning gdpr and software engineering specifications with a requirements engineering ap- proach,”Information and Software Technology, p. 107946, 2025

  17. [17]

    Under- standingthegdprfromarequirementsengineeringperspective—a systematic mapping study on regulatory data protection require- ments,

    C. Negri-Ribalta, M. Lombard-Platet, and C. Salinesi, “Under- standingthegdprfromarequirementsengineeringperspective—a systematic mapping study on regulatory data protection require- ments,”Requirements Engineering, vol. 29, no. 4, pp. 523–549, 2024

  18. [18]

    Llm-assistedextraction of regulatory requirements: A case study on the gdpr,

    S. Abualhaija, M. Ceci, N. Sannier, D. Bianculli, S. Lannier, M.Siclari,O.Voordeckers,andS.Tosza,“Llm-assistedextraction of regulatory requirements: A case study on the gdpr,” inProc. of theIEEE33rdInt’lRequirementsEngineeringConf.,pp.142–154, IEEE, 2025

  19. [19]

    Designingage-friendlyapps: Miningfunctional,usabil- ity, and privacy requirements from existing mobile applications,

    N. Manasreh, P. Spoletini, M. Valero, V. Nino, and I. Sanchez- Cardona,“Designingage-friendlyapps: Miningfunctional,usabil- ity, and privacy requirements from existing mobile applications,” inProc. of the IEEE 33rd Int’l Requirements Engineering Conf. Workshops, pp. 605–611, IEEE, 2025

  20. [20]

    Threat Modeling Process | OWASP Foundation

    L. Conklin, “Threat Modeling Process | OWASP Foundation.” https://owasp.org/www-community/Threat_Modeling_Process, 2025

  21. [21]

    “Regulation (EU) 2016/679 of the European Parliament and of the Council on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation),” Apr. 2016

  22. [22]

    Solution- awaredataflowdiagramsforsecuritythreatmodeling,

    L. Sion, K. Yskout, D. Van Landuyt, and W. Joosen, “Solution- awaredataflowdiagramsforsecuritythreatmodeling,”inProc.of the 33rd Annual ACM Symp. on Applied Computing, (New York, NY, USA), pp. 1425–1432, ACM, Apr. 2018

  23. [23]

    Threat modeling-uncoversecuritydesignflawsusingthestrideapproach,

    S. Hernan, S. Lambert, T. Ostwald, and A. Shostack, “Threat modeling-uncoversecuritydesignflawsusingthestrideapproach,” MSDN Magazine-Louisville, pp. 68–75, 2006

  24. [24]

    JohnWiley & Sons, 2014

    A.Shostack,Threatmodeling: Designingforsecurity. JohnWiley & Sons, 2014

  25. [25]

    Attention is All you Need,

    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. ukasz Kaiser, and I. Polosukhin, “Attention is All you Need,” inAdvances in Neural Information Processing Systems, vol. 30, Curran Associates, Inc., 2017

  26. [26]

    AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts,

    T. Wu, M. Terry, and C. J. Cai, “AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts,” inProc. of the 2022 CHI Conf. on Human Factors in Computing Systems, CHI ’22, (New York, NY, USA), pp. 1–22, ACM, Apr. 2022

  27. [27]

    Using LLMs in Software Requirements Specifications: An Empirical Evaluation,

    M. Krishna, B. Gaur, A. Verma, and P. Jalote, “Using LLMs in Software Requirements Specifications: An Empirical Evaluation,” inProc.ofthe32ndInt’lRequirementsEngineeringConf.,pp.475– 483, June 2024

  28. [28]

    From Specifications to Prompts: On the Future of GenerativeLargeLanguageModelsinRequirementsEngineering,

    A. Vogelsang, “From Specifications to Prompts: On the Future of GenerativeLargeLanguageModelsinRequirementsEngineering,” IEEE Software, vol. 41, pp. 9–13, Sept. 2024

  29. [29]

    Large Language Models for Software Engineering: ASystematicLiteratureReview,

    X. Hou, Y. Zhao, Y. Liu, Z. Yang, K. Wang, L. Li, X. Luo, D. Lo, J. Grundy, and H. Wang, “Large Language Models for Software Engineering: ASystematicLiteratureReview,”ACMTrans.Softw. Eng. Methodol., vol. 33, pp. 220:1–220:79, Dec. 2024

  30. [30]

    A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions,

    L. Huang, W. Yu, W. Ma, W. Zhong, Z. Feng, H. Wang, Q. Chen, W. Peng, X. Feng, B. Qin, and T. Liu, “A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions,”ACM Transactions on Information Systems, vol. 43, pp. 1–55, Mar. 2025

  31. [31]

    Language Models are Few-Shot Learners,

    T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, and et al., “Language Models are Few-Shot Learners,” inAdvances in Neural Information Processing Systems, vol. 33, pp. 1877–1901, Curran Associates, Inc., 2020

  32. [32]

    Chain-of-Thought Prompting ElicitsReasoninginLargeLanguageModels,

    J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. Chi, Q. V. Le, and D. Zhou, “Chain-of-Thought Prompting ElicitsReasoninginLargeLanguageModels,”AdvancesinNeural Information Processing Systems, vol. 35, pp. 24824–24837, Dec. 2022

  33. [33]

    Template-Based Financial Report Generation in Agentic and Decomposed Information Retrieval,

    Y.-E. Tian, Y.-C. Tang, K.-D. Wang, A.-Z. Yen, and W.-C. Peng, “Template-Based Financial Report Generation in Agentic and Decomposed Information Retrieval,” inProc. of the 48th Int’l ACM SIGIR Conf. on Research and Development in Information Retrieval, (New York, NY, USA), pp. 2706–2710, ACM, July 2025

  34. [34]

    Pohl,Requirements Engineering

    K. Pohl,Requirements Engineering. Springer Berlin Heidelberg, 2 ed., 2025

  35. [35]

    DPIAinContext: Ap- plying DPIA to Assess Privacy Risks of Cyber Physical Systems,

    J.Henriksen-Bulmer,S.Faily,andS.Jeary,“DPIAinContext: Ap- plying DPIA to Assess Privacy Risks of Cyber Physical Systems,” Future Internet, vol. 12, p. 93, May 2020

  36. [36]

    N8n.io - AI workflow automation tool

    n8n, “N8n.io - AI workflow automation tool.” https://n8n.io/, 2026

  37. [37]

    Improve your prompts in the developer console

    ANTHROPIC PBC, “Improve your prompts in the developer console.” https://claude.com/blog/prompt-improver, 2024

  38. [38]

    Prompting best practices

    ANTHROPIC PBC, “Prompting best practices.” https://platform.claude.com/docs/en/build-with-claude/prompt- engineering/claude-prompting-best-practices, 2026

  39. [39]

    Linddun go: A lightweight approach to privacy threat modeling,

    K. Wuyts, L. Sion, and W. Joosen, “Linddun go: A lightweight approach to privacy threat modeling,” inProc. of the 2020 IEEE EuropeanSymp.onSecurityandPrivacyWorkshops,pp.302–309, IEEE, 2020

  40. [40]

    Privacy Risk Analysis Based on System Control Structures: Adapting System-Theoretic Process Analysis for Pri- vacyEngineering,

    S. S. Shapiro, “Privacy Risk Analysis Based on System Control Structures: Adapting System-Theoretic Process Analysis for Pri- vacyEngineering,”in2016IEEESecurityandPrivacyWorkshops, pp. 17–24, May 2016

  41. [41]

    Retrieval-Augmented Generation for Knowledge- Intensive NLP Tasks,

    P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, H. Küttler, M. Lewis, W.-t. Yih, T. Rocktäschel, S. Riedel, and D. Kiela, “Retrieval-Augmented Generation for Knowledge- Intensive NLP Tasks,” inAdvances in Neural Information Pro- cessing Systems, vol. 33, pp. 9459–9474, Curran Associates, Inc., 2020

  42. [42]

    Fromgenerationtojudgment: Opportunities and challenges of llm-as-a-judge,

    D.Li,B.Jiang,L.Huang,A.Beigi,C.Zhao,Z.Tan,A.Bhattachar- jee,Y.Jiang,C.Chen,T.Wu,etal.,“Fromgenerationtojudgment: Opportunities and challenges of llm-as-a-judge,” inProc. of the 2025 Conf. on Empirical Methods in NLP, pp. 2757–2791, 2025