A multi-task evaluation of LLMs’ processing of academic text input

Tianyi Li, Yu Qin, Olivia R · 2025 · arXiv 2508.11779

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

Stability vs. Manipulability: Evaluating Robustness Under Post-Decision Interaction in LLM Judges

cs.AI · 2026-06-03 · unverdicted · novelty 7.0

LLM judges exhibit high stability under neutral re-evaluation but substantial reversibility under targeted post-decision challenges, quantified via a new Evaluation Robustness Score (ERS).

Whose Name Comes Up? III: Persona Prompting Effects in LLM-Based Scholar Recommendation

cs.IR · 2026-05-27 · unverdicted · novelty 6.0

Audits of 43 LLMs show that varying persona prompts (language, location, role-and-task) and context affects technical quality and social representativeness of scholar recommendations, with location impacting diversity and factuality.

LLM-based Schema-Guided Extraction and Validation of Missing-Person Intelligence from Heterogeneous Data Sources

cs.CL · 2026-04-08 · unverdicted · novelty 5.0

The Guardian Parser Pack pipeline extracts structured intelligence from heterogeneous missing-person documents using schema-guided LLM assistance, achieving F1 of 0.866 on 75 cases versus 0.258 for a deterministic baseline.

citing papers explorer

Showing 3 of 3 citing papers after filters.

Stability vs. Manipulability: Evaluating Robustness Under Post-Decision Interaction in LLM Judges cs.AI · 2026-06-03 · unverdicted · none · ref 51
LLM judges exhibit high stability under neutral re-evaluation but substantial reversibility under targeted post-decision challenges, quantified via a new Evaluation Robustness Score (ERS).
Whose Name Comes Up? III: Persona Prompting Effects in LLM-Based Scholar Recommendation cs.IR · 2026-05-27 · unverdicted · none · ref 26
Audits of 43 LLMs show that varying persona prompts (language, location, role-and-task) and context affects technical quality and social representativeness of scholar recommendations, with location impacting diversity and factuality.
LLM-based Schema-Guided Extraction and Validation of Missing-Person Intelligence from Heterogeneous Data Sources cs.CL · 2026-04-08 · unverdicted · none · ref 15
The Guardian Parser Pack pipeline extracts structured intelligence from heterogeneous missing-person documents using schema-guided LLM assistance, achieving F1 of 0.866 on 75 cases versus 0.258 for a deterministic baseline.

A multi-task evaluation of LLMs’ processing of academic text input

fields

years

verdicts

representative citing papers

citing papers explorer