Evaluation of Large Language Model Performance and Reliability for Citations and References in Scholarly Writing: Cross-Disciplinary Study

Mugaanyi, J · 2024 · DOI 10.2196/52935

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open at publisher browse 3 citing papers

representative citing papers

GIScholarBench: Benchmarking LLM Overconfidence in GIS Research

cs.IR · 2026-06-06 · unverdicted · novelty 6.0

GIScholarBench shows LLMs exhibit consistent overconfidence across three scholarly tasks in GIS, with different manifestations in factual retrieval, citation expansion, and idea generation.

AI-Augmented Bibliometric Framework: A Paradigm Shift with Agentic AI for Dynamic, Snippet-Based Research Analysis

cs.DL · 2025-11-22 · conditional · novelty 5.0

A multi-agent framework uses natural language to generate and execute Python code for dynamic bibliometric analysis including networks, clustering, and automated reports.

Acceptance-Test-Driven Evaluation Protocols for Business-Centric LLM Systems

cs.SE · 2026-06-01 · unverdicted · novelty 4.0

The paper introduces a red-train-green lifecycle and governance metric stack that adapts acceptance testing to LLM systems for business use.

citing papers explorer

Showing 3 of 3 citing papers.

GIScholarBench: Benchmarking LLM Overconfidence in GIS Research cs.IR · 2026-06-06 · unverdicted · none · ref 21
GIScholarBench shows LLMs exhibit consistent overconfidence across three scholarly tasks in GIS, with different manifestations in factual retrieval, citation expansion, and idea generation.
AI-Augmented Bibliometric Framework: A Paradigm Shift with Agentic AI for Dynamic, Snippet-Based Research Analysis cs.DL · 2025-11-22 · conditional · none · ref 30
A multi-agent framework uses natural language to generate and execute Python code for dynamic bibliometric analysis including networks, clustering, and automated reports.
Acceptance-Test-Driven Evaluation Protocols for Business-Centric LLM Systems cs.SE · 2026-06-01 · unverdicted · none · ref 11
The paper introduces a red-train-green lifecycle and governance metric stack that adapts acceptance testing to LLM systems for business use.

Evaluation of Large Language Model Performance and Reliability for Citations and References in Scholarly Writing: Cross-Disciplinary Study

fields

years

verdicts

representative citing papers

citing papers explorer