Mixed citations

Title resolution pending

Rohit Khoja, Devanshu Gupta, Yanjie Fu, Dan Roth, Vivek Gupta · 2025 · DOI 10.18653/v1/2025.emnlp-

Mixed citation behavior. Most common role is background (40%).

14 Pith papers citing it

Background 40% of classified citations

open at publisher browse 14 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 2 method 2 baseline 1

citation-polarity summary

background 2 use method 2 baseline 1

representative citing papers

Cost-Aware Optimization for Agentic Query Execution

cs.DB · 2026-06-02 · unverdicted · novelty 7.0

EnumGRPO is a self-improving optimizer for agentic query execution that reduces LLM-operator costs by ~317x while improving accuracy by 18% over a hybrid baseline across four databases.

C-Mining: Unsupervised Discovery of Seeds for Cultural Data Synthesis via Geometric Misalignment

cs.CL · 2026-04-17 · unverdicted · novelty 7.0

C-Mining automatically mines high-fidelity Culture Points from raw multilingual text by treating cross-lingual geometric isolation in embeddings as a quantifiable signal for cultural specificity, then uses them to synthesize better instruction data.

What Software Engineering Looks Like to AI Agents? -- An Empirical Study of AI-Only Technical Discourse on MoltBook

cs.SE · 2026-05-08 · unverdicted · novelty 6.0

Empirical analysis of 4707 MoltBook posts shows AI-only technical discourse focuses on security, trust, and abstract topics while lacking concrete runtime and project details found in human GitHub discussions.

LLM-PRISM: Characterizing Silent Data Corruption from Permanent GPU Faults in LLM Training

cs.AR · 2026-04-12 · unverdicted · novelty 6.0

LLMs resist low-frequency permanent GPU faults but certain datapaths and precision formats trigger catastrophic training divergence even at moderate fault rates.

Entropy-Gradient Grounding: Training-Free Evidence Retrieval in Vision-Language Models

cs.CV · 2026-04-09 · unverdicted · novelty 6.0

Entropy-gradient grounding uses model uncertainty to retrieve evidence regions in VLMs, improving performance on detail-critical and compositional tasks across multiple architectures.

From Retinal Evidence to Safe Decisions: RETINA-SAFE and ECRT for Hallucination Risk Triage in Medical LLMs

cs.AI · 2026-04-07 · unverdicted · novelty 6.0

RETINA-SAFE benchmark and ECRT two-stage triage improve hallucination risk detection in medical LLMs for retinal decisions by 0.15-0.19 balanced accuracy over baselines using internal representations and logit shifts.

Scalable Stewardship of an LLM-Assisted Clinical Benchmark with Physician Oversight

cs.AI · 2025-12-22 · conditional · novelty 6.0

Physician oversight reveals high error rates in LLM-generated labels for a clinical benchmark and demonstrates that corrected labels improve both evaluation accuracy and downstream model training.

Can Large Language Models Really Recognize Your Name?

cs.CR · 2025-05-20 · unverdicted · novelty 6.0

LLMs exhibit 20-40% lower recall on ambiguous human names for PII detection, worsening under prompt injections, as shown via the new AmBench benchmark.

Starve to Perceive: Taming Lazy Perception in VLMs with Constrained Visual Bandwidth

cs.CV · 2026-05-18 · unverdicted · novelty 5.0

Constraining visual token budget per observation during VLM training forces genuine active perception and delivers 5% average relative improvement without auxiliary losses or architecture changes.

CroSearch-R1: Better Leveraging Cross-lingual Knowledge for Retrieval-Augmented Generation

cs.CL · 2026-04-28 · unverdicted · novelty 5.0

CroSearch-R1 applies search-augmented RL with cross-lingual integration and multilingual rollouts to improve RAG effectiveness on multilingual collections.

Do Language Models Pass the Bechdel Test? Auditing Gender Biases in LLM-Generated Screenplays

cs.HC · 2026-06-23 · unverdicted · novelty 4.0

Human-written screenplays pass the Bechdel test more often than those generated by GPT-5, Gemini 3 Pro, and Claude Sonnet 4.5, though network analyses show mixed bias patterns across all script types.

Evaluation of LLM-Based Software Engineering Tools: Practices, Challenges, and Future Directions

cs.SE · 2026-04-27 · unverdicted · novelty 4.0

LLM-based SE tools lack stable ground truth and deterministic outputs, making standard evaluation assumptions invalid and requiring new approaches for reliable assessment.

GraphMind: From Operational Traces to Self-Evolving Workflow Automation

cs.AI · 2026-05-17

Break the Brake, Not the Wheel: Untargeted Jailbreak via Entropy Maximization

cs.CV · 2026-05-11

citing papers explorer

Showing 2 of 2 citing papers after filters.

C-Mining: Unsupervised Discovery of Seeds for Cultural Data Synthesis via Geometric Misalignment cs.CL · 2026-04-17 · unverdicted · none · ref 40
C-Mining automatically mines high-fidelity Culture Points from raw multilingual text by treating cross-lingual geometric isolation in embeddings as a quantifiable signal for cultural specificity, then uses them to synthesize better instruction data.
What Software Engineering Looks Like to AI Agents? -- An Empirical Study of AI-Only Technical Discourse on MoltBook cs.SE · 2026-05-08 · unverdicted · none · ref 42
Empirical analysis of 4707 MoltBook posts shows AI-only technical discourse focuses on security, trust, and abstract topics while lacking concrete runtime and project details found in human GitHub discussions.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer