Title resolution pending

· 2023 · arXiv 2301.10140

14 Pith papers cite this work. Polarity classification is still indexing.

14 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 1 dataset 1 other 1

citation-polarity summary

background 1 unclear 1 use dataset 1

representative citing papers

MemGym: a Long-Horizon Memory Environment for LLM Agents

cs.CL · 2026-05-20 · unverdicted · novelty 7.0

MemGym unifies agent gyms into a memory benchmark with isolated scoring across tool-use, research, coding, and computer-use regimes plus a lightweight reward model for tractable coding evaluation.

The Shrinking Lifespan of LLMs in Science

cs.DL · 2026-04-08 · unverdicted · novelty 7.0

LLM adoption in science follows a compressing inverted-U trajectory where release year predicts time-to-peak and lifespan better than model attributes.

Beyond Monolingual Assumptions: A Survey of Code-Switched NLP in the Era of Large Language Models across Modalities

cs.CL · 2025-10-08 · accept · novelty 7.0

A comprehensive survey of code-switched NLP research with LLMs across modalities, covering 327 studies, 15+ tasks, 30+ datasets, and 80+ languages while outlining challenges and a future roadmap.

Human-LLM Compound System for Scientific Ideation through Facet Recombination and Novelty Evaluation

cs.HC · 2024-09-23 · unverdicted · novelty 7.0

Scideator enables facet-based scientific ideation through LLM-driven extraction, human-guided recombination, analogous retrieval, and facet-grounded novelty verification, showing significantly higher creativity support than a baseline LLM in a user study with CS researchers.

ChartFI: Benchmarking Faithfulness and Insightfulness of Chart Descriptions from Multimodal Large Language Models

cs.CL · 2026-05-22 · unverdicted · novelty 6.0

ChartFI-Bench supplies 896 chart-description pairs and four metrics (Faithfulness, Coverage, Informativeness, Acuity) to evaluate MLLM-generated chart descriptions on faithfulness and insightfulness.

Jobs' AI Exposure Should Be Measured from Evidence, Not Model Priors

cs.IR · 2026-05-14 · conditional · novelty 6.0

The authors propose a retrieval-augmented framework that grounds AI exposure labels for 18,796 O*NET occupation-task pairs in retrieved news and academic abstracts, outperforming zero-shot prompting in 72% of disagreements and aligning better with observed real-world usage.

Unlocking LLM Creativity in Science through Analogical Reasoning

cs.AI · 2026-05-11 · conditional · novelty 6.0

Analogical reasoning increases LLM solution diversity by 90-173% and novelty rate to over 50%, delivering up to 13-fold gains on biomedical tasks including perturbation prediction and cell communication.

How Adversarial Environments Mislead Agentic AI?

cs.AI · 2026-04-20 · unverdicted · novelty 6.0

Adversarial compromise of tool outputs misleads agentic AI via breadth and depth attacks, revealing that epistemic and navigational robustness are distinct and often trade off against each other.

Benchmark of Benchmarks: Unpacking Influence and Code Repository Quality in LLM Safety Benchmarks

cs.CR · 2026-03-03 · conditional · novelty 6.0

Only 39% of LLM safety benchmark repositories run without modification, 6% include ethical warnings, and adoption tracks author prominence and runnability rather than code quality metrics.

Attribution Gradients: Incrementally Unfolding Citations for Critical Examination of Attributed AI Answers

cs.HC · 2025-10-01 · unverdicted · novelty 6.0

Attribution gradients consolidate citation evidence and enable incremental unfolding of secondary sources, leading to deeper engagement in a lab study of critical reading tasks for AI answers.

Automating Categorization of Scientific Texts with In-Context Learning and Prompt-Chaining in Large Language Models

cs.IR · 2026-04-25 · unverdicted · novelty 5.0

Prompt chaining with off-the-shelf LLMs outperforms in-context learning and BERT for 1st- and 2nd-level classification on the ORKG taxonomy using the FORC dataset, but struggles at the 3rd level.

Lit2Vec: A Reproducible Workflow for Building a Legally Screened Chemistry Corpus from S2ORC for Downstream Retrieval and Text Mining

cs.DB · 2026-04-14 · unverdicted · novelty 5.0

Lit2Vec delivers a documented, reproducible pipeline that extracts and annotates a large licensed chemistry paper corpus from S2ORC with paragraph embeddings and subfield labels.

Omakase: proactive assistance with actionable suggestions for evolving scientific research projects

cs.HC · 2026-04-10 · unverdicted · novelty 4.0

Omakase monitors project documents to infer timely queries and distills research reports into actionable suggestions that users rated significantly more useful than raw reports.

Looking Beyond the Obvious: A Survey on Abstract Concept Recognition for Video Understanding

cs.CV · 2025-08-28 · unverdicted · novelty 3.0

A literature survey on abstract concept recognition in videos that catalogs prior tasks and datasets while advocating for foundation models and reuse of decades of community experience.

citing papers explorer

Showing 14 of 14 citing papers.

MemGym: a Long-Horizon Memory Environment for LLM Agents cs.CL · 2026-05-20 · unverdicted · none · ref 22
MemGym unifies agent gyms into a memory benchmark with isolated scoring across tool-use, research, coding, and computer-use regimes plus a lightweight reward model for tractable coding evaluation.
The Shrinking Lifespan of LLMs in Science cs.DL · 2026-04-08 · unverdicted · none · ref 11
LLM adoption in science follows a compressing inverted-U trajectory where release year predicts time-to-peak and lifespan better than model attributes.
Beyond Monolingual Assumptions: A Survey of Code-Switched NLP in the Era of Large Language Models across Modalities cs.CL · 2025-10-08 · accept · none · ref 16
A comprehensive survey of code-switched NLP research with LLMs across modalities, covering 327 studies, 15+ tasks, 30+ datasets, and 80+ languages while outlining challenges and a future roadmap.
Human-LLM Compound System for Scientific Ideation through Facet Recombination and Novelty Evaluation cs.HC · 2024-09-23 · unverdicted · none · ref 37
Scideator enables facet-based scientific ideation through LLM-driven extraction, human-guided recombination, analogous retrieval, and facet-grounded novelty verification, showing significantly higher creativity support than a baseline LLM in a user study with CS researchers.
ChartFI: Benchmarking Faithfulness and Insightfulness of Chart Descriptions from Multimodal Large Language Models cs.CL · 2026-05-22 · unverdicted · none · ref 27
ChartFI-Bench supplies 896 chart-description pairs and four metrics (Faithfulness, Coverage, Informativeness, Acuity) to evaluate MLLM-generated chart descriptions on faithfulness and insightfulness.
Jobs' AI Exposure Should Be Measured from Evidence, Not Model Priors cs.IR · 2026-05-14 · conditional · none · ref 23
The authors propose a retrieval-augmented framework that grounds AI exposure labels for 18,796 O*NET occupation-task pairs in retrieved news and academic abstracts, outperforming zero-shot prompting in 72% of disagreements and aligning better with observed real-world usage.
Unlocking LLM Creativity in Science through Analogical Reasoning cs.AI · 2026-05-11 · conditional · none · ref 23
Analogical reasoning increases LLM solution diversity by 90-173% and novelty rate to over 50%, delivering up to 13-fold gains on biomedical tasks including perturbation prediction and cell communication.
How Adversarial Environments Mislead Agentic AI? cs.AI · 2026-04-20 · unverdicted · none · ref 51
Adversarial compromise of tool outputs misleads agentic AI via breadth and depth attacks, revealing that epistemic and navigational robustness are distinct and often trade off against each other.
Benchmark of Benchmarks: Unpacking Influence and Code Repository Quality in LLM Safety Benchmarks cs.CR · 2026-03-03 · conditional · none · ref 50
Only 39% of LLM safety benchmark repositories run without modification, 6% include ethical warnings, and adoption tracks author prominence and runnability rather than code quality metrics.
Attribution Gradients: Incrementally Unfolding Citations for Critical Examination of Attributed AI Answers cs.HC · 2025-10-01 · unverdicted · none · ref 32
Attribution gradients consolidate citation evidence and enable incremental unfolding of secondary sources, leading to deeper engagement in a lab study of critical reading tasks for AI answers.
Automating Categorization of Scientific Texts with In-Context Learning and Prompt-Chaining in Large Language Models cs.IR · 2026-04-25 · unverdicted · none · ref 23
Prompt chaining with off-the-shelf LLMs outperforms in-context learning and BERT for 1st- and 2nd-level classification on the ORKG taxonomy using the FORC dataset, but struggles at the 3rd level.
Lit2Vec: A Reproducible Workflow for Building a Legally Screened Chemistry Corpus from S2ORC for Downstream Retrieval and Text Mining cs.DB · 2026-04-14 · unverdicted · none · ref 19
Lit2Vec delivers a documented, reproducible pipeline that extracts and annotates a large licensed chemistry paper corpus from S2ORC with paragraph embeddings and subfield labels.
Omakase: proactive assistance with actionable suggestions for evolving scientific research projects cs.HC · 2026-04-10 · unverdicted · none · ref 24
Omakase monitors project documents to infer timely queries and distills research reports into actionable suggestions that users rated significantly more useful than raw reports.
Looking Beyond the Obvious: A Survey on Abstract Concept Recognition for Video Understanding cs.CV · 2025-08-28 · unverdicted · none · ref 42
A literature survey on abstract concept recognition in videos that catalogs prior tasks and datasets while advocating for foundation models and reuse of decades of community experience.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer