arXiv preprint arXiv:2212.13138 , year=

Karan Singhal, Shekoofeh Azizi, Tao Tu, S · 2022 · arXiv 2212.13138

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Why Supervised Fine-Tuning Fails to Learn: A Systematic Study of Incomplete Learning in Large Language Models

cs.CL · 2026-04-11 · unverdicted · novelty 7.0

Supervised fine-tuning of LLMs often fails to fully internalize all training instances due to five recurring causes including missing prerequisites and data conflicts, as diagnosed via a new framework across multiple models.

Vision-Language Foundation Models for Comprehensive Automated Pavement Condition Assessment

cs.CV · 2026-04-09 · unverdicted · novelty 7.0

Instruction-tuned vision-language model PaveGPT, trained on a large unified pavement dataset, achieves substantial gains over general models in comprehensive, standard-compliant pavement condition assessment.

Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation

cs.CV · 2023-10-09 · unverdicted · novelty 7.0

A new shared video-image tokenizer enables large language models to surpass diffusion models on standard visual generation benchmarks.

Atomic Fact-Checking Increases Clinician Trust in Large Language Model Recommendations for Oncology Decision Support: A Randomized Controlled Trial

cs.CL · 2026-05-05 · conditional · novelty 6.0

Atomic fact-checking of LLM oncology recommendations increased clinician trust from 26.9% to 66.5% (Cohen's d=0.94) in a trial of 356 doctors.

Compared to What? Baselines and Metrics for Counterfactual Prompting

cs.CL · 2026-05-01 · conditional · novelty 6.0

Counterfactual prompting effects on LLMs are often indistinguishable from those caused by meaning-preserving paraphrases, causing most previously reported demographic sensitivities to disappear under proper statistical comparison.

Towards an AI co-scientist

cs.AI · 2025-02-26 · unverdicted · novelty 6.0

A multi-agent AI system generates novel biomedical hypotheses that show promising experimental validation in drug repurposing for leukemia, new targets for liver fibrosis, and a bacterial gene transfer mechanism.

BloombergGPT: A Large Language Model for Finance

cs.LG · 2023-03-30 · conditional · novelty 6.0

BloombergGPT is a 50B parameter LLM trained on a 708B token mixed financial and general dataset that outperforms prior models on financial benchmarks while preserving general LLM performance.

Automated Auditing of Hospital Discharge Summaries for Care Transitions

cs.AI · 2026-04-07 · unverdicted · novelty 4.0

An LLM-based framework automates auditing of discharge summaries using a DISCHARGED-derived checklist on MIMIC-IV data to detect missing or ambiguous documentation elements.

Large Language Models: A Survey

cs.CL · 2024-02-09 · accept · novelty 3.0

The paper surveys key large language models, their training methods, datasets, evaluation benchmarks, and future research directions in the field.

citing papers explorer

Showing 9 of 9 citing papers.

Why Supervised Fine-Tuning Fails to Learn: A Systematic Study of Incomplete Learning in Large Language Models cs.CL · 2026-04-11 · unverdicted · none · ref 9
Supervised fine-tuning of LLMs often fails to fully internalize all training instances due to five recurring causes including missing prerequisites and data conflicts, as diagnosed via a new framework across multiple models.
Vision-Language Foundation Models for Comprehensive Automated Pavement Condition Assessment cs.CV · 2026-04-09 · unverdicted · none · ref 12
Instruction-tuned vision-language model PaveGPT, trained on a large unified pavement dataset, achieves substantial gains over general models in comprehensive, standard-compliant pavement condition assessment.
Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation cs.CV · 2023-10-09 · unverdicted · none · ref 254
A new shared video-image tokenizer enables large language models to surpass diffusion models on standard visual generation benchmarks.
Atomic Fact-Checking Increases Clinician Trust in Large Language Model Recommendations for Oncology Decision Support: A Randomized Controlled Trial cs.CL · 2026-05-05 · conditional · none · ref 1
Atomic fact-checking of LLM oncology recommendations increased clinician trust from 26.9% to 66.5% (Cohen's d=0.94) in a trial of 356 doctors.
Compared to What? Baselines and Metrics for Counterfactual Prompting cs.CL · 2026-05-01 · conditional · none · ref 120
Counterfactual prompting effects on LLMs are often indistinguishable from those caused by meaning-preserving paraphrases, causing most previously reported demographic sensitivities to disappear under proper statistical comparison.
Towards an AI co-scientist cs.AI · 2025-02-26 · unverdicted · none · ref 191
A multi-agent AI system generates novel biomedical hypotheses that show promising experimental validation in drug repurposing for leukemia, new targets for liver fibrosis, and a bacterial gene transfer mechanism.
BloombergGPT: A Large Language Model for Finance cs.LG · 2023-03-30 · conditional · none · ref 103
BloombergGPT is a 50B parameter LLM trained on a 708B token mixed financial and general dataset that outperforms prior models on financial benchmarks while preserving general LLM performance.
Automated Auditing of Hospital Discharge Summaries for Care Transitions cs.AI · 2026-04-07 · unverdicted · none · ref 10
An LLM-based framework automates auditing of discharge summaries using a DISCHARGED-derived checklist on MIMIC-IV data to detect missing or ambiguous documentation elements.
Large Language Models: A Survey cs.CL · 2024-02-09 · accept · none · ref 76
The paper surveys key large language models, their training methods, datasets, evaluation benchmarks, and future research directions in the field.

arXiv preprint arXiv:2212.13138 , year=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer