pith. machine review for the scientific record. sign in

arxiv: 2401.01313 · v3 · submitted 2024-01-02 · 💻 cs.CL

Recognition: 1 theorem link

A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models

Authors on Pith no claims yet

Pith reviewed 2026-05-15 19:10 UTC · model grok-4.3

classification 💻 cs.CL
keywords hallucination mitigationlarge language modelssurveytaxonomyretrieval augmented generationfeedback mechanismsknowledge retrievalLLM reliability
0
0 comments X

The pith

A survey reviews more than thirty-two techniques for mitigating hallucinations in large language models and organizes them into a taxonomy based on dataset use, tasks, feedback, and retrievers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to compile and structure the existing methods that aim to stop large language models from producing content that looks factual yet lacks grounding in reliable information. Hallucinations block safe use of LLMs in high-stakes settings such as medical summaries or financial reports, because models trained on broad internet text can invent details or twist facts to fit prompts. The authors examine more than thirty-two concrete techniques, among them retrieval-augmented generation, knowledge retrieval, CoNLI, and chain-of-verification. They sort these approaches according to how they draw on datasets, the tasks they target, the feedback loops they employ, and the kinds of retrievers they rely on. The survey also flags persistent challenges and limits, thereby marking open problems for later work.

Core claim

The central claim is that a taxonomy built around dataset utilization, common tasks, feedback mechanisms, and retriever types brings order to more than thirty-two hallucination-mitigation techniques for large language models. The taxonomy distinguishes retrieval-based methods such as Retrieval Augmented Generation and Knowledge Retrieval from feedback-driven ones such as CoNLI and CoVe, while also cataloguing their respective limitations.

What carries the argument

A taxonomy that groups hallucination-mitigation techniques by dataset utilization, common tasks, feedback mechanisms, and retriever types.

If this is right

  • Practitioners can match mitigation methods to specific tasks and data resources using the taxonomy.
  • The documented limitations point to concrete gaps that new techniques must address.
  • Feedback-based approaches can be combined with retriever-based ones to strengthen factual consistency.
  • The categorization supplies a baseline for measuring progress in LLM reliability.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The taxonomy may need updating whenever new mitigation ideas appear after the survey cutoff.
  • If the categories prove stable, they could support standardized benchmarks that compare hallucination rates across methods.
  • Extending the same organizing principles to other reliability issues, such as bias amplification, might reveal shared mechanisms.
  • Real-world deployment trials could test whether the taxonomy actually speeds up selection of suitable techniques for medical or legal text generation.

Load-bearing premise

The chosen set of more than thirty-two techniques and the four-part taxonomy together give a complete and useful picture of the field with no major omissions.

What would settle it

Publication of a later survey that identifies a substantial number of additional techniques that fall outside the four categories or shows that the taxonomy does not help practitioners select methods for concrete applications.

read the original abstract

As Large Language Models (LLMs) continue to advance in their ability to write human-like text, a key challenge remains around their tendency to hallucinate generating content that appears factual but is ungrounded. This issue of hallucination is arguably the biggest hindrance to safely deploying these powerful LLMs into real-world production systems that impact people's lives. The journey toward widespread adoption of LLMs in practical settings heavily relies on addressing and mitigating hallucinations. Unlike traditional AI systems focused on limited tasks, LLMs have been exposed to vast amounts of online text data during training. While this allows them to display impressive language fluency, it also means they are capable of extrapolating information from the biases in training data, misinterpreting ambiguous prompts, or modifying the information to align superficially with the input. This becomes hugely alarming when we rely on language generation capabilities for sensitive applications, such as summarizing medical records, financial analysis reports, etc. This paper presents a comprehensive survey of over 32 techniques developed to mitigate hallucination in LLMs. Notable among these are Retrieval Augmented Generation (Lewis et al, 2021), Knowledge Retrieval (Varshney et al,2023), CoNLI (Lei et al, 2023), and CoVe (Dhuliawala et al, 2023). Furthermore, we introduce a detailed taxonomy categorizing these methods based on various parameters, such as dataset utilization, common tasks, feedback mechanisms, and retriever types. This classification helps distinguish the diverse approaches specifically designed to tackle hallucination issues in LLMs. Additionally, we analyze the challenges and limitations inherent in these techniques, providing a solid foundation for future research in addressing hallucinations and related phenomena within the realm of LLMs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper claims to provide a comprehensive survey of over 32 hallucination mitigation techniques in large language models (LLMs). It introduces a taxonomy categorizing these methods based on dataset utilization, tasks, feedback mechanisms, and retriever types, and discusses the challenges and limitations of these techniques.

Significance. Should the survey prove comprehensive and the taxonomy effective, the work would offer significant value by consolidating the literature on a critical issue for LLM reliability. This could guide future research in developing more robust mitigation strategies for applications in sensitive domains.

major comments (1)
  1. [Introduction] The paper asserts a 'comprehensive survey' of over 32 techniques without documenting the literature search protocol, including databases, search terms, inclusion criteria, or date cutoffs. This is a load-bearing omission for the central claim, as it prevents verification of whether the taxonomy covers the field adequately or omits important approaches like uncertainty estimation methods.
minor comments (2)
  1. [Abstract] Specific techniques are mentioned (e.g., RAG, CoNLI, CoVe) but the paper should include a table summarizing all surveyed techniques with their key characteristics to improve accessibility.
  2. [Taxonomy] The categorization parameters are listed but would benefit from explicit examples or a mapping table showing which techniques fall under each category.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive feedback. We agree that explicitly documenting the literature search protocol is necessary to substantiate the claim of a comprehensive survey and to allow verification of the taxonomy's coverage. We will revise the manuscript to address this.

read point-by-point responses
  1. Referee: The paper asserts a 'comprehensive survey' of over 32 techniques without documenting the literature search protocol, including databases, search terms, inclusion criteria, or date cutoffs. This is a load-bearing omission for the central claim, as it prevents verification of whether the taxonomy covers the field adequately or omits important approaches like uncertainty estimation methods.

    Authors: We acknowledge this is a valid point and that the current manuscript lacks an explicit search protocol description. In the revised version, we will add a dedicated subsection in the Introduction (or a new 'Survey Methodology' section) detailing: databases searched (arXiv, Google Scholar, ACL Anthology, proceedings of NeurIPS/ICML/ICLR/ACL/EMNLP), search terms (e.g., 'LLM hallucination mitigation', 'reducing hallucinations in large language models', 'factuality in LLMs', 'hallucination detection and mitigation'), inclusion criteria (peer-reviewed or preprint papers from 2020 onward that propose or evaluate techniques specifically for mitigating hallucinations in generative LLMs, with empirical results), and cutoff date (December 2023). We will also review and incorporate uncertainty estimation approaches (e.g., methods using token-level confidence, ensemble variance, or self-consistency checks) into the taxonomy, likely under feedback mechanisms or as a related category, with discussion of how they complement or overlap with the 32+ techniques already covered. This revision will make the survey's scope verifiable. revision: yes

Circularity Check

0 steps flagged

No circularity: survey compiles external literature

full rationale

This is a literature survey paper that reviews and organizes over 32 existing hallucination mitigation techniques from external sources (e.g., RAG, CoVe, CoNLI). It introduces a taxonomy based on dataset utilization, tasks, feedback mechanisms, and retriever types, but this is presented as an organizational framework drawn from the reviewed works rather than any derivation, prediction, or fitted parameter. No equations, self-definitional claims, load-bearing self-citations, or reductions of results to the paper's own inputs exist. The central claims rest on compilation of prior literature, making the derivation chain self-contained with no circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

As a survey paper, the central contribution is the compilation and taxonomy of prior work. No free parameters, axioms, or invented entities are introduced.

pith-pipeline@v0.9.0 · 5639 in / 1022 out tokens · 29658 ms · 2026-05-15T19:10:59.545675+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 19 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. REALISTA: Realistic Latent Adversarial Attacks that Elicit LLM Hallucinations

    cs.CL 2026-05 unverdicted novelty 8.0

    REALISTA optimizes continuous combinations of valid editing directions in latent space to produce realistic adversarial prompts that elicit hallucinations more effectively than prior methods, including on large reason...

  2. Source or It Didn't Happen: A Multi-Agent Framework for Citation Hallucination Detection

    cs.CL 2026-05 accept novelty 7.0

    CiteTracer detects citation hallucinations at 97.1% accuracy on synthetic and real-world benchmarks by combining structured extraction, multi-source retrieval, deterministic matching, and class-specialist agents.

  3. VideoNet: A Large-Scale Dataset for Domain-Specific Action Recognition

    cs.CV 2026-05 unverdicted novelty 7.0

    VideoNet is a new large-scale benchmark and training dataset for domain-specific action recognition that exposes limitations in VLMs and enables smaller fine-tuned models to surpass larger open-weight ones.

  4. HAVEN: Hybrid Automated Verification ENgine for UVM Testbench Synthesis with LLMs

    cs.AR 2026-04 unverdicted novelty 7.0

    HAVEN combines LLM agents for planning and gap analysis with protocol-specific templates and a custom DSL to generate correct UVM testbenches, achieving 100% compilation success, 90.6% code coverage, and 87.9% functio...

  5. Uncertainty Propagation in LLM-Based Systems

    cs.SE 2026-04 unverdicted novelty 7.0

    This paper introduces a systems-level conceptual framing and a three-level taxonomy (intra-model, system-level, socio-technical) for uncertainty propagation in compound LLM applications, along with engineering insight...

  6. Context-Fidelity Boosting: Enhancing Faithful Generation through Watermark-Inspired Decoding

    cs.CL 2026-04 unverdicted novelty 7.0

    Context-Fidelity Boosting reduces faithfulness hallucinations by applying context-based logit boosts to source-supported tokens during LLM decoding.

  7. GraphScout: Empowering Large Language Models with Intrinsic Exploration Ability for Agentic Graph Reasoning

    cs.AI 2026-03 unverdicted novelty 7.0

    GraphScout trains LLMs to autonomously synthesize structured training data from knowledge graphs via flexible exploration tools, enabling a 4B model to outperform larger LLMs by 16.7% on average with fewer inference t...

  8. Common-agency Games for Multi-Objective Test-Time Alignment

    cs.GT 2026-05 unverdicted novelty 6.0

    CAGE uses common-agency games and an EPEC algorithm to compute equilibrium policies that balance multiple conflicting objectives for test-time LLM alignment.

  9. Estimating the Black-box LLM Uncertainty with Distribution-Aligned Adversarial Distillation

    cs.CL 2026-05 unverdicted novelty 6.0

    DisAAD trains a 1%-sized proxy model via adversarial distillation to quantify uncertainty in black-box LLMs by aligning with their output distributions.

  10. Talking to a Know-It-All GPT or a Second-Guesser Claude? How Repair reveals unreliable Multi-Turn Behavior in LLMs

    cs.CL 2026-04 unverdicted novelty 6.0

    Each tested LLM shows its own characteristic unreliability when engaging in repair during extended math-question dialogues.

  11. PRISM: Probing Reasoning, Instruction, and Source Memory in LLM Hallucinations

    cs.CL 2026-04 unverdicted novelty 6.0

    PRISM benchmark disentangles LLM hallucinations into knowledge missing, knowledge errors, reasoning errors, and instruction-following errors across three generation stages, revealing trade-offs when testing 24 models.

  12. From Clues to Generation: Language-Guided Conditional Diffusion for Cross-Domain Recommendation

    cs.IR 2026-04 unverdicted novelty 6.0

    LGCD creates pseudo-overlapping user data via LLM reasoning and uses conditional diffusion to generate target-domain user representations for inter-domain sequential recommendation without real overlapping users.

  13. Beyond Precision: Importance-Aware Recall for Factuality Evaluation in Long-Form LLM Generation

    cs.CL 2026-04 unverdicted novelty 6.0

    An importance-aware recall metric for LLM factuality evaluation reveals models are better at avoiding false claims than covering all relevant facts.

  14. CiteAudit: You Cited It, But Did You Read It? A Benchmark for Verifying Scientific References in the LLM Era

    cs.CL 2026-02 unverdicted novelty 6.0

    CiteAudit supplies a human-validated benchmark and multi-agent verification system that outperforms existing LLMs and commercial tools at detecting hallucinated scientific references.

  15. Corrective Retrieval Augmented Generation

    cs.CL 2024-01 unverdicted novelty 6.0

    CRAG improves RAG robustness via a retrieval quality evaluator that triggers web augmentation and a decompose-recompose filter to focus on relevant information, yielding better results on short- and long-form generati...

  16. The Dynamic Gist-Based Memory Model (DGMM): A Memory-Centric Architecture for Artificial Intelligence

    cs.AI 2026-05 unverdicted novelty 3.0

    DGMM is proposed as an explicit graph-structured memory architecture for AI that enables persistent episodic memory, cue-based recall, and context-dependent interpretation without retraining.

  17. LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods

    cs.CL 2024-12 accept novelty 3.0

    A survey that organizes LLMs-as-judges research into functionality, methodology, applications, meta-evaluation, and limitations.

  18. A Survey on the Memory Mechanism of Large Language Model based Agents

    cs.AI 2024-04 accept novelty 3.0

    A systematic review of memory designs, evaluation methods, applications, limitations, and future directions for LLM-based agents.

  19. A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications

    cs.AI 2024-02 unverdicted novelty 3.0

    A systematic survey categorizes prompt engineering methods for LLMs and VLMs by application area, summarizing methodologies, applications, models, datasets, strengths, and limitations for each technique along with a t...

Reference graph

Works this paper leans on

300 extracted references · 300 canonical work pages · cited by 19 Pith papers · 5 internal anchors

  1. [1]

    Aho and Jeffrey D

    Alfred V. Aho and Jeffrey D. Ullman , title =. 1972

  2. [2]

    Publications Manual , year = "1983", publisher =

  3. [3]

    Chandra and Dexter C

    Ashok K. Chandra and Dexter C. Kozen and Larry J. Stockmeyer , year = "1981", title =. doi:10.1145/322234.322243

  4. [4]

    Scalable training of

    Andrew, Galen and Gao, Jianfeng , booktitle=. Scalable training of

  5. [5]

    Dan Gusfield , title =. 1997

  6. [6]

    Tetreault , title =

    Mohammad Sadegh Rasooli and Joel R. Tetreault , title =. Computing Research Repository , volume =. 2015 , url =

  7. [7]

    A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =

    Ando, Rie Kubota and Zhang, Tong , Issn =. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =. Journal of Machine Learning Research , Month = dec, Numpages =

  8. [8]

    2023 , eprint=

    A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation , author=. 2023 , eprint=

  9. [9]

    Why is this misleading?

    "Why is this misleading?": Detecting News Headline Hallucinations with Explanations , author=. 2023 , eprint=

  10. [10]

    2023 , eprint=

    Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models , author=. 2023 , eprint=

  11. [11]

    2023 , eprint=

    Augmenting LLMs with Knowledge: A survey on hallucination prevention , author=. 2023 , eprint=

  12. [12]

    2023 , eprint=

    Chain of Natural Language Inference for Reducing Large Language Model Ungrounded Hallucinations , author=. 2023 , eprint=

  13. [13]

    2023 , eprint=

    Chain-of-Verification Reduces Hallucination in Large Language Models , author=. 2023 , eprint=

  14. [14]

    2023 , eprint=

    Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback , author=. 2023 , eprint=

  15. [15]

    2023 , eprint=

    Detecting and Mitigating Hallucinations in Multilingual Summarisation , author=. 2023 , eprint=

  16. [16]

    2023 , eprint=

    DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models , author=. 2023 , eprint=

  17. [17]

    2023 , eprint=

    Hallucination Reduction in Long Input Text Summarization , author=. 2023 , eprint=

  18. [18]

    2023 , eprint=

    Halo: Estimation and Reduction of Hallucinations in Open-Source Weak Large Language Models , author=. 2023 , eprint=

  19. [19]

    2023 , eprint=

    Lyra: Orchestrating Dual Correction in Automated Theorem Proving , author=. 2023 , eprint=

  20. [20]

    2023 , eprint=

    Making Large Language Models Perform Better in Knowledge Graph Completion , author=. 2023 , eprint=

  21. [21]

    2023 , eprint=

    Quantifying and Attributing the Hallucination of Large Language Models via Association Analysis , author=. 2023 , eprint=

  22. [22]

    2023 , eprint=

    Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation , author=. 2023 , eprint=

  23. [23]

    2023 , eprint=

    Teaching Language Models to Hallucinate Less with Synthetic Tasks , author=. 2023 , eprint=

  24. [24]

    2023 , eprint=

    The Knowledge Alignment Problem: Bridging Human and External Knowledge for Large Language Models , author=. 2023 , eprint=

  25. [25]

    2023 , eprint=

    Towards Mitigating Hallucination in Large Language Models via Self-Reflection , author=. 2023 , eprint=

  26. [26]

    2023 , eprint=

    Trusting Your Evidence: Hallucinate Less with Context-aware Decoding , author=. 2023 , eprint=

  27. [27]

    2023 , eprint=

    UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation , author=. 2023 , eprint=

  28. [28]

    Potsawee Manakul and Adian Liusie and Mark J. F. Gales , month =. SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models , url =

  29. [29]

    arXiv preprint arXiv:2305.13534 , year=

    How language model hallucinations can snowball , author=. arXiv preprint arXiv:2305.13534 , year=

  30. [30]

    Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models

    Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models , author=. arXiv preprint arXiv:2309.01219 , year=

  31. [31]

    ACM Computing Surveys , volume=

    Survey of hallucination in natural language generation , author=. ACM Computing Surveys , volume=. 2023 , publisher=

  32. [32]

    arXiv preprint arXiv:2309.05922 , year=

    A survey of hallucination in large foundation models , author=. arXiv preprint arXiv:2309.05922 , year=

  33. [33]

    Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

    Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks , author=. arXiv preprint arXiv:2005.11401v4 , year=

  34. [34]

    2023 , eprint=

    FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation , author=. 2023 , eprint=

  35. [35]

    2023 , eprint=

    Hallucination Augmented Recitations for Language Models , author=. 2023 , eprint=

  36. [36]

    2023 , eprint=

    A Step Closer to Comprehensive Answers: Constrained Multi-Stage Question Decomposition with Large Language Models , author=. 2023 , eprint=

  37. [37]

    Submitted to The Twelfth International Conference on Learning Representations , year=

    Fine-Tuning Language Models for Factuality , author=. Submitted to The Twelfth International Conference on Learning Representations , year=

  38. [38]

    2023 , eprint=

    Can Knowledge Graphs Reduce Hallucinations in LLMs? : A Survey , author=. 2023 , eprint=

  39. [39]

    2023 , eprint=

    On What Basis? Predicting Text Preference Via Structured Comparative Reasoning , author=. 2023 , eprint=

  40. [40]

    2023 , eprint=

    Ever: Mitigating Hallucination in Large Language Models through Real-Time Verification and Rectification , author=. 2023 , eprint=

  41. [41]

    arXiv preprint arXiv:2311.09467 , year=

    Think While You Write: Hypothesis Verification Promotes Faithful Knowledge-to-Text Generation , author=. arXiv preprint arXiv:2311.09467 , year=

  42. [42]

    arXiv preprint arXiv:2311.09677 , year=

    R-Tuning: Teaching Large Language Models to Refuse Unknown Questions , author=. arXiv preprint arXiv:2311.09677 , year=

  43. [43]

    arXiv preprint arXiv:2311.10081 , year=

    DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback , author=. arXiv preprint arXiv:2311.10081 , year=

  44. [44]

    2311.09800 , archivePrefix=

    Evgeniia Razumovskaia and Ivan Vulić and Pavle Marković and Tomasz Cichy and Qian Zheng and Tsung-Hsien Wen and Paweł Budzianowski , year=. 2311.09800 , archivePrefix=

  45. [45]

    Inference-Time Intervention: Eliciting Truthful Answers from a Language Model

    Inference-Time Intervention: Eliciting Truthful Answers from a Language Model , author=. arXiv preprint arXiv:2306.03341 , year=

  46. [46]

    Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

    Rarr: Researching and revising what language models say, using language models , author=. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

  47. [47]

    arXiv preprint arXiv:2210.09150 , year=

    Prompting gpt-3 to be reliable , author=. arXiv preprint arXiv:2210.09150 , year=

  48. [48]

    2023 , eprint=

    Mind's Mirror: Distilling Self-Evaluation Capability and Comprehensive Thinking from Large Language Models , author=. 2023 , eprint=

  49. [49]

    2023 , eprint=

    Fine-tuning Language Models for Factuality , author=. 2023 , eprint=

  50. [50]

    2023 , eprint=

    The Troubling Emergence of Hallucination in Large Language Models -- An Extensive Definition, Quantification, and Prescriptive Remediations , author=. 2023 , eprint=

  51. [51]

    ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope , journal =

    Partha Pratim Ray , keywords =. ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope , journal =. 2023 , issn =. doi:https://doi.org/10.1016/j.iotcps.2023.04.003 , url =

  52. [52]

    2023 , eprint=

    FLEEK: Factual Error Detection and Correction with Evidence Retrieved from External Knowledge , author=. 2023 , eprint=

  53. [53]

    Nature , volume=

    GPT-4 is here: what scientists think , author=. Nature , volume=. 2023 , publisher=

  54. [54]

    arXiv preprint arXiv:2305.15852 , year=

    Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation , author=. arXiv preprint arXiv:2305.15852 , year=

  55. [55]

    2023 , eprint=

    A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT , author=. 2023 , eprint=

  56. [56]

    2023 , eprint=

    Trapping LLM Hallucinations Using Tagged Context Prompts , author=. 2023 , eprint=

  57. [57]

    2023 , eprint=

    Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models , author=. 2023 , eprint=

  58. [58]

    International Conference on Learning Representations , year=

    ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , author=. International Conference on Learning Representations , year=

  59. [59]

    DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

    DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , author=. arXiv preprint arXiv:1910.01108 , year=

  60. [60]

    2023 , eprint=

    Self-Refine: Iterative Refinement with Self-Feedback , author=. 2023 , eprint=

  61. [61]

    Critic-Driven Decoding for Mitigating Hallucinations in Data-to-text Generation

    Lango, Mateusz and Dusek, Ondrej. Critic-Driven Decoding for Mitigating Hallucinations in Data-to-text Generation. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023. doi:10.18653/v1/2023.emnlp-main.172

  62. [62]

    2023 , eprint=

    Instruction Tuning for Large Language Models: A Survey , author=. 2023 , eprint=

  63. [63]

    2023 , eprint=

    Self-Instruct: Aligning Language Models with Self-Generated Instructions , author=. 2023 , eprint=

  64. [64]

    2022 , eprint=

    Scaling Instruction-Finetuned Language Models , author=. 2022 , eprint=

  65. [65]

    2023 , eprint=

    OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization , author=. 2023 , eprint=

  66. [66]

    2023 , eprint=

    WizardLM: Empowering Large Language Models to Follow Complex Instructions , author=. 2023 , eprint=

  67. [67]

    2023 , eprint=

    Llama 2: Open Foundation and Fine-Tuned Chat Models , author=. 2023 , eprint=

  68. [68]

    2023 , eprint=

    Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision , author=. 2023 , eprint=

  69. [69]

    Large-Scale Adversarial Training for Vision-and-Language Representation Learning , url =

    Gan, Zhe and Chen, Yen-Chun and Li, Linjie and Zhu, Chen and Cheng, Yu and Liu, Jingjing , booktitle =. Large-Scale Adversarial Training for Vision-and-Language Representation Learning , url =

  70. [70]

    arXiv preprint arXiv:2308.10168 , year=

    Head-to-Tail: How Knowledgeable are Large Language Models (LLM)? AKA Will LLMs Replace Knowledge Graphs? , author=. arXiv preprint arXiv:2308.10168 , year=

  71. [71]

    arXiv preprint arXiv:2108.13759 , year=

    Enjoy the salience: Towards better transformer-based faithful explanations with word salience , author=. arXiv preprint arXiv:2108.13759 , year=

  72. [72]

    BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

    Bloom: A 176b-parameter open-access multilingual language model , author=. arXiv preprint arXiv:2211.05100 , year=

  73. [73]

    Distilled AI , year =

    Hallucination Mitigation , author =. Distilled AI , year =

  74. [74]

    Proceedings of the Computational S anskrit & Digital Humanities: Selected papers presented at the 18th World S anskrit Conference. 2023

  75. [75]

    Neural Approaches for Data Driven Dependency Parsing in S anskrit

    Krishna, Amrith and Gupta, Ashim and Garasangi, Deepak and Sandhan, Jeevnesh and Satuluri, Pavankumar and Goyal, Pawan. Neural Approaches for Data Driven Dependency Parsing in S anskrit. Proceedings of the Computational S anskrit & Digital Humanities: Selected papers presented at the 18th World S anskrit Conference. 2023

  76. [76]

    Evaluating Neural Word Embeddings for S anskrit

    Sandhan, Jivnesh and Paranjay, Om Adideva and Digumarthi, Komal and Behra, Laxmidhar and Goyal, Pawan. Evaluating Neural Word Embeddings for S anskrit. Proceedings of the Computational S anskrit & Digital Humanities: Selected papers presented at the 18th World S anskrit Conference. 2023

  77. [77]

    Validation and Normalization of DCS corpus and Development of the S anskrit Heritage Engine ' s Segmenter

    Sriram, Krishnan and Kulkarni, Amba and Huet, G \'e rard. Validation and Normalization of DCS corpus and Development of the S anskrit Heritage Engine ' s Segmenter. Proceedings of the Computational S anskrit & Digital Humanities: Selected papers presented at the 18th World S anskrit Conference. 2023

  78. [78]

    Pre-annotation Based Approach for Development of a S anskrit Named Entity Recognition Dataset

    Sujoy, Sarkar and Krishna, Amrith and Goyal, Pawan. Pre-annotation Based Approach for Development of a S anskrit Named Entity Recognition Dataset. Proceedings of the Computational S anskrit & Digital Humanities: Selected papers presented at the 18th World S anskrit Conference. 2023

  79. [79]

    Disambiguation of Instrumental, Dative and Ablative Case suffixes in S anskrit

    Maity, Malay and Panchal, Sanjeev and Kulkarni, Amba. Disambiguation of Instrumental, Dative and Ablative Case suffixes in S anskrit. Proceedings of the Computational S anskrit & Digital Humanities: Selected papers presented at the 18th World S anskrit Conference. 2023

  80. [80]

    Creation of a Digital Rig V edic Index (Anukramani) for Computational Linguistic Tasks

    Mahesh, A V S D S and Bhattacharya, Arnab. Creation of a Digital Rig V edic Index (Anukramani) for Computational Linguistic Tasks. Proceedings of the Computational S anskrit & Digital Humanities: Selected papers presented at the 18th World S anskrit Conference. 2023

Showing first 80 references.