pith. sign in

Hallucination augmented contrastive learn- ing for multimodal large language model

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

citation-role summary

background 2

citation-polarity summary

fields

cs.CV 4

roles

background 2

polarities

background 2

representative citing papers

Deep Pre-Alignment for VLMs

cs.CV · 2026-05-14 · unverdicted · novelty 6.0

Deep Pre-Alignment uses a small VLM perceiver instead of ViT to pre-align visual features with LLM text space, yielding 1.9-3.0 point gains on multimodal benchmarks and 32.9% less language forgetting.

A Survey on Multimodal Large Language Models

cs.CV · 2023-06-23 · accept · novelty 3.0

This survey organizes the architectures, training strategies, data, evaluation methods, extensions, and challenges of Multimodal Large Language Models.

citing papers explorer

Showing 4 of 4 citing papers.

  • Deep Pre-Alignment for VLMs cs.CV · 2026-05-14 · unverdicted · none · ref 114

    Deep Pre-Alignment uses a small VLM perceiver instead of ViT to pre-align visual features with LLM text space, yielding 1.9-3.0 point gains on multimodal benchmarks and 32.9% less language forgetting.

  • Hallucination of Multimodal Large Language Models: A Survey cs.CV · 2024-04-29 · accept · none · ref 79

    The survey organizes causes of hallucinations in MLLMs, reviews evaluation benchmarks and metrics, and outlines mitigation approaches plus open questions.

  • A Survey on Hallucination in Large Vision-Language Models cs.CV · 2024-02-01 · unverdicted · none · ref 22

    This survey reviews the definition, symptoms, evaluation benchmarks, root causes, and mitigation methods for hallucinations in large vision-language models.

  • A Survey on Multimodal Large Language Models cs.CV · 2023-06-23 · accept · none · ref 168

    This survey organizes the architectures, training strategies, data, evaluation methods, extensions, and challenges of Multimodal Large Language Models.