pith. sign in

hub Canonical reference

Imagebind-llm: Multi-modality instruction tun- ing

Canonical reference. 71% of citing Pith papers cite this work as background.

17 Pith papers citing it
Background 71% of classified citations

hub tools

citation-role summary

background 4 method 2 baseline 1

citation-polarity summary

fields

cs.CV 15 cs.AI 2

clear filters

representative citing papers

Agentic AI in Remote Sensing: Foundations, Taxonomy, and Emerging Systems

cs.CV · 2026-01-05 · unverdicted · novelty 7.0

The paper delivers the first comprehensive review and unified taxonomy of agentic AI in remote sensing, covering single-agent copilots, multi-agent systems, planning mechanisms, benchmarks, and a roadmap while noting limitations in grounding and safety.

Chat-Scene++: Exploiting Context-Rich Object Identification for 3D LLM

cs.CV · 2026-03-29 · unverdicted · novelty 6.0

Chat-Scene++ improves 3D scene understanding in multimodal LLMs by representing scenes as context-rich object sequences with identifier tokens and grounded chain-of-thought reasoning, reaching state-of-the-art on five benchmarks using pre-trained encoders.

A Survey on Multimodal Large Language Models

cs.CV · 2023-06-23 · accept · novelty 3.0

This survey organizes the architectures, training strategies, data, evaluation methods, extensions, and challenges of Multimodal Large Language Models.

citing papers explorer

Showing 15 of 15 citing papers after filters.