pith. sign in

hub Canonical reference

Imagebind-llm: Multi-modality instruction tun- ing

Canonical reference. 71% of citing Pith papers cite this work as background.

17 Pith papers citing it
Background 71% of classified citations

hub tools

citation-role summary

background 4 method 2 baseline 1

citation-polarity summary

fields

cs.CV 15 cs.AI 2

clear filters

representative citing papers

Agentic AI in Remote Sensing: Foundations, Taxonomy, and Emerging Systems

cs.CV · 2026-01-05 · unverdicted · novelty 7.0

The paper delivers the first comprehensive review and unified taxonomy of agentic AI in remote sensing, covering single-agent copilots, multi-agent systems, planning mechanisms, benchmarks, and a roadmap while noting limitations in grounding and safety.

Chat-Scene++: Exploiting Context-Rich Object Identification for 3D LLM

cs.CV · 2026-03-29 · unverdicted · novelty 6.0

Chat-Scene++ improves 3D scene understanding in multimodal LLMs by representing scenes as context-rich object sequences with identifier tokens and grounded chain-of-thought reasoning, reaching state-of-the-art on five benchmarks using pre-trained encoders.

A Survey on Multimodal Large Language Models

cs.CV · 2023-06-23 · accept · novelty 3.0

This survey organizes the architectures, training strategies, data, evaluation methods, extensions, and challenges of Multimodal Large Language Models.

citing papers explorer

Showing 4 of 4 citing papers after filters.