Vision as LoRA

Haotian Wang, Yongjie Ye, Bingning Li, Yicheng Nie, Jinghui Lu, Jingqun Tang, Yaxing Wang, Can Huang · 2025 · arXiv 2503.20680

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

The Override Gap: A Magnitude Account of Knowledge Conflict Failure in Hypernetwork-Based Instant LLM Adaptation

cs.LG · 2026-04-26 · conditional · novelty 6.0 · 2 refs

Knowledge conflicts in hypernetwork LLM adaptation stem from constant adapter margins losing to frequency-dependent pretrained margins; selective layer boosting and conflict-aware triggering raise deep-conflict accuracy to 71-72.5% on Gemma-2B and Mistral-7B.

Xiaomi OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation

cs.CV · 2026-04-20 · unverdicted · novelty 6.0 · 2 refs

OneVL achieves superior accuracy to explicit chain-of-thought reasoning at answer-only latency by supervising latent tokens with a visual world model decoder that predicts future frames.

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

cs.CV · 2026-05-12 · unverdicted · novelty 5.0

SenseNova-U1 presents native unified multimodal models that match top understanding VLMs while delivering strong performance in image generation, infographics, and interleaved tasks via the NEO-unify architecture.

Multi-Branch Non-Homogeneous Image Dehazing via Concentration Partitioning and Image Fusion

cs.CV · 2026-04-27 · unverdicted · novelty 5.0

CPIFNet decomposes non-homogeneous dehazing into multiple homogeneous sub-problems via specialized IENet branches trained on different haze concentrations, then uses IFNet to fuse advantageous regions through deep feature merging.

Hierarchical Awareness Adapters with Hybrid Pyramid Feature Fusion for Dense Depth Prediction

cs.CV · 2026-04-03 · unverdicted · novelty 5.0

A multilevel perceptual CRF model using Swin Transformer, HPF fusion, HA adapters, and dynamic scaling attention achieves state-of-the-art monocular depth estimation on NYU Depth v2, KITTI, and MatterPort3D with reduced error and fast inference.

Image Classification via Random Dilated Convolution with Multi-Branch Feature Extraction and Context Excitation

cs.CV · 2026-04-28 · unverdicted · novelty 3.0

RDCNet reports state-of-the-art accuracy on CIFAR-10, CIFAR-100, SVHN, Imagenette, and Imagewoof by combining random dilated convolutions with multi-branch and attention modules.

citing papers explorer

Showing 6 of 6 citing papers.

The Override Gap: A Magnitude Account of Knowledge Conflict Failure in Hypernetwork-Based Instant LLM Adaptation cs.LG · 2026-04-26 · conditional · none · ref 34 · 2 links
Knowledge conflicts in hypernetwork LLM adaptation stem from constant adapter margins losing to frequency-dependent pretrained margins; selective layer boosting and conflict-aware triggering raise deep-conflict accuracy to 71-72.5% on Gemma-2B and Mistral-7B.
Xiaomi OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation cs.CV · 2026-04-20 · unverdicted · none · ref 99 · 2 links
OneVL achieves superior accuracy to explicit chain-of-thought reasoning at answer-only latency by supervising latent tokens with a visual world model decoder that predicts future frames.
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture cs.CV · 2026-05-12 · unverdicted · none · ref 131
SenseNova-U1 presents native unified multimodal models that match top understanding VLMs while delivering strong performance in image generation, infographics, and interleaved tasks via the NEO-unify architecture.
Multi-Branch Non-Homogeneous Image Dehazing via Concentration Partitioning and Image Fusion cs.CV · 2026-04-27 · unverdicted · none · ref 59
CPIFNet decomposes non-homogeneous dehazing into multiple homogeneous sub-problems via specialized IENet branches trained on different haze concentrations, then uses IFNet to fuse advantageous regions through deep feature merging.
Hierarchical Awareness Adapters with Hybrid Pyramid Feature Fusion for Dense Depth Prediction cs.CV · 2026-04-03 · unverdicted · none · ref 57
A multilevel perceptual CRF model using Swin Transformer, HPF fusion, HA adapters, and dynamic scaling attention achieves state-of-the-art monocular depth estimation on NYU Depth v2, KITTI, and MatterPort3D with reduced error and fast inference.
Image Classification via Random Dilated Convolution with Multi-Branch Feature Extraction and Context Excitation cs.CV · 2026-04-28 · unverdicted · none · ref 51
RDCNet reports state-of-the-art accuracy on CIFAR-10, CIFAR-100, SVHN, Imagenette, and Imagewoof by combining random dilated convolutions with multi-branch and attention modules.

Vision as LoRA

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer