super hub Mixed citations

Derf: Decomposed radiance fields

Guang Feng, Lihe Zhang, Zhiwei Hu · 2021 · arXiv 6437.2021

Mixed citation behavior. Most common role is background (67%).

136 Pith papers citing it

Background 67% of classified citations

read on arXiv browse 136 citing papers more from Guang Feng

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 23 dataset 5 method 5 baseline 3

citation-polarity summary

background 24 use dataset 5 baseline 3 use method 3 unclear 1

authors

and Huchuan Lu Guang Feng Lihe Zhang Zhiwei Hu

co-cited works

representative citing papers

WildBox: A Dataset and Benchmark for Aerial Monocular 3D Detection of African Savanna Wildlife

cs.CV · 2026-06-19 · unverdicted · novelty 8.0

WildBox provides over 237k 3D wildlife annotations from drone video and benchmarks reveal zero-shot 3D detection at 0 AP but fine-tuned performance of 8.68 AP-BEV and 13.17 AP3D, with depth estimation causing most errors.

MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark

cs.CL · 2024-09-04 · accept · novelty 8.0

MMMU-Pro is a stricter multimodal benchmark that removes text-only solvable questions, augments options, and requires reading text from images, yielding substantially lower model scores of 16.8-26.9%.

TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second

cs.LG · 2022-07-05 · conditional · novelty 8.0

TabPFN is a Prior-Data Fitted Network that approximates Bayesian inference for small tabular classification by training a Transformer once on synthetic data drawn from a causal prior, then solves new tasks in a single forward pass without further updates.

4DVLT: Dynamic Scene Understanding with Worldline-Centered Vision-Language Tracking

cs.CV · 2026-06-21 · conditional · novelty 7.0 · 2 refs

The paper defines the 4DVLT task for worldline-centered 4D scene understanding, releases Instruct-4D with 129.4K QA pairs, and presents 4DTrack achieving 62.68 TGA_Top1, outperforming adapted baselines by 19.62 points.

TopoCap: Learning Topology-Agnostic Motion Priors for Monocular Video-to-Animation

cs.CV · 2026-06-10 · unverdicted · novelty 7.0

A two-stage generative model (Graph CVAE + flow matching) learns topology-agnostic motion codes from a new 5k-topology dataset and retargets video motion to arbitrary unseen skeletons.

SpikeTAD: Spiking Neural Networks for End-to-End Temporal Action Detection

cs.CV · 2026-06-10 · unverdicted · novelty 7.0

SpikeTAD proposes the first SNN-based end-to-end TAD model, reporting 67.2% mAP on THUMOS14 and 37.42% on ActivityNet-1.3 with extremely low power consumption.

Mind the Gap: Disentangling Performance Bottlenecks in Video Instance Segmentation

cs.CV · 2026-06-05 · unverdicted · novelty 7.0

An ILP-based oracle applied to seven VIS methods on YouTube-VIS and OVIS shows tracking instability as the dominant bottleneck, producing gaps exceeding 20 AP under occlusion while classification impact is secondary.

Attribution via Distributional Paths for Information Revelation

cs.LG · 2026-06-02 · unverdicted · novelty 7.0

Reveal-IG performs path attribution by integrating model output changes along trajectories in a space of probe distributions rather than input-space paths, retaining completeness and handling multiscale or uncertain features.

Quality-Guided Semi-Supervised Learning for Medical Image Segmentation

cs.CV · 2026-06-01 · unverdicted · novelty 7.0

A new quality-guided approach for semi-supervised medical image segmentation that trains a predictor on synthetic errors to enhance pseudolabel handling.

Category-Level 3D Correspondence in Camera Space via Morphable Object Priors

cs.CV · 2026-05-27 · unverdicted · novelty 7.0

Morpheus learns morphable category-level shape priors to produce implicit 3D correspondences in camera space without explicit supervision and releases the HouseCorr3D benchmark with amodal and symmetry annotations.

ClothTransformer: Unified Latent-Space Transformers for Scalable Cloth Simulation

cs.GR · 2026-05-27 · unverdicted · novelty 7.0

ClothTransformer is a unified latent-space Transformer for cloth simulation that handles body-driven garments, robotic manipulation, and free-fall collisions in one model with 4-9x lower error than prior methods and mesh-resolution independence.

DARE-EEG: A Foundation Model for Mining Dual-Aligned Representation of EEG

cs.AI · 2026-05-18 · unverdicted · novelty 7.0

DARE-EEG is a self-supervised EEG foundation model that enforces mask-invariance via contrastive mask alignment and momentum anchor alignment, plus conv-linear-probing for heterogeneous setups, achieving SOTA accuracy and cross-dataset portability.

Strikingness-Aware Evaluation for Temporal Knowledge Graph Reasoning

cs.AI · 2026-05-13 · unverdicted · novelty 7.0

A rule-based strikingness measure is added to TKGR metrics to weight rare events higher, revealing that models weaken on striking events and ensemble gains come mostly from trivial fits.

Perception Without Engagement: Dissecting the Causal Discovery Deficit in LMMs

cs.CL · 2026-05-10 · unverdicted · novelty 7.0

LMMs perceive videos but underexploit visual content for causal reasoning due to textual shortcuts; ProCauEval diagnoses this and ADPO training reduces reliance on priors.

TripVVT: A Large-Scale Triplet Dataset and a Coarse-Mask Baseline for In-the-Wild Video Virtual Try-On

cs.CV · 2026-04-30 · unverdicted · novelty 7.0

A new large-scale triplet dataset and diffusion transformer model using coarse human masks deliver improved video virtual try-on quality and generalization in challenging real-world conditions.

Linguistically Informed Multimodal Fusion for Vietnamese Scene-Text Image Captioning: Dataset, Graph Framework, and Phonological Attention

cs.CV · 2026-04-30 · unverdicted · novelty 7.0

Introduces ViTextCaps dataset and PhonoSTFG phonological graph fusion framework for Vietnamese scene-text image captioning, showing cross-modal graph edges harm performance.

Too Sharp, Too Sure: When Calibration Follows Curvature

cs.LG · 2026-04-22 · unverdicted · novelty 7.0

Calibration error tracks curvature via shared margin-dependent exponential tails; a margin-aware objective improves out-of-sample calibration across optimizers.

Divide-and-Conquer Approach to Holistic Cognition in High-Similarity Contexts with Limited Data

cs.CV · 2026-04-21 · unverdicted · novelty 7.0

DHCNet improves ultra-fine-grained visual categorization by progressively building holistic cognition from local discrepancies using self-shuffling and refinement on limited data.

Towards Symmetry-sensitive Pose Estimation: A Rotation Representation for Symmetric Object Classes

cs.CV · 2026-04-20 · unverdicted · novelty 7.0 · 3 refs

SARR modifies trigonometric rotation encodings with object symmetry orders to produce unique continuous poses, enabling standard CNNs to outperform existing methods on symmetry-aware 6D pose estimation without custom losses or 3D models.

Orthogonal Transformations for Efficient Data-Driven Reachability Analysis

eess.SY · 2026-04-15 · unverdicted · novelty 7.0

Orthogonal transformations before order reduction in matrix zonotopes produce order-of-magnitude smaller reachable set volumes while keeping generator counts comparable.

FRTSearch: Unified Detection and Parameter Inference of Fast Radio Transients using Instance Segmentation

astro-ph.IM · 2026-04-14 · unverdicted · novelty 7.0

FRTSearch reframes fast radio transient detection as instance segmentation on dynamic spectra and uses the segmented shapes to infer dispersion measure and time of arrival, achieving 98% recall with over 99.9% fewer false positives than traditional methods.

Learning to Build Shapes by Extrusion

cs.GR · 2026-01-30 · unverdicted · novelty 7.0

Text Encoded Extrusions (TEE) lets LLMs generate and edit manifold 3D meshes by learning sequences of face extrusions from decomposed quadrilateral meshes.

Backdoor Attacks on Prompt-Driven Video Segmentation Foundation Models

cs.CV · 2025-12-26 · conditional · novelty 7.0

BadVSFM is the first effective backdoor attack on prompt-driven video segmentation foundation models, using a two-stage encoder-decoder strategy to achieve high attack success rates with limited clean performance loss.

Unmasking Puppeteers: Leveraging Biometric Leakage to Expose Impersonation in AI-Based Videoconferencing

cs.CV · 2025-10-03 · unverdicted · novelty 7.0

A pose-conditioned large-margin contrastive encoder isolates persistent biometric identity cues from transmitted latents in talking-head videoconferencing to flag impersonation attacks via cosine similarity without inspecting the output video.

citing papers explorer

Showing 4 of 4 citing papers after filters.

Linguistically Informed Multimodal Fusion for Vietnamese Scene-Text Image Captioning: Dataset, Graph Framework, and Phonological Attention cs.CV · 2026-04-30 · unverdicted · none · ref 39
Introduces ViTextCaps dataset and PhonoSTFG phonological graph fusion framework for Vietnamese scene-text image captioning, showing cross-modal graph edges harm performance.
Self-organized MT Direction Maps Emerge from Spatiotemporal Contrastive Optimization q-bio.NC · 2026-05-12 · unverdicted · none · ref 29
Direction maps and pinwheel structures in MT emerge spontaneously when a spatiotemporal deep network is trained on videos with contrastive self-supervised learning and spatial regularization.
Introducing Environmental Constraints to Grasping Strategies for Paper-Like Flexible Materials Using a Soft Gripper cs.RO · 2026-05-12 · unverdicted · none · ref 56
Systematic grasping strategies for paper-like materials are developed and tested with a soft gripper by exploiting environmental constraints to improve force control and success rates.
A Survey on Deep Learning Architectures for Point Cloud Classification and Segmentation cs.CV · 2026-05-16 · unreviewed · ref 61 · 2 links

Derf: Decomposed radiance fields

hub tools

citation-role summary

citation-polarity summary

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer