Canonical reference

Title resolution pending

Choy, C · 2019 · DOI 10.1109/cvpr

Canonical reference. 82% of citing Pith papers cite this work as background.

54 Pith papers citing it

Background 82% of classified citations

open at publisher browse 54 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 8 dataset 2 baseline 1

citation-polarity summary

background 9 baseline 1 use dataset 1

representative citing papers

Rolling Shutter Relative Pose Estimation Made Practical

cs.CV · 2026-06-25 · conditional · novelty 8.0

A linearized solver estimates rolling-shutter relative pose and motion from 7 affine correspondences in 1.2 ms and reports best-in-benchmark accuracy plus usable translational velocity.

From Phase to Phenomenon: Self-Supervised Learning of Subsurface Scattering with Minimal Phase-shift Inputs

cs.CV · 2026-06-28 · unverdicted · novelty 7.0

A self-supervised method pretrains an encoder on eight PSP images per view to learn generalizable subsurface scattering representations that transfer to relighting and dense footprint reconstruction on unseen complex objects.

4DVLT: Dynamic Scene Understanding with Worldline-Centered Vision-Language Tracking

cs.CV · 2026-06-21 · conditional · novelty 7.0

The paper defines the 4DVLT task for worldline-centered 4D scene understanding, releases Instruct-4D with 129.4K QA pairs, and presents 4DTrack achieving 62.68 TGA_Top1, outperforming adapted baselines by 19.62 points.

WHU-Infra3D: A Full-stack Multi-modal Dataset and Benchmark for 3D Roadside Infrastructure Inventory

cs.CV · 2026-06-03 · unverdicted · novelty 7.0

WHU-Infra3D is a new large-scale multi-modal dataset and benchmark for 3D roadside infrastructure inventory, providing over 175k 2D boxes, thousands of 3D instances, and 181k annotations across five core tasks while exposing cross-city gaps and long-tailed defect vulnerabilities.

A Systematic Benchmark of Intraoperative Ultrasound-to-MR Synthesis for Brain Tumour Surgery

cs.CV · 2026-05-30 · conditional · novelty 7.0

On the public ReMIND dataset, a systematic benchmark of six synthesis models across 48 experiments finds LPIPS correlates with downstream segmentation utility while SSIM does not, with SynDiff-2.5D performing best.

Urban-ImageNet: A Large-Scale Multi-Modal Dataset and Evaluation Framework for Urban Space Perception

cs.CV · 2026-05-11 · unverdicted · novelty 7.0

Urban-ImageNet is a 2-million-image multi-modal dataset with HUSIC 10-class taxonomy enabling benchmarks for urban scene classification, cross-modal retrieval, and instance segmentation.

Projection-Free Transformers via Gaussian Kernel Attention

cs.LG · 2026-05-04 · unverdicted · novelty 7.0

Gaussian Kernel Attention replaces learned QKV projections with a Gaussian RBF kernel on per-head token features, using 0.42x parameters and 0.49x FLOPs while showing competitive language modeling performance at depth 20.

Differentially Private Contrastive Learning via Bounding Group-level Contribution

cs.CR · 2026-04-29 · unverdicted · novelty 7.0

DP-GCL improves differentially private contrastive learning by bounding group-level contributions through batch partitioning and intra-group augmentation, delivering 5.6% higher image classification accuracy and 20.1% higher retrieval accuracy than existing approaches.

AttentionBender: Manipulating Cross-Attention in Video Diffusion Transformers as a Creative Probe

cs.MM · 2026-04-22 · unverdicted · novelty 7.0

AttentionBender applies 2D transforms to cross-attention maps in video diffusion transformers, producing distributed distortions and glitch aesthetics that reveal entangled attention mechanisms while serving as both an XAI probe and creative tool.

Divide-and-Conquer Approach to Holistic Cognition in High-Similarity Contexts with Limited Data

cs.CV · 2026-04-21 · unverdicted · novelty 7.0

DHCNet improves ultra-fine-grained visual categorization by progressively building holistic cognition from local discrepancies using self-shuffling and refinement on limited data.

Navig-AI-tion: Navigation by Contextual AI and Spatial Audio

cs.HC · 2026-03-13 · unverdicted · novelty 7.0

A system combining VLM landmark instructions with real-time corrective spatial audio reduces route deviations in a small user study compared to VLM-only and Google Maps audio baselines.

MobileMold: A Smartphone-Based Microscopy Dataset for Food Mold Detection

cs.CV · 2026-03-02 · unverdicted · novelty 7.0

MobileMold provides 4941 smartphone microscopy images and shows deep learning models reach 99.5% accuracy on mold detection and food classification tasks.

Accelerating Inference for Multilayer Neural Networks with Quantum Computers

quant-ph · 2025-10-08 · unverdicted · novelty 7.0

Quantum circuits for coherent multilayer neural network inference achieve quadratic to polylogarithmic speedups over classical methods depending on quantum data access models for inputs and weights.

MultiMat: Multimodal Program Synthesis for Procedural Materials using Large Multimodal Models

cs.CV · 2025-09-26 · unverdicted · novelty 7.0

MultiMat shows multimodal large models plus constrained search produce higher-quality procedural material graphs than text-only baselines on a new production dataset.

Anomaly Factory 3D: A Modular Framework for Diverse Pseudo-Anomaly Synthesis in Unsupervised 3D Anomaly Detection

cs.CV · 2026-06-28 · unverdicted · novelty 6.0

AF3AD is a modular synthesis framework using center-conditioned parametric deformations in local PCA frames to create diverse pseudo-anomalies, improving unsupervised 3D anomaly detection on AnomalyShapeNet and Real3D-AD.

Flowing With Purpose: Latent Action Guided Flow Matching Policies For Robotic Manipulation

cs.RO · 2026-06-22 · unverdicted · novelty 6.0

LAFM adapts the source distribution in flow matching policies via a latent action model to better match fragmented robotic action spaces, claiming 23.4% higher real-world success and 10.4% on LIBERO-90 while beating larger pre-trained models.

Radial Basis Function Networks as Projection Heads in Self-Supervised Learning

cs.CV · 2026-06-19 · unverdicted · novelty 6.0

RBFN projection heads serve as competitive replacements for MLP heads in SSL and enable SNS, a label-free metric from RBF parameters that correlates strongly with logistic regression evaluation.

FATE: Pillar Encoding and Frequency-Aware Training for Event-Based Object Detection

cs.CV · 2026-06-15 · unverdicted · novelty 6.0

FATE combines pillar encoding via orthogonal polynomial basis with frequency-aware training to enable event-based object detection at up to 200 Hz without internal temporal sub-binning.

Jaguar: Fast Private CNN Inference with Power-of-Two Homomorphic Arithmetic

cs.CR · 2026-06-10 · unverdicted · novelty 6.0

Jaguar replaces prime-modulus HE with power-of-two arithmetic to enable coefficient-domain convolution and local-shift truncation, reporting 2-3.7x lower latency than Cheetah and Rhombus on ResNet-18/50 and MobileNetV2.

Tetris: Tile-level Sampling for Efficient and High-Fidelity Video Object Tracking

cs.CV · 2026-05-25 · unverdicted · novelty 6.0

Tetris decomposes stationary videos into tile polyominoes and applies classifier plus ILP pruning to cut detector calls, staying within 5% accuracy loss while delivering up to 17.4x throughput gains over priors.

Model Merging: Foundations and Algorithms

cs.LG · 2026-05-02 · unverdicted · novelty 6.0

New cycle-consistent optimization, task vector theory, singular vector decompositions, adaptive routing, and efficient evolutionary search provide foundations for merging neural network weights across tasks.

Neighbor2Inverse: Self-Supervised Denoising for Low-Dose Region-of-Interest Phase Contrast CT

cs.CV · 2026-05-01 · unverdicted · novelty 6.0

Neighbor2Inverse adapts the Neighbor2Neighbor principle to train a denoising network directly in the image domain for low-dose PBI-CT by using independently noised subsampled projections.

Remote SAMsing: From Segment Anything to Segment Everything

cs.CV · 2026-04-30 · conditional · novelty 6.0

Remote SAMsing pipeline boosts SAM2 coverage on remote sensing scenes from 30-68% to 91-98% via multi-pass masking and boundary-aware merging while preserving mask quality.

Threat-Oriented Digital Twinning for Security Evaluation of Autonomous Platforms

cs.CR · 2026-04-28 · unverdicted · novelty 6.0

A threat-oriented digital twinning methodology and open-source modular twin is introduced for security evaluation of autonomous platforms, translating threat analysis into controllable tests for spoofing, replay, and adversarial ML attacks.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Explicit Logic Channel for Validation and Enhancement of MLLMs on Zero-Shot Tasks cs.AI · 2026-03-12 · unverdicted · none · ref 77
Introduces Explicit Logic Channel (ELC) with LLM, VFM and probabilistic inference for validating, selecting and enhancing MLLMs on zero-shot tasks using Consistency Rate and cross-channel integration.
A Negative Result on Cross-Model Activation Transfer in a Pythia Multi-Hop Setting cs.AI · 2026-06-02 · unverdicted · none · ref 10
A learned linear activation bridge achieves high alignment (cosine ~0.97) between Pythia-160M and Pythia-410M states but produces no improvement in downstream multi-hop answering when injected into the receiver.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer