hub

author Tang, Y

Jiawei Chen, Chiu Man Ho · 2022 · arXiv 1458.2022

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it

read on arXiv browse 13 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3 dataset 1

citation-polarity summary

background 4

representative citing papers

Cyclic 2.5D Perceptual Loss for Cross-Modal 3D Medical Image Synthesis: T1w MRI to Tau PET

eess.IV · 2024-06-18 · unverdicted · novelty 7.0

Proposes a cyclic 2.5D perceptual loss with manufacturer SUVR standardization for T1w MRI to tau PET synthesis, reporting improved regional agreement on ADNI and SCAN cohorts across U-Net, UNETR, SwinUNETR, CycleGAN, and Pix2Pix.

Evaluating Object Hallucination in Large Vision-Language Models

cs.CV · 2023-05-17 · accept · novelty 7.0

Large vision-language models exhibit severe object hallucination that varies with training instructions, and the proposed POPE polling method evaluates it more stably and flexibly than prior approaches.

Bengal-HP_RU: A Dataset of Bengal People For Head Pose Estimation

cs.CV · 2026-06-23 · unverdicted · novelty 6.0 · 2 refs

Bengal-HP_RU is the first publicly available head pose dataset for Bengali subjects, with 12,894 images collected from Wikimedia Commons and partitioned by uploader identity.

Controllable Texture Tiling with Transformed RoPE-Enhanced Diffusion Models

cs.GR · 2026-06-22 · unverdicted · novelty 6.0

A Diffusion Transformer framework applies coordinate-transformed RoPE and disjoint attention masks to achieve controllable, high-fidelity texture tiling that preserves reference structure and scene lighting.

Enhancing Multilingual Reasoning via Steerable Model Merging

cs.CL · 2026-06-17 · unverdicted · novelty 6.0

ST-Merge uses gated cross-attention to adaptively weight source models during merging, outperforming baselines on multilingual reasoning tasks across 21 languages.

MS-DKC: A Dataset Knowledge Card Framework for Designing and Adapting Medical Image Segmentation Models

cs.CV · 2026-06-04 · unverdicted · novelty 6.0

MS-DKC is a dataset knowledge card framework that maps image, morphology, supervision, context, and risk descriptors to design priors and failure modes, shown to produce dataset-specific model adaptations with improved metrics on DRIVE, ISIC2018, and ACDC.

AdaCodec: A Predictive Visual Code for Video MLLMs

cs.CV · 2026-06-01 · unverdicted · novelty 6.0

AdaCodec introduces a predictive visual code that cuts visual token use in video MLLMs by sending full frames only on high predictive cost and otherwise encoding inter-frame changes as P-tokens, yielding better benchmark scores at lower budgets.

Utilizing Inpainting for Keypoint Detection for Vision-Based Control of Robotic Manipulators

cs.RO · 2026-04-14 · unverdicted · novelty 6.0

A framework trains keypoint detectors on inpainted markerless robot images and uses runtime inpainting plus UKF for robust vision-based control without models or calibration.

Rotary Masked Autoencoders are Versatile Learners

cs.LG · 2025-05-26 · unverdicted · novelty 6.0

RoMAE applies rotary positional embeddings to masked autoencoders to enable representation learning and interpolation on continuous positional data across irregular time-series, images, and audio without modality-specific modifications.

Out-of-Distribution Generalization in Time Series: A Survey

cs.LG · 2025-03-18 · unverdicted · novelty 5.0

This is the first comprehensive survey of OOD generalization methodologies for time series, organized across data distribution, representation learning, and OOD evaluation.

Autonomous Unmanned Aircraft Systems for Enhanced Search and Rescue of Drowning Swimmers: Image-Based Localization and Mission Simulation

cs.CV · 2026-04-20 · unverdicted · novelty 4.0

A UAS with YOLO-based swimmer detection and DES simulations reduces drowning rescue response time by a factor of five versus standard operations in tested lake areas.

PaliGemma: A versatile 3B VLM for transfer

cs.CV · 2024-07-10 · unverdicted · novelty 4.0

PaliGemma is an open 3B VLM based on SigLIP and Gemma that achieves strong performance on nearly 40 diverse open-world tasks including benchmarks, remote-sensing, and segmentation.

SoK: A Comprehensive Analysis of the Current Status of Neural Tangent Generalization Attacks with Research Directions

cs.LG · 2026-05-12 · accept · novelty 3.0

NTGA is the first clean-label generalization attack under black-box settings but is vulnerable to adversarial training and image transformations, with newer attacks outperforming it.

citing papers explorer

Showing 13 of 13 citing papers.

Cyclic 2.5D Perceptual Loss for Cross-Modal 3D Medical Image Synthesis: T1w MRI to Tau PET eess.IV · 2024-06-18 · unverdicted · none · ref 27
Proposes a cyclic 2.5D perceptual loss with manufacturer SUVR standardization for T1w MRI to tau PET synthesis, reporting improved regional agreement on ADNI and SCAN cohorts across U-Net, UNETR, SwinUNETR, CycleGAN, and Pix2Pix.
Evaluating Object Hallucination in Large Vision-Language Models cs.CV · 2023-05-17 · accept · none · ref 7
Large vision-language models exhibit severe object hallucination that varies with training instructions, and the proposed POPE polling method evaluates it more stably and flexibly than prior approaches.
Bengal-HP_RU: A Dataset of Bengal People For Head Pose Estimation cs.CV · 2026-06-23 · unverdicted · none · ref 52 · 2 links
Bengal-HP_RU is the first publicly available head pose dataset for Bengali subjects, with 12,894 images collected from Wikimedia Commons and partitioned by uploader identity.
Controllable Texture Tiling with Transformed RoPE-Enhanced Diffusion Models cs.GR · 2026-06-22 · unverdicted · none · ref 13
A Diffusion Transformer framework applies coordinate-transformed RoPE and disjoint attention masks to achieve controllable, high-fidelity texture tiling that preserves reference structure and scene lighting.
Enhancing Multilingual Reasoning via Steerable Model Merging cs.CL · 2026-06-17 · unverdicted · none · ref 31
ST-Merge uses gated cross-attention to adaptively weight source models during merging, outperforming baselines on multilingual reasoning tasks across 21 languages.
MS-DKC: A Dataset Knowledge Card Framework for Designing and Adapting Medical Image Segmentation Models cs.CV · 2026-06-04 · unverdicted · none · ref 7
MS-DKC is a dataset knowledge card framework that maps image, morphology, supervision, context, and risk descriptors to design priors and failure modes, shown to produce dataset-specific model adaptations with improved metrics on DRIVE, ISIC2018, and ACDC.
AdaCodec: A Predictive Visual Code for Video MLLMs cs.CV · 2026-06-01 · unverdicted · none · ref 44
AdaCodec introduces a predictive visual code that cuts visual token use in video MLLMs by sending full frames only on high predictive cost and otherwise encoding inter-frame changes as P-tokens, yielding better benchmark scores at lower budgets.
Utilizing Inpainting for Keypoint Detection for Vision-Based Control of Robotic Manipulators cs.RO · 2026-04-14 · unverdicted · none · ref 11
A framework trains keypoint detectors on inpainted markerless robot images and uses runtime inpainting plus UKF for robust vision-based control without models or calibration.
Rotary Masked Autoencoders are Versatile Learners cs.LG · 2025-05-26 · unverdicted · none · ref 30
RoMAE applies rotary positional embeddings to masked autoencoders to enable representation learning and interpolation on continuous positional data across irregular time-series, images, and audio without modality-specific modifications.
Out-of-Distribution Generalization in Time Series: A Survey cs.LG · 2025-03-18 · unverdicted · none · ref 27
This is the first comprehensive survey of OOD generalization methodologies for time series, organized across data distribution, representation learning, and OOD evaluation.
Autonomous Unmanned Aircraft Systems for Enhanced Search and Rescue of Drowning Swimmers: Image-Based Localization and Mission Simulation cs.CV · 2026-04-20 · unverdicted · none · ref 11
A UAS with YOLO-based swimmer detection and DES simulations reduces drowning rescue response time by a factor of five versus standard operations in tested lake areas.
PaliGemma: A versatile 3B VLM for transfer cs.CV · 2024-07-10 · unverdicted · none · ref 95
PaliGemma is an open 3B VLM based on SigLIP and Gemma that achieves strong performance on nearly 40 diverse open-world tasks including benchmarks, remote-sensing, and segmentation.
SoK: A Comprehensive Analysis of the Current Status of Neural Tangent Generalization Attacks with Research Directions cs.LG · 2026-05-12 · accept · none · ref 110
NTGA is the first clean-label generalization attack under black-box settings but is vulnerable to adversarial training and image transformations, with newer attacks outperforming it.

author Tang, Y

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer