arxiv: 2602.18047 · v3 · submitted 2026-02-20 · 💻 cs.CV · cs.LG

Recognition: 2 theorem links

· Lean Theorem

CityGuard: Graph-Aware Private Descriptors for Bias-Resilient Identity Search Across Urban Cameras

Rong Fu , Yibo Meng , Jia Yee Tan , Jiaxuan Lu , Rui Lu , Jiekai Wu , Zhaolu Kang , Simon Fong

Authors on Pith no claims yet

Pith reviewed 2026-05-15 21:11 UTC · model grok-4.3

classification 💻 cs.CV cs.LG

keywords person re-identificationdifferential privacygraph attentionurban surveillanceprivacy-preserving retrievalcross-view alignmenttransformermetric learning

0 comments

The pith

CityGuard combines adaptive metrics, coarse-geometry graph attention, and differential privacy to create robust private descriptors for person re-identification across urban camera networks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents CityGuard as a topology-aware transformer that performs privacy-preserving identity retrieval without sharing raw imagery across distributed city cameras. It tackles viewpoint changes, occlusions, and domain shifts through a dispersion-adaptive metric learner that tightens intra-class clusters, a spatially conditioned attention layer that feeds rough location data such as GPS into graph self-attention for consistent cross-view alignment, and differentially private embedding maps paired with compact indexes. These elements together produce descriptors that remain effective under real-world appearance variation while allowing tunable privacy guarantees enforced by rigorous differential-privacy accounting. Experiments on Market-1501 and other benchmarks report higher retrieval precision and faster query throughput than prior methods, indicating the designs support practical deployment in privacy-regulated surveillance settings.

Core claim

CityGuard is a topology-aware transformer for privacy-preserving identity retrieval in decentralized surveillance that integrates three components: a dispersion-adaptive metric learner that adjusts instance-level margins according to feature spread to increase intra-class compactness, spatially conditioned attention that injects coarse geometric priors such as GPS or deployment floor plans into graph-based self-attention to enable projectively consistent cross-view alignment without survey-grade calibration, and differentially private embedding maps coupled with compact approximate indexes; together these designs produce descriptors robust to viewpoint variation, occlusion, and domain shifts

What carries the argument

Spatially conditioned attention that injects coarse geometric priors into graph-based self-attention to achieve projectively consistent cross-view alignment using only GPS or floor-plan data.

If this is right

Descriptors gain robustness to viewpoint variation, occlusion, and domain shifts.
Privacy and utility can be balanced in a tunable way under rigorous differential-privacy accounting.
Retrieval precision improves on Market-1501 and additional public benchmarks.
Query throughput rises through the use of compact approximate indexes.
Secure deployment becomes feasible for decentralized urban surveillance networks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The framework may allow rapid rollout of identity-matching systems in cities that already have basic GPS or floor-plan data but lack precise camera calibration.
Similar graph-attention designs with coarse geometry could apply to other multi-camera tasks such as vehicle tracking or crowd flow analysis.
If the coarse-prior approach generalizes, it reduces the cost barrier for adding new cameras to existing networks without recalibrating the entire system.
The combination of adaptive margins and private embeddings suggests a path toward descriptors that remain useful even when training data are heavily noised for stronger privacy.

Load-bearing premise

Coarse geometric priors such as GPS or deployment floor plans are sufficient to produce projectively consistent cross-view alignment inside graph-based self-attention without survey-grade calibration.

What would settle it

Retrieval accuracy falling below non-graph baselines when the supplied GPS or floor-plan priors contain errors larger than typical urban positioning noise would show that the coarse-prior assumption does not hold.

Figures

Figures reproduced from arXiv: 2602.18047 by Jiaxuan Lu, Jia Yee Tan, Jiekai Wu, Rong Fu, Rui Lu, Simon Fong, Yibo Meng, Zhaolu Kang.

**Figure 1.** Figure 1: Overview of the CityGuard framework for bias-resilient, privacy-preserving identity search. The process begins with Topology-Aware Geometry Encoding, where camera coordinates and rotations are mapped to a spatial adjacency graph. The Geometry-Conditioned Backbone then fuses multi-scale features and refines them through a Temporal Graph Network (TGN) to capture cross-camera motion cues. Centrally, the Dispe… view at source ↗

**Figure 2.** Figure 2: Camera topology (GPS only): top-down 2D layout of camera nodes with edge thickness encoding the [PITH_FULL_IMAGE:figures/full_fig_p026_2.png] view at source ↗

**Figure 3.** Figure 3: Camera topology (GPS + Rotation): top-down 2D layout where [PITH_FULL_IMAGE:figures/full_fig_p027_3.png] view at source ↗

**Figure 4.** Figure 4: Attention matrix A computed from visual similarity alone (no geometric bias). Rows correspond to source cameras and columns to target cameras; intensity indicates attention weight before incorporation of the geometric term Bgeom. alignment enables our model to effectively bridge modality gaps while maintaining high retrieval accuracy. The superior [PITH_FULL_IMAGE:figures/full_fig_p028_4.png] view at source ↗

**Figure 5.** Figure 5: Attention matrix A after adding geometric bias Bgeom. Compared with [PITH_FULL_IMAGE:figures/full_fig_p029_5.png] view at source ↗

**Figure 6.** Figure 6: ACT margin dynamics: evolution of adaptive margins [PITH_FULL_IMAGE:figures/full_fig_p030_6.png] view at source ↗

**Figure 7.** Figure 7: UMAP visualization of feature distributions comparing baseline and CityGuard embeddings. [PITH_FULL_IMAGE:figures/full_fig_p030_7.png] view at source ↗

**Figure 8.** Figure 8: Baseline feature distribution exhibiting dispersed intra-class clusters and overlapping inter-class regions. [PITH_FULL_IMAGE:figures/full_fig_p031_8.png] view at source ↗

**Figure 9.** Figure 9: CityGuard feature distribution demonstrating compact intra-class clusters and clear inter-class separation. [PITH_FULL_IMAGE:figures/full_fig_p031_9.png] view at source ↗

**Figure 10.** Figure 10: Training convergence on Market-1501 using Swin Transformer with circle loss and domain generalization. [PITH_FULL_IMAGE:figures/full_fig_p032_10.png] view at source ↗

**Figure 11.** Figure 11: Performance analysis on Market-1501 under varying training configurations. [PITH_FULL_IMAGE:figures/full_fig_p032_11.png] view at source ↗

**Figure 12.** Figure 12: Privacy-utility trade-off curves showing Rank-1 accuracy and mAP degradation on Market-1501 under vary [PITH_FULL_IMAGE:figures/full_fig_p033_12.png] view at source ↗

read the original abstract

City-scale person re-identification across distributed cameras must handle severe appearance changes from viewpoint, occlusion, and domain shift while complying with data protection rules that prevent sharing raw imagery. We introduce CityGuard, a topology-aware transformer for privacy-preserving identity retrieval in decentralized surveillance. The framework integrates three components. A dispersion-adaptive metric learner adjusts instance-level margins according to feature spread, increasing intra-class compactness. Spatially conditioned attention injects coarse geometry, such as GPS or deployment floor plans, into graph-based self-attention to enable projectively consistent cross-view alignment using only coarse geometric priors without requiring survey-grade calibration. Differentially private embedding maps are coupled with compact approximate indexes to support secure and cost-efficient deployment. Together these designs produce descriptors robust to viewpoint variation, occlusion, and domain shifts, and they enable a tunable balance between privacy and utility under rigorous differential-privacy accounting. Experiments on Market-1501 and additional public benchmarks, complemented by database-scale retrieval studies, show consistent gains in retrieval precision and query throughput over strong baselines, confirming the practicality of the framework for privacy-critical urban identity matching.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CityGuard blends adaptive margins, coarse-geometry graph attention, and private indexes into a workable pipeline for privacy-aware city-scale re-id, though the geometric alignment claim rests on a thin assumption.

read the letter

CityGuard combines a dispersion-adaptive metric learner, graph attention fed by coarse location priors such as GPS or floor plans, and differentially private compact embeddings with approximate indexes. The integration itself is the clearest new element; prior re-id work has touched each piece separately, but this specific stack for decentralized, privacy-compliant retrieval is not already in the literature the abstract cites. The adaptive margin and the privacy layer are straightforward engineering choices that address real deployment constraints, and the decision to avoid survey-grade calibration is pragmatic for urban camera networks. The abstract states that the system shows consistent gains on Market-1501 and other public sets plus improved query throughput, which is the kind of evidence that matters for this subfield. The main soft spot is the claim that coarse geometric priors alone produce projectively consistent cross-view alignment inside the attention layers. Projective geometry is sensitive to small errors in intrinsics or extrinsics, and the paper does not appear to supply an ablation that varies prior precision or measures residual misalignment. If those errors are not absorbed by the adaptive learner, the reported robustness to viewpoint and occlusion could shrink. The differential-privacy accounting looks standard, but the results section should include explicit privacy-utility curves rather than a single operating point. This paper is for applied computer-vision groups that build or evaluate large-scale surveillance systems under data-protection rules. It is concrete enough and uses familiar benchmarks, so it deserves a serious referee even if the geometric assumption needs tighter checking during review.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces CityGuard, a topology-aware transformer for privacy-preserving person re-identification across distributed urban cameras. It integrates a dispersion-adaptive metric learner that adjusts instance-level margins according to feature spread, spatially conditioned attention that injects coarse geometric priors (GPS or floor plans) into graph-based self-attention for cross-view alignment, and differentially private embedding maps paired with compact indexes. The central claim is that these components yield descriptors robust to viewpoint variation, occlusion, and domain shifts while enabling a tunable privacy-utility trade-off under rigorous differential privacy, with consistent retrieval gains shown on Market-1501 and other benchmarks.

Significance. If the empirical claims hold, the framework would provide a practical route to decentralized, privacy-compliant identity search in city-scale surveillance, combining graph attention, geometric conditioning, and differential privacy in a single pipeline. The emphasis on coarse priors to avoid survey-grade calibration and the dispersion-adaptive margin mechanism are potentially useful contributions to bias-resilient re-ID.

major comments (2)

[§3.2] §3.2 (Spatially conditioned attention): the claim that coarse GPS or floor-plan priors suffice for projectively consistent cross-view alignment inside graph self-attention lacks an explicit error bound or propagation analysis. Bounded but non-zero error in the conditioning signal can misalign attention weights across views; without showing that the dispersion-adaptive metric learner provably absorbs this error, the robustness claims to viewpoint variation and occlusion rest on an unverified assumption.
[Experiments] Experiments section (and abstract): no quantitative results, error bars, ablation tables, or per-component breakdowns are supplied for the Market-1501 gains or database-scale studies. Without these, the central empirical claim of “consistent gains over strong baselines” cannot be evaluated and the weakest assumption about prior granularity remains untested.

minor comments (2)

[Abstract] Abstract: the statement of “consistent gains” should be accompanied by at least the headline mAP or Rank-1 deltas to allow readers to gauge magnitude before reading further.
[§3.1] Notation: the dispersion-adaptive margin update rule would benefit from an explicit equation (e.g., how the margin scales with measured feature dispersion) rather than a prose description.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and commit to revisions that strengthen the theoretical and empirical sections of the manuscript.

read point-by-point responses

Referee: [§3.2] §3.2 (Spatially conditioned attention): the claim that coarse GPS or floor-plan priors suffice for projectively consistent cross-view alignment inside graph self-attention lacks an explicit error bound or propagation analysis. Bounded but non-zero error in the conditioning signal can misalign attention weights across views; without showing that the dispersion-adaptive metric learner provably absorbs this error, the robustness claims to viewpoint variation and occlusion rest on an unverified assumption.

Authors: We agree that an explicit error-propagation analysis would provide stronger theoretical support. In the revised manuscript we will add a dedicated subsection deriving bounds on attention misalignment induced by bounded errors in the coarse geometric priors. We will show that the dispersion-adaptive margin mechanism absorbs such errors by dynamically widening intra-class margins in proportion to observed feature dispersion, with the bound expressed in terms of the Lipschitz constant of the attention operator and the maximum prior error. The analysis will be accompanied by a controlled perturbation study on synthetic view shifts. revision: yes
Referee: Experiments section (and abstract): no quantitative results, error bars, ablation tables, or per-component breakdowns are supplied for the Market-1501 gains or database-scale studies. Without these, the central empirical claim of “consistent gains over strong baselines” cannot be evaluated and the weakest assumption about prior granularity remains untested.

Authors: We apologize for the insufficient presentation of results in the reviewed copy. The original experiments section contains numerical results on Market-1501 and additional benchmarks, yet they were not rendered with sufficient detail. In the revision we will replace the current summary with full tables reporting mAP and rank-1 accuracy (mean ± std over five independent runs), per-component ablation tables that isolate the contribution of spatially conditioned attention and the dispersion-adaptive learner, and a dedicated study varying the granularity of the geometric priors (GPS noise levels and floor-plan resolution). Database-scale retrieval latency and throughput figures will also be included with error bars. revision: yes

Circularity Check

0 steps flagged

No significant circularity; framework presented as forward design without self-referential reductions

full rationale

The provided abstract and description introduce CityGuard via three explicit components (dispersion-adaptive metric learner, spatially conditioned attention using coarse priors, and differentially private maps) but contain no equations, derivations, or parameter-fitting steps that reduce claimed robustness or privacy-utility balance to quantities defined by the inputs themselves. No self-citations, uniqueness theorems, or ansatzes are invoked in a load-bearing manner. The text reads as a proposal of architectural choices rather than a closed loop where predictions equal fitted inputs by construction. Per the hard rules, absent any quotable reduction (e.g., Eq. X = Eq. Y), the score is 0 and steps remain empty.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The central claim rests on three newly introduced components whose performance benefits are asserted without independent prior evidence or shipped artifacts; differential privacy accounting is treated as given.

axioms (2)

domain assumption Coarse geometric priors suffice for projectively consistent alignment
Invoked in the description of spatially conditioned attention
domain assumption Differential privacy accounting remains rigorous after coupling with compact indexes
Stated as enabling tunable privacy-utility balance

invented entities (2)

dispersion-adaptive metric learner no independent evidence
purpose: Adjust instance-level margins according to feature spread
New component introduced to increase intra-class compactness
spatially conditioned attention no independent evidence
purpose: Inject coarse geometry into graph self-attention
New mechanism for cross-view alignment without precise calibration

pith-pipeline@v0.9.0 · 5515 in / 1393 out tokens · 20812 ms · 2026-05-15T21:11:46.144343+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Spatially conditioned attention injects coarse geometry, such as GPS or deployment floor plans, into graph-based self-attention to enable projectively consistent cross-view alignment using only coarse geometric priors
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

dispersion-adaptive metric learner adjusts instance-level margins according to feature spread... γi = γ0 (1 + α tanh(β D_KL(Pi ∥ Q)))

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

103 extracted references · 103 canonical work pages · 3 internal anchors

[1]

Person re-identification by multi-camera networks for internet of things in smart cities.IEEE Access, 6:76111–76117, 2018

Shilin Zhang and Hangbin Yu. Person re-identification by multi-camera networks for internet of things in smart cities.IEEE Access, 6:76111–76117, 2018

work page 2018
[2]

Deep learning for person re-identification: A survey and outlook.IEEE transactions on pattern analysis and machine intelligence, 44(6): 2872–2893, 2021

Mang Ye, Jianbing Shen, Gaojie Lin, Tao Xiang, Ling Shao, and Steven CH Hoi. Deep learning for person re-identification: A survey and outlook.IEEE transactions on pattern analysis and machine intelligence, 44(6): 2872–2893, 2021

work page 2021
[3]

Person Re-identification: Past, Present and Future

Liang Zheng, Yi Yang, and Alexander G Hauptmann. Person re-identification: Past, present and future.arXiv preprint arXiv:1610.02984, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[4]

Parameter-efficient person re-identification in the 3d space.IEEE Transactions on Neural Networks and Learning Systems, 35(6):7534–7547, 2022

Zhedong Zheng, Xiaohan Wang, Nenggan Zheng, and Yi Yang. Parameter-efficient person re-identification in the 3d space.IEEE Transactions on Neural Networks and Learning Systems, 35(6):7534–7547, 2022

work page 2022
[5]

Transreid: Transformer-based object re- identification

Shuting He, Hao Luo, Pichao Wang, Fan Wang, Hao Li, and Wei Jiang. Transreid: Transformer-based object re- identification. InProceedings of the IEEE/CVF international conference on computer vision, pages 15013–15022, 2021

work page 2021
[6]

Transformer-based person re-identification: A comprehensive review.IEEE Transactions on Intelligent Vehicles, 9(7):5222–5239, 2024

Prodip Kumar Sarker, Qingjie Zhao, and Md Kamal Uddin. Transformer-based person re-identification: A comprehensive review.IEEE Transactions on Intelligent Vehicles, 9(7):5222–5239, 2024

work page 2024
[7]

Personvit: large-scale self-supervised vision transformer for person re-identification.Machine Vision and Applications, 36(2):32, 2025

Bin Hu, Xinggang Wang, and Wenyu Liu. Personvit: large-scale self-supervised vision transformer for person re-identification.Machine Vision and Applications, 36(2):32, 2025. 10 CityGuard

work page 2025
[8]

Deep ranking model by large adaptive margin learning for person re-identification.Pattern Recognition, 74:241–252, 2018

Jiayun Wang, Sanping Zhou, Jinjun Wang, and Qiqi Hou. Deep ranking model by large adaptive margin learning for person re-identification.Pattern Recognition, 74:241–252, 2018

work page 2018
[9]

Margin-based modal adaptive learning for visible-infrared person re-identification.Sensors, 23(3):1426, 2023

Qianqian Zhao, Hanxiao Wu, and Jianqing Zhu. Margin-based modal adaptive learning for visible-infrared person re-identification.Sensors, 23(3):1426, 2023

work page 2023
[10]

Adaptive intra-class variation contrastive learning for unsupervised person re-identification.arXiv preprint arXiv:2404.04665, 2024

Lingzhi Liu, Haiyang Zhang, Chengwei Tang, and Tiantian Zhang. Adaptive intra-class variation contrastive learning for unsupervised person re-identification.arXiv preprint arXiv:2404.04665, 2024

work page arXiv 2024
[11]

Adversarial camera alignment network for unsupervised cross-camera person re-identification.IEEE Transactions on Circuits and Systems for Video Technology, 32(5):2921–2936, 2021

Lei Qi, Lei Wang, Jing Huo, Yinghuan Shi, Xin Geng, and Yang Gao. Adversarial camera alignment network for unsupervised cross-camera person re-identification.IEEE Transactions on Circuits and Systems for Video Technology, 32(5):2921–2936, 2021

work page 2021
[12]

Occluded person re-identification via a universal framework with difference consistency guidance learning.IEEE Internet of Things Journal, 2024

Yuxuan Liu, Hongwei Ge, Guozhi Tang, and Yong Luo. Occluded person re-identification via a universal framework with difference consistency guidance learning.IEEE Internet of Things Journal, 2024

work page 2024
[13]

Occlusion simulation and token-constrained feature coupling network for occluded person re-identification.IEEE Internet of Things Journal, 2025

Li Wang, Shuli Cheng, Anyu Du, Liejun Wang, and Lun Zhang. Occlusion simulation and token-constrained feature coupling network for occluded person re-identification.IEEE Internet of Things Journal, 2025

work page 2025
[14]

Texture-aware transformer with pose-patch mapping for occluded person re-identification.Pattern Recognition, page 112341, 2025

Dengwen Wang, Guanyu Xing, and Yanli Liu. Texture-aware transformer with pose-patch mapping for occluded person re-identification.Pattern Recognition, page 112341, 2025

work page 2025
[15]

Enhance heads in vision transformer for occluded person re-identification.IEEE Sensors Journal, 2025

Shoudong Han, Ziwen Zhang, Xinpeng Yuan, and Delie Ming. Enhance heads in vision transformer for occluded person re-identification.IEEE Sensors Journal, 2025

work page 2025
[16]

Privacy-enhancing person re-identification framework-a dual-stage approach

Kajal Kansal, Yongkang Wong, and Mohan Kankanhalli. Privacy-enhancing person re-identification framework-a dual-stage approach. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 8543–8552, 2024

work page 2024
[17]

Secureda: Privacy-preserving source-free domain adaptation for person re-identification.IEEE Transactions on Multimedia, 2025

Xiaofeng Qu, Li Liu, Huaxiang Zhang, Lei Zhu, Liqiang Nie, Xiaojun Chang, and Fengling Li. Secureda: Privacy-preserving source-free domain adaptation for person re-identification.IEEE Transactions on Multimedia, 2025

work page 2025
[18]

A multi-scale graph attention-based transformer for occluded person re-identification.Applied Sciences, 14(18):8279, 2024

Ming Ma, Jianming Wang, and Bohan Zhao. A multi-scale graph attention-based transformer for occluded person re-identification.Applied Sciences, 14(18):8279, 2024

work page 2024
[19]

Generative adversarial patches for physical attacks on cross-modal pedestrian re-identification.arXiv preprint arXiv:2410.20097, 2024

Yue Su, Hao Li, and Maoguo Gong. Generative adversarial patches for physical attacks on cross-modal pedestrian re-identification.arXiv preprint arXiv:2410.20097, 2024

work page arXiv 2024
[20]

Joint discriminative and generative learning for person re-identification

Zhedong Zheng, Xiaodong Yang, Zhiding Yu, Liang Zheng, Yi Yang, and Jan Kautz. Joint discriminative and generative learning for person re-identification. Inproceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2138–2147, 2019

work page 2019
[21]

Pedestrian re-identification based on swin transformer

Zifei Qin, Peishun Liu, Yibei Liu, Haiping Duan, Feifei Li, and Han Wang. Pedestrian re-identification based on swin transformer. In2022 3rd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE), pages 123–129. IEEE, 2022

work page 2022
[22]

Edgevpr: Transformer-based real-time video person re-identification at the edge

Meng Sun, Ju Ren, and Yaoxue Zhang. Edgevpr: Transformer-based real-time video person re-identification at the edge. In2024 IEEE 44th International Conference on Distributed Computing Systems (ICDCS), pages 13–24. IEEE, 2024

work page 2024
[23]

Cvrecon: Rethinking 3d geometric feature learning for neural reconstruction

Ziyue Feng, Liang Yang, Pengsheng Guo, and Bing Li. Cvrecon: Rethinking 3d geometric feature learning for neural reconstruction. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 17750–17760, 2023

work page 2023
[24]

Group multi-view transformer for 3d shape analysis with spatial encoding.IEEE Transactions on Multimedia, 26:9450–9463, 2024

Lixiang Xu, Qingzhe Cui, Richang Hong, Wei Xu, Enhong Chen, Xin Yuan, Chenglong Li, and Yuanyan Tang. Group multi-view transformer for 3d shape analysis with spatial encoding.IEEE Transactions on Multimedia, 26:9450–9463, 2024

work page 2024
[25]

Vsformer: Mining correlations in flexible view set for multi-view 3d shape understanding.IEEE Transactions on Visualization and Computer Graphics, 31(4):2127–2141, 2024

Hongyu Sun, Yongcai Wang, Peng Wang, Haoran Deng, Xudong Cai, and Deying Li. Vsformer: Mining correlations in flexible view set for multi-view 3d shape understanding.IEEE Transactions on Visualization and Computer Graphics, 31(4):2127–2141, 2024

work page 2024
[26]

In Defense of the Triplet Loss for Person Re-Identification

Alexander Hermans, Lucas Beyer, and Bastian Leibe. In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737, 2017. 11 CityGuard

work page internal anchor Pith review Pith/arXiv arXiv 2017
[27]

Circle loss: A unified perspective of pair similarity optimization

Yifan Sun, Changmao Cheng, Yuhan Zhang, Chi Zhang, Liang Zheng, Zhongdao Wang, and Yichen Wei. Circle loss: A unified perspective of pair similarity optimization. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6398–6407, 2020

work page 2020
[28]

Learnable dynamic margin in deep metric learning.Pattern Recognition, 132:108961, 2022

Yifan Wang, Pingping Liu, Yijun Lang, Qiuzhan Zhou, and Xue Shan. Learnable dynamic margin in deep metric learning.Pattern Recognition, 132:108961, 2022

work page 2022
[29]

Tig-cl: teacher-guided individual and group aware contrastive learning for unsupervised person re-identification in internet of things.IEEE Internet of Things Journal, 2024

Xiao Teng, Chuan Li, Xueqiong Li, Xinwang Liu, and Long Lan. Tig-cl: teacher-guided individual and group aware contrastive learning for unsupervised person re-identification in internet of things.IEEE Internet of Things Journal, 2024

work page 2024
[30]

Dual-graph contrastive learning for unsupervised person reidentification.IEEE Transactions on Cognitive and Developmental Systems, 16(4):1352–1363, 2024

Lin Zhang, Ran Song, Yifan Wang, Qian Zhang, and Wei Zhang. Dual-graph contrastive learning for unsupervised person reidentification.IEEE Transactions on Cognitive and Developmental Systems, 16(4):1352–1363, 2024

work page 2024
[31]

An image–text dual-channel union network for person re-identification.IEEE Transactions on Instrumentation and Measurement, 72:1–16, 2023

Baoguang Qi, Yi Chen, Qiang Liu, Xiaohai He, Linbo Qing, Ray E Sheriff, and Honggang Chen. An image–text dual-channel union network for person re-identification.IEEE Transactions on Instrumentation and Measurement, 72:1–16, 2023

work page 2023
[32]

Image-text-image knowledge transfer for lifelong person re-identification with hybrid clothing states.IEEE Transactions on Image Processing, 2025

Qizao Wang, Xuelin Qian, Bin Li, Yanwei Fu, and Xiangyang Xue. Image-text-image knowledge transfer for lifelong person re-identification with hybrid clothing states.IEEE Transactions on Image Processing, 2025

work page 2025
[33]

Looking clearer with text: A hierarchical context blending network for occluded person re-identification.IEEE Transactions on Information Forensics and Security, 2025

Changshuo Wang, Shuting He, Meiqing Wu, Siew-Kei Lam, Prayag Tiwari, and Xingyu Gao. Looking clearer with text: A hierarchical context blending network for occluded person re-identification.IEEE Transactions on Information Forensics and Security, 2025

work page 2025
[34]

Syrer: Synergistic relational reasoning for rgb-d cross-modal re-identification.IEEE Transactions on Multimedia, 26:5600–5614, 2023

Hao Liu, Jingjing Wu, Feng Li, Jianguo Jiang, and Richang Hong. Syrer: Synergistic relational reasoning for rgb-d cross-modal re-identification.IEEE Transactions on Multimedia, 26:5600–5614, 2023

work page 2023
[35]

Rxnet: cross-modality person re-identification based on a dual-branch network: W

Weiyang Zhang, Jiong Guo, Qiang Liu, Maoyang Zou, Honggang Chen, and Jing Peng. Rxnet: cross-modality person re-identification based on a dual-branch network: W. zhang et al.Applied Intelligence, 55(15):993, 2025

work page 2025
[36]

Reliable cross-camera learning in random camera person re-identification.IEEE Transactions on Circuits and Systems for Video Technology, 34(6):4556–4567, 2023

Zhengqi Liu, Yutian Lin, Tianyang Liu, and Bo Du. Reliable cross-camera learning in random camera person re-identification.IEEE Transactions on Circuits and Systems for Video Technology, 34(6):4556–4567, 2023

work page 2023
[37]

Event-driven re-id: A new benchmark and method towards privacy-preserving person re-identification

Shafiq Ahmad, Gianluca Scarpellini, Pietro Morerio, and Alessio Del Bue. Event-driven re-id: A new benchmark and method towards privacy-preserving person re-identification. InProceedings of the IEEE/CVF winter conference on applications of computer vision, pages 459–468, 2022

work page 2022
[38]

Diffphysba: Diffusion-based physical backdoor attack against person re-identification in real-world.arXiv preprint arXiv:2405.19990, 2024

Wenli Sun, Xinyang Jiang, Dongsheng Li, and Cairong Zhao. Diffphysba: Diffusion-based physical backdoor attack against person re-identification in real-world.arXiv preprint arXiv:2405.19990, 2024

work page arXiv 2024
[39]

Generative metric learning for adversarially robust open-world person re-identification.ACM Transactions on Multimedia Computing, Communications and Applications, 19(1):1–19, 2023

Deyin Liu, Lin Wu, Richang Hong, Zongyuan Ge, Jialie Shen, Farid Boussaid, and Mohammed Bennamoun. Generative metric learning for adversarially robust open-world person re-identification.ACM Transactions on Multimedia Computing, Communications and Applications, 19(1):1–19, 2023

work page 2023
[40]

A two-stream dynamic pyramid representation model for video-based person re-identification.IEEE Transactions on Image Processing, 30:6266–6276, 2021

Xi Yang, Liangchen Liu, Nannan Wang, and Xinbo Gao. A two-stream dynamic pyramid representation model for video-based person re-identification.IEEE Transactions on Image Processing, 30:6266–6276, 2021

work page 2021
[41]

Context-aided semantic-aware self-alignment for video-based person re-identification.IEEE Transactions on Circuits and Systems for Video Technology, 2025

Zhidan Ran, Zhiyao Xiao, Xiaobo Lu, Xuan Wei, and Wei Liu. Context-aided semantic-aware self-alignment for video-based person re-identification.IEEE Transactions on Circuits and Systems for Video Technology, 2025

work page 2025
[42]

Similarity distribution based membership inference attack on person re-identification

Junyao Gao, Xinyang Jiang, Huishuai Zhang, Yifan Yang, Shuguang Dou, Dongsheng Li, Duoqian Miao, Cheng Deng, and Cairong Zhao. Similarity distribution based membership inference attack on person re-identification. InProceedings of the AAAI conference on artificial intelligence, volume 37, pages 14820–14828, 2023

work page 2023
[43]

Re-id-leak: Membership inference attacks against person re-identification.International Journal of Computer Vision, 132 (10):4673–4687, 2024

Junyao Gao, Xinyang Jiang, Shuguang Dou, Dongsheng Li, Duoqian Miao, and Cairong Zhao. Re-id-leak: Membership inference attacks against person re-identification.International Journal of Computer Vision, 132 (10):4673–4687, 2024

work page 2024
[44]

Securereid: Privacy-preserving anonymization for person re-identification.IEEE Transactions on Information Forensics and Security, 19:2840–2853, 2024

Mang Ye, Wei Shen, Junwu Zhang, Yao Yang, and Bo Du. Securereid: Privacy-preserving anonymization for person re-identification.IEEE Transactions on Information Forensics and Security, 19:2840–2853, 2024

work page 2024
[45]

Toward a privacy-preserving face recognition system: A survey of leakages and solutions.ACM Computing Surveys, 57(6):1–38, 2025

Lamyanba Laishram, Muhammad Shaheryar, Jong Taek Lee, and Soon Ki Jung. Toward a privacy-preserving face recognition system: A survey of leakages and solutions.ACM Computing Surveys, 57(6):1–38, 2025. 12 CityGuard

work page 2025
[46]

Re-identification attack based on few-hints dataset enrichment for ubiquitous applications

Andrea Artioli, Luca Bedogni, and Mauro Leoncini. Re-identification attack based on few-hints dataset enrichment for ubiquitous applications. In2022 IEEE 8th World Forum on Internet of Things (WF-IoT), pages 1–6. IEEE, 2022

work page 2022
[47]

Lucas Maris, Yuki Matsuda, and Keiichi Yasumoto. Differential privacy and k-anonymity for pedestrian image data: Impact on cross-camera person re-identification and demographic predictions.ACM Transactions on Cyber-Physical Systems, 9(4):1–31, 2025

work page 2025
[48]

Scalable person re- identification: A benchmark

Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, and Qi Tian. Scalable person re- identification: A benchmark. InProceedings of the IEEE international conference on computer vision, pages 1116–1124, 2015

work page 2015
[49]

Mars: A video benchmark for large-scale person re-identification

Liang Zheng, Zhi Bie, Yifan Sun, Jingdong Wang, Chi Su, Shengjin Wang, and Qi Tian. Mars: A video benchmark for large-scale person re-identification. InEuropean conference on computer vision, pages 868–884. Springer, 2016

work page 2016
[50]

Person transfer gan to bridge domain gap for person re-identification

Longhui Wei, Shiliang Zhang, Wen Gao, and Qi Tian. Person transfer gan to bridge domain gap for person re-identification. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 79–88, 2018

work page 2018
[51]

Person recognition system based on a combination of body images from visible light and thermal cameras.Sensors, 17(3):605, 2017

Dat Tien Nguyen, Hyung Gil Hong, Ki Wan Kim, and Kang Ryoung Park. Person recognition system based on a combination of body images from visible light and thermal cameras.Sensors, 17(3):605, 2017

work page 2017
[52]

Rgb-infrared cross-modality person re-identification

Ancong Wu, Wei-Shi Zheng, Hong-Xing Yu, Shaogang Gong, and Jianhuang Lai. Rgb-infrared cross-modality person re-identification. InProceedings of the IEEE international conference on computer vision, pages 5380–5389, 2017

work page 2017
[53]

Occluded person re-identification

Jiaxuan Zhuo, Zeyu Chen, Jianhuang Lai, and Guangcong Wang. Occluded person re-identification. In2018 IEEE international conference on multimedia and expo (ICME), pages 1–6. IEEE, 2018

work page 2018
[54]

Partial person re-identification

Wei-Shi Zheng, Xiang Li, Tao Xiang, Shengcai Liao, Jianhuang Lai, and Shaogang Gong. Partial person re-identification. InProceedings of the IEEE international conference on computer vision, pages 4678–4686, 2015

work page 2015
[55]

Deep spatial feature reconstruction for partial person re-identification: Alignment-free approach

Lingxiao He, Jian Liang, Haiqing Li, and Zhenan Sun. Deep spatial feature reconstruction for partial person re-identification: Alignment-free approach. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 7073–7082, 2018

work page 2018
[56]

A high-accuracy unsupervised person re-identification method using auxiliary information mined from datasets.arXiv preprint arXiv:2205.03124, 2022

Hehan Teng, Tao He, Yuchen Guo, and Guiguang Ding. A high-accuracy unsupervised person re-identification method using auxiliary information mined from datasets.arXiv preprint arXiv:2205.03124, 2022

work page arXiv 2022
[57]

Spatially and temporally efficient non-local attention network for video-based person re-identification.arXiv preprint arXiv:1908.01683, 2019

Chih-Ting Liu, Chih-Wei Wu, Yu-Chiang Frank Wang, and Shao-Yi Chien. Spatially and temporally efficient non-local attention network for video-based person re-identification.arXiv preprint arXiv:1908.01683, 2019

work page arXiv 1908
[58]

Robust video-based person re-identification by hierarchical mining.IEEE Transactions on Circuits and Systems for Video Technology, 32(12):8179–8191, 2021

Zhikang Wang, Lihuo He, Xiaoguang Tu, Jian Zhao, Xinbo Gao, Shengmei Shen, and Jiashi Feng. Robust video-based person re-identification by hierarchical mining.IEEE Transactions on Circuits and Systems for Video Technology, 32(12):8179–8191, 2021

work page 2021
[59]

Sta: Spatial-temporal attention for large-scale video-based person re-identification

Yang Fu, Xiaoyang Wang, Yunchao Wei, and Thomas Huang. Sta: Spatial-temporal attention for large-scale video-based person re-identification. InProceedings of the AAAI conference on artificial intelligence, volume 33, pages 8287–8294, 2019

work page 2019
[60]

Co-segmentation inspired attention networks for video-based person re-identification

Arulkumar Subramaniam, Athira Nambiar, and Anurag Mittal. Co-segmentation inspired attention networks for video-based person re-identification. InProceedings of the IEEE/CVF international conference on computer vision, pages 562–572, 2019

work page 2019
[61]

Global-local temporal representations for video person re-identification

Jianing Li, Jingdong Wang, Qi Tian, Wen Gao, and Shiliang Zhang. Global-local temporal representations for video person re-identification. InProceedings of the IEEE/CVF international conference on computer vision, pages 3958–3967, 2019

work page 2019
[62]

Vrstc: Occlusion-free video person re-identification

Ruibing Hou, Bingpeng Ma, Hong Chang, Xinqian Gu, Shiguang Shan, and Xilin Chen. Vrstc: Occlusion-free video person re-identification. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7183–7192, 2019. 13 CityGuard

work page 2019
[63]

Person re-identification method based on video spatial feature enhancement

Zunwang Ke, Guozhi Sun, Run Guo, Minghua Du, and Yugui Zhang. Person re-identification method based on video spatial feature enhancement. In2024 IEEE International Conference on Cognitive Computing and Complex Data (ICCD), pages 23–30. IEEE, 2024

work page 2024
[64]

Stfe: a comprehensive video-based person re-identification network based on spatio-temporal feature enhancement.IEEE Transactions on Multimedia, 26: 7237–7249, 2024

Xi Yang, Xian Wang, Liangchen Liu, Nannan Wang, and Xinbo Gao. Stfe: a comprehensive video-based person re-identification network based on spatio-temporal feature enhancement.IEEE Transactions on Multimedia, 26: 7237–7249, 2024

work page 2024
[65]

Dual-branch occlusion-aware semantic part-features extraction network for occluded person re-identification.Mathematics, 13(15):2432, 2025

Bo Sun, Yulong Zhang, Jianan Wang, and Chunmao Jiang. Dual-branch occlusion-aware semantic part-features extraction network for occluded person re-identification.Mathematics, 13(15):2432, 2025

work page 2025
[66]

Partformer: Awakening latent diverse representation from vision transformer for object re-identification.arXiv preprint arXiv:2408.16684, 2024

Lei Tan, Pingyang Dai, Jie Chen, Liujuan Cao, Yongjian Wu, and Rongrong Ji. Partformer: Awakening latent diverse representation from vision transformer for object re-identification.arXiv preprint arXiv:2408.16684, 2024

work page arXiv 2024
[67]

Svdnet for pedestrian retrieval

Yifan Sun, Liang Zheng, Weijian Deng, and Shengjin Wang. Svdnet for pedestrian retrieval. InProceedings of the IEEE international conference on computer vision, pages 3800–3808, 2017

work page 2017
[68]

Occlusion-aware transformer with second-order attention for person re-identification.IEEE Transactions on Image Processing, 2024

Yanping Li, Yizhang Liu, Hongyun Zhang, Cairong Zhao, Zhihua Wei, and Duoqian Miao. Occlusion-aware transformer with second-order attention for person re-identification.IEEE Transactions on Image Processing, 2024

work page 2024
[69]

Chatreid: Open-ended interactive person retrieval via hierarchical progressive tuning for vision language models

Ke Niu, Haiyang Yu, Mengyang Zhao, Teng Fu, Siyang Yi, Wei Lu, Bin Li, Xuelin Qian, and Xiangyang Xue. Chatreid: Open-ended interactive person retrieval via hierarchical progressive tuning for vision language models. arXiv preprint arXiv:2502.19958, 2025

work page arXiv 2025
[70]

A re-ranking method using k-nearest weighted fusion for person re-identification.arXiv preprint arXiv:2509.04050, 2025

Quang-Huy Che, Le-Chuong Nguyen, Gia-Nghia Tran, Dinh-Duy Phan, and Vinh-Tiep Nguyen. A re-ranking method using k-nearest weighted fusion for person re-identification.arXiv preprint arXiv:2509.04050, 2025

work page arXiv 2025
[71]

Faa-net: enhancing person re-identification through local-global feature association attention.Soft Computing, pages 1–15, 2025

Yangqi Zheng, Liang Zhang, and Jun Liang. Faa-net: enhancing person re-identification through local-global feature association attention.Soft Computing, pages 1–15, 2025

work page 2025
[72]

Dtc-cinet: Dynamic token compensation and cross-layer interaction network for occluded person re-identification

Jingyi Wen. Dtc-cinet: Dynamic token compensation and cross-layer interaction network for occluded person re-identification. In2025 IEEE 3rd International Conference on Image Processing and Computer Applications (ICIPCA), pages 214–222. IEEE, 2025

work page 2025
[73]

Temporal correlation vision transformer for video person re-identification

Pengfei Wu, Le Wang, Sanping Zhou, Gang Hua, and Changyin Sun. Temporal correlation vision transformer for video person re-identification. InProceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 6083–6091, 2024

work page 2024
[74]

Feature erasing and diffusion network for occluded person re-identification

Zhikang Wang, Feng Zhu, Shixiang Tang, Rui Zhao, Lihuo He, and Jiangning Song. Feature erasing and diffusion network for occluded person re-identification. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4754–4763, 2022

work page 2022
[75]

Pac-bayesian model averaging

David A McAllester. Pac-bayesian model averaging. InProceedings of the twelfth annual conference on Computational learning theory, pages 164–170, 1999

work page 1999
[76]

Pac-bayesian theory meets bayesian inference.Advances in Neural Information Processing Systems, 29, 2016

Pascal Germain, Francis Bach, Alexandre Lacoste, and Simon Lacoste-Julien. Pac-bayesian theory meets bayesian inference.Advances in Neural Information Processing Systems, 29, 2016

work page 2016
[77]

Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline)

Yifan Sun, Liang Zheng, Yi Yang, Qi Tian, and Shengjin Wang. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). InProceedings of the European conference on computer vision (ECCV), pages 480–496, 2018

work page 2018
[78]

High-order information matters: Learning relation and topology for occluded person re-identification

Guan’an Wang, Shuo Yang, Huanyu Liu, Zhicheng Wang, Yang Yang, Shuliang Wang, Gang Yu, Erjin Zhou, and Jian Sun. High-order information matters: Learning relation and topology for occluded person re-identification. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6449–6458, 2020

work page 2020
[79]

Quality-aware part models for occluded person re-identification.IEEE Transactions on Multimedia, 25:3154–3165, 2022

Pengfei Wang, Changxing Ding, Zhiyin Shao, Zhibin Hong, Shengli Zhang, and Dacheng Tao. Quality-aware part models for occluded person re-identification.IEEE Transactions on Multimedia, 25:3154–3165, 2022

work page 2022
[80]

Reasoning and tuning: Graph attention network for occluded person re-identification.IEEE Transactions on Image Processing, 32:1568–1582, 2023

Meiyan Huang, Chunping Hou, Qingyuan Yang, and Zhipeng Wang. Reasoning and tuning: Graph attention network for occluded person re-identification.IEEE Transactions on Image Processing, 32:1568–1582, 2023. 14 CityGuard

work page 2023

Showing first 80 references.