pith. machine review for the scientific record. sign in

arxiv: 2602.18047 · v3 · submitted 2026-02-20 · 💻 cs.CV · cs.LG

Recognition: 2 theorem links

· Lean Theorem

CityGuard: Graph-Aware Private Descriptors for Bias-Resilient Identity Search Across Urban Cameras

Authors on Pith no claims yet

Pith reviewed 2026-05-15 21:11 UTC · model grok-4.3

classification 💻 cs.CV cs.LG
keywords person re-identificationdifferential privacygraph attentionurban surveillanceprivacy-preserving retrievalcross-view alignmenttransformermetric learning
0
0 comments X

The pith

CityGuard combines adaptive metrics, coarse-geometry graph attention, and differential privacy to create robust private descriptors for person re-identification across urban camera networks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents CityGuard as a topology-aware transformer that performs privacy-preserving identity retrieval without sharing raw imagery across distributed city cameras. It tackles viewpoint changes, occlusions, and domain shifts through a dispersion-adaptive metric learner that tightens intra-class clusters, a spatially conditioned attention layer that feeds rough location data such as GPS into graph self-attention for consistent cross-view alignment, and differentially private embedding maps paired with compact indexes. These elements together produce descriptors that remain effective under real-world appearance variation while allowing tunable privacy guarantees enforced by rigorous differential-privacy accounting. Experiments on Market-1501 and other benchmarks report higher retrieval precision and faster query throughput than prior methods, indicating the designs support practical deployment in privacy-regulated surveillance settings.

Core claim

CityGuard is a topology-aware transformer for privacy-preserving identity retrieval in decentralized surveillance that integrates three components: a dispersion-adaptive metric learner that adjusts instance-level margins according to feature spread to increase intra-class compactness, spatially conditioned attention that injects coarse geometric priors such as GPS or deployment floor plans into graph-based self-attention to enable projectively consistent cross-view alignment without survey-grade calibration, and differentially private embedding maps coupled with compact approximate indexes; together these designs produce descriptors robust to viewpoint variation, occlusion, and domain shifts

What carries the argument

Spatially conditioned attention that injects coarse geometric priors into graph-based self-attention to achieve projectively consistent cross-view alignment using only GPS or floor-plan data.

If this is right

  • Descriptors gain robustness to viewpoint variation, occlusion, and domain shifts.
  • Privacy and utility can be balanced in a tunable way under rigorous differential-privacy accounting.
  • Retrieval precision improves on Market-1501 and additional public benchmarks.
  • Query throughput rises through the use of compact approximate indexes.
  • Secure deployment becomes feasible for decentralized urban surveillance networks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The framework may allow rapid rollout of identity-matching systems in cities that already have basic GPS or floor-plan data but lack precise camera calibration.
  • Similar graph-attention designs with coarse geometry could apply to other multi-camera tasks such as vehicle tracking or crowd flow analysis.
  • If the coarse-prior approach generalizes, it reduces the cost barrier for adding new cameras to existing networks without recalibrating the entire system.
  • The combination of adaptive margins and private embeddings suggests a path toward descriptors that remain useful even when training data are heavily noised for stronger privacy.

Load-bearing premise

Coarse geometric priors such as GPS or deployment floor plans are sufficient to produce projectively consistent cross-view alignment inside graph-based self-attention without survey-grade calibration.

What would settle it

Retrieval accuracy falling below non-graph baselines when the supplied GPS or floor-plan priors contain errors larger than typical urban positioning noise would show that the coarse-prior assumption does not hold.

Figures

Figures reproduced from arXiv: 2602.18047 by Jiaxuan Lu, Jia Yee Tan, Jiekai Wu, Rong Fu, Rui Lu, Simon Fong, Yibo Meng, Zhaolu Kang.

Figure 1
Figure 1. Figure 1: Overview of the CityGuard framework for bias-resilient, privacy-preserving identity search. The process begins with Topology-Aware Geometry Encoding, where camera coordinates and rotations are mapped to a spatial adjacency graph. The Geometry-Conditioned Backbone then fuses multi-scale features and refines them through a Temporal Graph Network (TGN) to capture cross-camera motion cues. Centrally, the Dispe… view at source ↗
Figure 2
Figure 2. Figure 2: Camera topology (GPS only): top-down 2D layout of camera nodes with edge thickness encoding the [PITH_FULL_IMAGE:figures/full_fig_p026_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Camera topology (GPS + Rotation): top-down 2D layout where [PITH_FULL_IMAGE:figures/full_fig_p027_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Attention matrix A computed from visual similarity alone (no geometric bias). Rows correspond to source cameras and columns to target cameras; intensity indicates attention weight before incorporation of the geometric term Bgeom. alignment enables our model to effectively bridge modality gaps while maintaining high retrieval accuracy. The superior [PITH_FULL_IMAGE:figures/full_fig_p028_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Attention matrix A after adding geometric bias Bgeom. Compared with [PITH_FULL_IMAGE:figures/full_fig_p029_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: ACT margin dynamics: evolution of adaptive margins [PITH_FULL_IMAGE:figures/full_fig_p030_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: UMAP visualization of feature distributions comparing baseline and CityGuard embeddings. [PITH_FULL_IMAGE:figures/full_fig_p030_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Baseline feature distribution exhibiting dispersed intra-class clusters and overlapping inter-class regions. [PITH_FULL_IMAGE:figures/full_fig_p031_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: CityGuard feature distribution demonstrating compact intra-class clusters and clear inter-class separation. [PITH_FULL_IMAGE:figures/full_fig_p031_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Training convergence on Market-1501 using Swin Transformer with circle loss and domain generalization. [PITH_FULL_IMAGE:figures/full_fig_p032_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Performance analysis on Market-1501 under varying training configurations. [PITH_FULL_IMAGE:figures/full_fig_p032_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Privacy-utility trade-off curves showing Rank-1 accuracy and mAP degradation on Market-1501 under vary [PITH_FULL_IMAGE:figures/full_fig_p033_12.png] view at source ↗
read the original abstract

City-scale person re-identification across distributed cameras must handle severe appearance changes from viewpoint, occlusion, and domain shift while complying with data protection rules that prevent sharing raw imagery. We introduce CityGuard, a topology-aware transformer for privacy-preserving identity retrieval in decentralized surveillance. The framework integrates three components. A dispersion-adaptive metric learner adjusts instance-level margins according to feature spread, increasing intra-class compactness. Spatially conditioned attention injects coarse geometry, such as GPS or deployment floor plans, into graph-based self-attention to enable projectively consistent cross-view alignment using only coarse geometric priors without requiring survey-grade calibration. Differentially private embedding maps are coupled with compact approximate indexes to support secure and cost-efficient deployment. Together these designs produce descriptors robust to viewpoint variation, occlusion, and domain shifts, and they enable a tunable balance between privacy and utility under rigorous differential-privacy accounting. Experiments on Market-1501 and additional public benchmarks, complemented by database-scale retrieval studies, show consistent gains in retrieval precision and query throughput over strong baselines, confirming the practicality of the framework for privacy-critical urban identity matching.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces CityGuard, a topology-aware transformer for privacy-preserving person re-identification across distributed urban cameras. It integrates a dispersion-adaptive metric learner that adjusts instance-level margins according to feature spread, spatially conditioned attention that injects coarse geometric priors (GPS or floor plans) into graph-based self-attention for cross-view alignment, and differentially private embedding maps paired with compact indexes. The central claim is that these components yield descriptors robust to viewpoint variation, occlusion, and domain shifts while enabling a tunable privacy-utility trade-off under rigorous differential privacy, with consistent retrieval gains shown on Market-1501 and other benchmarks.

Significance. If the empirical claims hold, the framework would provide a practical route to decentralized, privacy-compliant identity search in city-scale surveillance, combining graph attention, geometric conditioning, and differential privacy in a single pipeline. The emphasis on coarse priors to avoid survey-grade calibration and the dispersion-adaptive margin mechanism are potentially useful contributions to bias-resilient re-ID.

major comments (2)
  1. [§3.2] §3.2 (Spatially conditioned attention): the claim that coarse GPS or floor-plan priors suffice for projectively consistent cross-view alignment inside graph self-attention lacks an explicit error bound or propagation analysis. Bounded but non-zero error in the conditioning signal can misalign attention weights across views; without showing that the dispersion-adaptive metric learner provably absorbs this error, the robustness claims to viewpoint variation and occlusion rest on an unverified assumption.
  2. [Experiments] Experiments section (and abstract): no quantitative results, error bars, ablation tables, or per-component breakdowns are supplied for the Market-1501 gains or database-scale studies. Without these, the central empirical claim of “consistent gains over strong baselines” cannot be evaluated and the weakest assumption about prior granularity remains untested.
minor comments (2)
  1. [Abstract] Abstract: the statement of “consistent gains” should be accompanied by at least the headline mAP or Rank-1 deltas to allow readers to gauge magnitude before reading further.
  2. [§3.1] Notation: the dispersion-adaptive margin update rule would benefit from an explicit equation (e.g., how the margin scales with measured feature dispersion) rather than a prose description.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and commit to revisions that strengthen the theoretical and empirical sections of the manuscript.

read point-by-point responses
  1. Referee: [§3.2] §3.2 (Spatially conditioned attention): the claim that coarse GPS or floor-plan priors suffice for projectively consistent cross-view alignment inside graph self-attention lacks an explicit error bound or propagation analysis. Bounded but non-zero error in the conditioning signal can misalign attention weights across views; without showing that the dispersion-adaptive metric learner provably absorbs this error, the robustness claims to viewpoint variation and occlusion rest on an unverified assumption.

    Authors: We agree that an explicit error-propagation analysis would provide stronger theoretical support. In the revised manuscript we will add a dedicated subsection deriving bounds on attention misalignment induced by bounded errors in the coarse geometric priors. We will show that the dispersion-adaptive margin mechanism absorbs such errors by dynamically widening intra-class margins in proportion to observed feature dispersion, with the bound expressed in terms of the Lipschitz constant of the attention operator and the maximum prior error. The analysis will be accompanied by a controlled perturbation study on synthetic view shifts. revision: yes

  2. Referee: Experiments section (and abstract): no quantitative results, error bars, ablation tables, or per-component breakdowns are supplied for the Market-1501 gains or database-scale studies. Without these, the central empirical claim of “consistent gains over strong baselines” cannot be evaluated and the weakest assumption about prior granularity remains untested.

    Authors: We apologize for the insufficient presentation of results in the reviewed copy. The original experiments section contains numerical results on Market-1501 and additional benchmarks, yet they were not rendered with sufficient detail. In the revision we will replace the current summary with full tables reporting mAP and rank-1 accuracy (mean ± std over five independent runs), per-component ablation tables that isolate the contribution of spatially conditioned attention and the dispersion-adaptive learner, and a dedicated study varying the granularity of the geometric priors (GPS noise levels and floor-plan resolution). Database-scale retrieval latency and throughput figures will also be included with error bars. revision: yes

Circularity Check

0 steps flagged

No significant circularity; framework presented as forward design without self-referential reductions

full rationale

The provided abstract and description introduce CityGuard via three explicit components (dispersion-adaptive metric learner, spatially conditioned attention using coarse priors, and differentially private maps) but contain no equations, derivations, or parameter-fitting steps that reduce claimed robustness or privacy-utility balance to quantities defined by the inputs themselves. No self-citations, uniqueness theorems, or ansatzes are invoked in a load-bearing manner. The text reads as a proposal of architectural choices rather than a closed loop where predictions equal fitted inputs by construction. Per the hard rules, absent any quotable reduction (e.g., Eq. X = Eq. Y), the score is 0 and steps remain empty.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The central claim rests on three newly introduced components whose performance benefits are asserted without independent prior evidence or shipped artifacts; differential privacy accounting is treated as given.

axioms (2)
  • domain assumption Coarse geometric priors suffice for projectively consistent alignment
    Invoked in the description of spatially conditioned attention
  • domain assumption Differential privacy accounting remains rigorous after coupling with compact indexes
    Stated as enabling tunable privacy-utility balance
invented entities (2)
  • dispersion-adaptive metric learner no independent evidence
    purpose: Adjust instance-level margins according to feature spread
    New component introduced to increase intra-class compactness
  • spatially conditioned attention no independent evidence
    purpose: Inject coarse geometry into graph self-attention
    New mechanism for cross-view alignment without precise calibration

pith-pipeline@v0.9.0 · 5515 in / 1393 out tokens · 20812 ms · 2026-05-15T21:11:46.144343+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

103 extracted references · 103 canonical work pages · 3 internal anchors

  1. [1]

    Person re-identification by multi-camera networks for internet of things in smart cities.IEEE Access, 6:76111–76117, 2018

    Shilin Zhang and Hangbin Yu. Person re-identification by multi-camera networks for internet of things in smart cities.IEEE Access, 6:76111–76117, 2018

  2. [2]

    Deep learning for person re-identification: A survey and outlook.IEEE transactions on pattern analysis and machine intelligence, 44(6): 2872–2893, 2021

    Mang Ye, Jianbing Shen, Gaojie Lin, Tao Xiang, Ling Shao, and Steven CH Hoi. Deep learning for person re-identification: A survey and outlook.IEEE transactions on pattern analysis and machine intelligence, 44(6): 2872–2893, 2021

  3. [3]

    Person Re-identification: Past, Present and Future

    Liang Zheng, Yi Yang, and Alexander G Hauptmann. Person re-identification: Past, present and future.arXiv preprint arXiv:1610.02984, 2016

  4. [4]

    Parameter-efficient person re-identification in the 3d space.IEEE Transactions on Neural Networks and Learning Systems, 35(6):7534–7547, 2022

    Zhedong Zheng, Xiaohan Wang, Nenggan Zheng, and Yi Yang. Parameter-efficient person re-identification in the 3d space.IEEE Transactions on Neural Networks and Learning Systems, 35(6):7534–7547, 2022

  5. [5]

    Transreid: Transformer-based object re- identification

    Shuting He, Hao Luo, Pichao Wang, Fan Wang, Hao Li, and Wei Jiang. Transreid: Transformer-based object re- identification. InProceedings of the IEEE/CVF international conference on computer vision, pages 15013–15022, 2021

  6. [6]

    Transformer-based person re-identification: A comprehensive review.IEEE Transactions on Intelligent Vehicles, 9(7):5222–5239, 2024

    Prodip Kumar Sarker, Qingjie Zhao, and Md Kamal Uddin. Transformer-based person re-identification: A comprehensive review.IEEE Transactions on Intelligent Vehicles, 9(7):5222–5239, 2024

  7. [7]

    Personvit: large-scale self-supervised vision transformer for person re-identification.Machine Vision and Applications, 36(2):32, 2025

    Bin Hu, Xinggang Wang, and Wenyu Liu. Personvit: large-scale self-supervised vision transformer for person re-identification.Machine Vision and Applications, 36(2):32, 2025. 10 CityGuard

  8. [8]

    Deep ranking model by large adaptive margin learning for person re-identification.Pattern Recognition, 74:241–252, 2018

    Jiayun Wang, Sanping Zhou, Jinjun Wang, and Qiqi Hou. Deep ranking model by large adaptive margin learning for person re-identification.Pattern Recognition, 74:241–252, 2018

  9. [9]

    Margin-based modal adaptive learning for visible-infrared person re-identification.Sensors, 23(3):1426, 2023

    Qianqian Zhao, Hanxiao Wu, and Jianqing Zhu. Margin-based modal adaptive learning for visible-infrared person re-identification.Sensors, 23(3):1426, 2023

  10. [10]

    Adaptive intra-class variation contrastive learning for unsupervised person re-identification.arXiv preprint arXiv:2404.04665, 2024

    Lingzhi Liu, Haiyang Zhang, Chengwei Tang, and Tiantian Zhang. Adaptive intra-class variation contrastive learning for unsupervised person re-identification.arXiv preprint arXiv:2404.04665, 2024

  11. [11]

    Adversarial camera alignment network for unsupervised cross-camera person re-identification.IEEE Transactions on Circuits and Systems for Video Technology, 32(5):2921–2936, 2021

    Lei Qi, Lei Wang, Jing Huo, Yinghuan Shi, Xin Geng, and Yang Gao. Adversarial camera alignment network for unsupervised cross-camera person re-identification.IEEE Transactions on Circuits and Systems for Video Technology, 32(5):2921–2936, 2021

  12. [12]

    Occluded person re-identification via a universal framework with difference consistency guidance learning.IEEE Internet of Things Journal, 2024

    Yuxuan Liu, Hongwei Ge, Guozhi Tang, and Yong Luo. Occluded person re-identification via a universal framework with difference consistency guidance learning.IEEE Internet of Things Journal, 2024

  13. [13]

    Occlusion simulation and token-constrained feature coupling network for occluded person re-identification.IEEE Internet of Things Journal, 2025

    Li Wang, Shuli Cheng, Anyu Du, Liejun Wang, and Lun Zhang. Occlusion simulation and token-constrained feature coupling network for occluded person re-identification.IEEE Internet of Things Journal, 2025

  14. [14]

    Texture-aware transformer with pose-patch mapping for occluded person re-identification.Pattern Recognition, page 112341, 2025

    Dengwen Wang, Guanyu Xing, and Yanli Liu. Texture-aware transformer with pose-patch mapping for occluded person re-identification.Pattern Recognition, page 112341, 2025

  15. [15]

    Enhance heads in vision transformer for occluded person re-identification.IEEE Sensors Journal, 2025

    Shoudong Han, Ziwen Zhang, Xinpeng Yuan, and Delie Ming. Enhance heads in vision transformer for occluded person re-identification.IEEE Sensors Journal, 2025

  16. [16]

    Privacy-enhancing person re-identification framework-a dual-stage approach

    Kajal Kansal, Yongkang Wong, and Mohan Kankanhalli. Privacy-enhancing person re-identification framework-a dual-stage approach. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 8543–8552, 2024

  17. [17]

    Secureda: Privacy-preserving source-free domain adaptation for person re-identification.IEEE Transactions on Multimedia, 2025

    Xiaofeng Qu, Li Liu, Huaxiang Zhang, Lei Zhu, Liqiang Nie, Xiaojun Chang, and Fengling Li. Secureda: Privacy-preserving source-free domain adaptation for person re-identification.IEEE Transactions on Multimedia, 2025

  18. [18]

    A multi-scale graph attention-based transformer for occluded person re-identification.Applied Sciences, 14(18):8279, 2024

    Ming Ma, Jianming Wang, and Bohan Zhao. A multi-scale graph attention-based transformer for occluded person re-identification.Applied Sciences, 14(18):8279, 2024

  19. [19]

    Generative adversarial patches for physical attacks on cross-modal pedestrian re-identification.arXiv preprint arXiv:2410.20097, 2024

    Yue Su, Hao Li, and Maoguo Gong. Generative adversarial patches for physical attacks on cross-modal pedestrian re-identification.arXiv preprint arXiv:2410.20097, 2024

  20. [20]

    Joint discriminative and generative learning for person re-identification

    Zhedong Zheng, Xiaodong Yang, Zhiding Yu, Liang Zheng, Yi Yang, and Jan Kautz. Joint discriminative and generative learning for person re-identification. Inproceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2138–2147, 2019

  21. [21]

    Pedestrian re-identification based on swin transformer

    Zifei Qin, Peishun Liu, Yibei Liu, Haiping Duan, Feifei Li, and Han Wang. Pedestrian re-identification based on swin transformer. In2022 3rd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE), pages 123–129. IEEE, 2022

  22. [22]

    Edgevpr: Transformer-based real-time video person re-identification at the edge

    Meng Sun, Ju Ren, and Yaoxue Zhang. Edgevpr: Transformer-based real-time video person re-identification at the edge. In2024 IEEE 44th International Conference on Distributed Computing Systems (ICDCS), pages 13–24. IEEE, 2024

  23. [23]

    Cvrecon: Rethinking 3d geometric feature learning for neural reconstruction

    Ziyue Feng, Liang Yang, Pengsheng Guo, and Bing Li. Cvrecon: Rethinking 3d geometric feature learning for neural reconstruction. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 17750–17760, 2023

  24. [24]

    Group multi-view transformer for 3d shape analysis with spatial encoding.IEEE Transactions on Multimedia, 26:9450–9463, 2024

    Lixiang Xu, Qingzhe Cui, Richang Hong, Wei Xu, Enhong Chen, Xin Yuan, Chenglong Li, and Yuanyan Tang. Group multi-view transformer for 3d shape analysis with spatial encoding.IEEE Transactions on Multimedia, 26:9450–9463, 2024

  25. [25]

    Vsformer: Mining correlations in flexible view set for multi-view 3d shape understanding.IEEE Transactions on Visualization and Computer Graphics, 31(4):2127–2141, 2024

    Hongyu Sun, Yongcai Wang, Peng Wang, Haoran Deng, Xudong Cai, and Deying Li. Vsformer: Mining correlations in flexible view set for multi-view 3d shape understanding.IEEE Transactions on Visualization and Computer Graphics, 31(4):2127–2141, 2024

  26. [26]

    In Defense of the Triplet Loss for Person Re-Identification

    Alexander Hermans, Lucas Beyer, and Bastian Leibe. In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737, 2017. 11 CityGuard

  27. [27]

    Circle loss: A unified perspective of pair similarity optimization

    Yifan Sun, Changmao Cheng, Yuhan Zhang, Chi Zhang, Liang Zheng, Zhongdao Wang, and Yichen Wei. Circle loss: A unified perspective of pair similarity optimization. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6398–6407, 2020

  28. [28]

    Learnable dynamic margin in deep metric learning.Pattern Recognition, 132:108961, 2022

    Yifan Wang, Pingping Liu, Yijun Lang, Qiuzhan Zhou, and Xue Shan. Learnable dynamic margin in deep metric learning.Pattern Recognition, 132:108961, 2022

  29. [29]

    Tig-cl: teacher-guided individual and group aware contrastive learning for unsupervised person re-identification in internet of things.IEEE Internet of Things Journal, 2024

    Xiao Teng, Chuan Li, Xueqiong Li, Xinwang Liu, and Long Lan. Tig-cl: teacher-guided individual and group aware contrastive learning for unsupervised person re-identification in internet of things.IEEE Internet of Things Journal, 2024

  30. [30]

    Dual-graph contrastive learning for unsupervised person reidentification.IEEE Transactions on Cognitive and Developmental Systems, 16(4):1352–1363, 2024

    Lin Zhang, Ran Song, Yifan Wang, Qian Zhang, and Wei Zhang. Dual-graph contrastive learning for unsupervised person reidentification.IEEE Transactions on Cognitive and Developmental Systems, 16(4):1352–1363, 2024

  31. [31]

    An image–text dual-channel union network for person re-identification.IEEE Transactions on Instrumentation and Measurement, 72:1–16, 2023

    Baoguang Qi, Yi Chen, Qiang Liu, Xiaohai He, Linbo Qing, Ray E Sheriff, and Honggang Chen. An image–text dual-channel union network for person re-identification.IEEE Transactions on Instrumentation and Measurement, 72:1–16, 2023

  32. [32]

    Image-text-image knowledge transfer for lifelong person re-identification with hybrid clothing states.IEEE Transactions on Image Processing, 2025

    Qizao Wang, Xuelin Qian, Bin Li, Yanwei Fu, and Xiangyang Xue. Image-text-image knowledge transfer for lifelong person re-identification with hybrid clothing states.IEEE Transactions on Image Processing, 2025

  33. [33]

    Looking clearer with text: A hierarchical context blending network for occluded person re-identification.IEEE Transactions on Information Forensics and Security, 2025

    Changshuo Wang, Shuting He, Meiqing Wu, Siew-Kei Lam, Prayag Tiwari, and Xingyu Gao. Looking clearer with text: A hierarchical context blending network for occluded person re-identification.IEEE Transactions on Information Forensics and Security, 2025

  34. [34]

    Syrer: Synergistic relational reasoning for rgb-d cross-modal re-identification.IEEE Transactions on Multimedia, 26:5600–5614, 2023

    Hao Liu, Jingjing Wu, Feng Li, Jianguo Jiang, and Richang Hong. Syrer: Synergistic relational reasoning for rgb-d cross-modal re-identification.IEEE Transactions on Multimedia, 26:5600–5614, 2023

  35. [35]

    Rxnet: cross-modality person re-identification based on a dual-branch network: W

    Weiyang Zhang, Jiong Guo, Qiang Liu, Maoyang Zou, Honggang Chen, and Jing Peng. Rxnet: cross-modality person re-identification based on a dual-branch network: W. zhang et al.Applied Intelligence, 55(15):993, 2025

  36. [36]

    Reliable cross-camera learning in random camera person re-identification.IEEE Transactions on Circuits and Systems for Video Technology, 34(6):4556–4567, 2023

    Zhengqi Liu, Yutian Lin, Tianyang Liu, and Bo Du. Reliable cross-camera learning in random camera person re-identification.IEEE Transactions on Circuits and Systems for Video Technology, 34(6):4556–4567, 2023

  37. [37]

    Event-driven re-id: A new benchmark and method towards privacy-preserving person re-identification

    Shafiq Ahmad, Gianluca Scarpellini, Pietro Morerio, and Alessio Del Bue. Event-driven re-id: A new benchmark and method towards privacy-preserving person re-identification. InProceedings of the IEEE/CVF winter conference on applications of computer vision, pages 459–468, 2022

  38. [38]

    Diffphysba: Diffusion-based physical backdoor attack against person re-identification in real-world.arXiv preprint arXiv:2405.19990, 2024

    Wenli Sun, Xinyang Jiang, Dongsheng Li, and Cairong Zhao. Diffphysba: Diffusion-based physical backdoor attack against person re-identification in real-world.arXiv preprint arXiv:2405.19990, 2024

  39. [39]

    Generative metric learning for adversarially robust open-world person re-identification.ACM Transactions on Multimedia Computing, Communications and Applications, 19(1):1–19, 2023

    Deyin Liu, Lin Wu, Richang Hong, Zongyuan Ge, Jialie Shen, Farid Boussaid, and Mohammed Bennamoun. Generative metric learning for adversarially robust open-world person re-identification.ACM Transactions on Multimedia Computing, Communications and Applications, 19(1):1–19, 2023

  40. [40]

    A two-stream dynamic pyramid representation model for video-based person re-identification.IEEE Transactions on Image Processing, 30:6266–6276, 2021

    Xi Yang, Liangchen Liu, Nannan Wang, and Xinbo Gao. A two-stream dynamic pyramid representation model for video-based person re-identification.IEEE Transactions on Image Processing, 30:6266–6276, 2021

  41. [41]

    Context-aided semantic-aware self-alignment for video-based person re-identification.IEEE Transactions on Circuits and Systems for Video Technology, 2025

    Zhidan Ran, Zhiyao Xiao, Xiaobo Lu, Xuan Wei, and Wei Liu. Context-aided semantic-aware self-alignment for video-based person re-identification.IEEE Transactions on Circuits and Systems for Video Technology, 2025

  42. [42]

    Similarity distribution based membership inference attack on person re-identification

    Junyao Gao, Xinyang Jiang, Huishuai Zhang, Yifan Yang, Shuguang Dou, Dongsheng Li, Duoqian Miao, Cheng Deng, and Cairong Zhao. Similarity distribution based membership inference attack on person re-identification. InProceedings of the AAAI conference on artificial intelligence, volume 37, pages 14820–14828, 2023

  43. [43]

    Re-id-leak: Membership inference attacks against person re-identification.International Journal of Computer Vision, 132 (10):4673–4687, 2024

    Junyao Gao, Xinyang Jiang, Shuguang Dou, Dongsheng Li, Duoqian Miao, and Cairong Zhao. Re-id-leak: Membership inference attacks against person re-identification.International Journal of Computer Vision, 132 (10):4673–4687, 2024

  44. [44]

    Securereid: Privacy-preserving anonymization for person re-identification.IEEE Transactions on Information Forensics and Security, 19:2840–2853, 2024

    Mang Ye, Wei Shen, Junwu Zhang, Yao Yang, and Bo Du. Securereid: Privacy-preserving anonymization for person re-identification.IEEE Transactions on Information Forensics and Security, 19:2840–2853, 2024

  45. [45]

    Toward a privacy-preserving face recognition system: A survey of leakages and solutions.ACM Computing Surveys, 57(6):1–38, 2025

    Lamyanba Laishram, Muhammad Shaheryar, Jong Taek Lee, and Soon Ki Jung. Toward a privacy-preserving face recognition system: A survey of leakages and solutions.ACM Computing Surveys, 57(6):1–38, 2025. 12 CityGuard

  46. [46]

    Re-identification attack based on few-hints dataset enrichment for ubiquitous applications

    Andrea Artioli, Luca Bedogni, and Mauro Leoncini. Re-identification attack based on few-hints dataset enrichment for ubiquitous applications. In2022 IEEE 8th World Forum on Internet of Things (WF-IoT), pages 1–6. IEEE, 2022

  47. [47]

    Lucas Maris, Yuki Matsuda, and Keiichi Yasumoto. Differential privacy and k-anonymity for pedestrian image data: Impact on cross-camera person re-identification and demographic predictions.ACM Transactions on Cyber-Physical Systems, 9(4):1–31, 2025

  48. [48]

    Scalable person re- identification: A benchmark

    Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, and Qi Tian. Scalable person re- identification: A benchmark. InProceedings of the IEEE international conference on computer vision, pages 1116–1124, 2015

  49. [49]

    Mars: A video benchmark for large-scale person re-identification

    Liang Zheng, Zhi Bie, Yifan Sun, Jingdong Wang, Chi Su, Shengjin Wang, and Qi Tian. Mars: A video benchmark for large-scale person re-identification. InEuropean conference on computer vision, pages 868–884. Springer, 2016

  50. [50]

    Person transfer gan to bridge domain gap for person re-identification

    Longhui Wei, Shiliang Zhang, Wen Gao, and Qi Tian. Person transfer gan to bridge domain gap for person re-identification. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 79–88, 2018

  51. [51]

    Person recognition system based on a combination of body images from visible light and thermal cameras.Sensors, 17(3):605, 2017

    Dat Tien Nguyen, Hyung Gil Hong, Ki Wan Kim, and Kang Ryoung Park. Person recognition system based on a combination of body images from visible light and thermal cameras.Sensors, 17(3):605, 2017

  52. [52]

    Rgb-infrared cross-modality person re-identification

    Ancong Wu, Wei-Shi Zheng, Hong-Xing Yu, Shaogang Gong, and Jianhuang Lai. Rgb-infrared cross-modality person re-identification. InProceedings of the IEEE international conference on computer vision, pages 5380–5389, 2017

  53. [53]

    Occluded person re-identification

    Jiaxuan Zhuo, Zeyu Chen, Jianhuang Lai, and Guangcong Wang. Occluded person re-identification. In2018 IEEE international conference on multimedia and expo (ICME), pages 1–6. IEEE, 2018

  54. [54]

    Partial person re-identification

    Wei-Shi Zheng, Xiang Li, Tao Xiang, Shengcai Liao, Jianhuang Lai, and Shaogang Gong. Partial person re-identification. InProceedings of the IEEE international conference on computer vision, pages 4678–4686, 2015

  55. [55]

    Deep spatial feature reconstruction for partial person re-identification: Alignment-free approach

    Lingxiao He, Jian Liang, Haiqing Li, and Zhenan Sun. Deep spatial feature reconstruction for partial person re-identification: Alignment-free approach. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 7073–7082, 2018

  56. [56]

    A high-accuracy unsupervised person re-identification method using auxiliary information mined from datasets.arXiv preprint arXiv:2205.03124, 2022

    Hehan Teng, Tao He, Yuchen Guo, and Guiguang Ding. A high-accuracy unsupervised person re-identification method using auxiliary information mined from datasets.arXiv preprint arXiv:2205.03124, 2022

  57. [57]

    Spatially and temporally efficient non-local attention network for video-based person re-identification.arXiv preprint arXiv:1908.01683, 2019

    Chih-Ting Liu, Chih-Wei Wu, Yu-Chiang Frank Wang, and Shao-Yi Chien. Spatially and temporally efficient non-local attention network for video-based person re-identification.arXiv preprint arXiv:1908.01683, 2019

  58. [58]

    Robust video-based person re-identification by hierarchical mining.IEEE Transactions on Circuits and Systems for Video Technology, 32(12):8179–8191, 2021

    Zhikang Wang, Lihuo He, Xiaoguang Tu, Jian Zhao, Xinbo Gao, Shengmei Shen, and Jiashi Feng. Robust video-based person re-identification by hierarchical mining.IEEE Transactions on Circuits and Systems for Video Technology, 32(12):8179–8191, 2021

  59. [59]

    Sta: Spatial-temporal attention for large-scale video-based person re-identification

    Yang Fu, Xiaoyang Wang, Yunchao Wei, and Thomas Huang. Sta: Spatial-temporal attention for large-scale video-based person re-identification. InProceedings of the AAAI conference on artificial intelligence, volume 33, pages 8287–8294, 2019

  60. [60]

    Co-segmentation inspired attention networks for video-based person re-identification

    Arulkumar Subramaniam, Athira Nambiar, and Anurag Mittal. Co-segmentation inspired attention networks for video-based person re-identification. InProceedings of the IEEE/CVF international conference on computer vision, pages 562–572, 2019

  61. [61]

    Global-local temporal representations for video person re-identification

    Jianing Li, Jingdong Wang, Qi Tian, Wen Gao, and Shiliang Zhang. Global-local temporal representations for video person re-identification. InProceedings of the IEEE/CVF international conference on computer vision, pages 3958–3967, 2019

  62. [62]

    Vrstc: Occlusion-free video person re-identification

    Ruibing Hou, Bingpeng Ma, Hong Chang, Xinqian Gu, Shiguang Shan, and Xilin Chen. Vrstc: Occlusion-free video person re-identification. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7183–7192, 2019. 13 CityGuard

  63. [63]

    Person re-identification method based on video spatial feature enhancement

    Zunwang Ke, Guozhi Sun, Run Guo, Minghua Du, and Yugui Zhang. Person re-identification method based on video spatial feature enhancement. In2024 IEEE International Conference on Cognitive Computing and Complex Data (ICCD), pages 23–30. IEEE, 2024

  64. [64]

    Stfe: a comprehensive video-based person re-identification network based on spatio-temporal feature enhancement.IEEE Transactions on Multimedia, 26: 7237–7249, 2024

    Xi Yang, Xian Wang, Liangchen Liu, Nannan Wang, and Xinbo Gao. Stfe: a comprehensive video-based person re-identification network based on spatio-temporal feature enhancement.IEEE Transactions on Multimedia, 26: 7237–7249, 2024

  65. [65]

    Dual-branch occlusion-aware semantic part-features extraction network for occluded person re-identification.Mathematics, 13(15):2432, 2025

    Bo Sun, Yulong Zhang, Jianan Wang, and Chunmao Jiang. Dual-branch occlusion-aware semantic part-features extraction network for occluded person re-identification.Mathematics, 13(15):2432, 2025

  66. [66]

    Partformer: Awakening latent diverse representation from vision transformer for object re-identification.arXiv preprint arXiv:2408.16684, 2024

    Lei Tan, Pingyang Dai, Jie Chen, Liujuan Cao, Yongjian Wu, and Rongrong Ji. Partformer: Awakening latent diverse representation from vision transformer for object re-identification.arXiv preprint arXiv:2408.16684, 2024

  67. [67]

    Svdnet for pedestrian retrieval

    Yifan Sun, Liang Zheng, Weijian Deng, and Shengjin Wang. Svdnet for pedestrian retrieval. InProceedings of the IEEE international conference on computer vision, pages 3800–3808, 2017

  68. [68]

    Occlusion-aware transformer with second-order attention for person re-identification.IEEE Transactions on Image Processing, 2024

    Yanping Li, Yizhang Liu, Hongyun Zhang, Cairong Zhao, Zhihua Wei, and Duoqian Miao. Occlusion-aware transformer with second-order attention for person re-identification.IEEE Transactions on Image Processing, 2024

  69. [69]

    Chatreid: Open-ended interactive person retrieval via hierarchical progressive tuning for vision language models

    Ke Niu, Haiyang Yu, Mengyang Zhao, Teng Fu, Siyang Yi, Wei Lu, Bin Li, Xuelin Qian, and Xiangyang Xue. Chatreid: Open-ended interactive person retrieval via hierarchical progressive tuning for vision language models. arXiv preprint arXiv:2502.19958, 2025

  70. [70]

    A re-ranking method using k-nearest weighted fusion for person re-identification.arXiv preprint arXiv:2509.04050, 2025

    Quang-Huy Che, Le-Chuong Nguyen, Gia-Nghia Tran, Dinh-Duy Phan, and Vinh-Tiep Nguyen. A re-ranking method using k-nearest weighted fusion for person re-identification.arXiv preprint arXiv:2509.04050, 2025

  71. [71]

    Faa-net: enhancing person re-identification through local-global feature association attention.Soft Computing, pages 1–15, 2025

    Yangqi Zheng, Liang Zhang, and Jun Liang. Faa-net: enhancing person re-identification through local-global feature association attention.Soft Computing, pages 1–15, 2025

  72. [72]

    Dtc-cinet: Dynamic token compensation and cross-layer interaction network for occluded person re-identification

    Jingyi Wen. Dtc-cinet: Dynamic token compensation and cross-layer interaction network for occluded person re-identification. In2025 IEEE 3rd International Conference on Image Processing and Computer Applications (ICIPCA), pages 214–222. IEEE, 2025

  73. [73]

    Temporal correlation vision transformer for video person re-identification

    Pengfei Wu, Le Wang, Sanping Zhou, Gang Hua, and Changyin Sun. Temporal correlation vision transformer for video person re-identification. InProceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 6083–6091, 2024

  74. [74]

    Feature erasing and diffusion network for occluded person re-identification

    Zhikang Wang, Feng Zhu, Shixiang Tang, Rui Zhao, Lihuo He, and Jiangning Song. Feature erasing and diffusion network for occluded person re-identification. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4754–4763, 2022

  75. [75]

    Pac-bayesian model averaging

    David A McAllester. Pac-bayesian model averaging. InProceedings of the twelfth annual conference on Computational learning theory, pages 164–170, 1999

  76. [76]

    Pac-bayesian theory meets bayesian inference.Advances in Neural Information Processing Systems, 29, 2016

    Pascal Germain, Francis Bach, Alexandre Lacoste, and Simon Lacoste-Julien. Pac-bayesian theory meets bayesian inference.Advances in Neural Information Processing Systems, 29, 2016

  77. [77]

    Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline)

    Yifan Sun, Liang Zheng, Yi Yang, Qi Tian, and Shengjin Wang. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). InProceedings of the European conference on computer vision (ECCV), pages 480–496, 2018

  78. [78]

    High-order information matters: Learning relation and topology for occluded person re-identification

    Guan’an Wang, Shuo Yang, Huanyu Liu, Zhicheng Wang, Yang Yang, Shuliang Wang, Gang Yu, Erjin Zhou, and Jian Sun. High-order information matters: Learning relation and topology for occluded person re-identification. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6449–6458, 2020

  79. [79]

    Quality-aware part models for occluded person re-identification.IEEE Transactions on Multimedia, 25:3154–3165, 2022

    Pengfei Wang, Changxing Ding, Zhiyin Shao, Zhibin Hong, Shengli Zhang, and Dacheng Tao. Quality-aware part models for occluded person re-identification.IEEE Transactions on Multimedia, 25:3154–3165, 2022

  80. [80]

    Reasoning and tuning: Graph attention network for occluded person re-identification.IEEE Transactions on Image Processing, 32:1568–1582, 2023

    Meiyan Huang, Chunping Hou, Qingyuan Yang, and Zhipeng Wang. Reasoning and tuning: Graph attention network for occluded person re-identification.IEEE Transactions on Image Processing, 32:1568–1582, 2023. 14 CityGuard

Showing first 80 references.