Graph Neural Based End-to-end Data Association Framework for Online Multiple-Object Tracking

Peizhao Li; Xiantong Zhen; Xiaolong Jiang; Yanjing Li

arxiv: 1907.05315 · v1 · pith:N462GFOXnew · submitted 2019-07-11 · 💻 cs.CV

Graph Neural Based End-to-end Data Association Framework for Online Multiple-Object Tracking

Xiaolong Jiang , Peizhao Li , Yanjing Li , Xiantong Zhen This is my paper

Pith reviewed 2026-05-24 23:05 UTC · model grok-4.3

classification 💻 cs.CV

keywords multiple object trackingdata associationgraph neural networksbipartite matchingend-to-end learningaffinity learningonline trackingmotion cues

0 comments

The pith

A graph neural network can solve maximum weighted bipartite matching for data association in online multiple object tracking directly from detections.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces an end-to-end neural framework that treats frame-by-frame data association in multiple object tracking as a maximum weighted bipartite matching problem. An affinity module computes similarities between objects using both appearance and motion features, which then become edge weights on a bipartite graph. A graph neural network optimization module solves this matching problem while adapting to different numbers of detections. Training uses a multi-level matrix loss with assembled supervision so all components learn together. The result is a tracker that requires less manual tuning and shows stronger performance on standard MOT benchmarks.

Core claim

The central claim is that an end-to-end network with an affinity learning module and a graph neural network optimization module can resolve the data association problem in online MOT by learning to solve the maximum weighted bipartite matching task, allowing the entire system to co-adapt during training and handle varying object cardinalities with good scalability.

What carries the argument

The graph neural network optimization module that takes computed affinities as edge weights and solves the maximum weighted bipartite matching problem while adapting to varying numbers of detections.

If this is right

All modules in the tracker co-adapt during joint training, improving overall model adaptiveness.
The system handles association problems with changing numbers of detections without fixed-size assumptions.
Parameter tuning effort decreases because the network learns the matching process directly.
The approach integrates appearance and motion cues into a single trainable pipeline for online tracking.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same graph neural network approach to matching could apply to other vision tasks that reduce to bipartite assignment.
Replacing traditional solvers with learned optimization might lower computational overhead in real-time systems.
End-to-end training of association could allow trackers to adjust automatically to new camera setups or object types.

Load-bearing premise

The graph neural network can reliably approximate optimal solutions to the maximum weighted bipartite matching problem for different numbers of objects without post-processing or separate solvers.

What would settle it

Compare the assignments produced by the trained graph neural network against exact solutions from a standard bipartite matching solver on sequences with known ground-truth associations and varying object counts; systematic mismatches would falsify the claim.

Figures

Figures reproduced from arXiv: 1907.05315 by Peizhao Li, Xiantong Zhen, Xiaolong Jiang, Yanjing Li.

**Figure 1.** Figure 1: The pipeline of the proposed framework. It consists of the Siamese Network for affinity computation and the Graph Neural [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: The pipeline of the proposed optimization module based [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Visualization of tracking results on MOT17 benchmark [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

read the original abstract

In this work, we present an end-to-end framework to settle data association in online Multiple-Object Tracking (MOT). Given detection responses, we formulate the frame-by-frame data association as Maximum Weighted Bipartite Matching problem, whose solution is learned using a neural network. The network incorporates an affinity learning module, wherein both appearance and motion cues are investigated to encode object feature representation and compute pairwise affinities. Employing the computed affinities as edge weights, the following matching problem on a bipartite graph is resolved by the optimization module, which leverages a graph neural network to adapt with the varying cardinalities of the association problem and solve the combinatorial hardness with favorable scalability and compatibility. To facilitate effective training of the proposed tracking network, we design a multi-level matrix loss in conjunction with the assembled supervision methodology. Being trained end-to-end, all modules in the tracker can co-adapt and co-operate collaboratively, resulting in improved model adaptiveness and less parameter-tuning efforts. Experiment results on the MOT benchmarks demonstrate the efficacy of the proposed approach.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper proposes an end-to-end GNN for learning both affinities and bipartite matching in online MOT, but the abstract leaves the exact inference procedure for valid assignments unclear.

read the letter

The main point here is a framework that folds affinity computation and the matching step into one trainable network for frame-by-frame data association in multiple-object tracking. The affinity module combines appearance and motion cues, then feeds the resulting edge weights into a graph neural network that is supposed to handle the maximum weighted bipartite matching while adjusting to different numbers of detections. A multi-level matrix loss is used to supervise the whole pipeline jointly so the parts can co-adapt during training. This is a clear attempt to move away from separate affinity networks plus classical solvers like Hungarian, and the direction is reasonable for reducing hand-tuned components in online trackers. The positioning around scalability and compatibility with varying cardinalities is the central new angle relative to the usual two-stage pipelines in the MOT literature. The multi-level loss is a practical choice for end-to-end training and could help stability. The soft spot is the optimization module itself. The abstract states that the GNN resolves the combinatorial problem, yet standard message-passing GNNs produce soft scores rather than guaranteed permutation matrices. Without seeing the precise inference steps, it is possible that argmax, Sinkhorn normalization, or an external solver is still applied at test time, which would undercut the claim of directly solving the hardness inside the network. The training loss alone does not ensure feasible outputs on new inputs with unseen detection counts. That concern from the stress-test note is worth checking in the methods section. The paper is aimed at the MOT community and readers already exploring graph-based learned solvers for association. It could be useful to see the architecture and benchmark numbers, but only after the full implementation details are available. I would send it for peer review because the problem is relevant and the overall framing is coherent, even if the GNN matching claim needs verification.

Referee Report

1 major / 0 minor

Summary. The paper proposes an end-to-end neural framework for online multiple-object tracking that formulates frame-by-frame data association as a maximum weighted bipartite matching problem. An affinity learning module encodes appearance and motion cues to produce edge weights; these are fed to a graph neural network optimization module that is claimed to adapt to varying detection cardinalities and solve the combinatorial problem directly. Training uses a multi-level matrix loss with assembled supervision, allowing all modules to co-adapt.

Significance. If the GNN optimization module produces valid, high-quality matchings for arbitrary cardinalities without external solvers or post-processing, the work would advance fully differentiable MOT pipelines and reduce reliance on hand-tuned components. The multi-level loss and end-to-end training are presented as enabling better adaptability on MOT benchmarks.

major comments (1)

[Abstract / Optimization Module] Abstract (and optimization module description): the central claim that the GNN 'leverages a graph neural network to adapt with the varying cardinalities of the association problem and solve the combinatorial hardness' without post-hoc adjustments is load-bearing for the 'end-to-end' and 'no separate solvers' assertions. Standard message-passing GNNs on bipartite graphs output soft scores; converting them to feasible permutation matrices for unseen cardinalities typically requires argmax, Sinkhorn normalization, or an external solver such as Hungarian. The multi-level matrix loss supervises toward ground-truth assignments only during training and does not guarantee feasible or optimal outputs at inference. Concrete evidence (architecture diagram, inference procedure, or ablation removing any post-processing) is required to substantiate the claim.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. Below we respond point-by-point to the major comment, offering clarification on the optimization module while committing to revisions that strengthen the presentation of our claims.

read point-by-point responses

Referee: [Abstract / Optimization Module] Abstract (and optimization module description): the central claim that the GNN 'leverages a graph neural network to adapt with the varying cardinalities of the association problem and solve the combinatorial hardness' without post-hoc adjustments is load-bearing for the 'end-to-end' and 'no separate solvers' assertions. Standard message-passing GNNs on bipartite graphs output soft scores; converting them to feasible permutation matrices for unseen cardinalities typically requires argmax, Sinkhorn normalization, or an external solver such as Hungarian. The multi-level matrix loss supervises toward ground-truth assignments only during training and does not guarantee feasible or optimal outputs at inference. Concrete evidence (architecture diagram, inference procedure, or ablation removing any post-processing) is required to substantiate the claim.

Authors: We appreciate the referee highlighting the need for precision on this central aspect of the framework. The optimization module constructs a bipartite graph whose nodes correspond to detections in the current and previous frames (thus naturally accommodating arbitrary cardinalities) and whose edges are initialized with affinities from the appearance-motion module. Successive GNN layers perform message passing that refines these affinities into an output matrix whose entries directly encode assignment decisions. The multi-level matrix loss, applied with assembled supervision, explicitly penalizes deviations from the ground-truth assignment matrix at multiple resolutions, encouraging the network to produce outputs that are already close to valid permutation matrices. At inference the GNN output is used to recover the matching by selecting the highest-scoring entries while enforcing the one-to-one constraint implicit in the learned representation; no external combinatorial solver is invoked. This design keeps the entire pipeline differentiable. We nevertheless recognize that the manuscript would benefit from greater transparency. In the revision we will add an architecture diagram of the optimization module, a step-by-step description of the inference procedure that converts the GNN output into a feasible matching for unseen cardinalities, and an ablation that isolates the contribution of any minimal post-processing steps. revision: yes

Circularity Check

0 steps flagged

No circularity: framework trained end-to-end on external data with no self-definitional reductions

full rationale

The paper formulates data association as a maximum weighted bipartite matching problem and learns its solution via a neural network with affinity and optimization modules. All components are trained on labeled tracking data using a multi-level matrix loss; outputs are not equivalent to inputs by construction, nor are any predictions statistically forced from fitted subsets. No self-citations, uniqueness theorems, or ansatzes imported from prior author work appear in the provided text. The derivation chain is therefore self-contained against external benchmarks and does not reduce to renaming or tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review limits visibility into free parameters or invented entities; the central claim rests on the domain assumption that a GNN can learn to solve the combinatorial matching task.

axioms (1)

domain assumption A graph neural network can adapt to varying cardinalities and solve the maximum weighted bipartite matching problem with favorable scalability.
Invoked in the abstract when describing the optimization module.

pith-pipeline@v0.9.0 · 5714 in / 1162 out tokens · 40728 ms · 2026-05-24T23:05:22.207549+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

95 extracted references · 95 canonical work pages · 20 internal anchors

[1]

Alahi, K

A. Alahi, K. Goel, V . Ramanathan, A. Robicquet, L. Fei-Fei, and S. Savarese. Social lstm: Human trajectory prediction in crowded spaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages 961–971, 2016

work page 2016
[2]

S. Avidan. Ensemble tracking. IEEE transactions on pattern analysis and machine intelligence, 29(2), 2007

work page 2007
[3]

Bae and K.-J

S.-H. Bae and K.-J. Yoon. Conﬁdence-based data associa- tion and discriminative deep appearance learning for robust online multi-object tracking. IEEE transactions on pattern analysis and machine intelligence, 40(3):595–610, 2018

work page 2018
[4]

Balas and M

E. Balas and M. W. Padberg. Set partitioning: A survey. SIAM review, 18(4):710–760, 1976

work page 1976
[5]

P. W. Battaglia, J. B. Hamrick, V . Bapst, A. Sanchez- Gonzalez, V . Zambaldi, M. Malinowski, A. Tacchetti, D. Ra- poso, A. Santoro, R. Faulkner, et al. Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[6]

Beyer, S

L. Beyer, S. Breuers, V . Kurin, and B. Leibe. Towards a principled integration of multi-camera re-identiﬁcation and tracking through optimal bayes ﬁlters. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2017 IEEE Conference on, pages 1444–1453. IEEE, 2017

work page 2017
[7]

Bochinski, V

E. Bochinski, V . Eiselein, and T. Sikora. High-speed tracking-by-detection without using image information. In Advanced Video and Signal Based Surveillance (AVSS), 2017 14th IEEE International Conference on , pages 1–6. IEEE, 2017

work page 2017
[8]

Brendel, M

W. Brendel, M. Amer, and S. Todorovic. Multiobject track- ing as maximum weight independent set. InComputer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 1273–1280. IEEE, 2011

work page 2011
[9]

Brendel and S

W. Brendel and S. Todorovic. Learning spatiotemporal graphs of human activities. InComputer vision (ICCV), 2011 IEEE international conference on , pages 778–785. IEEE, 2011

work page 2011
[10]

M. M. Bronstein, J. Bruna, Y . LeCun, A. Szlam, and P. Van- dergheynst. Geometric deep learning: going beyond eu- clidean data. IEEE Signal Processing Magazine, 34(4):18– 42, 2017

work page 2017
[11]

Cai and G

Y . Cai and G. Medioni. Exploring context information for inter-camera multiple target tracking. In Applications of Computer Vision (WACV), 2014 IEEE Winter Conference on, pages 761–768. IEEE, 2014

work page 2014
[12]

X. Cao, X. Jiang, X. Li, and P. Yan. Correlation-based track- ing of multiple targets with hierarchical layered structure. IEEE transactions on cybernetics, 48(1):90–102, 2018

work page 2018
[13]

J. Chen, H. Sheng, Y . Zhang, and Z. Xiong. Enhancing de- tection model for multiple hypothesis tracking. In Conf. on Computer Vision and Pattern Recognition Workshops, pages 2143–2152, 2017

work page 2017
[14]

W. Choi. Near-online multi-target tracking with aggregated local ﬂow descriptor. In Proceedings of the IEEE inter- national conference on computer vision , pages 3029–3037, 2015

work page 2015
[15]

Choi and S

W. Choi and S. Savarese. A uniﬁed framework for multi- target tracking and collective activity recognition. In Eu- ropean Conference on Computer Vision , pages 215–230. Springer, 2012

work page 2012
[16]

Chopra, R

S. Chopra, R. Hadsell, and Y . LeCun. Learning a similar- ity metric discriminatively, with application to face veriﬁca- tion. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on , vol- ume 1, pages 539–546. IEEE, 2005

work page 2005
[17]

Q. Chu, W. Ouyang, H. Li, X. Wang, B. Liu, and N. Yu. Online multi-object tracking using cnn-based single ob- ject tracker with spatial-temporal attention mechanism. In 2017 IEEE International Conference on Computer Vision (ICCV).(Oct 2017), pages 4846–4855, 2017

work page 2017
[19]

R. T. Collins. Multitarget data association with higher-order motion models. In Computer Vision and Pattern Recogni- tion (CVPR), 2012 IEEE Conference on , pages 1744–1751. IEEE, 2012

work page 2012
[20]

H. Dai, E. B. Khalil, Y . Zhang, B. Dilkina, and L. Song. Learning combinatorial optimization algorithms over graphs. arXiv preprint arXiv:1704.01665, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[21]

De Cao and T

N. De Cao and T. Kipf. Molgan: An implicit gener- ative model for small molecular graphs. arXiv preprint arXiv:1805.11973, 2018

work page arXiv 2018
[22]

Dehghan, S

A. Dehghan, S. Modiri Assari, and M. Shah. Gmmcp tracker: Globally optimal generalized maximum multi clique prob- lem for multiple object tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages 4091–4099, 2015

work page 2015
[23]

Dehghan, Y

A. Dehghan, Y . Tian, P. H. Torr, and M. Shah. Target identity-aware network ﬂow for online multiple target track- ing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1146–1154, 2015

work page 2015
[24]

M. Ding, J. Tang, and J. Zhang. Semi-supervised learning on graphs with generative adversarial nets. arXiv preprint arXiv:1809.00130, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[25]

Dong and J

X. Dong and J. Shen. Triplet loss in siamese network for object tracking. In Proceedings of the European Conference on Computer Vision (ECCV), pages 459–474, 2018

work page 2018
[26]

Eiselein, D

V . Eiselein, D. Arp, M. P ¨atzold, and T. Sikora. Real-time multi-human tracking using a probability hypothesis density ﬁlter and multiple detectors. In Advanced Video and Signal- Based Surveillance (AVSS), 2012 IEEE Ninth International Conference on, pages 325–330. IEEE, 2012

work page 2012
[27]

Few-Shot Learning with Graph Neural Networks

V . Garcia and J. Bruna. Few-shot learning with graph neural networks. arXiv preprint arXiv:1711.04043, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[29]

Neural Message Passing for Quantum Chemistry

J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl. Neural message passing for quantum chemistry. arXiv preprint arXiv:1704.01212, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[30]

Goodfellow, Y

I. Goodfellow, Y . Bengio, A. Courville, and Y . Bengio.Deep learning, volume 1. MIT press Cambridge, 2016

work page 2016
[31]

M. Gori, G. Monfardini, and F. Scarselli. A new model for learning in graph domains. In Neural Networks, 2005. IJCNN’05. Proceedings. 2005 IEEE International Joint Con- ference on, volume 2, pages 729–734. IEEE, 2005

work page 2005
[32]

A. He, C. Luo, X. Tian, and W. Zeng. A twofold siamese network for real-time object tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recogni- tion, pages 4834–4843, 2018

work page 2018
[33]

Q. He, J. Wu, G. Yu, and C. Zhang. Sot for mot. arXiv preprint arXiv:1712.01059, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[34]

W. Hu, X. Li, W. Luo, X. Zhang, S. Maybank, and Z. Zhang. Single and multiple object tracking using log-euclidean rie- mannian subspace and block-division appearance model. IEEE Transactions on Pattern Analysis and Machine Intel- ligence, 34(12):2420–2440, 2012

work page 2012
[35]

Huang, B

C. Huang, B. Wu, and R. Nevatia. Robust object tracking by hierarchical association of detection responses. In European Conference on Computer Vision , pages 788–801. Springer, 2008

work page 2008
[36]

Javed, K

O. Javed, K. Shaﬁque, Z. Rasheed, and M. Shah. Mod- eling inter-camera space–time and appearance relationships for tracking across non-overlapping views. Computer Vision and Image Understanding, 109(2):146–162, 2008

work page 2008
[37]

Jiang, P

X. Jiang, P. Li, X. Zhen, and X. Cao. Model-free tracking with deep appearance and motion features integration. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 101–110. IEEE, 2019

work page 2019
[38]

C. Kim, F. Li, and J. M. Rehg. Multi-object tracking with neural gating using bilinear lstm. In Proceedings of the Eu- ropean Conference on Computer Vision (ECCV), pages 200– 215, 2018

work page 2018
[39]

D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[40]

T. N. Kipf and M. Welling. Semi-supervised classiﬁca- tion with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[41]

H. W. Kuhn. The hungarian method for the assignment prob- lem. Naval research logistics quarterly, 2(1-2):83–97, 1955

work page 1955
[42]

C.-H. Kuo, C. Huang, and R. Nevatia. Multi-target track- ing by on-line learned discriminative appearance models. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pages 685–692. IEEE, 2010

work page 2010
[43]

Kutschbach, E

T. Kutschbach, E. Bochinski, V . Eiselein, and T. Sikora. Sequential sensor fusion combining probability hypothesis density and kernelized correlation ﬁlters for multi-object tracking in video data. In2017 14th IEEE International Con- ference on Advanced Video and Signal Based Surveillance (AVSS), pages 1–5. IEEE, 2017

work page 2017
[44]

Leal-Taix ´e, C

L. Leal-Taix ´e, C. Canton-Ferrer, and K. Schindler. Learn- ing by tracking: Siamese cnn for robust target association. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 33–40, 2016

work page 2016
[45]

MOTChallenge 2015: Towards a Benchmark for Multi-Target Tracking

L. Leal-Taix ´e, A. Milan, I. Reid, S. Roth, and K. Schindler. Motchallenge 2015: Towards a benchmark for multi-target tracking. arXiv preprint arXiv:1504.01942, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015
[46]

Leal-Taix ´e, G

L. Leal-Taix ´e, G. Pons-Moll, and B. Rosenhahn. Everybody needs somebody: Modeling social and grouping behavior on a linear programming multiple people tracker. In Computer Vision Workshops (ICCV Workshops), 2011 IEEE Interna- tional Conference on, pages 120–127. IEEE, 2011

work page 2011
[47]

H. Li, Y . Li, and F. Porikli. Deeptrack: Learning discrimina- tive feature representations online for robust visual tracking. IEEE Transactions on Image Processing, 25(4):1834–1848, 2016

work page 2016
[48]

Y . Li, C. Huang, and R. Nevatia. Learning to associate: Hy- bridboosted multi-target tracker for crowded scene. 2009

work page 2009
[49]

MOT16: A Benchmark for Multi-Object Tracking

A. Milan, L. Leal-Taix ´e, I. Reid, S. Roth, and K. Schindler. Mot16: A benchmark for multi-object tracking. arXiv preprint arXiv:1603.00831, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[50]

Milan, S

A. Milan, S. H. Rezatoﬁghi, A. R. Dick, I. D. Reid, and K. Schindler. Online multi-target tracking using recurrent neural networks. In AAAI, volume 2, page 4, 2017

work page 2017
[51]

Milan, S

A. Milan, S. H. Rezatoﬁghi, R. Garg, A. R. Dick, and I. D. Reid. Data-driven approximations to np-hard problems. In AAAI, pages 1453–1459, 2017

work page 2017
[52]

Milan, S

A. Milan, S. Roth, and K. Schindler. Continuous energy min- imization for multitarget tracking. IEEE Trans. Pattern Anal. Mach. Intell., 36(1):58–72, 2014

work page 2014
[53]

Mitzel, E

D. Mitzel, E. Horbert, A. Ess, and B. Leibe. Multi-person tracking with sparse detection and continuous segmentation. In European Conference on Computer Vision , pages 397–

work page
[54]

Nowak, S

A. Nowak, S. Villar, A. S. Bandeira, and J. Bruna. Revised note on learning quadratic assignment with graph neural net- works. In 2018 IEEE Data Science Workshop (DSW), pages 1–5. IEEE, 2018

work page 2018
[55]

End-to-End Tracking and Semantic Segmentation Using Recurrent Neural Networks

P. Ondruska, J. Dequaire, D. Z. Wang, and I. Posner. End- to-end tracking and semantic segmentation using recurrent neural networks. arXiv preprint arXiv:1604.05091, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[56]

Deep Tracking: Seeing Beyond Seeing Using Recurrent Neural Networks

P. Ondruska and I. Posner. Deep tracking: Seeing be- yond seeing using recurrent neural networks. arXiv preprint arXiv:1602.00991, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[57]

Pellegrini, A

S. Pellegrini, A. Ess, and L. Van Gool. Improving data as- sociation by joint modeling of pedestrian trajectories and groupings. In European conference on computer vision , pages 452–465. Springer, 2010

work page 2010
[58]

Pirsiavash, D

H. Pirsiavash, D. Ramanan, and C. C. Fowlkes. Globally- optimal greedy algorithms for tracking a variable num- ber of objects. In Computer Vision and Pattern Recogni- tion (CVPR), 2011 IEEE Conference on , pages 1201–1208. IEEE, 2011

work page 2011
[59]

Possegger, T

H. Possegger, T. Mauthner, P. M. Roth, and H. Bischof. Oc- clusion geodesics for online multi-object tracking. In Pro- ceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1306–1313, 2014

work page 2014
[60]

Qin and C

Z. Qin and C. R. Shelton. Improving multi-target tracking via social grouping. In Computer Vision and Pattern Recog- nition (CVPR), 2012 IEEE Conference on, pages 1972–1978. IEEE, 2012

work page 2012
[61]

Discovering objects and their relations from entangled scene representations

D. Raposo, A. Santoro, D. Barrett, R. Pascanu, T. Lilli- crap, and P. Battaglia. Discovering objects and their rela- tions from entangled scene representations. arXiv preprint arXiv:1702.05068, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[62]

Features for Multi-Target Multi-Camera Tracking and Re-Identification

E. Ristani and C. Tomasi. Features for multi-target multi-camera tracking and re-identiﬁcation. arXiv preprint arXiv:1803.10859, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[63]

Robicquet, A

A. Robicquet, A. Sadeghian, A. Alahi, and S. Savarese. Learning social etiquette: Human trajectory understanding in crowded scenes. In European conference on computer vi- sion, pages 549–565. Springer, 2016

work page 2016
[64]

Tracking The Untrackable: Learning To Track Multiple Cues with Long-Term Dependencies

A. Sadeghian, A. Alahi, and S. Savarese. Tracking the un- trackable: Learning to track multiple cues with long-term de- pendencies. arXiv preprint arXiv:1701.01909, 4(5):6, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[65]

Sanchez-Matilla, F

R. Sanchez-Matilla, F. Poiesi, and A. Cavallaro. Online multi-target tracking with strong and weak detections. In European Conference on Computer Vision , pages 84–99. Springer, 2016

work page 2016
[66]

Scarselli, M

F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini. Computational capabilities of graph neu- ral networks. IEEE Transactions on Neural Networks , 20(1):81–102, 2009

work page 2009
[68]

Scarselli, M

F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini. The graph neural network model. IEEE Transactions on Neural Networks, 20(1):61–80, 2009

work page 2009
[69]

Schulter, P

S. Schulter, P. Vernaza, W. Choi, and M. Chandraker. Deep network ﬂow for multi-object tracking. In Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on, pages 2730–2739. IEEE, 2017

work page 2017
[70]

Scovanner and M

P. Scovanner and M. F. Tappen. Learning pedestrian dynam- ics from the real world. In Computer Vision, 2009 IEEE 12th International Conference on, pages 381–388. IEEE, 2009

work page 2009
[71]

G. Shu, A. Dehghan, O. Oreifej, E. Hand, and M. Shah. Part- based multiple-person tracking with partial occlusion han- dling. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pages 1815–1821. IEEE, 2012

work page 2012
[72]

J. Son, M. Baek, M. Cho, and B. Han. Multi-object tracking with quadruplet convolutional neural networks. In Proceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5620–5629, 2017

work page 2017
[73]

Song, T.-Y

B. Song, T.-Y . Jeng, E. Staudt, and A. K. Roy-Chowdhury. A stochastic graph evolution framework for robust multi- target tracking. In European Conference on Computer Vi- sion, pages 605–619. Springer, 2010

work page 2010
[74]

PeerNets: Exploiting Peer Wisdom Against Adversarial Attacks

J. Svoboda, J. Masci, F. Monti, M. M. Bronstein, and L. Guibas. Peernets: Exploiting peer wisdom against ad- versarial attacks. arXiv preprint arXiv:1806.00088, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[75]

S. Tang, B. Andres, M. Andriluka, and B. Schiele. Sub- graph decomposition for multi-target tracking. In Proceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5033–5041, 2015

work page 2015
[76]

S. Tang, B. Andres, M. Andriluka, and B. Schiele. Multi- person tracking by multicut and deep matching. InEuropean Conference on Computer Vision , pages 100–111. Springer, 2016

work page 2016
[77]

S. Tang, M. Andriluka, B. Andres, and B. Schiele. Multiple people tracking by lifted multicut and person reidentiﬁca- tion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3539–3548, 2017

work page 2017
[78]

Graph Attention Networks

P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Lio, and Y . Bengio. Graph attention networks. arXiv preprint arXiv:1710.10903, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[79]

X. Wan, J. Wang, Z. Kong, Q. Zhao, and S. Deng. Multi- object tracking using online metric learning with long short- term memory. In 2018 25th IEEE International Conference on Image Processing (ICIP), pages 788–792. IEEE, 2018

work page 2018
[80]

Wang and D.-Y

N. Wang and D.-Y . Yeung. Learning a deep compact im- age representation for visual tracking. In Advances in neural information processing systems, pages 809–817, 2013

work page 2013
[81]

Q. Wang, Z. Teng, J. Xing, J. Gao, W. Hu, and S. May- bank. Learning attentions: residual attentional siamese net- work for high performance online visual tracking. In Pro- ceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4854–4863, 2018

work page 2018
[82]

X. Wang, E. T ¨uretken, F. Fleuret, and P. Fua. Tracking inter- acting objects using intertwined ﬂows. IEEE transactions on pattern analysis and machine intelligence , 38(EPFL- ARTICLE-210040):2312–2326, 2016

work page 2016
[83]

Yang and R

B. Yang and R. Nevatia. Multi-target tracking by online learning of non-linear motion patterns and robust appear- ance models. In Computer Vision and Pattern Recogni- tion (CVPR), 2012 IEEE Conference on , pages 1918–1925. IEEE, 2012

work page 2012

Showing first 80 references.

[1] [1]

Alahi, K

A. Alahi, K. Goel, V . Ramanathan, A. Robicquet, L. Fei-Fei, and S. Savarese. Social lstm: Human trajectory prediction in crowded spaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages 961–971, 2016

work page 2016

[2] [2]

S. Avidan. Ensemble tracking. IEEE transactions on pattern analysis and machine intelligence, 29(2), 2007

work page 2007

[3] [3]

Bae and K.-J

S.-H. Bae and K.-J. Yoon. Conﬁdence-based data associa- tion and discriminative deep appearance learning for robust online multi-object tracking. IEEE transactions on pattern analysis and machine intelligence, 40(3):595–610, 2018

work page 2018

[4] [4]

Balas and M

E. Balas and M. W. Padberg. Set partitioning: A survey. SIAM review, 18(4):710–760, 1976

work page 1976

[5] [5]

P. W. Battaglia, J. B. Hamrick, V . Bapst, A. Sanchez- Gonzalez, V . Zambaldi, M. Malinowski, A. Tacchetti, D. Ra- poso, A. Santoro, R. Faulkner, et al. Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[6] [6]

Beyer, S

L. Beyer, S. Breuers, V . Kurin, and B. Leibe. Towards a principled integration of multi-camera re-identiﬁcation and tracking through optimal bayes ﬁlters. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2017 IEEE Conference on, pages 1444–1453. IEEE, 2017

work page 2017

[7] [7]

Bochinski, V

E. Bochinski, V . Eiselein, and T. Sikora. High-speed tracking-by-detection without using image information. In Advanced Video and Signal Based Surveillance (AVSS), 2017 14th IEEE International Conference on , pages 1–6. IEEE, 2017

work page 2017

[8] [8]

Brendel, M

W. Brendel, M. Amer, and S. Todorovic. Multiobject track- ing as maximum weight independent set. InComputer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 1273–1280. IEEE, 2011

work page 2011

[9] [9]

Brendel and S

W. Brendel and S. Todorovic. Learning spatiotemporal graphs of human activities. InComputer vision (ICCV), 2011 IEEE international conference on , pages 778–785. IEEE, 2011

work page 2011

[10] [10]

M. M. Bronstein, J. Bruna, Y . LeCun, A. Szlam, and P. Van- dergheynst. Geometric deep learning: going beyond eu- clidean data. IEEE Signal Processing Magazine, 34(4):18– 42, 2017

work page 2017

[11] [11]

Cai and G

Y . Cai and G. Medioni. Exploring context information for inter-camera multiple target tracking. In Applications of Computer Vision (WACV), 2014 IEEE Winter Conference on, pages 761–768. IEEE, 2014

work page 2014

[12] [12]

X. Cao, X. Jiang, X. Li, and P. Yan. Correlation-based track- ing of multiple targets with hierarchical layered structure. IEEE transactions on cybernetics, 48(1):90–102, 2018

work page 2018

[13] [13]

J. Chen, H. Sheng, Y . Zhang, and Z. Xiong. Enhancing de- tection model for multiple hypothesis tracking. In Conf. on Computer Vision and Pattern Recognition Workshops, pages 2143–2152, 2017

work page 2017

[14] [14]

W. Choi. Near-online multi-target tracking with aggregated local ﬂow descriptor. In Proceedings of the IEEE inter- national conference on computer vision , pages 3029–3037, 2015

work page 2015

[15] [15]

Choi and S

W. Choi and S. Savarese. A uniﬁed framework for multi- target tracking and collective activity recognition. In Eu- ropean Conference on Computer Vision , pages 215–230. Springer, 2012

work page 2012

[16] [16]

Chopra, R

S. Chopra, R. Hadsell, and Y . LeCun. Learning a similar- ity metric discriminatively, with application to face veriﬁca- tion. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on , vol- ume 1, pages 539–546. IEEE, 2005

work page 2005

[17] [17]

Q. Chu, W. Ouyang, H. Li, X. Wang, B. Liu, and N. Yu. Online multi-object tracking using cnn-based single ob- ject tracker with spatial-temporal attention mechanism. In 2017 IEEE International Conference on Computer Vision (ICCV).(Oct 2017), pages 4846–4855, 2017

work page 2017

[18] [19]

R. T. Collins. Multitarget data association with higher-order motion models. In Computer Vision and Pattern Recogni- tion (CVPR), 2012 IEEE Conference on , pages 1744–1751. IEEE, 2012

work page 2012

[19] [20]

H. Dai, E. B. Khalil, Y . Zhang, B. Dilkina, and L. Song. Learning combinatorial optimization algorithms over graphs. arXiv preprint arXiv:1704.01665, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[20] [21]

De Cao and T

N. De Cao and T. Kipf. Molgan: An implicit gener- ative model for small molecular graphs. arXiv preprint arXiv:1805.11973, 2018

work page arXiv 2018

[21] [22]

Dehghan, S

A. Dehghan, S. Modiri Assari, and M. Shah. Gmmcp tracker: Globally optimal generalized maximum multi clique prob- lem for multiple object tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages 4091–4099, 2015

work page 2015

[22] [23]

Dehghan, Y

A. Dehghan, Y . Tian, P. H. Torr, and M. Shah. Target identity-aware network ﬂow for online multiple target track- ing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1146–1154, 2015

work page 2015

[23] [24]

M. Ding, J. Tang, and J. Zhang. Semi-supervised learning on graphs with generative adversarial nets. arXiv preprint arXiv:1809.00130, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[24] [25]

Dong and J

X. Dong and J. Shen. Triplet loss in siamese network for object tracking. In Proceedings of the European Conference on Computer Vision (ECCV), pages 459–474, 2018

work page 2018

[25] [26]

Eiselein, D

V . Eiselein, D. Arp, M. P ¨atzold, and T. Sikora. Real-time multi-human tracking using a probability hypothesis density ﬁlter and multiple detectors. In Advanced Video and Signal- Based Surveillance (AVSS), 2012 IEEE Ninth International Conference on, pages 325–330. IEEE, 2012

work page 2012

[26] [27]

Few-Shot Learning with Graph Neural Networks

V . Garcia and J. Bruna. Few-shot learning with graph neural networks. arXiv preprint arXiv:1711.04043, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[27] [29]

Neural Message Passing for Quantum Chemistry

J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl. Neural message passing for quantum chemistry. arXiv preprint arXiv:1704.01212, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[28] [30]

Goodfellow, Y

I. Goodfellow, Y . Bengio, A. Courville, and Y . Bengio.Deep learning, volume 1. MIT press Cambridge, 2016

work page 2016

[29] [31]

M. Gori, G. Monfardini, and F. Scarselli. A new model for learning in graph domains. In Neural Networks, 2005. IJCNN’05. Proceedings. 2005 IEEE International Joint Con- ference on, volume 2, pages 729–734. IEEE, 2005

work page 2005

[30] [32]

A. He, C. Luo, X. Tian, and W. Zeng. A twofold siamese network for real-time object tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recogni- tion, pages 4834–4843, 2018

work page 2018

[31] [33]

Q. He, J. Wu, G. Yu, and C. Zhang. Sot for mot. arXiv preprint arXiv:1712.01059, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[32] [34]

W. Hu, X. Li, W. Luo, X. Zhang, S. Maybank, and Z. Zhang. Single and multiple object tracking using log-euclidean rie- mannian subspace and block-division appearance model. IEEE Transactions on Pattern Analysis and Machine Intel- ligence, 34(12):2420–2440, 2012

work page 2012

[33] [35]

Huang, B

C. Huang, B. Wu, and R. Nevatia. Robust object tracking by hierarchical association of detection responses. In European Conference on Computer Vision , pages 788–801. Springer, 2008

work page 2008

[34] [36]

Javed, K

O. Javed, K. Shaﬁque, Z. Rasheed, and M. Shah. Mod- eling inter-camera space–time and appearance relationships for tracking across non-overlapping views. Computer Vision and Image Understanding, 109(2):146–162, 2008

work page 2008

[35] [37]

Jiang, P

X. Jiang, P. Li, X. Zhen, and X. Cao. Model-free tracking with deep appearance and motion features integration. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 101–110. IEEE, 2019

work page 2019

[36] [38]

C. Kim, F. Li, and J. M. Rehg. Multi-object tracking with neural gating using bilinear lstm. In Proceedings of the Eu- ropean Conference on Computer Vision (ECCV), pages 200– 215, 2018

work page 2018

[37] [39]

D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[38] [40]

T. N. Kipf and M. Welling. Semi-supervised classiﬁca- tion with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[39] [41]

H. W. Kuhn. The hungarian method for the assignment prob- lem. Naval research logistics quarterly, 2(1-2):83–97, 1955

work page 1955

[40] [42]

C.-H. Kuo, C. Huang, and R. Nevatia. Multi-target track- ing by on-line learned discriminative appearance models. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pages 685–692. IEEE, 2010

work page 2010

[41] [43]

Kutschbach, E

T. Kutschbach, E. Bochinski, V . Eiselein, and T. Sikora. Sequential sensor fusion combining probability hypothesis density and kernelized correlation ﬁlters for multi-object tracking in video data. In2017 14th IEEE International Con- ference on Advanced Video and Signal Based Surveillance (AVSS), pages 1–5. IEEE, 2017

work page 2017

[42] [44]

Leal-Taix ´e, C

L. Leal-Taix ´e, C. Canton-Ferrer, and K. Schindler. Learn- ing by tracking: Siamese cnn for robust target association. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 33–40, 2016

work page 2016

[43] [45]

MOTChallenge 2015: Towards a Benchmark for Multi-Target Tracking

L. Leal-Taix ´e, A. Milan, I. Reid, S. Roth, and K. Schindler. Motchallenge 2015: Towards a benchmark for multi-target tracking. arXiv preprint arXiv:1504.01942, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015

[44] [46]

Leal-Taix ´e, G

L. Leal-Taix ´e, G. Pons-Moll, and B. Rosenhahn. Everybody needs somebody: Modeling social and grouping behavior on a linear programming multiple people tracker. In Computer Vision Workshops (ICCV Workshops), 2011 IEEE Interna- tional Conference on, pages 120–127. IEEE, 2011

work page 2011

[45] [47]

H. Li, Y . Li, and F. Porikli. Deeptrack: Learning discrimina- tive feature representations online for robust visual tracking. IEEE Transactions on Image Processing, 25(4):1834–1848, 2016

work page 2016

[46] [48]

Y . Li, C. Huang, and R. Nevatia. Learning to associate: Hy- bridboosted multi-target tracker for crowded scene. 2009

work page 2009

[47] [49]

MOT16: A Benchmark for Multi-Object Tracking

A. Milan, L. Leal-Taix ´e, I. Reid, S. Roth, and K. Schindler. Mot16: A benchmark for multi-object tracking. arXiv preprint arXiv:1603.00831, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[48] [50]

Milan, S

A. Milan, S. H. Rezatoﬁghi, A. R. Dick, I. D. Reid, and K. Schindler. Online multi-target tracking using recurrent neural networks. In AAAI, volume 2, page 4, 2017

work page 2017

[49] [51]

Milan, S

A. Milan, S. H. Rezatoﬁghi, R. Garg, A. R. Dick, and I. D. Reid. Data-driven approximations to np-hard problems. In AAAI, pages 1453–1459, 2017

work page 2017

[50] [52]

Milan, S

A. Milan, S. Roth, and K. Schindler. Continuous energy min- imization for multitarget tracking. IEEE Trans. Pattern Anal. Mach. Intell., 36(1):58–72, 2014

work page 2014

[51] [53]

Mitzel, E

D. Mitzel, E. Horbert, A. Ess, and B. Leibe. Multi-person tracking with sparse detection and continuous segmentation. In European Conference on Computer Vision , pages 397–

work page

[52] [54]

Nowak, S

A. Nowak, S. Villar, A. S. Bandeira, and J. Bruna. Revised note on learning quadratic assignment with graph neural net- works. In 2018 IEEE Data Science Workshop (DSW), pages 1–5. IEEE, 2018

work page 2018

[53] [55]

End-to-End Tracking and Semantic Segmentation Using Recurrent Neural Networks

P. Ondruska, J. Dequaire, D. Z. Wang, and I. Posner. End- to-end tracking and semantic segmentation using recurrent neural networks. arXiv preprint arXiv:1604.05091, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[54] [56]

Deep Tracking: Seeing Beyond Seeing Using Recurrent Neural Networks

P. Ondruska and I. Posner. Deep tracking: Seeing be- yond seeing using recurrent neural networks. arXiv preprint arXiv:1602.00991, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[55] [57]

Pellegrini, A

S. Pellegrini, A. Ess, and L. Van Gool. Improving data as- sociation by joint modeling of pedestrian trajectories and groupings. In European conference on computer vision , pages 452–465. Springer, 2010

work page 2010

[56] [58]

Pirsiavash, D

H. Pirsiavash, D. Ramanan, and C. C. Fowlkes. Globally- optimal greedy algorithms for tracking a variable num- ber of objects. In Computer Vision and Pattern Recogni- tion (CVPR), 2011 IEEE Conference on , pages 1201–1208. IEEE, 2011

work page 2011

[57] [59]

Possegger, T

H. Possegger, T. Mauthner, P. M. Roth, and H. Bischof. Oc- clusion geodesics for online multi-object tracking. In Pro- ceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1306–1313, 2014

work page 2014

[58] [60]

Qin and C

Z. Qin and C. R. Shelton. Improving multi-target tracking via social grouping. In Computer Vision and Pattern Recog- nition (CVPR), 2012 IEEE Conference on, pages 1972–1978. IEEE, 2012

work page 2012

[59] [61]

Discovering objects and their relations from entangled scene representations

D. Raposo, A. Santoro, D. Barrett, R. Pascanu, T. Lilli- crap, and P. Battaglia. Discovering objects and their rela- tions from entangled scene representations. arXiv preprint arXiv:1702.05068, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[60] [62]

Features for Multi-Target Multi-Camera Tracking and Re-Identification

E. Ristani and C. Tomasi. Features for multi-target multi-camera tracking and re-identiﬁcation. arXiv preprint arXiv:1803.10859, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[61] [63]

Robicquet, A

A. Robicquet, A. Sadeghian, A. Alahi, and S. Savarese. Learning social etiquette: Human trajectory understanding in crowded scenes. In European conference on computer vi- sion, pages 549–565. Springer, 2016

work page 2016

[62] [64]

Tracking The Untrackable: Learning To Track Multiple Cues with Long-Term Dependencies

A. Sadeghian, A. Alahi, and S. Savarese. Tracking the un- trackable: Learning to track multiple cues with long-term de- pendencies. arXiv preprint arXiv:1701.01909, 4(5):6, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[63] [65]

Sanchez-Matilla, F

R. Sanchez-Matilla, F. Poiesi, and A. Cavallaro. Online multi-target tracking with strong and weak detections. In European Conference on Computer Vision , pages 84–99. Springer, 2016

work page 2016

[64] [66]

Scarselli, M

F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini. Computational capabilities of graph neu- ral networks. IEEE Transactions on Neural Networks , 20(1):81–102, 2009

work page 2009

[65] [68]

Scarselli, M

F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini. The graph neural network model. IEEE Transactions on Neural Networks, 20(1):61–80, 2009

work page 2009

[66] [69]

Schulter, P

S. Schulter, P. Vernaza, W. Choi, and M. Chandraker. Deep network ﬂow for multi-object tracking. In Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on, pages 2730–2739. IEEE, 2017

work page 2017

[67] [70]

Scovanner and M

P. Scovanner and M. F. Tappen. Learning pedestrian dynam- ics from the real world. In Computer Vision, 2009 IEEE 12th International Conference on, pages 381–388. IEEE, 2009

work page 2009

[68] [71]

G. Shu, A. Dehghan, O. Oreifej, E. Hand, and M. Shah. Part- based multiple-person tracking with partial occlusion han- dling. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pages 1815–1821. IEEE, 2012

work page 2012

[69] [72]

J. Son, M. Baek, M. Cho, and B. Han. Multi-object tracking with quadruplet convolutional neural networks. In Proceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5620–5629, 2017

work page 2017

[70] [73]

Song, T.-Y

B. Song, T.-Y . Jeng, E. Staudt, and A. K. Roy-Chowdhury. A stochastic graph evolution framework for robust multi- target tracking. In European Conference on Computer Vi- sion, pages 605–619. Springer, 2010

work page 2010

[71] [74]

PeerNets: Exploiting Peer Wisdom Against Adversarial Attacks

J. Svoboda, J. Masci, F. Monti, M. M. Bronstein, and L. Guibas. Peernets: Exploiting peer wisdom against ad- versarial attacks. arXiv preprint arXiv:1806.00088, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[72] [75]

S. Tang, B. Andres, M. Andriluka, and B. Schiele. Sub- graph decomposition for multi-target tracking. In Proceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5033–5041, 2015

work page 2015

[73] [76]

S. Tang, B. Andres, M. Andriluka, and B. Schiele. Multi- person tracking by multicut and deep matching. InEuropean Conference on Computer Vision , pages 100–111. Springer, 2016

work page 2016

[74] [77]

S. Tang, M. Andriluka, B. Andres, and B. Schiele. Multiple people tracking by lifted multicut and person reidentiﬁca- tion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3539–3548, 2017

work page 2017

[75] [78]

Graph Attention Networks

P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Lio, and Y . Bengio. Graph attention networks. arXiv preprint arXiv:1710.10903, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[76] [79]

X. Wan, J. Wang, Z. Kong, Q. Zhao, and S. Deng. Multi- object tracking using online metric learning with long short- term memory. In 2018 25th IEEE International Conference on Image Processing (ICIP), pages 788–792. IEEE, 2018

work page 2018

[77] [80]

Wang and D.-Y

N. Wang and D.-Y . Yeung. Learning a deep compact im- age representation for visual tracking. In Advances in neural information processing systems, pages 809–817, 2013

work page 2013

[78] [81]

Q. Wang, Z. Teng, J. Xing, J. Gao, W. Hu, and S. May- bank. Learning attentions: residual attentional siamese net- work for high performance online visual tracking. In Pro- ceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4854–4863, 2018

work page 2018

[79] [82]

X. Wang, E. T ¨uretken, F. Fleuret, and P. Fua. Tracking inter- acting objects using intertwined ﬂows. IEEE transactions on pattern analysis and machine intelligence , 38(EPFL- ARTICLE-210040):2312–2326, 2016

work page 2016

[80] [83]

Yang and R

B. Yang and R. Nevatia. Multi-target tracking by online learning of non-linear motion patterns and robust appear- ance models. In Computer Vision and Pattern Recogni- tion (CVPR), 2012 IEEE Conference on , pages 1918–1925. IEEE, 2012

work page 2012