pith. machine review for the scientific record. sign in

arxiv: 2605.07852 · v1 · submitted 2026-05-08 · 📊 stat.ME

Recognition: no theorem link

CHASM: Online Changepoint Detection in Temporal and Cross-Variable Dependence

Dean A. Bodenham, Edward A. K. Cohen, Niall M. Adams, Victor K. Khamesi

Pith reviewed 2026-05-11 02:25 UTC · model grok-4.3

classification 📊 stat.ME
keywords changepoint detectiondynamic mode decompositionmultivariate time seriesonline monitoringnonparametric methodeigenvalue sequencevector autoregressive modeldependence structure
0
0 comments X

The pith

CHASM detects online changepoints in multivariate dependence by tracking the truncated eigenvalues of a recursively estimated dynamic mode decomposition operator.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces CHASM, an online nonparametric method for spotting changepoints in multivariate time series that arise from shifts in cross-variable or temporal dependence rather than simple changes in means or variances. It works by continuously updating a dynamic mode decomposition operator and monitoring its leading eigenvalues, which reflect the underlying dynamics. Practical challenges are handled by applying optimal linear assignment to resolve the arbitrary ordering of eigenvalues and by using an augmented scheme to manage complex-valued quantities. The design draws motivation from the theoretical behavior of the estimator under a vector autoregressive model. If successful, the approach supplies a deployable tool that needs few parameters, makes no strong distributional assumptions, and performs competitively on both synthetic tests and real data such as video and text streams.

Core claim

CHASM monitors the truncated eigenvalue sequence of the recursively estimated dynamic mode decomposition operator to detect changepoints in temporal and cross-variable dependence. The approach uses optimal linear assignment to handle permutation invariance of eigendecompositions and a novel augmented monitoring scheme for complex-valued series. Theoretical analysis under the vector autoregressive model supports the design, and empirical tests show competitive or better performance than existing methods on synthetic and real-world datasets.

What carries the argument

The truncated eigenvalue sequence of the recursively estimated dynamic mode decomposition (DMD) operator, with optimal linear assignment to resolve permutation invariance and an augmented scheme to accommodate complex values.

If this is right

  • The method runs online and unsupervised with only a small number of interpretable parameters.
  • It requires no distributional assumptions beyond finite moments.
  • Performance is competitive or superior to modern alternatives on synthetic data and on challenging real data including video and text.
  • The algorithmic choices are directly motivated by the estimator's properties under the canonical vector autoregressive model.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Eigenvalue sequences from linear dynamical approximations may serve as compact statistics for detecting global dependence changes that scalar functionals miss.
  • The same recursive DMD monitoring idea could be tested on streaming data from domains such as sensor networks or biological signals where dependence shifts are diagnostically important.
  • Hybrid detectors that combine the eigenvalue monitor with reconstruction-error methods might reduce false alarms while retaining sensitivity to subtle dependence changes.

Load-bearing premise

Shifts in cross-variable and temporal dependence reliably appear as changes in the truncated eigenvalue sequence of the recursive DMD operator once permutation invariance and complex-valued issues are resolved by the assignment and augmentation steps.

What would settle it

A controlled multivariate series containing a clear change in dependence structure but no corresponding shift in the monitored eigenvalue sequence, or a series with no dependence change that nevertheless triggers a detection.

Figures

Figures reproduced from arXiv: 2605.07852 by Dean A. Bodenham, Edward A. K. Cohen, Niall M. Adams, Victor K. Khamesi.

Figure 1
Figure 1. Figure 1: Spectral velocity under stationary dy￾namics (a) and at a changepoint (b). Stationarity yields directionally incoherent velocities; a change￾point induces a coherent shift across consecutive eigenvalues. 101 102 103 Sample size n 10−1 10−2 10−3 kE[Θ n] − ΘkF ρ = 1 ρ = 0.99 ρ = 0.98 ρ = 0.95 ρ = 0.90 [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Effective sample size and detection. Without forgetting, both bias and variance vanish asymptotically, yielding a concentrated estimator that may become insensitive to changepoints. With forgetting, the estimator can retain a non-degenerate variability driven by recent observa￾tions, thereby maintaining sensitivity to changes. The forgetting factor ρ controls this trade-off: larger values favour stability … view at source ↗
Figure 3
Figure 3. Figure 3: F1, ARL1, and ARL0 vs. dimension d for the sparse VAR data set. Higher is better for F1 and ARL0, and lower is better for ARL1. Shaded bands show mean ± std over the parameter grid. Methods with near-zero F1 (mSSA, mSSA-MW, BOCPDMS) are omitted from ARL1 for readability; the full figure is in Appendix C.5. 5.2 Synthetic data simulations Data sets. We consider six synthetic data sets based on VARp(1) proces… view at source ↗
Figure 4
Figure 4. Figure 4: Synthetic VAR2(1) example with a sin￾gle changepoint (shaded post-change region), along with one of the eigenvalues and CHASM’s D2 n statis￾tic. -5 0 5 Data 50 60 70 80 90 Time (s) 0 10 D2 n [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: Computational scaling of online changepoint detection methods across sequence length and dimension. Runtime (top) and peak memory usage (bottom) are reported for sequence lengths T ∈ {250, 500, 1000, 2000} and dimensions d ∈ {2, 4, 8, 16, 32}, on VARd(1) process xt = Θ xt−1 + εt, εt ∼ N (0, Id), with transition Θij i.i.d. ∼ N (0, 0.09/d). Lines denote the mean and shaded bands denote ±1 standard deviation … view at source ↗
Figure 7
Figure 7. Figure 7: Sample time series from the CIFAR-100 benchmark dataset. Top: Heatmap of the raw 512-dimensional CLIP ViT-B/32 embeddings φ(It) ∈ R 512 across T = 1,000 time steps. Middle: Heatmap of the corresponding PCA-projected embeddings φpca(It) ∈ R 32, obtained by standardising the CLIP embeddings and projecting onto the leading 32 principal components estimated from the held-out test split. Bottom: Representative … view at source ↗
Figure 8
Figure 8. Figure 8: We compute accuracy metrics using ∆ℓ = 0 and ∆r = 50 (Appendix C.1.2). Motivation. Text streams such as news articles, social media posts, and customer communi￾cations often exhibit abrupt changes in topic or semantic content, and the 20 Newsgroups data set has been used to model this setting [23]. In our construction, each segment corresponds to a fixed topic, and documents within a segment are sampled in… view at source ↗
Figure 8
Figure 8. Figure 8: Sample time series from the 20 Newsgroups benchmark dataset. Top: Heatmap of the raw 512-dimensional CLIP ViT-B/32 embeddings φ(Dt) ∈ R 512 across T = 1,000 time steps. Middle: Heatmap of the corresponding PCA-projected embeddings φpca(Dt) ∈ R 32, obtained by standardising the CLIP embeddings and projecting onto the leading 32 principal components estimated from the held-out test split. Bottom: Representat… view at source ↗
Figure 9
Figure 9. Figure 9: Sample time series from the UCF-Crime benchmark dataset (Explosion category). Top: Heatmap of the raw 512-dimensional CLIP ViT-B/32 frame embeddings φ(Ft) ∈ R 512 across T ≈ 3,200 frames, decoded at 30 fps. Middle: Heatmap of the corresponding PCA-projected embeddings φpca(Ft) ∈ R 32, obtained by applying the fixed linear projection transferred from the CIFAR-100 experiment. Bottom: A strip of representati… view at source ↗
Figure 10
Figure 10. Figure 10: Six samples from the WikiSection benchmark dataset (English city subset). Each panel displays the heatmap of the 64-dimensional MRL-truncated token embeddings φmrl(xt) ∈ R 64 for a single Wikipedia article, with the article title, total token count, and number of changepoints indicated above, and the sequence of section labels shown below. Vertical white lines mark the ground-truth changepoint locations, … view at source ↗
Figure 11
Figure 11. Figure 11: Accuracy (top) and speed (bottom) metrics (see Appendix C.1) across all parameter configurations on the synthetic bivariate VAR2(1) data set with Gaussian noise (see Appendix C.3.3). Each dot corresponds to a single parameter set evaluated over the grid (see [PITH_FULL_IMAGE:figures/full_fig_p047_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Accuracy (top) and speed (bottom) metrics (see Appendix C.1) across all parameter configurations on the synthetic bivariate VAR2(1) data set with Laplace noise (see Appendix C.3.3). Each dot corresponds to a single parameter set evaluated over the grid (see [PITH_FULL_IMAGE:figures/full_fig_p047_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Accuracy (top) and speed (bottom) metrics (see Appendix C.1) across all parameter configurations on the synthetic bivariate VAR2(1) data set with Student’s tν noise (see Appendix C.3.3). Each dot corresponds to a single parameter set evaluated over the grid (see [PITH_FULL_IMAGE:figures/full_fig_p048_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Accuracy (top) and speed (bottom) metrics (see Appendix C.1) across all parameter configura￾tions on the synthetic bivariate VAR2(1) data set with Huber-ϵ contaminated noise (see Appendix C.3.3). Each dot corresponds to a single parameter set evaluated over the grid (see [PITH_FULL_IMAGE:figures/full_fig_p048_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: F1, ARL1, and ARL0 vs. contamination ϵ for the VAR2(1) data set with Huber-ϵ contaminated noise. Shaded bands show mean ± std over the parameter grid. 48 [PITH_FULL_IMAGE:figures/full_fig_p048_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: F1, ARL1, and ARL0 vs. degrees of freedom ν for the VAR2(1) data set with Student’s tν noise. Shaded bands show mean ± std over the parameter grid. 0.00 0.25 0.50 0.75 1.00 CHASM ρ = 1 CHASM ρ < 1 mSSA mSSA-MW Tian & Safikhani OCD CPDMD BOCPDMS Precision ↑ 0.00 0.25 0.50 0.75 1.00 Recall ↑ 0.00 0.25 0.50 0.75 1.00 F1-score ↑ 0 20 40 CHASM ρ = 1 CHASM ρ < 1 mSSA mSSA-MW Tian & Safikhani OCD CPDMD BOCPDMS A… view at source ↗
Figure 17
Figure 17. Figure 17: Accuracy (top) and speed (bottom) metrics (see Appendix C.1) across all parameter configurations on the synthetic bivariate VARd(1) data set with sparse dynamics, d ∈ {2, . . . , 40} (see Appendix C.3.3). Each dot corresponds to a single parameter set evaluated over the grid (see [PITH_FULL_IMAGE:figures/full_fig_p049_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Accuracy (top) and speed (bottom) metrics (see Appendix C.1) across all parameter configurations on the synthetic bivariate VARd(1) data set with full-rank dynamics, d ∈ {2, . . . , 40} (see Appendix C.3.3). Each dot corresponds to a single parameter set evaluated over the grid (see [PITH_FULL_IMAGE:figures/full_fig_p050_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: F1, ARL1, and ARL0 vs. dimension d for the sparse VARd(1) data set. Shaded bands show mean ± std over the parameter grid. 2 10 20 30 40 Dimension d 0.0 0.5 1.0 F1-score → Method CHASM ρ = 1 CHASM ρ < 1 mSSA mSSA-MW Tian & Safikhani OCD CPDMD BOCPDMS 2 10 20 30 40 Dimension d 0 10 20 ARL1 ← 2 10 20 30 40 Dimension d 102 103 104 ARL0 → [PITH_FULL_IMAGE:figures/full_fig_p051_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: F1, ARL1, and ARL0 vs. dimension d for the full-rank VARd(1) data set. Shaded bands show mean ± std over the parameter grid. 0.00 0.25 0.50 0.75 1.00 CHASM ρ = 1 CHASM ρ < 1 mSSA mSSA-MW Tian & Safikhani OCD CPDMD BOCPDMS Precision ↑ 0.00 0.25 0.50 0.75 1.00 Recall ↑ 0.00 0.25 0.50 0.75 1.00 F1-score ↑ [PITH_FULL_IMAGE:figures/full_fig_p051_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: Accuracy metrics (see Appendix C.1) across all parameter configurations on the HASC real-world data set. Each dot corresponds to a single parameter set evaluated over the grid (see [PITH_FULL_IMAGE:figures/full_fig_p051_21.png] view at source ↗
Figure 22
Figure 22. Figure 22: Accuracy metrics (see Appendix C.1) across all parameter configurations on the CIFAR-100 real-world data set. Each dot corresponds to a single parameter set evaluated over the grid (see [PITH_FULL_IMAGE:figures/full_fig_p051_22.png] view at source ↗
Figure 23
Figure 23. Figure 23: Accuracy metrics (see Appendix C.1) across all parameter configurations on the 20 Newsgroups real-world data set. Each dot corresponds to a single parameter set evaluated over the grid (see [PITH_FULL_IMAGE:figures/full_fig_p052_23.png] view at source ↗
Figure 24
Figure 24. Figure 24: Accuracy metrics (see Appendix C.1) across all parameter configurations on the UCF-Crime real-world data set. Each dot corresponds to a single parameter set evaluated over the grid (see [PITH_FULL_IMAGE:figures/full_fig_p052_24.png] view at source ↗
Figure 25
Figure 25. Figure 25: Accuracy metrics (see Appendix C.1) across all parameter configurations on the WikiSection real-world data set. Each dot corresponds to a single parameter set evaluated over the grid (see [PITH_FULL_IMAGE:figures/full_fig_p052_25.png] view at source ↗
Figure 26
Figure 26. Figure 26: Three representative samples from the CIFAR-100 data set, visualised as heatmaps over the d = 32 feature dimensions across time. Vertical black lines indicate true changepoints and star markers (⋆) indicate CHASM alarms with parameters ρ = 0.98, r = 4, and (α, h) = (0.35, 28). 52 [PITH_FULL_IMAGE:figures/full_fig_p052_26.png] view at source ↗
Figure 27
Figure 27. Figure 27: Three representative samples from the 20 Newsgroups data set, visualised as heatmaps over the d = 32 feature dimensions across time. Vertical black lines indicate true changepoints and star markers (⋆) indicate CHASM alarms with parameters ρ = 0.98, r = 4, and (α, h) = (0.35, 28). 0 2000 4000 6000 8000 10000 12000 14000 16000 1 32 Change Alarm 0 200 400 600 800 1000 1200 1400 1600 1 32 0 1000 2000 3000 40… view at source ↗
Figure 28
Figure 28. Figure 28: Three representative samples from the UCF-Crime data set, visualised as heatmaps over the d = 32 feature dimensions across time. Vertical black lines indicate true changepoints and star markers (⋆) indicate CHASM alarms with parameters ρ = 1, r = 4, and (α, h) = (0.18, 15). The data exhibits an asymmetry between onset and offset transitions: onset changepoints, corresponding to the beginning of anomalous … view at source ↗
Figure 29
Figure 29. Figure 29: Three representative samples from the WikiSection data set, visualised as heatmaps over the d = 32 feature dimensions across time. Vertical black lines indicate true changepoints and star markers (⋆) indicate CHASM alarms with parameters ρ = 0.95, r = 8, and (α, h) = (0.25, 25). Changes in this data set are subtle, with little visible structure in the heatmaps, yet CHASM successfully detects a subset of c… view at source ↗
read the original abstract

Changepoint detection identifies times when the generative process of a time series changes, with applications in healthcare, cybersecurity, and finance. In multivariate settings, changes in cross-variable and temporal dependence are particularly challenging to detect, as they are often less pronounced than shifts in marginal statistics such as the mean or variance. Existing methods detect changes using reconstruction error, which provides only an indirect measure of dynamical change, or rely on scalar functionals that may be too coarse to capture global structure. We introduce CHASM, an online nonparametric method that monitors the truncated eigenvalue sequence of the recursively estimated dynamic mode decomposition operator. Designing such an approach raises two challenges: the permutation invariance of eigendecompositions, resolved via optimal linear assignment, and the lack of online changepoint methods for multivariate complex-valued time series, addressed through a novel augmented monitoring scheme. We study the theoretical properties of the dynamics estimator under the canonical vector autoregressive model, which directly motivates our algorithmic design. The proposed method achieves competitive or superior performance to modern competitors across synthetic and real-world data sets, including challenging settings in video and text data. It is unsupervised, depends on a small number of interpretable parameters, and requires no distributional assumptions beyond finite moments, making it readily deployable across scientific domains.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces CHASM, an online nonparametric changepoint detection method for multivariate time series that monitors the truncated eigenvalue sequence of a recursively estimated dynamic mode decomposition (DMD) operator. It resolves permutation invariance of eigendecompositions via optimal linear assignment and addresses online monitoring of complex-valued series via a novel augmented scheme. Theoretical properties of the estimator are derived under the canonical vector autoregressive (VAR) model, which motivates the design. The method is claimed to achieve competitive or superior performance to modern competitors on synthetic and real-world datasets (including video and text), while remaining unsupervised, depending on a small number of interpretable parameters, and requiring no distributional assumptions beyond finite moments.

Significance. If the empirical claims hold, the work provides a practical, assumption-light approach to detecting changes in cross-variable and temporal dependence structures rather than marginal moments, which addresses a recognized gap in online multivariate changepoint detection. Credit is due for grounding the algorithmic choices in VAR consistency results and for testing on challenging non-linear data types such as video and text. The unsupervised character and limited parameter count are genuine strengths for deployability.

major comments (2)
  1. [§3] §3 (Theoretical properties): Consistency of the recursively estimated DMD operator and its truncated eigenvalues is established only under the canonical VAR model. The central claim of reliable detection on video and text data (with no assumptions beyond finite moments) therefore rests on the unproven assertion that dependence shifts will produce detectable changes in the truncated eigenvalue sequence even when the linear approximation is poor; no robustness analysis or counter-example study under nonlinear dynamics is provided to support this extrapolation.
  2. [§5] §5 (Experiments, real-data results): The reported competitive performance on video and text datasets is presented without ablation isolating the contribution of the optimal linear assignment and augmented monitoring scheme. It is therefore unclear whether the gains are attributable to the DMD-eigenvalue statistic itself or to the permutation/complex-value corrections; this directly affects the load-bearing claim that the method reliably captures cross-variable and temporal dependence changes.
minor comments (2)
  1. [§2] Notation for the truncation level and detection threshold should be introduced with explicit symbols in the method section rather than only in the algorithm box, to improve readability.
  2. [Abstract] The abstract states 'no distributional assumptions beyond finite moments,' but the text should clarify whether this applies to the online recursion or only to the offline consistency result.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below, indicating planned revisions where appropriate.

read point-by-point responses
  1. Referee: [§3] §3 (Theoretical properties): Consistency of the recursively estimated DMD operator and its truncated eigenvalues is established only under the canonical VAR model. The central claim of reliable detection on video and text data (with no assumptions beyond finite moments) therefore rests on the unproven assertion that dependence shifts will produce detectable changes in the truncated eigenvalue sequence even when the linear approximation is poor; no robustness analysis or counter-example study under nonlinear dynamics is provided to support this extrapolation.

    Authors: We acknowledge that the consistency results for the DMD estimator and truncated eigenvalues are derived specifically under the canonical VAR model, which provides a linear approximation that directly motivates the algorithmic design. The method itself is nonparametric and requires only finite moments, with no further distributional assumptions. While we do not provide a dedicated robustness analysis or counterexamples for strongly nonlinear regimes, the empirical results on inherently nonlinear data such as video and text demonstrate that shifts in the truncated eigenvalue sequence remain detectable in practice. In the revision we will clarify the scope of the theoretical results in §3, explicitly noting that the VAR analysis serves as a motivating case rather than a universal guarantee, and we will add a short discussion of the empirical support for broader applicability. revision: partial

  2. Referee: [§5] §5 (Experiments, real-data results): The reported competitive performance on video and text datasets is presented without ablation isolating the contribution of the optimal linear assignment and augmented monitoring scheme. It is therefore unclear whether the gains are attributable to the DMD-eigenvalue statistic itself or to the permutation/complex-value corrections; this directly affects the load-bearing claim that the method reliably captures cross-variable and temporal dependence changes.

    Authors: We agree that an ablation isolating the optimal linear assignment and augmented monitoring components would strengthen the presentation. The current experiments evaluate the full CHASM pipeline against competitors but do not decompose the contribution of these two elements. We will add an ablation study to the revised §5, reporting performance of CHASM variants with and without each component on the video and text datasets. This will clarify whether the observed gains stem primarily from the DMD-eigenvalue monitoring statistic or from the permutation and complex-value handling. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained via external VAR motivation and novel components

full rationale

The paper motivates CHASM from the canonical VAR model (external to the present work) and introduces new algorithmic elements (optimal linear assignment for permutation invariance; augmented monitoring for complex-valued series) to monitor the truncated DMD eigenvalue sequence. Theoretical consistency is studied under VAR assumptions, but this does not reduce the online detection statistic or performance claims to a fit or self-referential equation. Empirical results on synthetic/real data (video/text) are presented as independent validation without renaming fitted quantities as predictions or relying on load-bearing self-citations for the core uniqueness or correctness. No quoted step equates a claimed result to its own inputs by construction.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The method rests on a small number of interpretable parameters (truncation level and detection threshold) and standard nonparametric assumptions; no new entities are postulated. Theory is developed under an existing VAR model rather than new axioms.

free parameters (2)
  • eigenvalue truncation level
    Number of leading eigenvalues retained for monitoring; chosen to balance sensitivity and noise, appears as a tunable parameter in the method description.
  • changepoint detection threshold
    Threshold applied to the monitoring statistic; required for online decision-making and listed among the small number of interpretable parameters.
axioms (2)
  • domain assumption Time series possess finite moments
    Explicitly stated as the only distributional assumption required, enabling nonparametric operation without stronger conditions.
  • domain assumption Canonical vector autoregressive model governs the dynamics for theoretical analysis
    Used to derive properties of the DMD estimator that directly motivate the algorithmic design.

pith-pipeline@v0.9.0 · 5537 in / 1603 out tokens · 44784 ms · 2026-05-11T02:25:19.064349+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

83 extracted references · 83 canonical work pages · 1 internal anchor

  1. [1]

    R. P. Adams and D. J. C. MacKay. Bayesian Online Changepoint Detection.arXiv preprint arXiv:0710.3742, 2007

  2. [2]

    Alanqary, A

    A. Alanqary, A. Alomar, and D. Shah. Change Point Detection via Multivariate Singular Spectrum Analysis. InAdvances in Neural Information Processing Systems, volume 34, pages 23218–23230, 2021

  3. [3]

    Arnold, R

    S. Arnold, R. Schneider, P. Cudré-Mauroux, F. A. Gers, and A. Löser. SECTOR: A Neural Model for Coherent Topic Segmentation and Classification.Transactions of the Association for Computational Linguistics, 7:169–184, 2019

  4. [4]

    A. Aue, S. Hörmann, L. Horváth, and M. Reimherr. Break detection in the covariance structure of multivariate time series models.The Annals of Statistics, 37(6B):4046–4087, 2009

  5. [5]

    P. Bai, A. Safikhani, and G. Michailidis. Multiple Change Points Detection in Low Rank and Sparse High Dimensional Vector Autoregressive Models.IEEE Transactions on Signal Processing, 68: 3074–3089, 2020

  6. [6]

    P. Bai, A. Safikhani, and G. Michailidis. Multiple Change Point Detection in Reduced Rank High Dimensional Vector Autoregressive Models.Journal of the American Statistical Association, 118 (544):2776–2792, 2023

  7. [7]

    Bartz and K.-R

    D. Bartz and K.-R. Müller. Covariance shrinkage for autocorrelated data. InAdvances in Neural Information Processing Systems, volume 27, 2014

  8. [8]

    Basu and G

    S. Basu and G. Michailidis. Regularized Estimation in Sparse High-Dimensional Time Series Models. The Annals of Statistics, pages 1535–1567, 2015

  9. [9]

    S. Basu, X. Li, and G. Michailidis. Low Rank and Structured Modeling of High-Dimensional Vector Autoregressions.IEEE Transactions on Signal Processing, 67(5):1207–1222, 2019

  10. [10]

    Berens, G

    T. Berens, G. N. F. Weiß, and D. Wied. Testing for structural breaks in correlations: Does it improve Value-at-Risk forecasting?Journal of Empirical Finance, 32:135–152, 2015

  11. [11]

    Bertsimas.Introduction to Linear Optimization

    D. Bertsimas.Introduction to Linear Optimization. Athena Scientific, 1997

  12. [12]

    D. A. Bodenham and N. M. Adams. Continuous monitoring for changepoints in data streams using adaptive estimation.Statistics and Computing, 27(5):1257–1270, 2017

  13. [13]

    R. A. Borsoi, C. Richard, A. Ferrari, J. Chen, and J. C. M. Bermudez. Online Graph-Based Change Point Detection in Multiband Image Sequences. In28th European Signal Processing Conference, pages 850–854. IEEE, 2021

  14. [14]

    S. L. Brunton and J. N. Kutz.Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control. Cambridge University Press, 2022

  15. [15]

    S. L. Brunton, M. Budišić, E. Kaiser, and J. N. Kutz. Modern Koopman Theory for Dynamical Systems.SIAM Review, 64(2):229–340, 2022

  16. [16]

    T. Cai, W. Liu, and Y. Xia. Two-Sample Covariance Matrix Testing and Support Recovery in High-Dimensional and Sparse Settings.Journal of the American Statistical Association, 108(501): 265–277, 2013

  17. [17]

    H. P. Chan. Optimal sequential detection in multi-stream data.The Annals of Statistics, 45(6): 2736–2763, 2017

  18. [18]

    Y. Chen, T. Wang, and R. J. Samworth. High-dimensional, multiscale online changepoint detection. Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(1):234–266, 2022

  19. [19]

    Enikeeva, O

    F. Enikeeva, O. Klopp, and M. Rousselot. Change-point detection in low-rank VAR processes. Bernoulli, 31(2):1058–1083, 2025. 12

  20. [20]

    Fearnhead and P

    P. Fearnhead and P. Fryzlewicz. Detecting A Single Change-Point.arXiv preprint arXiv:2210.07066, 2022

  21. [21]

    A. Foi, M. Trimeche, V. Katkovnik, and K. Egiazarian. Practical Poissonian-Gaussian Noise Modeling and Fitting for Single-Image Raw-Data.IEEE Transactions on Image Processing, 17(10):1737–1754, 2008

  22. [22]

    Fridman, M

    Y. Fridman, M. Rusanovsky, and G. Oren. ChangeChip: A Reference-Based Unsupervised Change Detection for PCB Defect Detection. InIEEE Physical Assurance and Inspection of Electronics, pages 1–8, 2021

  23. [23]

    C. M. Garcia, R. Abilio, A. L. Koerich, A. S. Britto Jr, and J. P. Barddal. Concept Drift Adaptation in Text Stream Mining Settings: A Systematic Review.ACM Transactions on Intelligent Systems and Technology, 16(2):1–67, 2025

  24. [24]

    Gavish and D

    M. Gavish and D. L. Donoho. The Optimal Hard Threshold for Singular Values is4/ √ 3.IEEE Transactions on Information Theory, 60(8):5040–5053, 2014

  25. [25]

    G. H. Golub and C. F. Van Loan.Matrix Computations. Johns Hopkins University Press, 2013

  26. [26]

    Gorrostieta, H

    C. Gorrostieta, H. Ombao, P. Bédard, and J. N. Sanes. Investigating brain connectivity using mixed effects vector autoregressive models.NeuroImage, 59(4):3347–3355, 2012

  27. [27]

    J. D. Hamilton.Time Series Analysis. Princeton University Press, 1994

  28. [28]

    R. A. Horn and C. R. Johnson.Matrix Analysis. Cambridge University Press, 2012

  29. [29]

    Jia and J

    M. Jia and J. Diaz-Rodriguez. Unsupervised Text Segmentation via Kernel Change-Point Detection on Sentence Embeddings.arXiv preprint arXiv:2601.18788, 2026

  30. [30]

    Jiang, J

    Z. Jiang, J. Che, and L. Wang. Ultra-short-term wind speed forecasting based on EMD-VAR model and spatial correlation.Energy Conversion and Management, 250:114919, 2021

  31. [31]

    Jonker and A

    R. Jonker and A. Volgenant. A Shortest Augmenting Path Algorithm for Dense and Sparse Linear Assignment Problems.Computing, 38(4):325–340, 1987

  32. [32]

    Kawaguchi, N

    N. Kawaguchi, N. Ogawa, Y. Iwasaki, K. Kaji, T. Terada, K. Murao, S. Inoue, Y. Kawahara, Y. Sumi, and N. Nishio. HASC Challenge: Gathering Large Scale Human Activity Corpus for the Real-World Activity Understandings. InProceedings of the 2nd Augmented Human International Conference, pages 1–5, 2011

  33. [33]

    Kawahara

    Y. Kawahara. Dynamic Mode Decomposition with Reproducing Kernels for Koopman Spectral Analysis. InAdvances in Neural Information Processing Systems, volume 29, 2016

  34. [34]

    V. K. Khamesi, N. M. Adams, D. A. Bodenham, and E. A. K. Cohen. Online Changepoint Detection via Dynamic Mode Decomposition.arXiv preprint arXiv:2405.15576, 2024

  35. [35]

    Killick, P

    R. Killick, P. Fearnhead, and I. A. Eckley. Optimal Detection of Changepoints With a Linear Computational Cost.Journal of the American Statistical Association, 107(500):1590–1598, 2012

  36. [36]

    Knoblauch and T

    J. Knoblauch and T. Damoulas. Spatio-temporal Bayesian On-line Changepoint Detection with Model Selection. InInternational Conference on Machine Learning, pages 2718–2727, 2018

  37. [37]

    Krizhevsky and G

    A. Krizhevsky and G. Hinton. Learning Multiple Layers of Features from Tiny Images. Technical report, University of Toronto, 2009

  38. [38]

    Kusupati, G

    A. Kusupati, G. Bhatt, A. Rege, M. Wallingford, A. Sinha, V. Ramanujan, W. Howard-Snyder, K. Chen, S. Kakade, P. Jain, and A. Farhadi. Matryoshka Representation Learning. InAdvances in Neural Information Processing Systems, volume 35, pages 30233–30249, 2022

  39. [39]

    K. Lang. NewsWeeder: Learning to Filter Netnews. InMachine Learning Proceedings, pages 331–339. Elsevier, 1995. 13

  40. [40]

    Langel, O

    S. Langel, O. G. Crespillo, and M. Joerger. Frequency-Domain Modeling of Correlated Gaussian Noise in Kalman Filtering.IEEE Transactions on Aerospace and Electronic Systems, 60(6):8945–8959, 2024

  41. [41]

    R. B. Lehoucq, D. C. Sorensen, and C. Yang.ARPACK Users’ Guide: Solution of Large-Scale Eigenvalue Problems with Implicitly Restarted Arnoldi Methods. SIAM, 1998

  42. [42]

    Li and S

    J. Li and S. X. Chen. Two sample tests for high-dimensional covariance matrices.The Annals of Statistics, 40(2):908–940, 2012

  43. [43]

    Li and Y

    M. Li and Y. Yu. Adversarially Robust Change Point Detection. InAdvances in Neural Information Processing Systems, volume 34, pages 22955–22967, 2021

  44. [44]

    S. Liu, D. Marinelli, L. Bruzzone, and F. Bovolo. A Review of Change Detection in Multitemporal Hyperspectral Images: Current techniques, applications, and challenges.IEEE Geoscience and Remote Sensing Magazine, 7(2):140–158, 2019

  45. [45]

    C. A. Lowry, W. H. Woodall, C. W. Champ, and S. E. Rigdon. A Multivariate Exponentially Weighted Moving Average Control Chart.Technometrics, 34(1):46–53, 1992

  46. [46]

    F. M. Megahed, L. J. Wells, J. A. Camelio, and W. H. Woodall. A Spatiotemporal Method for the Monitoring of Image Data.Quality and Reliability Engineering International, 28(8):967–980, 2012

  47. [47]

    Y. Mei. Efficient scalable schemes for monitoring a large number of data streams.Biometrika, 97(2): 419–433, 2010

  48. [48]

    Michailidis and F

    G. Michailidis and F. d’Alché Buc. Autoregressive models for gene regulatory network inference: Sparsity, stability and causality issues.Mathematical Biosciences, 246(2):326–334, 2013

  49. [49]

    C. M. Michel and M. M. Murray. Towards the utilization of EEG as a brain imaging tool.Neuroimage, 61(2):371–385, 2012

  50. [50]

    J. Munkres. Algorithms for the Assignment and Transportation Problems.Journal of the Society for Industrial and Applied Mathematics, 5(1):32–38, 1957

  51. [51]

    E. S. Page. Continuous Inspection Schemes.Biometrika, 41(1/2):100–115, 1954

  52. [52]

    Picklands

    J. Picklands. Statistical inference using extreme order statistics.The Annals of Statistics, 3(1): 119–131, 1975

  53. [53]

    Radford, J

    A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever. Learning Transferable Visual Models From Natural Language Supervision. InInternational Conference on Machine Learning, pages 8748–8763, 2021

  54. [54]

    S. E. Rigdon. An integral equation for the in-control average run length of a multivariate exponentially weighted moving average control chart.Journal of Statistical Computation and Simulation, 52(4): 351–365, 1995

  55. [55]

    G. C. Runger and S. S. Prabhu. A Markov Chain Model for the Multivariate Exponentially Weighted Moving Averages Control Chart.Journal of the American Statistical Association, 91(436):1701–1706, 1996

  56. [56]

    Ryan and R

    S. Ryan and R. Killick. Detecting Changes in Covariance via Random Matrix Theory.Technometrics, 65(4):480–491, 2023

  57. [57]

    P. J. Schmid. Dynamic mode decomposition of numerical and experimental data.Journal of Fluid Mechanics, 656:5–28, 2010

  58. [58]

    P. J. Schmid. Dynamic Mode Decomposition and Its Variants.Annual Review of Fluid Mechanics, 54(1):225–254, 2022

  59. [59]

    P. J. Schreier and L. L. Scharf.Statistical Signal Processing of Complex-Valued Data: The Theory of Improper and Noncircular Signals. Cambridge University Press, 2010. 14

  60. [60]

    X. Shi, C. Beaulieu, R. Killick, and R. Lund. Changepoint Detection: An Analysis of the Central England Temperature Series.Journal of Climate, 35(19):6329–6342, 2022

  61. [61]

    J. H. Stock and M. W. Watson. Vector Autoregressions.Journal of Economic Perspectives, 15(4): 101–115, 2001

  62. [62]

    Stoehr, J

    C. Stoehr, J. A. D. Aston, and C. Kirch. Detecting changes in the covariance structure of functional time series with application to fMRI data.Econometrics and Statistics, 18:44–62, 2021

  63. [63]

    Sultani, C

    W. Sultani, C. Chen, and M. Shah. Real-World Anomaly Detection in Surveillance Videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6479–6488, 2018

  64. [64]

    Takeishi, Y

    N. Takeishi, Y. Kawahara, and T. Yairi. Learning koopman Invariant Subspaces for Dynamic Mode Decomposition. InAdvances in Neural Information Processing Systems, volume 30, 2017

  65. [65]

    Tian and A

    Y. Tian and A. Safikhani. Sequential Change Point Detection in High-Dimensional Vector Auto- regressive Models.Statistica Sinica, 2024

  66. [66]

    G. J. J. Van den Burg and C. K. I. Williams. An Evaluation of Change Point Detection Algorithms. arXiv preprint arXiv:2003.06222, 2020

  67. [67]

    Wang and C

    Y. Wang and C. Goutte. Real-time Change Point Detection using On-line Topic Models. In Proceedings of the 27th International Conference on Computational Linguistics, pages 2505–2515, 2018

  68. [68]

    Zhang, C

    H. Zhang, C. W. Rowley, E. A. Deem, and L. N. Cattafesta. Online Dynamic Mode Decomposition for Time-Varying Systems.SIAM Journal on Applied Dynamical Systems, 18(3):1586–1609, 2019

  69. [69]

    Zhang, M

    Y. Zhang, M. J. Wainwright, and J. C. Duchi. Communication-Efficient Algorithms for Statistical Optimization.Journal of Machine Learning Research, 14(104):3321–3363, 2013

  70. [70]

    Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

    Y. Zhang, M. Li, D. Long, X. Zhang, H. Lin, B. Yang, P. Xie, A. Yang, D. Liu, J. Lin, F. Huang, and J. Zhou. Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models.arXiv preprint arXiv:2506.05176, 2025

  71. [71]

    Zheng and S

    X. Zheng and S. Mak. BLAST: Bayesian online change-point detection with structured image data. arXiv preprint arXiv:2504.09783, 2025. 15 A Extended background A.1 Vector autoregressive processes Definition 1.A d-dimensional vector autoregressive process of orderp, denoted VARd(p), d, p∈N, is a discrete-time stochastic process{yt}, wherey t ∈R d, following...

  72. [72]

    Compute the correlation matrixRasR ij = Σij/ p ΣiiΣjj

  73. [73]

    Drawz∼ N(0, R), a correlated standard normal vector

  74. [74]

    Transform to uniform marginals asu= Φ(z)element-wise

  75. [75]

    The result εt = X has marginal Laplace distributions with varianceΣii and cross-correlation structure induced byR

    Apply the inverse Laplacexi =F −1 Lap(ui |0, s i), withs i = √Σii/ √ 2. The result εt = X has marginal Laplace distributions with varianceΣii and cross-correlation structure induced byR. Note that the marginal variance of aLaplace(0, s)distribution is 2s2 = Σii, so the covariance matrix ofεt isΣby construction. Bivariate VAR with Student’stnoise Purpose.E...

  76. [76]

    The first and last transitions are discarded (start/end boundary artefacts)

  77. [77]

    Output.Each recording yields one time series of dimensiond = 3and length Ti, where Ti varies by recording, along with a list of changepoint indices

    The other transitions come in pairs (labelled→ unlabelled and unlabelled→ labelled); the first element of each pair is retained, corresponding to the onset of each labelled activity region. Output.Each recording yields one time series of dimensiond = 3and length Ti, where Ti varies by recording, along with a list of changepoint indices. We compute accurac...

  78. [78]

    DrawK+ 1 = 5distinct coarse classes uniformly without replacement

  79. [79]

    For each chosen superclass, randomly select one of its 5 fine classes

  80. [80]

    , K, and add i.i.d

    Compute nominal changepoint positionsτ nom k = k T /(K + 1)for k = 1, . . . , K, and add i.i.d. jitterj k ∼Uniform{−30, . . . ,+30}:τ k =τ nom k +j k

Showing first 80 references.