Cluster-Specific Localized Drift Detection for Efficient Batch Model Adaptation under Controlled Distribution Shift

Almas Baimagambetov; Ignacio Cabrera Martin; Marcello Trovati; Nikolaos Polatidis

arxiv: 2606.22026 · v1 · pith:V3DXNFDUnew · submitted 2026-06-20 · 💻 cs.LG · cs.AI

Cluster-Specific Localized Drift Detection for Efficient Batch Model Adaptation under Controlled Distribution Shift

Ignacio Cabrera Martin , Marcello Trovati , Almas Baimagambetov , Nikolaos Polatidis This is my paper

Pith reviewed 2026-06-26 11:47 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords distribution shift simulationdrift detectionmodel adaptationtabular datasetsclusteringADWINbatch retrainingconcept drift

0 comments

The pith

A simulation framework converts static tabular datasets into controlled evolving data streams by perturbing clustered feature partitions to enable evaluation of drift adaptation strategies.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Many machine learning applications encounter shifting data distributions over time, yet common tabular benchmarks have no built-in temporal evolution, making it hard to test adaptation techniques reproducibly. The paper introduces a framework that creates such streams from static data by first clustering to define feature space partitions and then applying structured perturbations to simulate distribution shifts. This setup is used to evaluate six adaptation strategies ranging from static models to localized drift detection methods across classification and regression tasks with various model families. If the framework holds, it offers a standardized way to compare how well different adaptation approaches handle controlled shifts without requiring inherently temporal datasets.

Core claim

The paper establishes a cluster-induced distribution shift simulation framework that transforms static tabular datasets into controlled evolving data streams through structured perturbations across feature space partitions, which then supports the systematic evaluation of six adaptation strategies including static learning, sliding-window retraining, global and cluster-local ADWIN retraining, random subspace drift detection, and feature-partitioned drift detection on five benchmark datasets.

What carries the argument

The cluster-induced distribution shift simulation framework that identifies feature space partitions via clustering and applies structured perturbations to simulate controlled distribution shifts in the generated data streams.

If this is right

Reproducible comparisons of adaptation strategies become possible on standard tabular benchmarks with known shift characteristics.
Cluster-local ADWIN retraining and feature-partitioned drift detection can be assessed for efficiency in batch model adaptation.
Performance of linear models, nearest neighbors, tree ensembles, boosting, and online learners can be tracked under the same simulated shifts.
The framework distinguishes between global and localized detection approaches in terms of their response to partition-specific changes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Such controlled simulations could help identify which adaptation methods scale best to different shift types before real deployment.
Extending the perturbation approach to other dataset types might broaden its utility in streaming machine learning research.
The emphasis on cluster-specific localization suggests potential efficiency gains in detecting and responding to localized drifts.

Load-bearing premise

The structured perturbations across feature space partitions produce distribution shifts that are controlled enough and representative enough to allow meaningful comparisons between adaptation strategies.

What would settle it

If experiments on real-world streaming datasets with natural temporal structure show that the relative performance of the six strategies reverses compared to the simulated streams, the framework's validity for guiding adaptation choices would be undermined.

Figures

Figures reproduced from arXiv: 2606.22026 by Almas Baimagambetov, Ignacio Cabrera Martin, Marcello Trovati, Nikolaos Polatidis.

**Figure 1.** Figure 1: Performance and Adaptation Effort Analysis for All Classification Datasets across Strategies S1-S6. : Preprint submitted to Elsevier Page 28 of 65 [PITH_FULL_IMAGE:figures/full_fig_p028_1.png] view at source ↗

**Figure 2.** Figure 2: Performance and Adaptation Effort Analysis for the Adult Dataset across Strategies S1-S6. : Preprint submitted to Elsevier Page 31 of 65 [PITH_FULL_IMAGE:figures/full_fig_p031_2.png] view at source ↗

**Figure 3.** Figure 3: Performance and Adaptation Effort Analysis for the Wine Quality Dataset across Strategies S1-S6. : Preprint submitted to Elsevier Page 36 of 65 [PITH_FULL_IMAGE:figures/full_fig_p036_3.png] view at source ↗

**Figure 4.** Figure 4: Performance and Adaptation Effort Analysis for the Breast Cancer Dataset across Strategies S1-S6. : Preprint submitted to Elsevier Page 41 of 65 [PITH_FULL_IMAGE:figures/full_fig_p041_4.png] view at source ↗

**Figure 5.** Figure 5: Performance and Adaptation Analysis for the Airfoil Self-Noise Dataset across Strategies S1-S6. : Preprint submitted to Elsevier Page 48 of 65 [PITH_FULL_IMAGE:figures/full_fig_p048_5.png] view at source ↗

**Figure 6.** Figure 6: Performance and Adaptation Effort Analysis for the Superconductivity dataset across Strategies S1-S6. : Preprint submitted to Elsevier Page 52 of 65 [PITH_FULL_IMAGE:figures/full_fig_p052_6.png] view at source ↗

**Figure 7.** Figure 7: Localized ADWIN adaptive window dynamics for stable and high-drift clusters under different robustness configurations. Sudden window collapses correspond to detected drift resets, while steadily increasing windows indicate stable feature-space regions. C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 Configuration ID Cl.0 Cl.1 Cl.2 Cl.3 Cl.4 Cl.5 Cl.6 Cl.7 Cl.8 Cl.9 Detector Cluster Centroid mismatch C11 C12 C13 C14 C15 C16… view at source ↗

**Figure 8.** Figure 8: Cumulative retraining effort per cluster (Adult dataset). Bright regions denote clusters dominating adaptation activity; dark regions indicate stable feature-space partitions requiring minimal intervention. : Preprint submitted to Elsevier Page 56 of 65 [PITH_FULL_IMAGE:figures/full_fig_p056_8.png] view at source ↗

**Figure 9.** Figure 9: Cumulative retraining effort per cluster (Superconductivity dataset). The localized concentration confirms that S4 selectively allocates resources to non-stationary regions, even in high-dimensional regression tasks [PITH_FULL_IMAGE:figures/full_fig_p057_9.png] view at source ↗

**Figure 10.** Figure 10: Mean update training time across benchmarking strategies [PITH_FULL_IMAGE:figures/full_fig_p066_10.png] view at source ↗

**Figure 11.** Figure 11: Heatmap of mean update training times for classification tasks across robustness settings and benchmarking strategies. : Preprint submitted to Elsevier Page 66 of 65 [PITH_FULL_IMAGE:figures/full_fig_p066_11.png] view at source ↗

**Figure 12.** Figure 12: Heatmap of mean update training times for regression tasks across robustness settings and benchmarking strategies. : Preprint submitted to Elsevier Page 67 of 65 [PITH_FULL_IMAGE:figures/full_fig_p067_12.png] view at source ↗

read the original abstract

Machine learning systems deployed in dynamic environments frequently operate under nonstationary data distributions, where controlled distribution shift can progressively degrade predictive performance. However, many widely used tabular benchmark datasets lack explicit temporal structure, limiting reproducible evaluation of drift adaptation methods. This work proposes a cluster-induced distribution shift simulation framework that transforms static tabular datasets into controlled evolving data streams through structured perturbations across featurespace partitions. Using this framework, six adaptation strategies are systematically evaluated: static learning, sliding-window retraining, global ADWIN retraining, cluster-local ADWIN retraining, random subspace drift detection, and feature-partitioned drift detection. Experiments are conducted on five benchmark datasets covering both classification and regression tasks using diverse predictive model families, including linear models, k-Nearest Neighbours, tree ensembles, boosting methods, and adaptive online learners.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces a cluster-based simulation framework to turn static tabular datasets into controlled drifting streams for benchmarking adaptation strategies.

read the letter

The main contribution here is a framework that takes static tabular data and creates evolving streams with controlled shifts by perturbing features within clusters. This setup then supports a direct comparison of six adaptation strategies on five datasets across classification and regression tasks.

The new element is the cluster-induced simulation approach itself, along with the systematic evaluation that includes cluster-local ADWIN retraining alongside global ADWIN, sliding windows, random subspace detection, and feature-partitioned detection. It covers a range of models from linear and KNN to tree ensembles, boosting, and online learners.

This addresses a practical gap where many tabular benchmarks have no built-in temporal structure, so the simulation method could help make drift adaptation tests more reproducible. The structured use of clusters for perturbations gives the shifts a bit more organization than purely random changes.

The soft spots are mostly around missing details in the abstract. No quantitative results, error bars, or specifics on perturbation strength appear here, so claims about efficiency gains for batch adaptation cannot be checked yet. The framework's value rests on whether the generated shifts feel representative enough in the full experiments; if they turn out too artificial, the comparisons lose some force. That said, the construction itself shows no internal contradictions.

This is aimed at people working on concept drift and online learning for tabular data who need better evaluation tools. A reader focused on benchmarking practices would find the most direct use.

The work shows clear thinking on the evaluation problem. It deserves peer review to examine the implementation and results.

Referee Report

0 major / 2 minor

Summary. The paper proposes a cluster-induced distribution shift simulation framework that transforms static tabular datasets into controlled evolving data streams through structured perturbations across feature-space partitions. Using this framework, it systematically evaluates six adaptation strategies—static learning, sliding-window retraining, global ADWIN retraining, cluster-local ADWIN retraining, random subspace drift detection, and feature-partitioned drift detection—on five benchmark datasets covering classification and regression tasks with diverse model families including linear models, kNN, tree ensembles, boosting methods, and adaptive online learners.

Significance. If the proposed simulation framework generates sufficiently controlled and representative distribution shifts, the work provides a valuable contribution by enabling reproducible evaluation of drift adaptation methods on otherwise static tabular benchmarks. The systematic comparison of localized versus global detection approaches could yield practical insights for efficient batch model adaptation under nonstationary conditions. The framework itself is a strength as an independent construction for generating evolving streams without relying on fitted parameters or circular assumptions.

minor comments (2)

[Abstract] Abstract: the description of the framework and evaluation plan is clear at a high level, but the absence of any quantitative results, error bars, or verification details in the provided abstract makes it impossible to assess the central claims of efficiency and effectiveness; the full methods and results sections are needed to substantiate the comparisons.
The weakest assumption noted—that the structured perturbations produce sufficiently controlled and representative shifts—is presented as the intended output of the framework rather than a hidden premise, which is appropriate, but the manuscript should explicitly state how reproducibility of the perturbation process is ensured (e.g., via fixed seeds or public code).

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of the cluster-induced distribution shift framework and its potential value for reproducible evaluation of adaptation strategies. The recommendation for minor revision is noted. No specific major comments were provided in the report, so we have no point-by-point responses to address at this time. We will incorporate any minor editorial or presentation improvements in the revised manuscript.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper proposes an explicit new framework for generating controlled distribution shifts via cluster-based perturbations on static tabular datasets, then applies it to evaluate six listed adaptation strategies on five benchmarks. No equations, fitted parameters, or derivations are described that reduce to self-definition or prior self-citations. The central construction (cluster-induced perturbations creating evolving streams) is presented as an independent methodological contribution rather than a result derived from its own outputs or unverified self-citations. The evaluation scope is external to the framework definition itself, satisfying the criteria for a self-contained proposal with no load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim depends on the simulation framework being a valid proxy for controlled drift; no free parameters or invented entities beyond the framework itself are mentioned.

axioms (1)

domain assumption Structured perturbations across feature space partitions can create controlled and reproducible distribution shifts suitable for evaluating adaptation methods.
Invoked to justify the framework's utility for systematic evaluation.

invented entities (1)

Cluster-induced distribution shift simulation framework no independent evidence
purpose: To transform static tabular datasets into controlled evolving data streams
Newly proposed method for addressing lack of temporal structure in benchmarks.

pith-pipeline@v0.9.1-grok · 5677 in / 1208 out tokens · 21735 ms · 2026-06-26T11:47:15.863965+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages · 1 internal anchor

[1]

Springer, pp

A framework for clustering evolving data streams, in: Data Streams: Models and Algorithms. Springer, pp. 81–92. doi:10.1016/B978-012722442-8/50016-1. Agrahari, S., Singh, A.K.,

work page doi:10.1016/b978-012722442-8/50016-1
[2]

Journal of King Saud University – Computer and Information Sciences 34, 9523–9540

Concept drift detection in data stream mining: A literature review. Journal of King Saud University – Computer and Information Sciences 34, 9523–9540. doi:10.1016/j.jksuci.2021.11.006. Aguiar,G.J.,Cano,A.,2023. Acomprehensiveanalysisofconceptdriftlocalityindatastreams. URL:https://arxiv.org/abs/2311.06396, arXiv:2311.06396. Baena-García, M., del Campo-Ávi...

work page doi:10.1016/j.jksuci.2021.11.006 2021
[3]

Knowledge-Based Systems 245, 108632

From concept drift to model degradation: An overview on performance-aware drift detectors. Knowledge-Based Systems 245, 108632. doi:10.1016/j.knosys.2022.108632. :Preprint submitted to Elsevier Page 63 of 65 Cluster-Specific Localized Drift Detection for Efficient Batch Model Adaptation under Controlled Distribution Shift Becker, B., Kohavi, R.,

work page doi:10.1016/j.knosys.2022.108632 2022
[4]

UCI Machine Learning Repository

Adult. UCI Machine Learning Repository. doi:10.24432/C5XW20. Bifet, A., Gavaldà, R.,

work page doi:10.24432/c5xw20
[5]

Learning from time-changing data with adaptive windowing, in: Proceedings of the 2007 SIAM International Conference on Data Mining, SIAM. pp. 443–448. doi:10.1137/1.9781611972771.42. Bifet, A., Gavaldà, R.,

work page doi:10.1137/1.9781611972771.42 2007
[6]

Adaptive learning from evolving data streams, in: Advances in Intelligent Data Analysis VIII, Springer. pp. 249–260. doi:10.1007/978-3-642-03915-7\_22. Bifet, A., Holmes, G., Pfahringer, B., Kranen, P., Kremer, H., Jansen, T., Seidl, T.,

work page doi:10.1007/978-3-642-03915-7
[7]

Breiman, Random forests, Mach

Random forests. Machine Learning 45, 5–32. doi:10.1023/A:1010933404324. Cabello-López, T., Cañizares-Juan, M., Carranza-García, M., Garcia-Gutiérrez, J., Riquelme, J.C.,

work page doi:10.1023/a:1010933404324
[8]

Concept drift detection to improve time series forecasting of wind energy generation, in: Lecture Notes in Computer Science, pp. 133–140. doi:10.1007/978-3-031-15471-3\_12. Chen, T., Guestrin, C.,

work page doi:10.1007/978-3-031-15471-3
[9]

Xgboost: A scalable tree boosting system, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. doi:10.1145/2939672.2939785. Cortez,P.,Cerdeira,A.,Almeida,F.,Matos,T.,Reis,J.,2009.Modelingwinepreferencesbydataminingfromphysicochemicalproperties.Decision Support Systems 47, 547–553. doi:10.1016...

work page doi:10.1145/2939672.2939785 2009
[10]

2004.838346

Nearest neighbor pattern classification. IEEE Transactions on Information Theory 13, 21–27. doi:10.1109/TIT. 1967.1053964. Ditzler,G.,Roveri,M.,Alippi,C.,Polikar,R.,2015. Learninginnonstationaryenvironments:Asurvey. IEEEComputationalIntelligenceMagazine 10, 12–25. doi:10.1109/MCI.2015.2471196. Domingos, P., Hulten, G.,

work page doi:10.1109/tit 1967
[11]

Mining high-speed data streams, in: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM. pp. 71–80. doi:10.1145/347090.347107. Gama,J.,2010. KnowledgeDiscoveryfromDataStreams. 1ed.,ChapmanandHall/CRC,BocaRaton,FL. URL:https://doi.org/10.1201/ EBK1439826119, doi:10.1201/EBK1439826119. Gama, J., Medas, P....

work page doi:10.1145/347090.347107 2010
[12]

Learning with drift detection, in: Advances in Artificial Intelligence – SBIA 2004, Springer. pp. 286–295. doi:10.1007/978-3-540-28645-5\_29. Gama, J., Žliobait˙e, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.,

work page doi:10.1007/978-3-540-28645-5 2004
[13]

García-Teodoro, P., Díaz-Verdejo, J., Maciá-Fernández, G., & Vázquez, E

A survey on concept drift adaptation. ACM Computing Surveys 46, 44:1–44:37. doi:10.1145/2523813. Géron, A.,

work page doi:10.1145/2523813
[14]

Machine Learning 106, 1469–1495

Adaptive random forests for evolving data stream classification. Machine Learning 106, 1469–1495. doi:10.1007/s10994-017-5642-8. Hancock, J.T., Khoshgoftaar, T.M.,

work page doi:10.1007/s10994-017-5642-8
[15]

Haque, A., Khan, L., Baron, M.,

URL:https://doi.org/ 10.1186/s40537-020-00305-w, doi:10.1186/s40537-020-00305-w. Haque, A., Khan, L., Baron, M.,

work page doi:10.1186/s40537-020-00305-w
[16]

Hastie, T., Tibshirani, R., Friedman, J.,

doi:10.1609/aaai.v30i1.10283. Hastie, T., Tibshirani, R., Friedman, J.,

work page doi:10.1609/aaai.v30i1.10283
[17]

2 ed., Springer

The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2 ed., Springer. doi:10.1007/978-0-387-21606-5. Hinder, F., Vaquet, V., Hammer, B.,

work page doi:10.1007/978-0-387-21606-5
[18]

Ho, T.K.,

doi:10.3389/frai.2024.1330258. Ho, T.K.,

work page doi:10.3389/frai.2024.1330258 2024
[19]

Applied Logistic Regression. Wiley. doi:10.1002/0471722146. James, G., Witten, D., Hastie, T., Tibshirani, R.,

work page doi:10.1002/0471722146
[20]

p τ “1,γ 1 “0.5,epZq

An Introduction to Statistical Learning: with Applications in R. Springer, New York. doi:10.1007/978-1-4614-7138-7. Kam Hamidieh,

work page doi:10.1007/978-1-4614-7138-7
[21]

On handling concept drift, calibration and explainability in non-stationary environments and resources limited contexts, in: Proceedings of the 16th International Conference on Agents and Artificial Intelligence, pp. 336–346. doi:10.5220/0012382200003636. Khannouz,M.,Glatard,T.,2020. Abenchmarkofdatastreamclassificationforhumanactivityrecognitiononconnect...

work page doi:10.5220/0012382200003636 2020
[22]

2 ed., Wiley

Statistical Analysis with Missing Data. 2 ed., Wiley. doi:10.1002/9781119013563. Liu, A., Lu, J., Zhang, G.,

work page doi:10.1002/9781119013563
[23]

IEEE Transactions on Cybernetics 51, 3198–3211

Concept drift detection via equal intensity k-means space partitioning. IEEE Transactions on Cybernetics 51, 3198–3211. doi:10.1109/TCYB.2020.2983962. Liu, A., Song, Y., Zhang, G., Lu, J.,

work page doi:10.1109/tcyb.2020.2983962 2020
[24]

Proceedings of the Twenty- Sixth International Joint Conference on Artificial Intelligence , 2280–2286doi:10.24963/ijcai.2017/317

Regional concept drift detection and density synchronized drift adaptation. Proceedings of the Twenty- Sixth International Joint Conference on Artificial Intelligence , 2280–2286doi:10.24963/ijcai.2017/317. Losing, V., Hammer, B., Wersing, H.,

work page doi:10.24963/ijcai.2017/317 2017
[25]

Neurocomputing 275, 1261–1274

Incremental on-line learning: A review and comparison of the state of the art algorithms. Neurocomputing 275, 1261–1274. doi:10.1016/j.neucom.2017.06.084. Lu, J., Liu, A., Dong, F., Gu, F., Gama, J., Zhang, G.,

work page doi:10.1016/j.neucom.2017.06.084 2017
[26]

IEEE Transactions on Knowledge and Data Engineering 31, 2346–2363

Learning under concept drift: A review. IEEE Transactions on Knowledge and Data Engineering 31, 2346–2363. doi:10.1109/TKDE.2018.2876857. :Preprint submitted to Elsevier Page 64 of 65 Cluster-Specific Localized Drift Detection for Efficient Batch Model Adaptation under Controlled Distribution Shift Mehmood,H.,Kostakos,P.,Cortes,M.,Anagnostopoulos,T.,Pirtt...

work page doi:10.1109/tkde.2018.2876857 2018
[27]

2340–2345

Online bagging and boosting, in: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, pp. 2340–2345. doi:10.1109/ICSMC.2005.1571498. Pesaranghader, A., Viktor, H.L., Paquet, E.,

work page doi:10.1109/icsmc.2005.1571498 2005
[28]

McDiarmid drift detection methods for evolving data streams, in: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–9. doi:10.1109/IJCNN.2018.8489260. Read, J., Bifet, A., Pfahringer, B., Holmes, G.,

work page doi:10.1109/ijcnn.2018.8489260 2018
[29]

Batch-incremental versus instance-incremental learning in dynamic and evolving data, in: Advances in Intelligent Data Analysis XI, Springer. pp. 313–323. doi:10.1007/978-3-642-34156-4\_29. Ross, G.J., Adams, N.M., Tasoulis, D.K., Hand, D.J.,

work page doi:10.1007/978-3-642-34156-4
[30]

Exponentially weighted moving average charts for detecting concept drift

Exponentially weighted moving average charts for detecting concept drift. Pattern Recognition Letters 33, 191–198. doi:10.1016/j.patrec.2011.08.019. Schlimmer, J.C., Granger, R.H.,

work page doi:10.1016/j.patrec.2011.08.019 2011
[31]

Machine Learning 1, 317–354

Incremental learning from noisy data. Machine Learning 1, 317–354. doi:10.1007/BF00116895. Sethi, T.S., Kantardzic, M.,

work page doi:10.1007/bf00116895
[32]

Expert Systems with Applications 82, 77–99

On the reliable detection of concept drift from streaming unlabeled data. Expert Systems with Applications 82, 77–99. doi:10.1016/j.eswa.2017.04.008. Souza,V.M.A.,dosReis,D.M.,Maletzke,A.G.,Batista,G.E.A.P.A.,2020. Challengesinbenchmarkingstreamlearningalgorithmswithreal-world data. Data Mining and Knowledge Discovery 34, 1805–1858. doi:10.1007/s10618-020...

work page doi:10.1016/j.eswa.2017.04.008 2017
[33]

OLINDDA: A cluster-based approach for detecting novelty and concept drift in data streams, in: Proceedings of the 2007 ACM Symposium on Applied Computing, ACM. pp. 448–452. doi:10.1145/1244002.1244107. Street, W.N., Kim, Y.,

work page doi:10.1145/1244002.1244107 2007
[34]

A streaming ensemble algorithm (SEA) for large-scale classification, in: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM. pp. 377–382. doi:10.1145/502512.502568. Street,W.N.,Wolberg,W.H.,Mangasarian,O.L.,1993. Nuclearfeatureextractionforbreasttumordiagnosis,in:BiomedicalImageProcessingand Biomedic...

work page doi:10.1145/502512.502568 1993
[35]

URL:https://archive.ics.uci.edu/dataset/291, doi:10.24432/C5VW2C

Airfoil Self-Noise. URL:https://archive.ics.uci.edu/dataset/291, doi:10.24432/C5VW2C. Vinagre, J., Jorge, A.M., Gama, J.,

work page doi:10.24432/c5vw2c
[36]

Evaluation of recommender systems in streaming environments, in: Proceedings of the 9th Workshop on Real-Time Business Intelligence and Analytics, pp. 1–8. doi:10.1145/2611286.2611299. Wang, H., Fan, W., Yu, P.S., Han, J.,

work page doi:10.1145/2611286.2611299
[37]

Mining concept-drifting data streams using ensemble classifiers, in: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM. pp. 226–235. doi:10.1145/956750.956778. Webb,G.I.,Hyde,R.,Cao,H.,Nguyen,H.L.,Petitjean,F.,2016. Characterizingconceptdrift. DataMiningandKnowledgeDiscovery30,964–994. doi:10.1007/s10...

work page doi:10.1145/956750.956778 2016
[38]

Learning under Concept Drift: an Overview

Learning under concept drift: An overview. arXiv preprint arXiv:1010.4784. doi:10.48550/arXiv.1010.4784. Žliobait˙e, I., Bifet, A., Pfahringer, B., Holmes, G.,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1010.4784
[39]

IEEE Transactions on Neural Networks and Learning Systems 25, 27–39

Active learning with drifting streaming data. IEEE Transactions on Neural Networks and Learning Systems 25, 27–39. doi:10.1109/TNNLS.2012.2236570. :Preprint submitted to Elsevier Page 65 of 65 Cluster-Specific Localized Drift Detection for Efficient Batch Model Adaptation under Controlled Distribution Shift Figure 10:Mean update training time across bench...

work page doi:10.1109/tnnls.2012.2236570 2012

[1] [1]

Springer, pp

A framework for clustering evolving data streams, in: Data Streams: Models and Algorithms. Springer, pp. 81–92. doi:10.1016/B978-012722442-8/50016-1. Agrahari, S., Singh, A.K.,

work page doi:10.1016/b978-012722442-8/50016-1

[2] [2]

Journal of King Saud University – Computer and Information Sciences 34, 9523–9540

Concept drift detection in data stream mining: A literature review. Journal of King Saud University – Computer and Information Sciences 34, 9523–9540. doi:10.1016/j.jksuci.2021.11.006. Aguiar,G.J.,Cano,A.,2023. Acomprehensiveanalysisofconceptdriftlocalityindatastreams. URL:https://arxiv.org/abs/2311.06396, arXiv:2311.06396. Baena-García, M., del Campo-Ávi...

work page doi:10.1016/j.jksuci.2021.11.006 2021

[3] [3]

Knowledge-Based Systems 245, 108632

From concept drift to model degradation: An overview on performance-aware drift detectors. Knowledge-Based Systems 245, 108632. doi:10.1016/j.knosys.2022.108632. :Preprint submitted to Elsevier Page 63 of 65 Cluster-Specific Localized Drift Detection for Efficient Batch Model Adaptation under Controlled Distribution Shift Becker, B., Kohavi, R.,

work page doi:10.1016/j.knosys.2022.108632 2022

[4] [4]

UCI Machine Learning Repository

Adult. UCI Machine Learning Repository. doi:10.24432/C5XW20. Bifet, A., Gavaldà, R.,

work page doi:10.24432/c5xw20

[5] [5]

Learning from time-changing data with adaptive windowing, in: Proceedings of the 2007 SIAM International Conference on Data Mining, SIAM. pp. 443–448. doi:10.1137/1.9781611972771.42. Bifet, A., Gavaldà, R.,

work page doi:10.1137/1.9781611972771.42 2007

[6] [6]

Adaptive learning from evolving data streams, in: Advances in Intelligent Data Analysis VIII, Springer. pp. 249–260. doi:10.1007/978-3-642-03915-7\_22. Bifet, A., Holmes, G., Pfahringer, B., Kranen, P., Kremer, H., Jansen, T., Seidl, T.,

work page doi:10.1007/978-3-642-03915-7

[7] [7]

Breiman, Random forests, Mach

Random forests. Machine Learning 45, 5–32. doi:10.1023/A:1010933404324. Cabello-López, T., Cañizares-Juan, M., Carranza-García, M., Garcia-Gutiérrez, J., Riquelme, J.C.,

work page doi:10.1023/a:1010933404324

[8] [8]

Concept drift detection to improve time series forecasting of wind energy generation, in: Lecture Notes in Computer Science, pp. 133–140. doi:10.1007/978-3-031-15471-3\_12. Chen, T., Guestrin, C.,

work page doi:10.1007/978-3-031-15471-3

[9] [9]

Xgboost: A scalable tree boosting system, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. doi:10.1145/2939672.2939785. Cortez,P.,Cerdeira,A.,Almeida,F.,Matos,T.,Reis,J.,2009.Modelingwinepreferencesbydataminingfromphysicochemicalproperties.Decision Support Systems 47, 547–553. doi:10.1016...

work page doi:10.1145/2939672.2939785 2009

[10] [10]

2004.838346

Nearest neighbor pattern classification. IEEE Transactions on Information Theory 13, 21–27. doi:10.1109/TIT. 1967.1053964. Ditzler,G.,Roveri,M.,Alippi,C.,Polikar,R.,2015. Learninginnonstationaryenvironments:Asurvey. IEEEComputationalIntelligenceMagazine 10, 12–25. doi:10.1109/MCI.2015.2471196. Domingos, P., Hulten, G.,

work page doi:10.1109/tit 1967

[11] [11]

Mining high-speed data streams, in: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM. pp. 71–80. doi:10.1145/347090.347107. Gama,J.,2010. KnowledgeDiscoveryfromDataStreams. 1ed.,ChapmanandHall/CRC,BocaRaton,FL. URL:https://doi.org/10.1201/ EBK1439826119, doi:10.1201/EBK1439826119. Gama, J., Medas, P....

work page doi:10.1145/347090.347107 2010

[12] [12]

Learning with drift detection, in: Advances in Artificial Intelligence – SBIA 2004, Springer. pp. 286–295. doi:10.1007/978-3-540-28645-5\_29. Gama, J., Žliobait˙e, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.,

work page doi:10.1007/978-3-540-28645-5 2004

[13] [13]

García-Teodoro, P., Díaz-Verdejo, J., Maciá-Fernández, G., & Vázquez, E

A survey on concept drift adaptation. ACM Computing Surveys 46, 44:1–44:37. doi:10.1145/2523813. Géron, A.,

work page doi:10.1145/2523813

[14] [14]

Machine Learning 106, 1469–1495

Adaptive random forests for evolving data stream classification. Machine Learning 106, 1469–1495. doi:10.1007/s10994-017-5642-8. Hancock, J.T., Khoshgoftaar, T.M.,

work page doi:10.1007/s10994-017-5642-8

[15] [15]

Haque, A., Khan, L., Baron, M.,

URL:https://doi.org/ 10.1186/s40537-020-00305-w, doi:10.1186/s40537-020-00305-w. Haque, A., Khan, L., Baron, M.,

work page doi:10.1186/s40537-020-00305-w

[16] [16]

Hastie, T., Tibshirani, R., Friedman, J.,

doi:10.1609/aaai.v30i1.10283. Hastie, T., Tibshirani, R., Friedman, J.,

work page doi:10.1609/aaai.v30i1.10283

[17] [17]

2 ed., Springer

The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2 ed., Springer. doi:10.1007/978-0-387-21606-5. Hinder, F., Vaquet, V., Hammer, B.,

work page doi:10.1007/978-0-387-21606-5

[18] [18]

Ho, T.K.,

doi:10.3389/frai.2024.1330258. Ho, T.K.,

work page doi:10.3389/frai.2024.1330258 2024

[19] [19]

Applied Logistic Regression. Wiley. doi:10.1002/0471722146. James, G., Witten, D., Hastie, T., Tibshirani, R.,

work page doi:10.1002/0471722146

[20] [20]

p τ “1,γ 1 “0.5,epZq

An Introduction to Statistical Learning: with Applications in R. Springer, New York. doi:10.1007/978-1-4614-7138-7. Kam Hamidieh,

work page doi:10.1007/978-1-4614-7138-7

[21] [21]

On handling concept drift, calibration and explainability in non-stationary environments and resources limited contexts, in: Proceedings of the 16th International Conference on Agents and Artificial Intelligence, pp. 336–346. doi:10.5220/0012382200003636. Khannouz,M.,Glatard,T.,2020. Abenchmarkofdatastreamclassificationforhumanactivityrecognitiononconnect...

work page doi:10.5220/0012382200003636 2020

[22] [22]

2 ed., Wiley

Statistical Analysis with Missing Data. 2 ed., Wiley. doi:10.1002/9781119013563. Liu, A., Lu, J., Zhang, G.,

work page doi:10.1002/9781119013563

[23] [23]

IEEE Transactions on Cybernetics 51, 3198–3211

Concept drift detection via equal intensity k-means space partitioning. IEEE Transactions on Cybernetics 51, 3198–3211. doi:10.1109/TCYB.2020.2983962. Liu, A., Song, Y., Zhang, G., Lu, J.,

work page doi:10.1109/tcyb.2020.2983962 2020

[24] [24]

Proceedings of the Twenty- Sixth International Joint Conference on Artificial Intelligence , 2280–2286doi:10.24963/ijcai.2017/317

Regional concept drift detection and density synchronized drift adaptation. Proceedings of the Twenty- Sixth International Joint Conference on Artificial Intelligence , 2280–2286doi:10.24963/ijcai.2017/317. Losing, V., Hammer, B., Wersing, H.,

work page doi:10.24963/ijcai.2017/317 2017

[25] [25]

Neurocomputing 275, 1261–1274

Incremental on-line learning: A review and comparison of the state of the art algorithms. Neurocomputing 275, 1261–1274. doi:10.1016/j.neucom.2017.06.084. Lu, J., Liu, A., Dong, F., Gu, F., Gama, J., Zhang, G.,

work page doi:10.1016/j.neucom.2017.06.084 2017

[26] [26]

IEEE Transactions on Knowledge and Data Engineering 31, 2346–2363

Learning under concept drift: A review. IEEE Transactions on Knowledge and Data Engineering 31, 2346–2363. doi:10.1109/TKDE.2018.2876857. :Preprint submitted to Elsevier Page 64 of 65 Cluster-Specific Localized Drift Detection for Efficient Batch Model Adaptation under Controlled Distribution Shift Mehmood,H.,Kostakos,P.,Cortes,M.,Anagnostopoulos,T.,Pirtt...

work page doi:10.1109/tkde.2018.2876857 2018

[27] [27]

2340–2345

Online bagging and boosting, in: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, pp. 2340–2345. doi:10.1109/ICSMC.2005.1571498. Pesaranghader, A., Viktor, H.L., Paquet, E.,

work page doi:10.1109/icsmc.2005.1571498 2005

[28] [28]

McDiarmid drift detection methods for evolving data streams, in: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–9. doi:10.1109/IJCNN.2018.8489260. Read, J., Bifet, A., Pfahringer, B., Holmes, G.,

work page doi:10.1109/ijcnn.2018.8489260 2018

[29] [29]

Batch-incremental versus instance-incremental learning in dynamic and evolving data, in: Advances in Intelligent Data Analysis XI, Springer. pp. 313–323. doi:10.1007/978-3-642-34156-4\_29. Ross, G.J., Adams, N.M., Tasoulis, D.K., Hand, D.J.,

work page doi:10.1007/978-3-642-34156-4

[30] [30]

Exponentially weighted moving average charts for detecting concept drift

Exponentially weighted moving average charts for detecting concept drift. Pattern Recognition Letters 33, 191–198. doi:10.1016/j.patrec.2011.08.019. Schlimmer, J.C., Granger, R.H.,

work page doi:10.1016/j.patrec.2011.08.019 2011

[31] [31]

Machine Learning 1, 317–354

Incremental learning from noisy data. Machine Learning 1, 317–354. doi:10.1007/BF00116895. Sethi, T.S., Kantardzic, M.,

work page doi:10.1007/bf00116895

[32] [32]

Expert Systems with Applications 82, 77–99

On the reliable detection of concept drift from streaming unlabeled data. Expert Systems with Applications 82, 77–99. doi:10.1016/j.eswa.2017.04.008. Souza,V.M.A.,dosReis,D.M.,Maletzke,A.G.,Batista,G.E.A.P.A.,2020. Challengesinbenchmarkingstreamlearningalgorithmswithreal-world data. Data Mining and Knowledge Discovery 34, 1805–1858. doi:10.1007/s10618-020...

work page doi:10.1016/j.eswa.2017.04.008 2017

[33] [33]

OLINDDA: A cluster-based approach for detecting novelty and concept drift in data streams, in: Proceedings of the 2007 ACM Symposium on Applied Computing, ACM. pp. 448–452. doi:10.1145/1244002.1244107. Street, W.N., Kim, Y.,

work page doi:10.1145/1244002.1244107 2007

[34] [34]

A streaming ensemble algorithm (SEA) for large-scale classification, in: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM. pp. 377–382. doi:10.1145/502512.502568. Street,W.N.,Wolberg,W.H.,Mangasarian,O.L.,1993. Nuclearfeatureextractionforbreasttumordiagnosis,in:BiomedicalImageProcessingand Biomedic...

work page doi:10.1145/502512.502568 1993

[35] [35]

URL:https://archive.ics.uci.edu/dataset/291, doi:10.24432/C5VW2C

Airfoil Self-Noise. URL:https://archive.ics.uci.edu/dataset/291, doi:10.24432/C5VW2C. Vinagre, J., Jorge, A.M., Gama, J.,

work page doi:10.24432/c5vw2c

[36] [36]

Evaluation of recommender systems in streaming environments, in: Proceedings of the 9th Workshop on Real-Time Business Intelligence and Analytics, pp. 1–8. doi:10.1145/2611286.2611299. Wang, H., Fan, W., Yu, P.S., Han, J.,

work page doi:10.1145/2611286.2611299

[37] [37]

Mining concept-drifting data streams using ensemble classifiers, in: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM. pp. 226–235. doi:10.1145/956750.956778. Webb,G.I.,Hyde,R.,Cao,H.,Nguyen,H.L.,Petitjean,F.,2016. Characterizingconceptdrift. DataMiningandKnowledgeDiscovery30,964–994. doi:10.1007/s10...

work page doi:10.1145/956750.956778 2016

[38] [38]

Learning under Concept Drift: an Overview

Learning under concept drift: An overview. arXiv preprint arXiv:1010.4784. doi:10.48550/arXiv.1010.4784. Žliobait˙e, I., Bifet, A., Pfahringer, B., Holmes, G.,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1010.4784

[39] [39]

IEEE Transactions on Neural Networks and Learning Systems 25, 27–39

Active learning with drifting streaming data. IEEE Transactions on Neural Networks and Learning Systems 25, 27–39. doi:10.1109/TNNLS.2012.2236570. :Preprint submitted to Elsevier Page 65 of 65 Cluster-Specific Localized Drift Detection for Efficient Batch Model Adaptation under Controlled Distribution Shift Figure 10:Mean update training time across bench...

work page doi:10.1109/tnnls.2012.2236570 2012