Evaluating and Combating the Impact of Concept Drift on the Performance of Machine Learning-Based Phishing Detection Systems
Pith reviewed 2026-06-27 12:16 UTC · model grok-4.3
The pith
Evolution in spam emails causes concept drift that degrades machine learning phishing detectors.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The evolution within the spam email domain impacts the performance of machine learning-based detection systems, and strategies can be explored for mitigating associated performance degradation.
What carries the argument
Concept drift arising from the changing statistical properties of spam and phishing emails, which creates a mismatch between training data and new messages seen by deployed models.
If this is right
- Machine learning models trained on historical spam data will show declining accuracy on newer phishing attempts.
- Periodic model retraining or adaptive learning methods become necessary to restore performance.
- Email security pipelines must incorporate mechanisms to detect and respond to domain evolution.
Where Pith is reading between the lines
- The same drift mechanism is likely to affect machine learning detectors in related domains such as malware classification.
- Continuous performance monitoring against live email traffic would be a direct practical next step.
- Feature sets that are more invariant to attacker evolution could reduce the frequency of required updates.
Load-bearing premise
That measurable concept drift occurs in the phishing domain and produces clear performance degradation in standard machine learning models.
What would settle it
A time-series experiment that trains fixed machine learning models on older phishing data and measures their detection rates on progressively newer emails, showing no accuracy drop.
Figures
read the original abstract
The expansion of the digital domain has resulted in a substantial increase in digital communication, with email emerging as one of the most prominent channels. The proliferation of email communication is apparent in both professional and personal contexts, thereby creating numerous vulnerabilities for malicious actors to exploit. Spam emails, a form of unsolicited correspondence often bearing malicious intent towards recipients, have been an ongoing challenge for email users since the inception of email technology, and this problem has been exacerbated by the growth of the digital landscape. Email spam filters are integral components of email clients, engineered to identify potentially harmful messages and alert users to their malicious content. Phishing, frequently the initial phase of malware-based attacks, is evolving rapidly, with malware becoming increasingly sophisticated over time. A widely adopted approach for detecting malicious activity within malware and spam domains is the application of machine learning. Our aim is to assess the impact of the evolution within the spam email domain on these machine learning-based detection systems and to explore strategies for mitigating associated performance degradation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that the evolution within the spam email domain impacts the performance of machine learning-based detection systems and that strategies can be explored for mitigating associated performance degradation due to concept drift in phishing detection.
Significance. If the results hold, the work would address a practically important issue in cybersecurity, where phishing remains a leading attack vector and static ML detectors are known to degrade over time in adversarial settings. Quantifying drift effects and validating mitigations could inform more robust detection pipelines.
major comments (2)
- [Abstract / Manuscript body] Abstract and overall manuscript: The central claim requires empirical demonstration via time-ordered datasets, drift quantification (e.g., via Hellinger distance or ADWIN), statistically significant performance drops (accuracy/F1 on future vs. past test sets), and at least one evaluated mitigation, but no datasets, temporal splits, models, metrics, or results are provided anywhere in the text.
- [Methods / Experiments] Methods and evaluation (absent): No description of feature sets, ML algorithms, evaluation protocol, or mitigation approaches (e.g., online learning, drift detection triggers) is given, preventing assessment of whether observed degradation is due to concept drift rather than label shift or other confounds.
minor comments (1)
- [Abstract] Abstract: The opening sentences are somewhat repetitive regarding the growth of digital communication and email; tightening would improve readability.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive review. We acknowledge that the submitted manuscript consists primarily of a high-level motivation and statement of aims without the empirical components, datasets, methods, or results needed to substantiate the central claims about concept drift in phishing detection. We will revise the manuscript to incorporate the required elements.
read point-by-point responses
-
Referee: [Abstract / Manuscript body] Abstract and overall manuscript: The central claim requires empirical demonstration via time-ordered datasets, drift quantification (e.g., via Hellinger distance or ADWIN), statistically significant performance drops (accuracy/F1 on future vs. past test sets), and at least one evaluated mitigation, but no datasets, temporal splits, models, metrics, or results are provided anywhere in the text.
Authors: We agree that the current manuscript does not contain the requested empirical demonstration. The provided text is limited to the abstract and motivation. In revision we will add time-ordered datasets with explicit temporal splits, drift quantification (including Hellinger distance and ADWIN), statistical tests on performance degradation (accuracy/F1 on future versus past sets), and evaluation of at least one mitigation strategy. revision: yes
-
Referee: [Methods / Experiments] Methods and evaluation (absent): No description of feature sets, ML algorithms, evaluation protocol, or mitigation approaches (e.g., online learning, drift detection triggers) is given, preventing assessment of whether observed degradation is due to concept drift rather than label shift or other confounds.
Authors: We concur that the methods and evaluation sections are absent. The revised manuscript will include detailed descriptions of feature sets, ML algorithms, the temporal evaluation protocol, mitigation approaches (online learning, drift detection triggers), and explicit discussion of potential confounds such as label shift to isolate concept drift effects. revision: yes
Circularity Check
No derivation chain or equations present; empirical claim shows no circularity.
full rationale
The supplied abstract and description contain only a high-level research aim to assess concept drift impact on ML phishing detectors and explore mitigations, with no equations, fitted parameters, predictions, self-citations, or uniqueness theorems invoked. No load-bearing step reduces to its own inputs by construction, as there is no mathematical derivation or renamed empirical pattern at all. This is the common case of a non-circular empirical proposal whose validity depends on future experiments rather than internal self-reference.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
R. Chen, J. Gaia, and H. R. Rao, ”An examination of the effect of recent phishing encounters on phishing susceptibility,” Decis. Support Syst., 2020, doi: 10.1016/j.dss.2020.113287
-
[2]
S. Patil and S. Dhage, ”A Methodical Overview on Phishing Detection along with an Orga- nized Way to Construct an Anti-Phishing Framework,” 2019 5th International Conference on Ad- vanced Computing & Communication Systems (ICACCS), Coimbatore, India, 2019, pp. 588-593, doi: 10.1109/ICACCS.2019.8728356
-
[3]
Available online: https://www.thesslstore.com/blog/phishing-statistic/
C.Crane, ”Phishing Statistics: The 21 Latest Phishing Stats you need to know in 2024, 2024. Available online: https://www.thesslstore.com/blog/phishing-statistic/
2024
-
[4]
Cofence email security, ”The history of Phishing”, 2024 Available online: https://cofense.com/knowledge-center/history-of-phishing
2024
-
[6]
6. D.W. Fernando, N. Komninos, FeSAD ransomware detection framework with machine learning using adaption to concept drift, Computers & Security, Volume 137, 2024, 103629, ISSN 0167-4048, https://doi.org/10.1016/j.cose.2023.103629
-
[8]
Kosmopoulos, G
A. Kosmopoulos, G. Paliouras, and I. Androutsopoulos, ”Adaptive spam filtering using only na¨ ıve bayes text classifiers,” In Proceedings of the Conference on Email and Anti-Spam, 2008
2008
-
[9]
P. Saraswat and M. Singh Solanki, ”Phishing Detection in Emails using Machine Learning,” 2022 2nd International Conference on Technological Advancements in Computational Sciences (ICTACS), Tashkent, Uzbekistan, 2022, pp. 420-424, doi: 10.1109/ICTACS56270.2022.9987839. 22
-
[10]
I. Saha, D. Sarma, R. J. Chakma, M. N. Alam, A. Sultana and S. Hossain, ”Phishing Attacks Detection using Deep Learning Approach,” 2020 Third International Conference on Smart Sys- tems and Inventive Technology (ICSSIT), Tirunelveli, India, 2020, pp. 1180-1185, doi: 10.1109/IC- SSIT48917.2020.9214132
work page doi:10.1109/ic- 2020
-
[11]
M. R. Ridho and H. H. Nuha, ”Application of Extreme Learning Machine (ELM) Classification in Detecting Phishing Sites,” 2022 5th International Conference of Computer and Informatics Engineering (IC2IE), Jakarta, Indonesia, 2022, pp. 60-64, doi: 10.1109/IC2IE56416.2022.9970191
-
[12]
A. Lakshmanarao, P. S. P. Rao and M. M. B. Krishna, ”Phishing website detection using novel machine learning fusion approach,” 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS), Coimbatore, India, 2021, pp. 1164-1169, doi: 10.1109/ICAIS50930.2021.9395810
-
[13]
13. I. Ortiz Garc´ es, M. F. Cazares and R. O. Andrade, ”Detection of Phishing Attacks with Machine Learning Techniques in Cognitive Security Architecture,” 2019 International Conference on Computa- tional Science and Computational Intelligence (CSCI), Las Vegas, NV, USA, 2019, pp. 366-370, doi: 10.1109/CSCI49370.2019.00071
-
[14]
14. H. Bouijij and A. Berqia, ”Machine Learning Algorithms Evaluation for Phishing URLs Classifi- cation,” 2021 4th International Symposium on Advanced Electrical and Communication Technologies (ISAECT), Alkhobar, Saudi Arabia, 2021, pp. 01-05, doi: 10.1109/ISAECT53699.2021.9668489
-
[15]
A. Chandra, Gregorius, M. S. J. Immanuel, A. A. S. Gunawan and Anderies, ”Accuracy Comparison of Different Machine Learning Models in Phishing Detection,” 2022 5th International Conference on Information and Communications Technology (ICOIACT), Yogyakarta, Indonesia, 2022, pp. 24-29, doi: 10.1109/ICOIACT55506.2022.9972107
-
[16]
S. Bagui, D. Nandi, S. Bagui and R. J. White, ”Classifying Phishing Email Using Machine Learn- ing and Deep Learning,” 2019 International Conference on Cyber Security and Protection of Digital Services (Cyber Security), Oxford, UK, 2019, pp. 1-2, doi: 10.1109/CyberSecPODS.2019.8885143
-
[17]
P. Chinnasamy, N. Kumaresan, R. Selvaraj, S. Dhanasekaran, K. Ramprathap and S. Boddu, ”An Efficient Phishing Attack Detection using Machine Learning Algorithms,” 2022 International Confer- ence on Advancements in Smart, Secure and Intelligent Computing (ASSIC), Bhubaneswar, India, 2022, pp. 1-6, doi: 10.1109/ASSIC55218.2022.10088399
-
[18]
S. Anakal, K. Maka, A. Tadkal, S. Humanabad, S. Anakal and E. Laxmikant, ”Phishing Website Detection Using Machine Learning Methods,” 2023 International Conference on Integrated Intelli- gence and Communication Systems (ICIICS), Kalaburagi, India, 2023, pp. 1-5, doi: 10.1109/ICI- ICS59993.2023.10420933
-
[19]
S. Uplenchwar, V. Sawant, P. Surve, S. Deshpande and S. Kelkar, ”Phishing Attack Detection on Text Messages Using Machine Learning Techniques,” 2022 IEEE Pune Section International Conference (PuneCon), Pune, India, 2022, pp. 1-5, doi: 10.1109/PuneCon55413.2022.10014876
-
[20]
Bostoganashvili, The Whys and How of Email Spam Filters, 2024, Available online: https://mailtrap.io/blog/spam-filters/
K. Bostoganashvili, The Whys and How of Email Spam Filters, 2024, Available online: https://mailtrap.io/blog/spam-filters/
2024
-
[21]
Jordaney, K
R. Jordaney, K. Sharad, S. Kumar Dash, Z. Wang, D. Papini, I. Nouretdinov, and L. Cavallaro, 2017, ”Transcend: detecting concept drift in malware classification models”, In Proceedings of the 26th USENIX Conference on Security Symposium (SEC’17), USENIX Association, USA, 625–642
2017
-
[22]
Karim, M
A. Karim, M. Shahroz, K. Mustofa, S.B. Belhaouri, R. Kumar Joga, ”Phishing Detection System Through Hybrid Machine Learning Based on URL”, 2023, IEEE Access volume 11, pp. 36805-36822
2023
-
[23]
Menon, G
A.G. Menon, G. Gressel, S.M. Thampi, S. Piramuthu, KC Lee, S. Berretti, M. Wozniak, M. Singh, 2021, ”Concept Drift Detection in Phishing Using Autoencoders. In Machine Learning and Meta- heuristics Algorithms and Applications. SoMMA 2020. Communications in Computer and Information Science, vol 1366. Springer, Singapore. 23
2021
-
[24]
M. Zi Hayat, J. Basiri, L. Seyedhossein and A. Shakery, ”Content-based concept drift detection for Email spam filtering,” 2010 5th International Symposium on Telecommunications, Tehran, Iran, 2010, pp. 531-536, doi: 10.1109/ISTEL.2010.5734082
-
[25]
Fernando and N
D.W. Fernando and N. Komninos,FeSA: Feature selection architecture for ransomware detection under concept drift, Computers & Security Volume 116, 2022
2022
-
[26]
Mamun, M.A
M.S.I. Mamun, M.A. Rathore, A.H. Lashkari, N. Stakhanova and A.A. Ghorbani, ”Detecting Mali- cious URLs Using Lexical Analysis”, Network and System Security, Springer International Publishing, pp. 467-482, 2016
2016
-
[27]
2024, Available online: https://github.com/JPCERTCC/phishurl-list
JPCERTCC, JPCERTCC Phishing Dataset, 2019. 2024, Available online: https://github.com/JPCERTCC/phishurl-list
2019
-
[28]
S. Ariyadasa, S. Fernando, S. Fernando, 2021, Phishing Websites Dataset, Mendeley Data, V1, doi: 10.17632/n96ncsr5g4.1
-
[29]
Arthi , 2024, Phishing URL dataset, Mendeley Data, V1, doi: 10.17632/vfszbj9b36.1
K.S SKAITHOLIKKAL, JISHNU, B. Arthi , 2024, Phishing URL dataset, Mendeley Data, V1, doi: 10.17632/vfszbj9b36.1
-
[30]
S.S. Shafin, An explainable feature selection framework for web phishing detection with machine learning, Data Science and Management, Volume 8, Issue 2,2025, Pages 127-136, ISSN 2666-7649, https://doi.org/10.1016/j.dsm.2024.08.004
-
[31]
T.M. Ahmed , I.M. Kabirul , B. Touhid , S. Abdus , Dataset of suspicious phishing URL detection, Frontiers in Computer Science, Volume 6 - 2024, 2024, DOI=10.3389/fcomp.2024.1308634
-
[32]
C.L. Tan (2018), ”Phishing Dataset for Machine Learning: Feature Evaluation”, Mendeley Data, V1, doi: 10.17632/h3cgnj8hft.1
-
[33]
G. Vrbanˇ ciˇ c, I. Jr. Fister, V. Podgorelec, Datasets for Phishing Websites Detection. Data in Brief, Vol. 33, 2020, DOI: 10.1016/j.dib.2020.106438
-
[34]
A. Hannousse S. Yahiouche, (2020), ”Web page phishing detection”, Mendeley Data, V1, doi: 10.17632/c2gw7fy2j4.1
-
[35]
D.W.A. Fernando (2023) ”FeSAD: Ransomware Detection with Ma- chine learning using Adaption to Concept Drift” Available Online: https://openaccess.city.ac.uk/id/eprint/32739/1/Fernando%20thesis%202023%20PDF-A.pdf 24 Damien Warren Fernandoreceived a MSci in Computer Sci- ence and Cyber Security in 2017 from City, University of Lon- don. Having worked at Cit...
2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.