arxiv: 2604.13465 · v1 · submitted 2026-04-15 · 💻 cs.LG · eess.SP

Recognition: unknown

Adaptive Unknown Fault Detection and Few-Shot Continual Learning for Condition Monitoring in Ultrasonic Metal Welding

Ahmadreza Eslaminia , Kuan-Chieh Lu , Klara Nahrstedt , Chenhui Shao

Authors on Pith no claims yet

Pith reviewed 2026-05-10 13:51 UTC · model grok-4.3

classification 💻 cs.LG eess.SP

keywords ultrasonic metal weldingunknown fault detectioncontinual learningfew-shot learningcondition monitoringmultilayer perceptronfault classificationmanufacturing process monitoring

0 comments

The pith

Analyzing hidden layers of a multilayer perceptron with statistical thresholds detects unknown faults in ultrasonic metal welding and adds them via few-shot continual learning by updating only the final layers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an adaptive monitoring system for ultrasonic metal welding that identifies process faults never seen in training data. It examines the internal hidden-layer activations inside a multilayer perceptron and applies a statistical threshold to mark samples that deviate from both known fault patterns and normal operation. Once flagged, similar unknown samples are grouped through cosine similarity and clustering so that only a small number require manual labels. The model then incorporates each new fault by retraining solely its output layers, which leaves classification of all prior classes unchanged. Real sensor experiments confirm the method reaches 96 percent detection accuracy for unknowns and 98 percent overall accuracy after adding one new fault type from five examples.

Core claim

Unknown faults are detected by applying a statistical threshold to the hidden-layer representations of a multilayer perceptron trained on known conditions. Detected unknown samples are grouped using cosine similarity clustering to minimize labeling effort, after which the new fault type is added through continual learning that updates only the final network layers while preserving performance on existing classes and normal states.

What carries the argument

Hidden-layer representation monitoring via statistical thresholding for unknown detection, paired with selective final-layer updates in a continual learning step after cosine-similarity clustering of new samples.

Load-bearing premise

That the hidden-layer activations of the multilayer perceptron produce statistically separable deviations for unknown faults that do not overlap excessively with normal process variation or known faults.

What would settle it

Apply the detector to welding sensor recordings that contain a new fault type deliberately chosen so its hidden representations fall inside the statistical bounds of normal operation, then check whether unknown-fault detection accuracy falls below 90 percent or known-class accuracy drops.

Figures

Figures reproduced from arXiv: 2604.13465 by Ahmadreza Eslaminia, Chenhui Shao, Klara Nahrstedt, Kuan-Chieh Lu.

**Figure 2.** Figure 2: Schematic for the proposed unknown fault detection method [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Model updating process of continual learning [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Online monitoring system for UMW. The dataset covers mixed process conditions involving three tool conditions (new, worn, and damaged) and three surface conditions (clean, contaminated, and polished), yielding nine classes with 30 samples each (270 samples in total). Combinations of the new and worn tool conditions with all three surface types form six known fault classes used for initial model training, w… view at source ↗

**Figure 5.** Figure 5: Photo of the UMW machine with sensors [28]. [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗

**Figure 6.** Figure 6: Unknown fault detection results: (a) damaged tool and (b) new tool. [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗

**Figure 7.** Figure 7: Confusion matrix after incorporating one new fault class using five labeled samples [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗

**Figure 8.** Figure 8: Classification accuracy as a function of the number of unknown classes and the [PITH_FULL_IMAGE:figures/full_fig_p014_8.png] view at source ↗

**Figure 9.** Figure 9: BIRCH clustering results 2 0 2 Similarity to New/Contaminated 5 4 3 2 1 0 1 2 3 Similarity to New/Polished cluster 0 1 2 3 4 [PITH_FULL_IMAGE:figures/full_fig_p015_9.png] view at source ↗

**Figure 10.** Figure 10: Clustering result in the selected cosine similarity space [PITH_FULL_IMAGE:figures/full_fig_p015_10.png] view at source ↗

read the original abstract

Ultrasonic metal welding (UMW) is widely used in industrial applications but is sensitive to tool wear, surface contamination, and material variability, which can lead to unexpected process faults and unsatisfactory weld quality. Conventional monitoring systems typically rely on supervised learning models that assume all fault types are known in advance, limiting their ability to handle previously unseen process faults. To address this challenge, this paper proposes an adaptive condition monitoring approach that enables unknown fault detection and few-shot continual learning for UMW. Unknown faults are detected by analyzing hidden-layer representations of a multilayer perceptron and leveraging a statistical thresholding strategy. Once detected, the samples from unknown fault types are incorporated into the existing model through a continual learning procedure that selectively updates only the final layers of the network, which enables the model to recognize new fault types while preserving knowledge of existing classes. To accelerate the labeling process, cosine similarity transformation combined with a clustering algorithm groups similar unknown samples, thereby reducing manual labeling effort. Experimental results using a multi-sensor UMW dataset demonstrate that the proposed method achieves 96% accuracy in detecting unseen fault conditions while maintaining reliable classification of known classes. After incorporating a new fault type using only five labeled samples, the updated model achieves 98% testing classification accuracy. These results demonstrate that the proposed approach enables adaptive monitoring with minimal retraining cost and time. The proposed approach provides a scalable solution for continual learning in condition monitoring where new process conditions may constantly emerge over time and is extensible to other manufacturing processes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper assembles a usable pipeline for unknown-fault detection via hidden-layer stats plus selective few-shot updates in UMW monitoring, but the threshold's robustness against normal process drift is not convincingly shown.

read the letter

The paper's main contribution is a concrete pipeline that flags unknown faults by applying a statistical threshold to hidden-layer activations from a standard MLP, then folds the new class into the model by updating only the final layers while using cosine-similarity clustering to cut down on labeling. On their multi-sensor ultrasonic metal welding dataset this yields the reported 96% unseen-fault detection accuracy and 98% post-update classification accuracy with five labeled samples. Those numbers and the low-retraining design are the practical points worth noting for anyone running condition monitoring on the factory floor.

Referee Report

2 major / 2 minor

Summary. The paper proposes an adaptive condition monitoring framework for ultrasonic metal welding (UMW) that detects previously unseen faults by applying a statistical threshold to hidden-layer activations of a multilayer perceptron, then incorporates the new fault type via few-shot continual learning that updates only the final network layers while using cosine-similarity clustering to reduce labeling effort. On a multi-sensor UMW dataset the method is reported to achieve 96% accuracy on unseen faults while preserving known-class performance, and 98% test accuracy after adding one new fault type from only five labeled samples.

Significance. If the empirical claims are robust, the work addresses a practically important gap in industrial condition monitoring: the need to handle emergent faults without catastrophic forgetting or full retraining. The combination of representation-based unknown detection with selective layer updates and clustering-based labeling offers a low-cost path to continual adaptation that could generalize to other manufacturing processes. The few-shot regime with only five samples is particularly attractive for real-world deployment where labeling is expensive.

major comments (2)

[Method (unknown fault detection)] Unknown-fault detection procedure (method section): the statistical thresholding applied to MLP hidden representations is not accompanied by any derivation, cross-validation procedure, or ablation that demonstrates separation from normal process variability (tool wear, surface contamination, material changes). Without such evidence the 96% unseen-fault detection accuracy cannot be considered supported, and downstream few-shot updates risk inheriting mislabeled samples.
[Experiments] Experimental evaluation: the reported 96% and 98% accuracies are presented without baselines, number of independent runs, error bars, dataset statistics, or explicit description of how the detection threshold and layer-update choices were selected or validated. These omissions make it impossible to judge whether the performance gains are statistically reliable or merely artifacts of a particular data split.

minor comments (2)

[Abstract] The abstract and method sections should explicitly state the number of sensors, total samples, and the concrete fault types present in the UMW dataset.
[Method] Notation for the statistical threshold (e.g., mean + k·std) and the clustering hyperparameters should be introduced with symbols and default values in the method section.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and will revise the manuscript to incorporate the suggested improvements, thereby strengthening the justification of our method and the rigor of the experimental evaluation.

read point-by-point responses

Referee: [Method (unknown fault detection)] Unknown-fault detection procedure (method section): the statistical thresholding applied to MLP hidden representations is not accompanied by any derivation, cross-validation procedure, or ablation that demonstrates separation from normal process variability (tool wear, surface contamination, material changes). Without such evidence the 96% unseen-fault detection accuracy cannot be considered supported, and downstream few-shot updates risk inheriting mislabeled samples.

Authors: We agree that additional justification is required for the statistical thresholding procedure. In the revised manuscript we will add a formal derivation of the threshold (based on the empirical mean and standard deviation of hidden-layer activations computed on normal-process data) together with a cross-validation procedure for selecting the multiplier k. We will also include a new ablation study that explicitly compares activation distributions under normal variability (tool wear, surface contamination, material changes) versus the unseen fault conditions, demonstrating that the chosen threshold separates the two regimes with high reliability. These additions will directly support the reported 96% detection accuracy and reduce the risk of propagating mislabeled samples into the continual-learning stage. revision: yes
Referee: [Experiments] Experimental evaluation: the reported 96% and 98% accuracies are presented without baselines, number of independent runs, error bars, dataset statistics, or explicit description of how the detection threshold and layer-update choices were selected or validated. These omissions make it impossible to judge whether the performance gains are statistically reliable or merely artifacts of a particular data split.

Authors: We acknowledge that the current experimental section lacks several elements needed for full reproducibility and statistical assessment. In the revision we will (i) add baseline comparisons against standard anomaly-detection and continual-learning methods (e.g., one-class SVM, autoencoder-based reconstruction error, and EWC), (ii) report all accuracies as means and standard deviations over at least five independent random seeds with error bars, (iii) include a table of dataset statistics (sample counts per class, sensor modalities, and train/validation/test splits), and (iv) describe the validation protocol used to select both the detection threshold and the number of layers updated during few-shot adaptation. These changes will allow readers to evaluate the statistical reliability of the 96% and 98% figures and confirm that results are not artifacts of a single split. revision: yes

Circularity Check

0 steps flagged

No significant circularity: empirical ML pipeline validated on dataset

full rationale

The paper describes a practical pipeline: MLP hidden-layer representations + statistical thresholding for unknown fault detection, followed by selective final-layer updates for few-shot continual learning and cosine-similarity clustering to reduce labeling. No equations, derivations, or self-referential definitions are present that reduce any prediction to its inputs by construction. Claims rest on reported test accuracies (96% unseen detection, 98% post-update) from a multi-sensor UMW dataset rather than tautological fits or self-citation chains. No load-bearing uniqueness theorems, ansatzes, or renamings of known results are invoked. The method is self-contained against external benchmarks via explicit experimental evaluation.

Axiom & Free-Parameter Ledger

3 free parameters · 2 axioms · 0 invented entities

The central claim depends on several unstated modeling choices and domain assumptions typical of applied neural-network work; without the full text these cannot be enumerated exhaustively.

free parameters (3)

statistical detection threshold
Value used to flag unknown faults from hidden representations; not specified in abstract.
number of final layers updated
Choice of which layers to adapt during continual learning; not detailed.
clustering hyperparameters
Parameters for grouping unknown samples via cosine similarity; not reported.

axioms (2)

domain assumption Hidden-layer activations of the MLP form separable clusters for known versus unknown faults
Invoked by the statistical thresholding strategy for unknown-fault detection.
domain assumption Updating only the final layers is sufficient to incorporate new classes without catastrophic forgetting
Central to the continual-learning procedure described.

pith-pipeline@v0.9.0 · 5582 in / 1467 out tokens · 44721 ms · 2026-05-10T13:51:45.498972+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

45 extracted references · 3 canonical work pages

[1]

Martinsen, S

K. Martinsen, S. Hu, B. Carlson, Joining of dissimilar materials, Cirp Annals 64 (2015) 679–699

2015
[2]

W. W. Cai, B. Kang, S. J. Hu, Ultrasonic welding of lithium-ion batteries, ASME press, 2017

2017
[3]

T. H. Kim, J. Yum, S. J. Hu, J. P. Spicer, J. A. Abell, Process robustness of single lap ultrasonic welding of thin, dissimilar materials, CIRP Annals 60 (2011) 17–20

2011
[4]

Banerjee, I

A. Banerjee, I. Manna, V. Sharma, A. Das, Quantifying the role of ul- trasonic welding and press contacts on electrical resistance for developing battery interconnects, Journal of Energy Storage 132 (2025) 117905

2025
[5]

Haddadi, D

F. Haddadi, D. Tsivoulas, Grain structure, texture and mechanical prop- erty evolution of automotive aluminium sheet during high power ultrasonic welding, Materials Characterization 118 (2016) 340–351

2016
[6]

Z. L. Ni, F. X. Ye, Ultrasonic spot welding of aluminum alloys: A review, Journal of Manufacturing Processes 35 (2018) 580–594

2018
[7]

Becker, F

M. Becker, F. Balle, Multi-spot ultrasonic welding of aluminum to steel sheets: Process and fracture analysis, Metals 11 (2021) 779

2021
[8]

Y. Meng, D. Peng, Q. Nazir, G. Kuntumalla, M. C. Rajagopal, H. C. Chang, H. Zhao, S. Sundar, P. M. Ferreira, S. Sinha, et al., Ultrasonic welding of soft polymer and metal: a preliminary study, in: International Manufacturing Science and Engineering Conference, volume 58752, Amer- ican Society of Mechanical Engineers, 2019, p. V002T03A083

2019
[9]

Kuntumalla, Y

G. Kuntumalla, Y. Meng, M. Rajagopal, R. Toro, H. Zhao, H. C. Chang, S. Sundar, S. Salapaka, N. Miljkovic, C. Shao, et al., Joining techniques for novel metal polymer hybrid heat exchangers, in: ASME International Mechanical Engineering Congress and Exposition, volume 59384, American Society of Mechanical Engineers, 2019, p. V02BT02A018

2019
[10]

L. Nong, C. Shao, T. H. Kim, S. J. Hu, Improving process robustness in ultrasonic metal welding of lithium-ion batteries, Journal of Manufacturing Systems 48 (2018) 45–54

2018
[11]

Lu, L.-W

K.-C. Lu, L.-W. Shih, C. Shao, An integrated learning, monitoring, and control system for ultrasonic metal welding, Journal of Manufacturing Processes 161 (2026) 267–276. 17

2026
[12]

S. S. Lee, C. Shao, T. H. Kim, S. J. Hu, E. Kannatey-Asibu, W. W. Cai, J. P. Spicer, J. A. Abell, Characterization of ultrasonic metal welding by correlating online sensor signals with weld attributes, Journal of Manufac- turing Science and Engineering 136 (2014)

2014
[13]

Nazir, C

Q. Nazir, C. Shao, Online tool condition monitoring for ultrasonic metal welding via sensor fusion and machine learning, Journal of Manufacturing Processes 62 (2021) 806–816

2021
[14]

K.-C. Lu, Z. Dong, C. Shao, Sensor and feature selection for cost-and time-efficient online monitoring of ultrasonic metal welding, Journal of Manufacturing Processes 160 (2026) 498–508

2026
[15]

C. Shao, K. Paynabar, T. H. Kim, J. J. Jin, S. J. Hu, J. P. Spicer, H. Wang, J. A. Abell, Feature selection for manufacturing process monitoring using cross-validation, Journal of Manufacturing Systems 32 (2013) 550–555

2013
[16]

M. Feng, Z. Wang, D. Meng, C. Liu, J. Hu, P. Wang, A review of quality monitoring for ultrasonic metal welding, Materials Science and Technology 40 (2024) 3–25

2024
[17]

P. Geng, Y. Xia, Z. Dong, B. Men, B. Pan, C. Shao, Y. Li, J. Li, Machine learning applications in welding processes: progresses and opportunities, International Journal of Machine Tools and Manufacture (2025) 104344

2025
[18]

W. Guo, C. Shao, T. H. Kim, S. J. Hu, J. Jin, J. P. Spicer, H. Wang, Online process monitoring with near-zero misdetection for ultrasonic weld- ing of lithium-ion batteries: An integration of univariate and multivariate methods, Journal of Manufacturing Systems 38 (2016) 141–150

2016
[19]

I. Balz, E. Abi Raad, E. Rosenthal, R. Lohoff, A. Schiebahn, U. Reisgen, M. Vorländer, Process monitoring of ultrasonic metal welding of battery tabs using external sensor data, Journal of Advanced Joining Processes 1 (2020) 100005

2020
[20]

E. B. Schwarz, F. Bleier, F. Guenter, R. Mikut, J. P. Bergmann, Improv- ing process monitoring of ultrasonic metal welding using classical machine learning methods and process-informed time series evaluation, Journal of Manufacturing Processes 77 (2022) 54–62

2022
[21]

Y. Meng, C. Shao, Physics-informed ensemble learning for online joint strength prediction in ultrasonic metal welding, Mechanical Systems and Signal Processing 181 (2022)

2022
[22]

D. Zhao, D. Ren, K. Zhao, S. Pan, X. Guo, Effect of welding parameters on tensile strength of ultrasonic spot welded joints of aluminum to steel–by experimentation and artificial neural network, Journal of Manufacturing processes 30 (2017) 63–74. 18

2017
[23]

Y. Wu, Y. Meng, C. Shao, End-to-end online quality prediction for ul- trasonic metal welding using sensor fusion and deep learning, Journal of Manufacturing Processes 83 (2022) 685–694

2022
[24]

C. Shao, T. Hyung Kim, S. Jack Hu, J. Jin, J. A. Abell, J. Patrick Spicer, Tool wear monitoring for ultrasonic metal welding of lithium-ion batteries, Journal of Manufacturing Science and Engineering 138 (2016) 051005

2016
[25]

Z. Dong, C. Shao, Fine-scale characterization and monitoring of tool sur- face degradation in ultrasonic metal welding using optical measurements and computer vision, Journal of Computing and Information Science in Engineering 25 (2025) 111001

2025
[26]

K.-C. Lu, Y. Meng, Z. Dong, C. Shao, Online cost-effective classification of mixed tool and material conditions in ultrasonic metal welding: To- wards integrated monitoring and control, in: International Manufacturing Science and Engineering Conference, volume 87240, American Society of Mechanical Engineers, 2023, p. V002T07A007

2023
[27]

B. Tian, A. Eslaminia, K.-C. Lu, Y. Wang, C. Shao, K. Nahrstedt, Weld- mon: A cost-effective ultrasonic welding machine condition monitoring sys- tem, in: 2023 IEEE 14th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), IEEE, 2023, pp. 0310– 0319

2023
[28]

Y. Meng, Z. Dong, K.-C. Lu, S. Li, C. Shao, Meta-learning-based do- main generalization for cost-effective tool condition monitoring in ultra- sonic metal welding, IEEE Transactions on Industrial Informatics (2024)

2024
[29]

Eslaminia, Y

A. Eslaminia, Y. Meng, K. Nahrstedt, C. Shao, Federated domain gener- alization for condition monitoring in ultrasonic metal welding, Journal of Manufacturing Systems 77 (2024) 1–12

2024
[30]

Y. Meng, K. C. Lu, Z. Dong, S. Li, C. Shao, Explainable few-shot learn- ing for online anomaly detection in ultrasonic metal welding with varying configurations, Journal of Manufacturing Processes 107 (2023) 345–355

2023
[31]

Geng, S.-j

C. Geng, S.-j. Huang, S. Chen, Recent advances in open set recognition: A survey, IEEE Transactions on Pattern Analysis and Machine Intelligence 43 (2021) 3614–3631. doi:10.1109/TPAMI.2020.2981604

work page doi:10.1109/tpami.2020.2981604 2021
[32]

S. Chen, Y. Meng, H. Tang, Y. Tian, N. He, C. Shao, Robust deep learning- based diagnosis of mixed faults in rotating machinery, IEEE/ASME Trans- actions on Mechatronics 25 (2020) 2167–2176

2020
[33]

Naghavi Khanghah, A

K. Naghavi Khanghah, A. Patel, R. Malhotra, H. Xu, Large language models for extrapolative modeling of manufacturing processes, Journal of Intelligent Manufacturing (2025) 1–29. 19

2025
[34]

Overcoming catastrophic forgetting in neural networks.Proceedings of the National Academy of Sciences, 114(13): 3521–3526, 2017

J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska, et al., Overcoming catastrophic forgetting in neural networks, Pro- ceedings of the National Academy of Sciences 114 (2017) 3521–3526. doi:10.1073/pnas.1611835114

work page doi:10.1073/pnas.1611835114 2017
[35]

Boschini, L

M. Boschini, L. Bonicelli, P. Buzzega, A. Porrello, S. Calderara, Class- incremental continual learning into the extended der-verse, IEEE transac- tions on pattern analysis and machine intelligence 45 (2022) 5497–5512

2022
[36]

M. Zhao, R. Fu, Q. Ren, Few-shot learning approaches for fault diagnosis using vibration data: a comprehensive review, Sustainability 15 (2023) 14975

2023
[37]

Theissler, Detecting known and unknown faults in automotive systems using ensemble-based anomaly detection, Knowledge-Based Systems 123 (2017) 163–173

A. Theissler, Detecting known and unknown faults in automotive systems using ensemble-based anomaly detection, Knowledge-Based Systems 123 (2017) 163–173

2017
[38]

C. Wang, J. Nie, P. Yin, J. Xu, S. Yu, X. Ding, Unknown fault detection of rolling bearings guided by global–local feature coupling, Mechanical Systems and Signal Processing 213 (2024) 111331

2024
[39]

Zhou, Q.-W

D.-W. Zhou, Q.-W. Wang, Z.-H. Qi, H.-J. Ye, D.-C. Zhan, Z. Liu, Class- incremental learning: A survey, IEEE Transactions on Pattern Analysis and Machine Intelligence 46 (2024) 9851–9873

2024
[40]

Eslaminia, A

A. Eslaminia, A. Jackson, B. Tian, A. Stern, H. Gordon, R. Malhotra, K. Nahrstedt, C. Shao, Fdm-bench: a domain-specific benchmark for eval- uating large language models in additive manufacturing, Manufacturing Letters 44 (2025) 1415–1424

2025
[41]

B. Tian, Y. Wang, A. Eslaminia, R. Gupta, R. B. Kaufman, L. Espenhahn, G. Pezzarossi, M. Sardela, J. Dallesasse, K. Nahrstedt, Machinestetho- scope: A smart and cost-effective machine health monitoring system, in: 2025 IEEE 8th International Conference on Multimedia Information Pro- cessing and Retrieval (MIPR), IEEE, 2025, pp. 323–329

2025
[42]

P. Xia, L. Zhang, F. Li, Learning similarity with cosine similarity ensemble, Information sciences 307 (2015) 39–52

2015
[43]

Mehta, C

M. Mehta, C. Shao, A greedy agglomerative framework for clustered fed- erated learning, IEEE Transactions on Industrial Informatics 19 (2023) 11856–11867

2023
[44]

A. K. Jain, M. N. Murty, P. J. Flynn, Data clustering: a review, ACM computing surveys (CSUR) 31 (1999) 264–323

1999
[45]

Zhang, R

T. Zhang, R. Ramakrishnan, M. Livny, BIRCH: A new data clustering algorithm and its applications, Data Mining and Knowledge Discovery 1 (1997) 141–182. doi:10.1023/A:1009783824328. 20

work page doi:10.1023/a:1009783824328 1997