pith. sign in

arxiv: 1907.01162 · v1 · pith:ZDVHC4ULnew · submitted 2019-07-02 · 💻 cs.LG · stat.ML

Sample Adaptive Multiple Kernel Learning for Failure Prediction of Railway Points

Pith reviewed 2026-05-25 11:19 UTC · model grok-4.3

classification 💻 cs.LG stat.ML
keywords railway pointsfailure predictionmultiple kernel learningmissing datasample adaptivepredictive maintenancemulti-source data
0
0 comments X

The pith

A sample-adaptive multiple kernel learning algorithm predicts railway points failures by handling missing data patterns and variance across point groups.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to forecast failures in railway points using multi-source data that is often incomplete, rather than relying on fixed schedules or special sensors. It formulates the task as a multiple kernel learning problem and introduces an algorithm that adapts to individual samples while accounting for missing kernels and differences between groups of points. A sympathetic reader would care because accurate predictions could allow proactive maintenance, improving rail reliability and reducing disruptions. The approach integrates readily available data with domain expert feature selection to make large-scale implementation feasible.

Core claim

We present a robust multiple kernel learning algorithm for predicting points failures. Our model takes into account the missing pattern of data as well as the inherent variance on different sets of railway points. Extensive experiments demonstrate the superiority of our algorithm compared with other state-of-the-art methods.

What carries the argument

Sample-adaptive multiple kernel learning model that handles missing kernels by incorporating missing patterns and group variance.

If this is right

  • Superior performance on real-world Sydney Trains data compared to existing methods.
  • Enables use of multi-source data without needing a unified model across all point types.
  • Supports proactive maintenance to minimize impacts on train reliability and punctuality.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The method may extend to predicting failures in other rail infrastructure components with similar data issues.
  • Integration with real-time monitoring could further improve prediction timeliness.
  • Domain expert involvement in feature selection suggests hybrid human-AI approaches for industrial applications.

Load-bearing premise

The incomplete multi-source data still holds sufficient predictive signal when missing patterns and point group variances are explicitly modeled.

What would settle it

Running the algorithm on the constructed Sydney Trains dataset and finding it does not outperform state-of-the-art methods in failure prediction metrics would falsify the superiority claim.

Figures

Figures reproduced from arXiv: 1907.01162 by Christina Kirsch, Jian Zhang, Jinfeng Yi, Qiang Wu, Yongshun Gong, Zhibin Li.

Figure 1
Figure 1. Figure 1: Workflow of our method. with yi ∈ {−1, +1} the label for xi , commonly used MKL can be formulated as the following convex optimisation problem [25]: min {ωm } s m=1 ,b,ξ,η∈∆ 1 2 Õs m=1 ∥ωm ∥ 2 2 + C Õn i=1 ξi , s.t. yi Õs m=1 √ ηmω ⊤ mϕm(x (m) i ) + b ! ≥ 1 − ξi , ξi ≥ 0, i = 1, 2, ...,n, (2) where ∥·∥2 is the Euclidean norm for vectors. ωm is the weight vectors for mapped features ϕm(x (m) i ). η contains… view at source ↗
Figure 2
Figure 2. Figure 2: A sample of our data. 3.1.1 Infrastructure Failure Management System Database. Infras￾tructure Failure Management System (IFMS) Database stores fail￾ures of assets in Sydney Trains with timestamps. We extracted points’ failures as part of our ground truth. 3.1.2 Equipment Details. Equipment details data record the de￾tailed parameters of every set of railway points, including Points ID, Manufacturer, Type … view at source ↗
Figure 3
Figure 3. Figure 3: To forecast failures in week i+1, we use data from [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
read the original abstract

Railway points are among the key components of railway infrastructure. As a part of signal equipment, points control the routes of trains at railway junctions, having a significant impact on the reliability, capacity, and punctuality of rail transport. Traditionally, maintenance of points is based on a fixed time interval or raised after the equipment failures. Instead, it would be of great value if we could forecast points' failures and take action beforehand, minimising any negative effect. To date, most of the existing prediction methods are either lab-based or relying on specially installed sensors which makes them infeasible for large-scale implementation. Besides, they often use data from only one source. We, therefore, explore a new way that integrates multi-source data which are ready to hand to fulfil this task. We conducted our case study based on Sydney Trains rail network which is an extensive network of passenger and freight railways. Unfortunately, the real-world data are usually incomplete due to various reasons, e.g., faults in the database, operational errors or transmission faults. Besides, railway points differ in their locations, types and some other properties, which means it is hard to use a unified model to predict their failures. Aiming at this challenging task, we firstly constructed a dataset from multiple sources and selected key features with the help of domain experts. In this paper, we formulate our prediction task as a multiple kernel learning problem with missing kernels. We present a robust multiple kernel learning algorithm for predicting points failures. Our model takes into account the missing pattern of data as well as the inherent variance on different sets of railway points. Extensive experiments demonstrate the superiority of our algorithm compared with other state-of-the-art methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper proposes a sample-adaptive multiple kernel learning (MKL) algorithm to predict failures of railway points. It integrates multi-source data from the Sydney Trains network, explicitly models missing data patterns and group-wise variance across point types, formulates the task as MKL with missing kernels, and reports that extensive experiments show superiority over state-of-the-art methods.

Significance. If the empirical claims hold with proper controls, the work offers a practical route to failure prediction that relies only on routinely collected operational data rather than lab setups or dedicated sensors. The explicit treatment of missingness and inter-group variance addresses two common obstacles in real-world railway maintenance data.

major comments (1)
  1. [Abstract] Abstract: the central claim of experimental superiority is stated without any dataset size, number of points, train/test split, performance metrics, baseline methods, statistical significance tests, or error bars. Because these details are absent, the claim that the proposed model outperforms SOTA methods cannot be evaluated from the supplied text.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their comment. We address it below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim of experimental superiority is stated without any dataset size, number of points, train/test split, performance metrics, baseline methods, statistical significance tests, or error bars. Because these details are absent, the claim that the proposed model outperforms SOTA methods cannot be evaluated from the supplied text.

    Authors: We agree that the abstract would be strengthened by including key experimental details to support the superiority claim. The full manuscript (Section 4) reports the Sydney Trains dataset size, number of points, train/test splits, metrics (e.g., AUC, F1-score), baselines, and results with statistical significance testing and error bars. To address the concern, we will revise the abstract to concisely state the dataset scale, split, primary metrics, main baselines, and note that superiority is supported by statistical tests. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The abstract formulates the task as a multiple kernel learning problem with missing kernels and presents a robust algorithm that accounts for missing patterns and group variance, but supplies no equations, derivation steps, or self-citations that could be inspected for reduction to inputs. Performance claims rest on empirical experiments on the Sydney Trains dataset. No load-bearing mathematical step is visible that reduces by construction to a fitted parameter or self-citation chain, so the derivation (if any) cannot be shown to be circular under the required criteria of explicit quotation and exhibited equivalence.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no extractable free parameters, axioms, or invented entities; the method relies on standard MKL assumptions and domain-expert feature selection.

pith-pipeline@v0.9.0 · 5847 in / 945 out tokens · 44664 ms · 2026-05-25T11:19:59.372116+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages

  1. [1]

    Arash Afkanpour, András György, Csaba Szepesvári, and Michael Bowling. 2013. A randomized mirror descent algorithm for large scale multiple kernel learning. In Proc. 30th International Conference on Machine Learning . 374–382

  2. [2]

    Salah Althloothi, Mohammad H Mahoor, Xiao Zhang, and Richard M Voyles. 2014. Human activity recognition using multi-features and multiple kernel learning. Pattern Recognition 47, 5 (2014), 1800–1812

  3. [3]

    Serhat S Bucak, Rong Jin, and Anil K Jain. 2014. Multiple kernel learning for visual object recognition: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 7 (2014), 1354–1369

  4. [4]

    Fatih Camci, Omer Faruk Eker, Saim Başkan, and Savas Konur. 2016. Comparison of sensors and methodologies for effective prognostics on railway turnout sys- tems. Proc. Institution of Mechanical Engineers, Part F: Journal of Rail and Rapid Transit 230, 1 (2016), 24–42

  5. [5]

    Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: a library for support vector machines. ACM TIST 2, 3 (2011), 27

  6. [6]

    Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A library for large linear classification. JMLR 9 (2008), 1871–1874

  7. [7]

    Fausto Pedro García Márquez, Clive Roberts, and Andrew M Tobias. 2010. Railway point mechanisms: condition monitoring and fault detection. Proc. Institution of Mechanical Engineers, Part F: Journal of Rail and Rapid Transit 224, 1 (2010), 35–44

  8. [8]

    Mehmet Gönen and Ethem Alpaydin. 2008. Localized multiple kernel learning. In Proc. 25th International Conference on Machine Learning . ACM, 352–359

  9. [9]

    Mehmet Gönen and Ethem Alpaydın. 2011. Multiple kernel learning algorithms. JMLR 12, Jul (2011), 2211–2268

  10. [10]

    Yongshun Gong, Zhibin Li, Jian Zhang, Wei Liu, Yu Zheng, and Christina Kirsch

  11. [11]

    Network-wide Crowd Flow Prediction of Sydney Trains via Customized Online Non-negative Matrix Factorization. In Proc. 27th ACM International Con- ference on Information and Knowledge Management . ACM, 1243–1252

  12. [12]

    Seyedahmad Jalili Hassankiadeh. 2011. Failure analysis of railway switches and crossings for the purpose of preventive maintenance. Transport Science (2011)

  13. [13]

    Muhammad Fitri Ishak, Serdar Dindar, and Sakdirat Kaewunruen. 2016. Safety- based maintenance for geometry restoration of railway turnout systems in vari- ous operational environments. In Proc. 21st National Convention on Civil Engi- neering

  14. [14]

    Marius Kloft, Ulf Brefeld, Pavel Laskov, Klaus-Robert Müller, Alexander Zien, and Sören Sonnenburg. 2009. Efficient and accurate lp-norm multiple kernel learning. In Advances in Neural Information Processing Systems . 997–1005

  15. [15]

    Marius Kloft, Ulf Brefeld, Sören Sonnenburg, and Alexander Zien. 2011. Lp-norm multiple kernel learning. JMLR 12, Mar (2011), 953–997

  16. [16]

    Quoc Le, Tamás Sarlós, and Alex Smola. 2013. Fastfood-approximating kernel expansions in loglinear time. In Proc. 30th International Conference on Machine Learning, Vol. 85

  17. [17]

    Yunwen Lei, Alexander Binder, Urun Dogan, and Marius Kloft. 2016. Localized multiple kernel learning a convex approach. In Proc. 8th Asian Conference on Machine Learning. 81–96

  18. [18]

    Xiang Li, Bin Gu, Shuang Ao, Huaimin Wang, and Charles X Ling. 2017. Triply stochastic gradients on multiple kernel learning. In Proc. 33rd Conference on Uncertainty in Artificial Intelligence

  19. [19]

    Xiang Li, Huaimin Wang, Bin Gu, and Charles X Ling. 2015. Data sparseness in linear SVM. In Proc. 24th International Joint Conference on Artificial Intelligence . 3628–3634

  20. [20]

    Zhibin Li, Jian Zhang, Qiang Wu, and Christina Kirsch. 2018. Field-regularised factorization machines for mining the maintenance logs of equipment. In Aus- tralasian Joint Conference on Artificial Intelligence . Springer, 172–183

  21. [21]

    Fayao Liu, Luping Zhou, Chunhua Shen, and Jianping Yin. 2014. Multiple kernel learning in the primal for multimodal Alzheimer’s disease classification. IEEE Journal of Biomedical and Health Informatics 18, 3 (2014), 984–990

  22. [22]

    Xinwang Liu, Lei Wang, Jianping Yin, Yong Dou, and Jian Zhang. 2015. Absent multiple kernel learning. In Proc. 29th AAAI Conference on Artificial Intelligence . 2807–2813

  23. [23]

    Xinwang Liu, Lei Wang, Jian Zhang, and Jianping Yin. 2014. Sample-Adaptive Multiple Kernel Learning. In Proc. 28th AAAI Conference on Artificial Intelligence . 1975–1981

  24. [24]

    BO Oyebande and AC Renfrew. 2002. Condition monitoring of railway electric point machines. IEE Proc. Electric Power Applications 149, 6 (2002), 465–473

  25. [25]

    Ali Rahimi and Benjamin Recht. 2008. Random features for large-scale kernel machines. In Advances in Neural Information Processing Systems . 1177–1184

  26. [26]

    Alain Rakotomamonjy, Francis R Bach, Stéphane Canu, and Yves Grandvalet

  27. [27]

    JMLR 9, Nov (2008), 2491–2521

    SimpleMKL. JMLR 9, Nov (2008), 2491–2521

  28. [28]

    Alain Rakotomamonjy and Sukalpa Chanda. 2014. Lp-norm multiple kernel learning with low-rank kernels. Neurocomputing 143 (2014), 68–79

  29. [29]

    Doyen Sahoo, Steven CH Hoi, and Bin Li. 2014. Online multiple kernel regression. In Proc. 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 293–302

  30. [30]

    Yanning Shen and Tianyi Chen. 2018. Online ensemble multi-kernel learning adaptive to non-stationary and adversarial environments. In Proc. 21st Interna- tional Conference on Artificial Intelligence and Statistics , Vol. 84

  31. [31]

    Yanning Shen, Tianyi Chen, and Georgios B Giannakis. 2018. Online multi-kernel learning with orthogonal random features. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing . IEEE, 6289–6293

  32. [32]

    Ruben Sipos, Dmitriy Fradkin, Fabian Moerchen, and Zhuang Wang. 2014. Log- based predictive maintenance. InProc. 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . ACM, 1867–1876

  33. [33]

    Sören Sonnenburg, Gunnar Rätsch, Christin Schäfer, and Bernhard Schölkopf

  34. [34]

    JMLR 7, Jul (2006), 1531–1565

    Large scale multiple kernel learning. JMLR 7, Jul (2006), 1531–1565

  35. [35]

    Hanqing Tao and Yang Zhao. 2015. Intelligent fault prediction of railway switch based on improved least squares support vector machine. Metallurgical and Mining Industry 7, 10 (2015), 69–75

  36. [36]

    Guang Wang, Tianhua Xu, Tao Tang, Tangming Yuan, and Haifeng Wang. 2017. A Bayesian network model for prediction of weather-related failures in railway turnout systems. Expert Systems with Applications 69 (2017), 247–256

  37. [37]

    Chang Xu, Dacheng Tao, and Chao Xu. 2015. Multi-view learning with incomplete views. IEEE Transactions on Image Processing 24, 12 (2015), 5812–5825

  38. [38]

    Zenglin Xu, Rong Jin, Haiqin Yang, Irwin King, and Michael R Lyu. 2010. Simple and efficient multiple kernel learning by group lasso. In Proc. 27th International Conference on Machine Learning . Omnipress, 1175–1182

  39. [39]

    Jingjing Yang, Yonghong Tian, Ling-Yu Duan, Tiejun Huang, and Wen Gao. 2012. Group-sensitive multiple kernel learning for object recognition.IEEE Transactions on Image Processing 21, 5 (2012), 2838–2852

  40. [40]

    Chi-Yuan Yeh, Chi-Wei Huang, and Shie-Jue Lee. 2011. A multiple-kernel support vector regression approach for stock market price forecasting. Expert Systems with Applications 38, 3 (2011), 2177–2186

  41. [41]

    Halis Yilboga, Ömer Faruk Eker, Adem Güçlü, and Fatih Camci. 2010. Failure prediction on railway turnouts using time delay neural networks. In 2010 IEEE International Conference on Computational Intelligence for Measurement Systems and Applications. IEEE, 134–137

  42. [42]

    Fuzhen Zhang. 2006. The Schur complement and its applications . Vol. 4. Springer Science & Business Media