Skip to main content
Intended for healthcare professionals
Restricted access
Research article
First published online May 26, 2025

Research on multimodal techniques for arc detection in railway systems with limited data

Abstract

The pantograph–catenary system is a critical component of railway vehicles, and its performance directly affects the quality of current collection. Accurately measuring the arcing rate is essential for monitoring the system’s condition and ensuring safe operation. However, traditional arc detection methods are prone to increased false detection rates and reduced measurement accuracy in complex railway environments due to the diversity of arc sizes and shapes, environmental interference, instability in current collection, and power fluctuations. While deep learning-based methods can effectively address environmental interference, obtaining sufficient labeled training data is challenging because arc events occur infrequently. Moreover, a large number of unlabeled images of pantograph–catenary contacts cannot be directly utilized due to the lack of annotations. To solve these issues, a novel arc detection method is proposed: a multimodal arc detection network based on denoising diffusion probabilistic models (DDPMs-MILNet). First, a DDPM is pretrained using a large set of unlabeled images to acquire advanced image features. This model serves as a feature extractor, and a hierarchical variation semantic decoder is fine-tuned, thereby improving performance under small-sample conditions and reducing dependence on extensive labeled datasets. Building on this, an audiovisual semantic decoder is designed to incorporate audio signals as semantic cues, providing additional modality information for visual features. This approach not only reduces the model’s reliance on visual information but also enables it to locate the visual target of the arc even when the object is not simultaneously seen and heard, further alleviating the challenges posed by limited sample sizes. Experimental results demonstrate that DDPM-MILNet achieves excellent detection performance with minimal data in complex railway environments, indicating significant application potential, particularly in the state monitoring and anomaly detection of railway systems.

Get full access to this article

View all access and purchase options for this article.

References

1. Cheng Y, Yan J, Zhang F, et al. Surrogate modeling of pantograph-catenary system interactions. Mech Syst Signal Process 2025; 224: 112134.
2. Wang H, Liu Z, Han Z. HO2RL: A novel hybrid offline-and-online reinforcement learning method for active pantograph control. IEEE Trans Ind Electron 2024; 72: 1–11.
3. Wang J, Feng X, Yu Y, et al. Function-dependent neural-network-driven state feedback control and self-verification stability for discrete-time nonlinear system. Neurocomputing 2024; 609: 128422.
4. Wang X, Jiang H, Mu M, et al. A dynamic collaborative adversarial domain adaptation network for unsupervised rotating machinery fault diagnosis. Reliab Eng Syst Saf 2025; 255: 110662.
5. Zhao K, Liu Z, Li J, et al. Self-paced decentralized federated transfer framework for rotating machinery fault diagnosis with multiple domains. Mech Syst Signal Process 2024; 211: 111258.
6. Feng X, Yu Y, Wang X, et al. A hybrid search mode-based differential evolution algorithm for auto design of the interval type-2 fuzzy logic system. Expert Syst Appl 2024; 236: 121271.
7. Wei H, Zhou N, Cheng Y, et al. Study of a semi-data-driven high-precision spatial load spectrum fitting method for pantograph design. Proc IMechE, Part F: J Rail and Rapid Transit 2025; 239: 246–257.
8. Wang H, Liu Z, Hu G, et al. Offline meta-reinforcement learning for active pantograph control in high-speed railways. IEEE Trans Ind Inform 2024; 20: 10669–10679.
9. Cheng Y, Zhou N, Wang Z, et al. CFFsBD: a candidate fault frequencies-based blind deconvolution for rolling element bearings fault feature enhancement. IEEE Trans Instrumen Meas 2023; 72: 1–12.
10. Zhang Z, Yang B, Wang S, et al. Mixed mode (I/II) fatigue crack growth in butt-welded joints using actual stress intensity factors. Theor Appl Fract Mech 2025; 138: 104894.
11. Zhang Z, Yang B, Feng F, et al. Multiaxial fatigue model describing crack growth behavior and its application in welded structures of railway frames. Int J Fatigue 2025; 194: 108831.
12. Yan J, Cheng Y, Zhang F, et al. Multi-modal imitation learning for arc detection in complex railway environments. IEEE Trans Instrumen Meas 2025; 13: 220.
13. Yan Y, Liu H, Gan L, et al. A novel arc detection and identification method in pantograph-catenary system based on deep learning. Sci Rep 2025; 15: 3511.
14. Saleh SA, Valdes ME, Mardegan CS, et al. The state-of-the-art methods for digital detection and identification of arcing current faults. IEEE Trans Ind Appl 2019; 55: 4536–4550.
15. Jiang D, Zou H, Guo Y, et al. Simulation on operating overvoltage of dropping pantograph based on pantograph–catenary arc and variable capacitance model. Appl Sci 2024; 14: 6861.
16. Karakose E, Gencoglu MT, Karakose M, et al. A new arc detection method based on fuzzy logic using s-transform for pantograph–catenary systems. J Intell Manuf 2018; 29: 839–856.
17. Kou H, Ma Y, Jin F. Recognition strategy for pantograph-arc based on ZOA-RBFNN. In: International workshop on automation, control, and communication engineering (IWACCE 2024), Hohhot, China, vol. 13394, SPIE, 2024, pp. 322–329.
18. Liu Z, Zhou H, Huang K, et al. Extended black-box model of pantograph-catenary detachment arc considering pantograph-catenary dynamics in electrified railway. IEEE Trans Ind Appl 2018; 55: 776–785.
19. Pan L, Wang H, Yu Z, et al. Electrical characterization of pantograph and catenary offline under train operating conditions based on improved habedank model and at power supply network. IEEE Access 2024; 12: 174343–174353.
20. Seferi Y, Blair SM, Mester C, et al. A novel arc detection method for DC railway systems. Energies 2021; 14: 444.
21. Fan F, Wank A, Seferi Y, et al. Pantograph arc location estimation using resonant frequencies in dc railway power systems. IEEE Trans Transp Electrif 2021; 7: 3083–3095.
22. Wei W, Liang C, Yang Z, et al. A novel method for detecting the pantograph–catenary arc based on the arc sound characteristics. Proc IMechE, Part F: J Rail and Rapid Transit 2019; 233: 506–515.
23. Gao G, Yan X, Yang Z, et al. Pantograph–catenary arcing detection based on electromagnetic radiation. IEEE Trans Electromagn Compat 2018; 61: 983–989.
24. Park C, Lee K, Kim K, et al. Evaluation of time-based arc flash detection with non-contact UV sensor. J Elect Eng Technol 2024; 19: 1983–1992.
25. Yu X, Su H. Pantograph arc detection of urban rail based on photoelectric conversion mechanism. IEEE Access 2020; 8: 14489–14499.
26. Gao S, Wang Y, Liu Z, et al. Thermal distribution modeling and experimental verification of contact wire considering the lifting or dropping pantograph in electrified railway. IEEE Trans Transp Electrif 2016; 2: 256–265.
27. Gao S. Automatic detection and monitoring system of pantograph–catenary in China’s high-speed railways. IEEE Trans Instrumen Meas 2020; 70: 1–12.
28. Jin B, Gonçalves N, Cruz L, et al. Simulated multimodal deep facial diagnosis. Expert Syst Appl 2024; 252: 123881.
29. Chen Y, Zhang Q, Yu F. Transforming traffic accident investigations: a virtual-real-fusion framework for intelligent 3d traffic accident reconstruction. Complex Intell Syst 2025; 11: 76.
30. Zhao X, Quan W, Gao S, et al. Visual detection approach of pantograph-catenary arcing based on dense mesh learning. In: 2023 8th International conference on intelligent computing and signal processing (ICSP), Xian, China, IEEE, 2023, pp. 1–6.
31. Liu Y, Quan W, Lu X, et al. A novel arcing detection model of pantograph–catenary for high-speed train in complex scenes. IEEE Trans Instrumen Meas 2023; 72: 1–13.
32. Quan W, Guo S, Lu X, et al. Arcmask: a robust and fast image-based method for high-speed railway pantograph-catenary arcing instance segmentation. Neural Comput Appl 2023; 35: 6875–6890.
33. Quan W, Xu X, Liu X, et al. ArcSE: a dual-branch semantic segmentation model for robust pantograph-catenary arcing detection in complex environments. IEEE Trans Instrumen Meas 2024; 74: 1–14.
34. Yan J, Cheng Y, Wang Q, et al. Transformer and graph convolution-based unsupervised detection of machine anomalous sound under domain shifts. IEEE Trans Emerg Topics Comput Intell 2024; 8: 2827–2842.
35. Wu G, Dong K, Xu Z, et al. Pantograph–catenary electrical contact system of high-speed railways: recent progress, challenges, and outlooks. Railway Eng Sci 2022; 30: 437–467.
36. Hao F, Ma Z-F, Tian H-P, et al. Semi-supervised label propagation for multi-source remote sensing image change detection. Comput Geosci 2023; 170: 105249.
37. Ericsson L, Gouk H, Loy CC, et al. Self-supervised representation learning: Introduction, advances, and challenges. IEEE Signal Process Magaz 2022; 39: 42–62.
38. Chapelle O, Zien A. Semi-supervised classification by low density separation. In: Cowell RG, Ghahramani Z (eds) Proceedings of the tenth international workshop on artificial intelligence and statistics, AISTATS 2005, Bridgetown, Barbados, 6–8 January 2005, Society for Artificial Intelligence and Statistics, 2005, pp. 57–64.
39. Kim G. Recent deep semi-supervised learning approaches and related works. arXiv preprint arXiv:2106.11528, 2021.
40. Grandvalet Y, Bengio Y. Semi-supervised learning by entropy minimization. Adv Neural Inform Process Syst 2004; 17: 529–536.
41. Manas O, Lacoste A, Giró-i Nieto X, et al. Seasonal contrast: unsupervised pre-training from uncurated remote sensing data. In: Proceedings of the IEEE/CVF international conference on computer vision, Montreal, QC, Canada, 2021, pp. 9414–9423
42. Bandara WGC, Patel N, Gholami A, et al. Adamae: adaptive masking for efficient spatiotemporal learning with masked autoencoders. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, Vancouver, BC, Canada, 2023, pp. 14507–14517.
43. Gui J, Chen T, Zhang J, et al. A survey on self-supervised learning: algorithms, applications, and future trends. IEEE Trans Pattern Anal Mach Intell 2024: 9052–9071.
44. Ho J, Jain A, Abbeel P. Denoising diffusion probabilistic models. In: Larochelle H, Ranzato M, Hadsell R, et al. (eds) Advances in neural information processing systems 33: Annual conference on neural information processing systems 2020, NeurIPS 2020, Virtual, 6–12 December 2020, virtual, 2020.
45. Nichol AQ, Dhariwal P Improved denoising diffusion probabilistic models. In: Meila M, Zhang T (eds) Proceedings of the 38th international conference on machine learning, ICML 2021, 18–24 July 2021, Virtual Event, volume 139 of Proceedings of Machine Learning Research, PMLR, 2021, pp. 8162–8171.
46. Yan J, Wang Q, Cheng Y, et al. Optimized single-image super-resolution reconstruction: a multimodal approach based on reversible guidance and cyclical knowledge distillation. Eng Appl Artif Intell 2024; 133: 108496.
47. Huang L, Yan J, Wang M, et al. Improving image super-resolution with structured knowledge distillation-based multimodal denoising diffusion probabilistic model. J Electron Imag 2024; 33: 033004–033004.
48. Peng K, Zhou W, Jiang L, et al. Multimodal fusion hybrid neural network approach for multi-class damage classification in high-speed rail track-bridge systems with multi-parameter. Eng Struct 2025; 328: 119710.
49. Qiu S, Zaheer Q, Hassan Shah SMA, et al. Lidar-simulated multimodal and self-supervised contrastive digital twin approach for probabilistic point cloud generation of rail fasteners, J Comput Civ Eng 2025; 39: 04025001.
50. Ma Z, Duan H, Chen Z, et al. Intelligent fault diagnosis of railway pantograph using a novel graph construction methodology. Meas Sci Technol 2024; 35: 076117.
51. Huang S, Chen W, Sun B, et al. Arc detection and recognition in the pantograph-catenary system based on multi-information fusion. Trans Res Record 2020; 2674: 229–240.
52. Schick T, Schütze H. Exploiting cloze-questions for few-shot text classification and natural language inference. In: Merlo P, Tiedemann J, Tsarfaty R (eds.) Proceedings of the 16th conference of the European chapter of the association for computational linguistics: main volume, EACL 2021, Online, 19–23 April 2021, Association for Computational Linguistics, 2021, pp. 255–269.
53. Liu J, Ju C, Ma C, et al. Audio-aware query-enhanced transformer for audio-visual segmentation. arXiv preprint arXiv:2307.13236, 2023.
54. Zhou C, He J, Ma X, et al. Prompt consistency for zero-shot task generalization. In: Goldberg Y, Kozareva Z, Zhang Y (eds) Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates, 7–11 December 2022, Association for Computational Linguistics, 2022, pp. 2613–2626.
55. Bandara WGC, Nair NG, Patel VM. DDPM-CD: remote sensing change detection using denoising diffusion probabilistic models. arXiv preprint arXiv:2206.11892, 2022.
56. Qin Q, Yan J, Wang Q, et al. Etdnet: an efficient transformer deraining model. IEEE Access 2021; 9: 119881–119893.
57. Gao S, Chen Z, Chen G, et al. Avsegformer: audio-visual segmentation with transformer. In: Thirty-Eighth AAAI conference on artificial intelligence, AAAI 2024, thirty-sixth conference on innovative applications of artificial intelligence, IAAI 2024, fourteenth symposium on educational advances in artificial intelligence, EAAI 2014, 20–27 February 2024, Vancouver, BC, Canada, AAAI Press, 2024, pp. 12155–12163.
58. Wang X, Yan J-K, Cai J-Y, et al. Super-resolution reconstruction of single image for latent features. Comput Visual Media 2024; 10: 1–21.
59. Chen L-C, Zhu Y, Papandreou G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV), Munich, Germany, 2018, pp. 801–818.
60. Xie E, Wang W, Yu Z, et al. Segformer: simple and efficient design for semantic segmentation with transformers. Adv Neural Inform Process Syst 2021; 34: 12077–12090.
61. Liu Z, Lin Y, Cao Y, et al. Swin transformer: hierarchical vision transformer using shifted windows. In: 2021 IEEE/CVF international conference on computer vision, ICCV 2021, Montreal, QC, Canada, 10–17 October 2021, IEEE, 2021, pp. 9992–10002.
62. Yan J, Wang X, Cai J, et al. Medical image segmentation model based on triple gate multilayer perceptron. Sci Rep 2022; 12: 6103.
63. Gao S, Chen Z, Chen G, et al. Avsegformer: Audio-visual segmentation with transformer. In: Thirty-eighth AAAI conference on artificial intelligence, AAAI 2024, Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence, IAAI 2024, Fourteenth Symposium on Educational Advances in Artificial Intelligence, EAAI 2014, 20–27 February, 2024, Vancouver, BC, Canada, AAAI Press, 2024, pp. 12155–12163.