ACMAN: Adaptive Cross-Modal Anomaly Network
DOI: 10.23977/cpcs.2025.090114 | Downloads: 0 | Views: 52
Author(s)
Junwei Wang 1, Junting Liu 1, Yutian Jiao 1
Affiliation(s)
1 Shandong Jiaotong University, Jinan, Shandong, China
Corresponding Author
Junwei WangABSTRACT
Anomaly detection underpins quality inspection, medical diagnosis, and safety monitoring, yet progress remains hindered by the scarcity of anomaly samples, limited semantic alignment, and unreliable uncertainty estimates. Here we present ACMAN-AD (Adaptive Cross-Modal Anomaly Network for Anomaly Detection), a unified framework that leverages vision—language pre-training to overcome these bottlenecks. ACMAN- AD integrates four complementary modules: a Cross-Modal Dynamic Adapter (CMDA) for image-guided prompt generation and adaptive alignment; a Self-Supervised Multi-Scale Feature Fusion (SSMFF) strategy for hierarchical representation learning; a Generative Adversarial Anomaly Synthesis (GAAS) module to enrich anomaly diversity; and a Knowledge Distillation and Uncertainty Quantification (KDUQ) scheme for lightweight inference with calibrated confidence. On MVTec AD and VisA, ACMAN-AD surpasses state-of- the-art methods in both detection and segmentation, improving AUROC and AUPRC by 3.2.
KEYWORDS
Anomaly detection, Contrastive Language-Image Pre-training, Vision-language pre-trainingCITE THIS PAPER
Junwei Wang, Junting Liu, Yutian Jiao, ACMAN: Adaptive Cross-Modal Anomaly Network. Computing, Performance and Communication Systems (2025) Vol. 9: 106-114. DOI: http://dx.doi.org/10.23977/cpcs.2025.090114.
REFERENCES
[1] Jaehyeok Bae, Jae-Han Lee, and Seyun Kim. Pni: industrial anomaly detection using position and neighborhood in- formation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 6373–6383, 2023.
[2] Hangbo Bao, Wenhui Wang, Li Dong, Qiang Liu, Owais Khan Mohammed, Kriti Aggarwal, Subhojit Som, Songhao Piao, and Furu Wei. Vlmo: Unified vision-language pre-training with mixture-of-modality-experts. Advances in Neural Information Processing Systems, 35:32897–32912, 2022.
[3] Kilian Batzner, Lars Heckler, and Rebecca Ko… nig. Efficientad: Accurate visual anomaly detection at millisecond-level latencies. arXiv preprint arXiv:2303.14535, 2023.
[4] Paul Bergmann, Michael Fauser, David Sattlegger, and Carsten Steger. Mvtec ad–a comprehensive real-world dataset for unsupervised anomaly detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9592–9600, 2019.
[5] Paul Bergmann, Michael Fauser, David Sattlegger, and Carsten Steger. Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4183–4192, 2020.
[6] Karsten Roth, Latha Pemula, Joaquin Zepeda, Bernhard Schölkopf, Thomas Brox, Peter Gehler.Towards Total Recall in Industrial Anomaly Detection.In Proc. 36th Conf. Neural Inf. Process. Syst. (NeurIPS), New Orleans, USA, 2022, arXiv:2106.08265 .
[7] Yunkang Cao, Xiaohao Xu, Chen Sun, Yuqi Cheng, Zongwei Du, Liang Gao, and Weiming Shen. Segment any anomaly without training via hybrid prompt regularization. arXiv preprint arXiv:2305.10724, 2023.
[8] Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. In International conference on ma- chine learning, pages 1597–1607. PMLR, 2020.
[9] Wentao Chen, Chenyang Si, Zhang Zhang, Liang Wang, Zilei Wang, and Tieniu Tan. Semantic prompt for few-shot image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 23581–23591, 2023.
[10] Xuhai Chen, Yue Han, and Jiangning Zhang. A zero-/few- shot anomaly classification and segmentation method for cvpr 2023 vand workshop challenge tracks 1&2: 1st place on zero-shot ad and 4th place on few-shot ad. arXiv preprint arXiv:2305.17382, 2023.
[11] Niv Cohen and Yedid Hoshen. Sub-image anomaly detection with deep pyramid correspondences. arXiv preprint arXiv:2005.02357, 2020.
[12] Thomas Defard, Aleksandr Setkov, Angelique Loesch, and Romaric Audigier. Padim: a patch distribution modeling framework for anomaly detection and localization. In Inter- national Conference on Pattern Recognition, pages 475–489. Springer, 2021.
[13] Kulikov, V., Yadin, S., Kleiner, M., Michaeli, T.: Sinddm: A single image denoising diffusion model. In: International Conference on Machine Learning. pp. 17920-17930. PMLR (2023).
[14] Kumari, N., Zhang, B., Zhang, R., Shechtman, E., Zhu, J.Y.: Multi-concept customization of text-to-image diffusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 1931-1941 (2023).
[15] Liu, R., Liu, W., Zheng, Z., Wang, L., Mao, L., Qiu, Q., Ling, G.: Anomalygan: A data augmentation method for train surface anomaly detection. Expert Systems with Applications p. 120284 (Oct 2023). https://doi.org/10.1016/j.eswa. 2023 . 120284, http://dx.doi.org/10.1016/j.eswa.2023.120284.
| Downloads: | 3404 |
|---|---|
| Visits: | 217091 |
Sponsors, Associates, and Links
-
Power Systems Computation
-
Internet of Things (IoT) and Engineering Applications
-
Journal of Artificial Intelligence Practice
-
Advances in Computer, Signals and Systems
-
Journal of Network Computing and Applications
-
Journal of Web Systems and Applications
-
Journal of Electrotechnology, Electrical Engineering and Management
-
Journal of Wireless Sensors and Sensor Networks
-
Journal of Image Processing Theory and Applications
-
Mobile Computing and Networking
-
Vehicle Power and Propulsion
-
Frontiers in Computer Vision and Pattern Recognition
-
Knowledge Discovery and Data Mining Letters
-
Big Data Analysis and Cloud Computing
-
Electrical Insulation and Dielectrics
-
Crypto and Information Security
-
Journal of Neural Information Processing
-
Collaborative and Social Computing
-
International Journal of Network and Communication Technology
-
File and Storage Technologies
-
Frontiers in Genetic and Evolutionary Computation
-
Optical Network Design and Modeling
-
Journal of Virtual Reality and Artificial Intelligence
-
Natural Language Processing and Speech Recognition
-
Journal of High-Voltage
-
Programming Languages and Operating Systems
-
Visual Communications and Image Processing
-
Journal of Systems Analysis and Integration
-
Knowledge Representation and Automated Reasoning
-
Review of Information Display Techniques
-
Data and Knowledge Engineering
-
Journal of Database Systems
-
Journal of Cluster and Grid Computing
-
Cloud and Service-Oriented Computing
-
Journal of Networking, Architecture and Storage
-
Journal of Software Engineering and Metrics
-
Visualization Techniques
-
Journal of Parallel and Distributed Processing
-
Journal of Modeling, Analysis and Simulation
-
Journal of Privacy, Trust and Security
-
Journal of Cognitive Informatics and Cognitive Computing
-
Lecture Notes on Wireless Networks and Communications
-
International Journal of Computer and Communications Security
-
Journal of Multimedia Techniques
-
Automation and Machine Learning
-
Computational Linguistics Letters
-
Journal of Computer Architecture and Design
-
Journal of Ubiquitous and Future Networks

Download as PDF