РОЗРОБЛЕННЯ СИСТЕМИ МУЛЬТИМОДАЛЬНОГО РОЗПІЗНАВАННЯ ТА КЛАСИФІКАЦІЇ ЦІЛЕЙ НА ОСНОВІ ШТУЧНОГО ІНТЕЛЕКТУ

Peter Nikolyuk; Veronika Sapozhnikova; Vladyslav Chemes; Peter Nikolyuk; Veronika Sapozhnikova; Vladyslav Chemes

doi:10.30888/2663-5712.2025-34-01-113

Authors

Peter Nikolyuk Vasil Stus Donetsk national university https://orcid.org/0000-0002-0286-297X
Veronika Sapozhnikova Vasil Stus Donetsk national university
Vladyslav Chemes Vasil Stus Donetsk national university

DOI:

https://doi.org/10.30888/2663-5712.2025-34-01-113

Keywords:

multimodal recognition, target classification, artificial intelligence, RGB-T data, deep learning, edge-aware learning, semantic segmentation.

Abstract

The study addresses the problem of developing a multimodal system for target recognition and classification based on artificial intelligence. The relevance of the topic stems from the need to improve the reliability of automatic object detection under co

References

Baltrušaitis, T., Ahuja, C., Morency, L.P. Multimodal Machine Learning: A Survey and Taxonomy. IEEE TPAMI, 2019. DOI: https://doi.org/10.1109/TPAMI.2018.2798607

Hazirbas, C., Ma, L., Domokos, C., Cremers, D. FuseNet: Incorporating Depth into Semantic Segmentation via Fusion. ACCV, 2016. DOI: https://doi.org/10.1007/978-3-319-54181-5_14.

Ha, Q., Watanabe, K., Karasawa, T., Ushiku, Y., Harada, T. MFNet: Towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes. IEEE, 2017. DOI: https://doi.org/10.1109/IROS.2017.8206396.

Shivakumar, S. S.; Rodrigues, N.; Zhou, A.; Miller, I. D.; Kumar, V.; Taylor, C. J. PST900: RGB-Thermal Calibration, Dataset and Segmentation Network. arXiv, 2019. / GRASP Lab, University of Pennsylvania. arXiv preprint. URL: https://arxiv.org/abs/1909.10980.

Ji, W.; Li, J.; Bian, C.; Zhang, Z.; Cheng, L. SemanticRT: Large-Scale RGB-T Dataset for Semantic Segmentation and Target Recognition. arXiv, 2023. DOI: https://doi.org/10.1145/3581783.3611738.

Sun, Y.; Zuo, W.; Liu, M. RTFNet: RGB-T Fusion Network for Semantic Segmentation of Urban Scenes. IEEE, 2019. DOI: https://doi.org/10.1109/LRA.2019.2904733.

Chen, Z., et al. MMTM: Multimodal Transfer Module for CNN Fusion. CVPR, 2020. DOI: https://doi.org/10.48550/arXiv.1911.08670.

Li, P.; Chen, J.; Lin, B.; Xu, X. Residual Spatial Fusion Network for RGB-Thermal Semantic Segmentation. In: Proceedings / preprint, 2023. URL: https://huggingface.co/papers/trending.

Li, G.; … CAFNet: Cross-Modal Adaptive Fusion Network With Attention and Gated Weighting for RGB-T Semantic Segmentation. IEEE Access, (DOAJ). DOI: https://doi.org/10.1109/access.2025.3595811.

Zhou, Z.; Wu, S.; Zhu, G.; Wang, H.; He, Z. Channel and Spatial Relation-Propagation Network for RGB-Thermal Semantic Segmentation (CSRPNet). arXiv, 2023. DOI: https://doi.org/10.48550/arXiv.2308.12534.

He, K., Girshick, R., Dollár, P. Feature Pyramid Networks for Object Detection. CVPR, 2017. DOI: https://doi.org/10.1109/CVPR.2017.106.

Valada, A., Vertens, J., Dhall, A., Burgard, W. Self-Supervised Multimodal Fusion for Semantic Segmentation. IEEE RAL, 2019. DOI: https://doi.org/10.1007/s11263-019-01188-y.

Zhang, J.; Liu, H.; Yang, K.; Hu, X.; Liu, R.; Stiefelhagen, R. CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers. arXiv, 2022. DOI: https://doi.org/10.48550/arXiv.2203.04838.

Wang, M.; Zhu, Z.; Wang, Y.; Tu, R.; Weng, J.; Yu, X. Edge-Supervised Attention-Aware Fusion Network for RGB-T Semantic Segmentation. Electronics, 2025. DOI: https://doi.org/10.3390/electronics14081489.

DEVELOPMENT OF A MULTIMODAL RECOGNITION SYSTEM AND TARGET CLASSIFICATION BASED ON ARTIFICIAL INTELLIGENCE

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Language