On this page

Research on the application of end-to-end deep learning model-based multimodal target detection technique in point cloud data

By: Niya Dong 1, Yi Lin 1
1College of Communication and Information Engineering, Chongqing College of Mobile Communication, Chongqing, 401520, China

Abstract

Computer vision is an important field in the digital era, and target detection technology plays a key role in it. Traditional methods have accuracy and robustness limitations in complex environments, and point cloud data has gradually become a research hotspot due to its advantage of 3D spatial information. Multimodal deep learning effectively solves the limitation problem of single modality by fusing different data sources, and significantly improves the performance of target detection. In this paper, an end-to-end deep learning model (MANet) based on mutual attention mechanism is proposed to realize the effective fusion of point cloud data and RGB image features for 3D target detection. The point cloud data is first preprocessed with statistical filtering and RANSAC ground segmentation, and then an end-to-end deep learning network composed of four modules: point cloud feature learning, image feature learning, mutual attention feature fusion, and target detection is designed. Through the mutual attention mechanism, the alignment and fusion of point cloud and image features are realized, and the 3D target detection performance is improved. Experiments on the KITTI dataset show that the proposed MANet algorithm achieves 86.13% accuracy on the Car AP 3D metric with medium difficulty, which is a 5.66% improvement over MAFF-Net, and 92.27% on the Car AP BEV metric.Ablation experiments on the Waymo Open dataset demonstrate the effectiveness of the mutual-attention feature fusion to make the 3D mAP of LEVEL_1 to improve from 84.57% to 85.84%. The experimental results show that the proposed multimodal fusion method can effectively improve the accuracy and robustness of 3D target detection, which is of great application value in the fields of autonomous driving and smart city.