留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于激光点云与图像融合的3D目标检测研究

刘永刚 于丰宁 章新杰 陈峥 秦大同

刘永刚, 于丰宁, 章新杰, 陈峥, 秦大同. 基于激光点云与图像融合的3D目标检测研究[J]. 机械工程学报, 2022, 58(24): 289-299. doi: 10.3901/JME.2022.24.289
引用本文: 刘永刚, 于丰宁, 章新杰, 陈峥, 秦大同. 基于激光点云与图像融合的3D目标检测研究[J]. 机械工程学报, 2022, 58(24): 289-299. doi: 10.3901/JME.2022.24.289
LIU Yonggang, YU Fengning, ZHANG Xinjie, CHEN Zheng, QIN Datong. Research on 3D Object Detection Based on Laser Point Cloud and Image Fusion[J]. JOURNAL OF MECHANICAL ENGINEERING, 2022, 58(24): 289-299. doi: 10.3901/JME.2022.24.289
Citation: LIU Yonggang, YU Fengning, ZHANG Xinjie, CHEN Zheng, QIN Datong. Research on 3D Object Detection Based on Laser Point Cloud and Image Fusion[J]. JOURNAL OF MECHANICAL ENGINEERING, 2022, 58(24): 289-299. doi: 10.3901/JME.2022.24.289

基于激光点云与图像融合的3D目标检测研究

doi: 10.3901/JME.2022.24.289
基金项目: 

国家自然科学基金 51775063

汽车仿真与控制国家重点实验室开放基金 20201101

重庆自主品牌汽车协同创新中心揭榜挂帅项目 2022CDJDX-004

详细信息
    作者简介:

    于丰宁,男,1997年出生,硕士研究生。主要研究方向为智能汽车激光雷达3D目标检测。E-mail:yufengning@cqu.edu.cn

    章新杰:男,1984年出生,博士,教授,博士研究生导师。主要研究方向为车辆动力学及控制、智能运载测试与评价、驾驶员模型。E-mail:x_jzhang@jlu.edu.cn

    陈峥,男,1982年出生,博士,教授,博士研究生导师。主要研究方向为动力电池管理、智能车辆控制及混合动力汽车能量管理。E-mail:chen@kust.edu.cn

    秦大同,男,1956年出生,博士,教授,博士研究生导师。主要研究方向为机械传动系统、车辆动力传动及其智能控制。E-mail:dtqin@cqu.edu.cn

    通讯作者:

    刘永刚(通信作者),男,1982年出生,博士,教授,博士研究生导师。主要研究方向为智能汽车决策与控制关键技术、新能源汽车动力系统优化与控制、车辆自动变速传动及综合控制。E-mail:andyliuyg@cqu.edu.cn

  • 中图分类号: TG156

Research on 3D Object Detection Based on Laser Point Cloud and Image Fusion

  • 摘要: 目前基于激光雷达与摄像头融合的目标检测技术受到了广泛的关注,然而大部分融合算法难以精确检测行人、骑行人等较小目标物体,因此提出一种基于自注意力机制的点云特征融合网络。首先,改进Faster-RCNN目标检测网络以形成候选框,然后根据激光雷达和相机的投影关系提取出图像目标框中的视锥点云,减小点云的计算规模与空间搜索范围;其次,提出一种基于自注意力机制的Self-Attention PointNet网络结构,在视锥范围内对原始点云数据进行实例分割;然后,利用边界框回归PointNet网络和轻量级T-Net网络来预测目标点云的3D边界框参数,同时在损失函数中添加正则化项以提高检测精度;最后,在KITTI数据集上进行验证。结果表明,所提方法明显优于广泛应用的F-PointNet,在简单、中等和困难任务下,汽车、行人和骑行人的检测精度均得到较大的提升,其中骑行人的检测精度提升最为明显。同时,与许多主流的三维目标检测网络相比具有更高的准确率,有效地提高了3D目标检测的精度。

     

    目前基于激光雷达与摄像头融合的目标检测技术受到了广泛的关注,然而大部分融合算法难以精确检测行人、骑行人等较小目标物体,因此提出一种基于自注意力机制的点云特征融合网络。首先,改进Faster-RCNN目标检测网络以形成候选框,然后根据激光雷达和相机的投影关系提取出图像目标框中的视锥点云,减小点云的计算规模与空间搜索范围;其次,提出一种基于自注意力机制的Self-Attention PointNet网络结构,在视锥范围内对原始点云数据进行实例分割;然后,利用边界框回归PointNet网络和轻量级T-Net网络来预测目标点云的3D边界框参数,同时在损失函数中添加正则化项以提高检测精度;最后,在KITTI数据集上进行验证。结果表明,所提方法明显优于广泛应用的F-PointNet,在简单、中等和困难任务下,汽车、行人和骑行人的检测精度均得到较大的提升,其中骑行人的检测精度提升最为明显。同时,与许多主流的三维目标检测网络相比具有更高的准确率,有效地提高了3D目标检测的精度。
  • 图  改进的Faster-RCNN网络结构

    图  Self-Attention Block结构

    图  Self-Attention Pointnet网络结构

    图  基于特征融合的三维点云目标检测网络结构

    图  2D检测图像与3D点云数据的校准效果图

    图  视锥切割与目标视锥候选区域提取图

    图  视锥朝向调整图

    图  目标点云坐标系转换

    图  T-Net网络与边界框回归网络结构

    图  10  3D目标检测训练过程中的损失函数收敛曲线和测试准确率曲线

    图  11  KITTI数据集上检测结果可视化

    表  1  不同衰减率下的目标检测精度

    正则化系数 Car Pedestrian Cyclist
    Easy Moderate Hard Easy Moderate Hard Easy Moderate Hard
    0 82.18 68.93 61.02 64.45 55.36 48.43 68.52 50.53 47.14
    0.01 83.75 69.17 62.38 64.54 55.71 48.75 71.11 53.36 50.29
    0.001 82.93 69.16 61.95 67.54 57.62 50.84 68.61 51.83 48.12
    0.0001 84.25 69.75 63.07 62.89 54.54 48.09 67.82 52.51 48.32
    下载: 导出CSV

    表  2  各处理部分对3D目标检测AP值的影响

    正则化损失 Self-Attention Car Pedestrian Cyclist
    Easy Moderate Hard Easy Moderate Hard Easy Moderate Hard
    82.18 68.93 61.02 64.45 55.36 48.43 68.52 50.53 47.14
    82.93 69.16 61.95 67.54 57.62 50.84 68.61 51.83 48.12
    84.79 71.50 63.54 67.18 58.15 51.25 77.04 57.91 54.16
    84.10 71.01 63.39 67.07 58.08 51.21 80.77 58.61 54.37
    下载: 导出CSV

    表  3  本模型与其他模型的3D目标检测AP值对比(仅汽车类别)

    Method Car
    Easy Moderate Hard
    F-PointNet(v2) 83.76 70.92 63.56
    PointFusion 77.92 63.00 53.27
    RT3D 72.85 61.64 64.38
    MV3D 71.29 62.68 56.56
    VoxelNet 81.97 65.46 62.85
    Ours 84.79 71.50 63.54
    下载: 导出CSV
  • [1] 薛培林, 吴愿, 殷国栋, 等. 基于信息融合的城市自主车辆实时目标识别[J]. 机械工程学报, 2020, 56(12): 165-173. doi: 10.3901/JME.2020.12.165

    XUE Peilin, WU Yuan, YIN Guodong, et al. Real-time target recognition of urban autonomous vehicles based on information fusion[J]. Chinese Journal of Mechanical Engineering, 2020, 56(12): 165-173. doi: 10.3901/JME.2020.12.165
    [2] 彭育辉, 郑玮鸿, 张剑锋. 基于深度学习的道路障碍物检测方法[J]. 计算机应用, 2020, 40(8): 2428-2433. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJY202008040.htm

    PENG Yuhui, ZHENG Weihong, ZHANG Jianfeng. Road obstacle detection method based on deep learning[J]. Journal of Computer Applications, 2020, 40(8): 2428-2433. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJY202008040.htm
    [3] WANG D L, POSNER I. Voting for voting in online point cloud object detection[C]//Robotics: Science and Systems Xi, Sapienza Univ Rome: MIT PRESS, 2015: 13-22.
    [4] ZHOU Yin, TUZEL O. VoxelNet: end-to-end learning for point cloud based 3D object detection[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), Salt Lake City UT: IEEE Comp Soc, 2018: 4490-4499.
    [5] YAN Yan, MAO Yuxing, LI Bo. SECOND: Sparsely embedded convolutional detection[J]. Sensors, 2018, 18(10): 3337-3354. doi: 10.3390/s18103337
    [6] KUANG Hongwu, WANG Bei, AN Jianping, et al. Voxel-FPN: Multi-scale voxel feature aggregation for 3d object detection from lidar point clouds[J]. Sensors, 2020, 20(3): 704-723. doi: 10.3390/s20030704
    [7] ENGELCKE M, RAO D, ZENG D, et al. Vote3Deep: fast object detection in 3d point clouds using efficient convolutional neural ntworks[C]//2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore: IEEE, 2017: 1355-1361.
    [8] B. 3D fully convolutional network for vehicle detection in point cloud[C]//2017 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS), Vancouver: IEEE, 2017: 1513-1518.
    [9] QI C R, SU Hao, MO Kaichun, et al. PointNet: Deep learning on point sets for 3d classification and segmentation[C]//30th IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Honolulu: IEEE, 2017: 77-85.
    [10] QI C R, YI Li, SU Hao, et al. PointNet plus plus : Deep hierarchical feature learning on point sets in a metric space[C]//Proceedings of Advances in Neural Information Processing Systems 30, Long Beach CA: NIPS, 2017: 5099-5108.
    [11] LI Yangyan, BU Rui, SUN Mingchao, et al. PointCNN: Convolution on x-transformed points[C]//Proceedings of Advances in Neural Information Processing Systems 31, Montreal: NIPS, 2018: 820-830.
    [12] DENG Haowen, BIRDAL T, IlIE S, et al. PPFNet: Global context aware local features for robust 3D point matching[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), Salt Lake City UT: IEEE, 2018: 195-205.
    [13] MEYER G P, LADDHA A, KEE E, et al. LaserNet: An efficient probabilistic 3D object detector for autonomous driving[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), Long Beach CA: IEEE, 2019: 12669-12678.
    [14] YANG Zetong, SUN Yanan, LIU Shu, et al. 3DSSD: point-based 3D single stage object detector[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), Seattle: IEEE, 2020: arXiv: 2002.10187.
    [15] LI Bo, ZHANG Tianlei, XIA Tian. Vehicle detection from 3 D lidar using fully convolutional network[C]//Proceedings of Robotics: Science and Systems (RSS), Ann Arbor: MIT PRESS, 2016: 42-50.
    [16] CHEN Xiaozhi, MA Huimin, WAN Ji, et al. Multi-view 3 D object detection network for autonomous driving[C]//30th IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), Honolulu: IEEE, 2017: 6526-6534.
    [17] KU J, MOZIFIAN M, LEE J, et al. Joint 3d proposal generation and object detection from view aggregation[C]//2018 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS), Madrid: IEEE, 2018: 5750-5757.
    [18] QI C R, LIU Wei, WU Chenxia, et al. Frustum pointnets for 3D object detection from RGB-D data[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), Salt Lake City UT: IEEE, 2018: 918-927.
    [19] WANG Zhixin, JIA Kui. Frustum convNet: sliding frustums to aggregate local point-wise features for amodal 3D object detection[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau: IEEE, 2019: 1742-1749.
    [20] LIANG Ming, YANG Bin, CHEN Yun, et al. Multi-task multi-sensor fusion for 3D object detection[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach CA: IEEE, 2019: 7337-7345.
    [21] LIANG Ming, YANG Bin, WANG Shenlong, et al. Deep continuous fusion for multi-sensor 3D object detection[C]//15th European Conference on Computer Vision (ECCV), Munich: Springer-Verlag Berlin, 2018: 663-678.
    [22] REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence(TPAMI), 2016, 36(6): 1137-1149.
    [23] LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//30th IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Honolulu: IEEE, 2017: 936-944.
    [24] WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]//15th European Conference on Computer Vision (ECCV), Munich: SPRINGER-VERLAG BERLIN, 2018: 3-19.
    [25] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of Advances in Neural Information Processing Systems 30, Long Beach CA: NIPS, 2017: 1049-1064.
    [26] GEIGER A, LENZ P, STILLER C, et al. Vision meets robotics: The kitti dataset[J]. International Journal of Robotics Research, 2013, 32(11): 1231-1237. doi: 10.1177/0278364913491297
    [27] JADERBERG M, SIMONYAN K, ZISSERMAN A, et al. Spatial transformer networks[C]//Proceedings of Advances in Neural Information Processing Systems 28, Montreal: NIPS, 2015: 2017-2025.
    [28] XU Danfei, ANGUELOV D, JAIN A. PointFusion: deep sensor fusion for 3d bounding box estimation[C]//31st IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City UT: IEEE, 2018: 244-253.
    [29] ZENG Yiming, HU Yu, LIU Shice, et al. RT3D: Real-time 3D vehicle detection in lidar point cloud for autonomous driving[J]. IEEE Robotics And Automation Letters, 2018, 3(4): 3434-3440. doi: 10.1109/LRA.2018.2852843
  • 加载中
图(12) / 表(3)
计量
  • 文章访问数:  211
  • HTML全文浏览量:  14
  • PDF下载量:  0
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-01-19
  • 修回日期:  2022-09-26
  • 网络出版日期:  2024-03-07
  • 刊出日期:  2022-12-20

目录

    /

    返回文章
    返回