基于YOLOv3和ASMS的目标跟踪算法

吕晨; 程德强; 寇旗旗; 庄焕东; 李海翔

doi:10.12086/oee.2021.200175

基于YOLOv3和ASMS的目标跟踪算法

doi: 10.12086/oee.2021.200175

基金项目:

国家自然科学基金资助项目 51774281

详细信息

作者简介:
吕晨(1994-)，男，硕士研究生，主要从事模式识别，目标跟踪的研究。E-mail：286562685@qq.com

中图分类号: TP181;TP391
计量
- 文章访问数: 300
- HTML全文浏览量: 262
- PDF下载量: 0
- 被引次数: 0
出版历程
- 收稿日期: 2020-05-18
- 修回日期: 2020-09-24

Target tracking algorithm based on YOLOv3 and ASMS

Funds:

National Natural Science Foundation of China 51774281

More Information

Corresponding author: Lv Chen, E-mail: 286562685@qq.com

摘要

摘要: 为了解决传统算法在全自动跟踪过程中遇到遮挡或运动速度过快时的目标丢失问题，本文提出一种基于YOLOv3和ASMS的目标跟踪算法。首先通过YOLOv3算法进行目标检测并确定跟踪的初始目标区域，然后基于ASMS算法进行跟踪，实时检测并判断目标跟踪效果，通过二次拟合定位和YOLOv3算法实现跟踪目标丢失后的重新定位。为了进一步提升算法运行效率，本文应用增量剪枝方法，对算法模型进行了压缩。通过与当前主流算法进行对比，实验结果表明，本算法能够很好地解决受到遮挡时跟踪目标的丢失问题，提高了目标检测和跟踪的精度，且具有计算复杂度低、耗时少，实时性高的优点。
- 目标跟踪 /
- 目标丢失 /
- YOLOv3 /
- 模型剪枝 /
- ASMS
Abstract: In order to solve the problem of loss when the target encounters occlusion or the speed is too fast during the automatic tracking process, a target tracking algorithm based on YOLOv3 and ASMS is proposed. Firstly, the target is detected by the YOLOv3 algorithm and the initial target area to be tracked is determined. After that, the ASMS algorithm is used for tracking. The tracking effect of the target is detected and judged in real time. Repositioning is achieved by quadratic fitting positioning and the YOLOv3 algorithm when the target is lost. Finally, in order to further improve the efficiency of the algorithm, the incremental pruning method is used to compress the algorithm model. Compared with the mainstream algorithms, experimental results show that the proposed algorithm can solve the lost problem when the tracking target is occluded, improving the accuracy of target detection and tracking. It also has advantages of low computational complexity, time-consuming, and high real-time performance.
- target tracking /
- target loss /
- you look only once v3 /
- model pruning /
- robust scale-adaptive mean-shift

HTML全文

图 1 YOLOv3的检测框架图

Figure 1. Block diagram of YOLOv3

下载: 全尺寸图片幻灯片

图 2 通过稀疏训练和通道剪枝获得剪枝后的YOLOv3

Figure 2. YOLOv3 pruned through sparse training and channel pruning

下载: 全尺寸图片幻灯片

图 3 基于YOLOv3和ASMS的跟踪算法流程图

Figure 3. The tracking algorithm flow chart based on YOLOv3 and ASMS

下载: 全尺寸图片幻灯片

图 4 联合YOLOv3-95和ASMS算法的跟踪效果

Figure 4. The tracking performance of algorithm based on YOLOv3-95 and ASMS

下载: 全尺寸图片幻灯片

图 5 传统ASMS算法的跟踪效果。(a) 行人；(b) 动物；(c) 小车

Figure 5. Tracking performance of the ASMS algorithm. (a) Pedestrian; (b) Animal; (c) Car

下载: 全尺寸图片幻灯片

图 6 KCF算法跟踪效果。(a) 行人；(b) 动物；(c) 小车

Figure 6. Tracking performance of the KCF algorithm. (a) Pedestrian; (b) Animal; (c) Car

下载: 全尺寸图片幻灯片

图 7 基于YOLOv3-95和ASMS算法的跟踪效果。(a) 行人；(b) 动物；(c) 小车

Figure 7. Tracking performance of the algorithm based on YOLOv3-95 and ASMS. (a) Pedestrian; (b) Animal; (c) Car

下载: 全尺寸图片幻灯片

图 8 巴氏系数的曲线变化图

Figure 8. Bhattacharyya coefficient curves of different algorithms

下载: 全尺寸图片幻灯片

图 9 巴氏系数的曲线变化图

Figure 9. Bhattacharyya coefficient curves of different algorithms

下载: 全尺寸图片幻灯片

表 1 对比模型和剪枝模型评价结果

Table 1. Evaluation results of comparison model and pruning model

模型	精确度	mAP	速度/(f/s)		参数	体量
模型	精确度	mAP	CPU	GPU	参数	体量
YOLOv3-tiny	32.7	24.1	48	120	8.9M	33.1MB
YOLOv3	55.8	57.9	13	27	60.6M	231MB
YOLOv3-50	57.6	56.6	22	48	19.8M	91.7MB
YOLOv3-80	51.7	52.4	23	50	12.3M	46.6MB
YOLOv3-95	49.4	46.5	27	57	4.8M	18.7MB

下载: 导出CSV

表 2 算法对比表

Table 2. Comparison among different algorithms

算法	平均巴氏距离	单帧平均耗时/s
传统ASMS算法	0.786	0.0098
KCF算法	0.795	0.0073
基于YOLOv3和ASMS算法	0.805	0.0631
基于YOLOv3-95和ASMS算法	0.803	0.0463

下载: 导出CSV

表 3 算法对比表

Table 3. Comparison among different algorithms

算法	平均巴氏距离			单帧平均耗时/s
算法	行人	动物	小车	行人	动物	小车
ASMS算法	0.3128	0.2564	0.3397	0.0093	0.0101	0.0104
KCF算法	0.3275	0.2631	0.3463	0.0078	0.0073	0.0085
基于YOLOv3和ASMS的算法	0.6965	0.6700	0.7201	0.0626	0.0611	0.0607
基于YOLOv3-95和ASMS的算法	0.6733	0.6574	0.7196	0.0469	0.0460	0.0473
VITAL算法	0.7043	0.6852	0.7253	1.6667	1.6823	1.6295
SANet算法	0.6965	0.6700	0.7201	1.3333	1.3478	1.3256

下载: 导出CSV

参考文献(27)

[1]	卢湖川, 李佩霞, 王栋. 目标跟踪算法综述[J]. 模式识别与人工智能, 2018, 31(1): 61-76. https://www.cnki.com.cn/Article/CJFDTOTAL-MSSB201801008.htm Lu H C, Li P X, Wang D. Visual object tracking: a survey[J]. Pattern Recognit Artif Intell, 2018, 31(1): 61-76. https://www.cnki.com.cn/Article/CJFDTOTAL-MSSB201801008.htm
[2]	李玺, 查宇飞, 张天柱, 等. 深度学习的目标跟踪算法综述[J]. 中国图象图形学报, 2019, 24(12): 2057-2080. doi: 10.11834/jig.190372 Li X, Zha Y F, Zhang T Z, et al. Survey of visual object tracking algorithms based on deep learning[J]. J Image Graph, 2019, 24(12): 2057–2080. doi: 10.11834/jig.190372
[3]	葛宝义, 左宪章, 胡永江. 视觉目标跟踪方法研究综述[J]. 中国图象图形学报, 2018, 23(8): 1091-1107. https://www.cnki.com.cn/Article/CJFDTOTAL-ZGTB201808001.htm Ge B Y, Zuo X Z, Hu Y J. Review of visual object tracking technology[J]. J Image Graph, 2018, 23(8): 1091–1107. https://www.cnki.com.cn/Article/CJFDTOTAL-ZGTB201808001.htm
[4]	Sun D Q, Roth S, Black M J. Secrets of optical flow estimation and their principles[C]//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, 2010: 2432-2439.
[5]	Nummiaro K, Koller-Meier E, Van Gool L. An adaptive color-based particle filter[J]. Image Vis Comput, 2003, 21(1): 99-110. doi: 10.1016/S0262-8856(02)00129-4
[6]	Comaniciu D, Ramesh V, Meer P. Real-time tracking of non-rigid objects using mean shift[C]//Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No. PR00662), Hilton Head Island, SC, 2002: 142–149.
[7]	Babenko B, Yang M H, Belongie S. Robust object tracking with online multiple instance learning[J]. IEEE Trans Pattern Anal Mach Intell, 2011, 33(8): 1619–1632. doi: 10.1109/TPAMI.2010.226
[8]	Kalal Z, Mikolajczyk K, Matas J. Tracking-learning-detection[J]. IEEE Trans Pattern Anal Mach Intell, 2012, 34(7): 1409–1422. doi: 10.1109/TPAMI.2011.239
[9]	Avidan S. Support vector tracking[J]. IEEE Trans Pattern Anal Mach Intell, 2004, 26(8): 1064-1072. doi: 10.1109/TPAMI.2004.53
[10]	Vojir T, Noskova J, Matas J. Robust scale-adaptive mean-shift for tracking[C]//Proceedings of the 18th Scandinavian Conference Scandinavian Conference on Image Analysis, Espoo, Finland, 2014: 652–663.
[11]	Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, 2014: 580–587.
[12]	Girshick R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, Santigago, Chile, 2015: 1440–1448.
[13]	Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Trans Pattern Anal Mach Intell, 2017, 39(6): 1137–1149. doi: 10.1109/TPAMI.2016.2577031
[14]	Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, 2016: 779–788.
[15]	Liu W, Anguelov D, Erhan D, et al. SSD: single shot MultiBox detector[C]//Proceedings of the 14th European Conference European Conference on Computer Vision, Amsterdam, 2016: 21–37.
[16]	Redmon J, Farhadi A. YOLOv3: an incremental improvement[EB/OL]. [2020-02-10]. https://pjreddie.com/media/files/papers/YOLOv3.pdf.
[17]	Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, 2017: 6517–6525.
[18]	Fu C Y, Liu W, Ranga A, et al. DSSD: deconvolutional single shot detector[EB/OL]. [2020-02-10]. https://arxiv.org/pdf/1701.06659.pdf.
[19]	Li Z X, Zhou F Q. FSSD: feature fusion single shot multibox detector[EB/OL]. [2020-02-10]. https://arxiv.org/pdf/1712.00960.pdf.
[20]	Liu Z, Li J G, Shen Z Q, et al. Learning efficient convolutional networks through network slimming[C]//Proceedings of 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2755–2763.
[21]	Chen G B, Choi W, Yu X, et al. Learning efficient object detection models with knowledge distillation[EB/OL]. [2020-02-10] http://papers.nips.cc/paper/6676-learning-efficient-object-detection-models-with-knowledge-distillation.pdf.
[22]	Wu J X, Leng C, Wang Y H, et al. Quantized convolutional neural networks for mobile devices[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, 2016: 4820–4828.
[23]	Huang G, Chen D L, Li T H, et al. Multi-scale dense networks for resource efficient image classification[EB/OL]. [2020-02-10] https://arxiv.org/pdf/1703.09844.pdf.
[24]	He M, Zhao H W, Wang G Z, et al. Deep neural network acceleration method based on sparsity[C]//Proceedings of the 15th International Forum International Forum on Digital TV and Wireless Multimedia Communications, Shanghai, China, 2019: 133–145.
[25]	Henriques J F, Caseiro R, Martins P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Trans Pattern Anal Mach Intell, 2015, 37(3): 583–596. doi: 10.1109/TPAMI.2014.2345390
[26]	Song Y B, Ma C, Wu X H, et al. VITAL: visual tracking via adversarial learning[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018: 8990–8999.
[27]	Fan H, Ling H B. SANet: structure-aware network for visual tracking[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, Hl, 2017: 2217–2224.