Deep Reinforcement Learning-based Integrated Control of Hybrid Electric Vehicles Driven by High Definition Map in Cloud Control System
-
摘要: 在智能化、网联化与新能源化的发展背景下,汽车工业将联合计算机、信息通信、人工智能等领域实现融合性发展。基于新一代信息与通信技术——智能网联汽车云控系统,通过网联数据驱动的形式实现新能源汽车的云控级自动驾驶,将为车辆行驶与动力系统提供革新的规划与控制思路。首先,基于云控系统的资源平台获取目标路段的经纬度、海拔、气象信息,建立包含坡度、曲率、转角等数据在内的高精度模型。其次,提出了一种基于高精地图驱动的深度强化学习型混合动力汽车集成控制方法,通过利用两种深度强化学习算法对整车层的速度与转向以及动力系统层的发动机与变速器进行控制,实现了四种控制策略的同步学习。最后,采用高性能边缘计算设备NVIDIA Jetson AGX Xavier进行了处理器在环测试。结果表明,当变量空间涉及14种状态与4种动作时,深度强化学习型集成控制策略在全程172 km的高速工况下实现了在整车层对速度与转向的精准控制,同时取得了5.53 L/100 km的燃油经济性,并且在嵌入式处理器在环测试中仅消耗104.14 s的计算时间,有效验证了学习型多目标集成控制策略的优化性与实时性。Abstract: In the context of the development of intelligence, connectivity, and new energy, the automotive industry combines computer, information communication, artificial intelligence(AI) to achieve integrated development. Based on the new generation of information and communication technology--cloud control system(CCS) of intelligent and connected vehicles(ICVs), the cloud-level automatic driving of new energy vehicles is realized driven by connected data, which provides innovative planning and control ideas for vehicle driving and power systems. Firstly, based on the resource platform of CCS, the latitude, longitude, altitude, and weather of the target road are obtained, and a high definition(HD) path model including slope, curvature, and steering angle is established. Secondly, a deep reinforcement learning(DRL)-based integrated control method for hybrid electric vehicle(HEV) drive by the HD model is proposed. By adopting two DRL algorithms, the speed and steering of the vehicle and the engine and transmission in the powertrain are controlled, and the synchronous learning of four control strategies is realized. Finally, processor-in-the-loop(PIL) tests are performed by using the high-performance edge computing device NVIDIA Jetson AGX Xavier. The results show that under a variable space including 14 states and 4 actions, the DRL -based integrated control strategy realizes the precise control of the speed and steering of the vehicle layer under the high-speed driving cycle of 172 km, and achieves a fuel consumption of 5.53L/100km. Meanwhile, it only consumes 104.14s in the PIL test, which verifies the optimization and real-time performance of the learning-based multi-objective integrated control strategy.
-
表 1 整车主要参数
部件 参数 数值 整车参数 质量/kg 1 587 车轮半径/m 0.301 15 轴距/m 2.67 发动机 最大功率/kW 197 最大转矩/(N·m) 331 电动机/发电机 最大功率/kW 51 最大转矩/(N·m) 390 锂离子动力电池 电池容量/(A·h) 12 额定电压/V 324 CVT变速器 传动比 [0.529, 3.172] 主减速器 传动比 2.47 表 2 深度强化学习型集成控制策略的训练伪代码
深度强化学习型混合动力汽车集成控制策略 输入:训练环境模型 1: 初始化:四种控制策略的贪婪算法系数ε; 2 初始化:四种控制策略的经验池D; 3: 随机初始化:在线网络参数与目标网络参数; 4: For训练回合= 1到最大训练回合= 400 do 5: 获得初始状态值SAcc(1), SSte(1), SCVT(1), SEng(1); 6: For t = 1到T do 7: 根据当前策略$ \text{π} $输出动作,其中按照1-ε×100%的概率选择最大动作值对应的动作或者按照ε×100%的概率选择随机的动作; 8: 在环境执行动作后,转移到下一状态并且获得奖励; 9: 保存训练样本{S(t), A(t), R(t), S(t+1), }; 10: 到达更新时刻后,从经验池中随机抽取小批量样本来计算更新神经网络的梯度数据; 11: if训练回合< = 100; 12: 仅训练和更新基于DDPG的车辆加速度控制策略与车辆转向角控制策略; 13: 贪婪算法系数εAcc与εSte逐渐衰减至0.01; 14: elif 100 < 训练回合< = 200; 15: 开始更新基于DQN的CVT传动比控制策略; 16: 贪婪算法系数εCVT逐渐衰减至0.01; 17: else 200 < 训练回合; 18: 开始更新基于DQN的发动机功率控制策略; 19: 贪婪算法系数εEng逐渐衰减至0.01; 20: End for 21: End for 22: 提取保存拟合深度强化学习型混合动力汽车集成控制策略的四组神经网络参数文件,用于后续对控制策略的处理器在环试验测试。 输出:深度强化学习型集成控制策略 表 3 超参数与神经网络结构
参数 数值 学习率 0.001 折扣因子 0.9 经验池容量 10 000 小批量样本数目 128 衰减率 0.01 初始探索率 1 最终探索率 0.01 神经元激活函数 Sigmoid 隐含层神经元数目 第一层 100 第二层 100 第三层 100 表 4 能量管理策略对比方案设置
命名方案 能量管理策略设置 DP(T)/DP-based EMSa
[二维状态空间]
[二维动作空间]状态空间包括SOC与CVT传动比,动作空间包括节气门开度以及CVT换档命令。其中,SOC变化范围是[0.5, 0.7]并被离散为21格;CVT传动比被离散化为28格;节气门开度范围是[0, 1]并离散为101格;CVT换档命令为[-5, 5]并被离散为11格,表示瞬时传动比的变化量。 DP(Δ)/DP-based EMSb
[三维状态空间]
[二维动作空间]状态空间包括SOC、发动机功率与CVT传动比,动作空间包括发动机功率变化量与CVT换档命令。其中,发动机功率范围是[0 kW, 197 kW]并被离散为198格,功率变化量范围是[−5 kW, 5 kW]并离散化为11格。其余变量设置与DP(T)/DP-based EMS保持一致。 QL/DP-based EMS
[一维状态空间]
[一维动作空间]状态空间包括SOC,动作空间包括发动机功率变化量。具体离散设置与前两种控制策略保持一致,而CVT换档策略直接采用DP(Δ)/DP-based EMS的传动比序列。 注:aT表示控制对象为节气门开度;bΔ表示控制对象为发动机功率变化量。 表 5 能量管理策略结果对比
能量管理 & CVT换档策略 初始SOC 终值SOC 计算时间/s 燃油消耗量/g 燃油经济性/L/100km 燃油经济性差异(%) DP(T)/DP-based EMS 0.600 0.5462 2 171.80a 6 738.37 5.39 0.41 DP(Δ)/DP-based EMS 0.600 0.5463 73 496.12a 6 711.05 5.37 — QL/DP-based EMS 0.600 0.5494 7.32a 7 275.30 5.82 8.71 DRL/DP-based EMS 0.600 0.5491 104.14b 6 907.69 5.53 2.93 注:a为搭载16 GB内存的16核3.80 GHz i7-10700K CPU处理器(计算机);b为NVIDIA 8核ARM®v8.2 64位CPU处理器(边缘计算设备)。 -
[1] 李克强, 戴一凡, 李升波, 等. 智能网联汽车(ICV)技术的发展现状及趋势[J]. 汽车安全与节能学报, 2017, 8(1): 1-14. doi: 10.3969/j.issn.1674-8484.2017.01.001LI Keqiang, DAI Yifan, LI Shengbo, et al. State-of-the-art and technical trends of intelligent and connected vehicles[J]. Journal of Automotive Safety and Energy, 2017, 8(1): 1-14. doi: 10.3969/j.issn.1674-8484.2017.01.001 [2] 李克强, 常雪阳, 李家文, 等. 智能网联汽车云控系统及其实现[J]. 汽车工程, 2020, 42(12): 1595-1605. doi: 10.19562/j.chinasae.qcgc.2020.12.001LI Keqiang, CHANG Xueyang, LI Jiawen, et al. Cloud control system for intelligent and connected vehicles and its application[J]. Automotive Engineering, 2020, 42(12): 1595-1605. doi: 10.19562/j.chinasae.qcgc.2020.12.001 [3] 欧阳明高. 中国新能源汽车的研发及展望[J]. 科技导报, 2016, 34(6): 13-20. https://www.cnki.com.cn/Article/CJFDTOTAL-KJDB201606008.htmOUYANG Minggao. New energy vehicle research and development in China[J]. Science and Technology Review, 2016, 34(6): 13-20. https://www.cnki.com.cn/Article/CJFDTOTAL-KJDB201606008.htm [4] TANG X L, JIA T, HU X S, et al. Naturalistic data-driven predictive energy management for plug-in hybrid electric vehicles[J]. IEEE Transactions on Transportation Electrification, 2021, 7(2): 497-508. doi: 10.1109/TTE.2020.3025352 [5] TANG X L, CHEN J X, PU H Y, et al. Double deep reinforcement learning-based energy management for a parallel hybrid electric vehicle with engine start–stop strategy[J]. IEEE Transactions on Transportation Electrification, 2022, 8(1): 1376-1388. doi: 10.1109/TTE.2021.3101470 [6] 刘华伟, 耿安琪, 何正友, 等. 重载铁路再生制动能量利用方案研究[J]. 电气工程学报, 2021, 16(1): 157-165. https://www.cnki.com.cn/Article/CJFDTOTAL-DQZH202101020.htmLIU Huawei, GENG Anqi, HE Zhengyou, et al. Research on energy utilization scheme of regenerative braking for heavy haul railway[J]. Journal of Electrical Engineering, 2021, 16(1): 157-165. https://www.cnki.com.cn/Article/CJFDTOTAL-DQZH202101020.htm [7] 肖梓林. 城市轨道交通再生能量利用的直流牵引供电系统仿真研究[J]. 电气工程学报, 2021, 16(1): 166-172. https://www.cnki.com.cn/Article/CJFDTOTAL-DQZH202101021.htmXIAO Zilin. Simulation research on DC traction power supply system for renewable energy utilization of urban rail transit[J]. Journal of Electrical Engineering, 2021, 16(1): 166-172. https://www.cnki.com.cn/Article/CJFDTOTAL-DQZH202101021.htm [8] TRAN DD, VAFAEIPOUR M, EL BAGHDADI, M, et al. Thorough state-of-the-art analysis of electric and hybrid vehicle powertrains: Topologies and integrated energy management strategies[J]. Renewable & Sustainable Energy Reviews, 2020, 119: 109596. [9] 张风奇, 胡晓松, 许康辉, 等. 混合动力汽车模型预测能量管理研究现状与展望[J]. 机械工程学报, 2019, 55(10): 86-108. doi: 10.3901/JME.2019.10.086ZHANG Fengqi, HU Xiaosong, XU Kanghui, et al. Current status and prospects for model predictive energy management in hybrid electric vehicles[J]. Journal of Mechanical Engineering, 2019, 55(10): 86-108. doi: 10.3901/JME.2019.10.086 [10] YANG K, TANG X L, QIN Y C, et al. Comparative study of trajectory tracking control for automated vehicles via model predictive control and robust H-infinity state feedback control[J]. Chinese Journal of Mechanical Engineering, 2021, 34(1): 1-14. doi: 10.1186/s10033-020-00524-5 [11] LIU T, HU X S, LI S B, et al. Reinforcement learning optimized look-ahead energy management of a parallel hybrid electric vehicle[J]. IEEE-ASME Transactions on Mechatronics, 2017, 22(4): 1497-1507. doi: 10.1109/TMECH.2017.2707338 [12] HU X S, LIU T, QI X W, et al. Reinforcement learning for hybrid and plug-in hybrid electric vehicle energy management: Recent advances and prospects[J]. IEEE Industrial Electronics Magazine, 2019, 13(3): 16-25. doi: 10.1109/MIE.2019.2913015 [13] TAN H C, ZHANG H L, PENG J K, et al. Energy management of hybrid electric bus based on deep reinforcement learning in continuous state and action space[J]. Energy Conversion and Management, 2019, 195: 548-560. [14] ZOU R N, FAN L K, DONG Y R, et al. DQL energy management: An online-updated algorithm and its application in fix-line hybrid electric vehicle[J]. Energy, 2021, 225: 120174. [15] LI Y C, HE H W, KHAJEPOUR A, et al. Energy management for a power-split hybrid electric bus via deep reinforcement learning with terrain information[J]. Applied Energy, 2019(255): 113762. [16] WANG Y, TAN H C, WU Y K, et al. Hybrid electric vehicle energy management with computer vision and deep reinforcement learning[J]. IEEE Transactions on Industrial Informatics, 2021, 17(6): 3857-3868. [17] LI W H, CUI H, NEMETH T, et al. Cloud-based health-conscious energy management of hybrid battery systems in electric vehicles with deep reinforcement learning[J]. Applied Energy, 2021, 293: 116977. [18] LIAN R Z, TAN H C, PENG J K, et al. Cross-type transfer for deep reinforcement learning based hybrid electric vehicle energy management[J]. IEEE Transactions on Vehicular Technology, 2020, 69(8): 8367-8380. [19] TANG X L, CHEN J X, LIU T, et al. Distributed deep reinforcement learning-based energy and emission management strategy for hybrid electric vehicles[J]. IEEE Transactions on Vehicular Technology, 2021, 70(10): 9922-9934. [20] 李克强, 李家文, 常雪阳, 等. 智能网联汽车云控系统原理及其典型应用[J]. 汽车安全与节能学报, 2020, 11(3): 261-275. https://www.cnki.com.cn/Article/CJFDTOTAL-QCAN202003001.htmLI Keqiang, LI Jiawen, CHANG Xueyang, et al. Principles and typical applications of cloud control system for intelligent and connected vehicles[J]. Journal of Automotive Safety and Energy, 2020, 11(3): 261-275. https://www.cnki.com.cn/Article/CJFDTOTAL-QCAN202003001.htm [21] 唐小林, 李珊珊, 王红, 等. 网联环境下基于分层式模型预测控制的车队能量控制策略研究[J]. 机械工程学报, 2020, 56(14): 119-128. doi: 10.3901/JME.2020.14.119TANG Xiaolin, LI Shanshan, WANG Hong, et al. Research on energy control strategy based on hierarchical model predictive control in connected environment[J]. Journal of Mechanical Engineering, 2020, 56(14): 119-128. doi: 10.3901/JME.2020.14.119 [22] 唐小林, 陈佳信, 刘腾, 等. 基于深度强化学习的混合动力汽车智能跟车控制与能量管理策略研究[J]. 机械工程学报, 2021, 57(22): 237-246. doi: 10.3901/JME.2021.22.237TANG Xiaolin, CHEN Jiaxin, LIU Teng, et al. Research on deep reinforcement learning-based intelligent car-following control and energy management strategy for hybrid electric vehicles[J]. Journal of Mechanical Engineering, 2021, 57(22): 237-246. doi: 10.3901/JME.2021.22.237 [23] 刘腾. 混合动力车辆强化学习能量管理研究[D]. 北京: 北京理工大学, 2017.LIU Teng. Reinforcement learning-based energy management for hybrid electric vehicles[D]. Beijing : Beijing Institute of Technology, 2017. [24] CHEN J X, SHU H, TANG X L, et al. Deep reinforcement learning-based multi-objective control of hybrid power system combined with road recognition under time-varying environment[J]. Energy, 2022, 239, Part C: 122123. [25] VOLODYMYR M, KORAY K, DAVID S, et al. Human- level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529-533. [26] 胡悦. 混合动力电动汽车控制系统设计与能量管理策略研究[D]. 深圳: 中国科学院大学(中国科学院深圳先进技术研究院), 2018.HU Yue. Research on control system design and energy management strategy of hybrid electric vehicle[D]. Shenzhen: University of Chinese Academy of Sciences (Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences), 2018.