留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

多模态深度学习及其在眼科人工智能的应用展望

李锡荣

李锡荣. 多模态深度学习及其在眼科人工智能的应用展望[J]. 机械工程学报, 2021, 12(5): 602-607. doi: 10.12290/xhyxzz.2021-0500
引用本文: 李锡荣. 多模态深度学习及其在眼科人工智能的应用展望[J]. 机械工程学报, 2021, 12(5): 602-607. doi: 10.12290/xhyxzz.2021-0500
LI Xirong. Multi-modal Deep Learning and Its Applications in Ophthalmic Artificial Intelligence[J]. JOURNAL OF MECHANICAL ENGINEERING, 2021, 12(5): 602-607. doi: 10.12290/xhyxzz.2021-0500
Citation: LI Xirong. Multi-modal Deep Learning and Its Applications in Ophthalmic Artificial Intelligence[J]. JOURNAL OF MECHANICAL ENGINEERING, 2021, 12(5): 602-607. doi: 10.12290/xhyxzz.2021-0500

多模态深度学习及其在眼科人工智能的应用展望

doi: 10.12290/xhyxzz.2021-0500
基金项目: 

北京市自然科学基金面上项目 4202033

北京市自然科学基金-海淀原始创新联合基金 19L2062

北京市科委医药协同创新专项课题 Z191100007719002

详细信息
    通讯作者:

    李锡荣  电话:010-82504345,E-mail: xirong@ruc.edu.cn

  • 中图分类号: R77; TP18

Multi-modal Deep Learning and Its Applications in Ophthalmic Artificial Intelligence

Funds: 

Beijing Natural Science Foundation 4202033

Beijing Natural Science Foundation Haidian Original InnovationJoint Fund 19L2062

the Pharmaceutical Collaborative Innovation Research Project of Beijing Science and Technology Commission Z191100007719002

More Information
  • 摘要: 深度学习的强学习能力和高易用性使其成为当前主流机器学习算法和医学人工智能的核心技术。鉴于医学影像在健康筛查、疾病诊断、精准治疗、预后评估等诸多任务中的关键作用,用于医学影像结构分析与语义理解的深度学习正成为重要的交叉学科研究方向。在临床场景中,医生为了实现更精准的诊断,往往需要同时参考不同类型、不同模态的影像样本进行综合分析和判断。本文介绍面向此类场景的多模态深度学习的基本概念和工作原理,结合具体案例分析多模态深度学习在眼科领域的研究进展、应用情况及技术挑战,并对该技术的应用前景作出展望。

     

  • 图  不同类型眼科影像示例

    A.眼底彩照; B.荧光素眼底血管造影; C.超广角眼底图像; D.光学相干断层成像; E.裂隙灯照片(斜照法)

    图  多模态深度学习的3种范式(虚线方框)

    A.数据层融合;B.特征层融合;C.任务层融合

    表  1  单模态深度学习在眼科领域的应用举例

    年份(年) 研究者 任务 单模态输入
    2016 Gulshan等[7] DR转诊/非转诊分类 单张眼底彩照
    2017 Burlina等[8] AMD分级 单张眼底彩照
    2018 Kermany等[9] 多病种识别 OCT图像序列
    2018 Wei等[10] 激光斑检测 单张眼底彩照
    2019 Lai等[11] 左右眼识别 单张眼底彩照
    2019 Xu等[12] 核性白内障分级 单张裂隙灯照片
    2019 Yang等[13] 视盘-黄斑联合定位 单张超广角眼底图像
    2020 Wu等[14] 异常检测 单张OCT B-scan图像
    2020 Ding等[15] 视盘/视杯分割 单张眼底彩照
    2020 Ding等[16] RNFLD检测 单张眼底彩照
    2020 Wei等[17] 眼底病灶分割, DR分级 单张眼底彩照
    2020 Li等[18] ROP检测 多张眼底彩照
    2021 Li等[19] 多病种识别 单张眼底彩照
    2021 Zhang等[20] 多病种识别 单张超广角眼底图像
    DR:糖尿病视网膜病变;AMD:年龄相关性黄斑变性;OCT:光学相干断层成像;RNFLD:视神经纤维层缺损;ROP:早产儿视网膜病变
    下载: 导出CSV

    表  2  多模态深度学习在医学领域的应用举例

    年份(年) 研究者 任务 多模态输入 融合层级 融合策略
    2020 Wang等[28] 乳腺癌分类 普通超声, 彩色多普勒超声, 剪切波弹性成像, 应变弹性成像 特征层 特征拼接
    2020 Zhou等[29] 脑肿瘤患者总生存期预测 4种模态(T1、T1ce、T2、FLAIR)的MR影像 特征层 特征拼接
    2020 Chen等[26] 癌症诊断与预后预测 组织病理学图像, 基因组特征 特征层 张量融合
    2020 Jiang等[30] 胰腺分割 静脉期CT, 动脉期CT 特征层 多层次选择性特征融合
    2020 Peng等[31] 癌细胞远端转移预测 PET, CT 特征层 网络结构搜索
    下载: 导出CSV

    表  3  多模态深度学习在眼科领域的应用举例

    年份(年) 研究者 任务 多模态输入 融合层级 融合策略
    2019 Wang等[32] AMD分类 眼底彩照,OCT图像 特征层 特征拼接
    2020 Xu等[33] AMD/PCV分类 眼底彩照,OCT图像 特征层 特征拼接
    2020 Li等[24] 特定眼底疾病识别 眼底彩照, 算法合成FFA 数据层 样本混合
    2021 Yang等[27] 多种眼底疾病识别 眼底彩照, OCT图像序列 任务层 平均得分
    AMD、OCT:同表 1;PCV:息肉状脉络膜血管病变;FFA:荧光素眼底血管造影
    下载: 导出CSV
  • [1] Etzioni O, Decario N. AI can help scientists find a COVID-19 vaccine[EB/OL ]. [2021-06-16]. https://www.wired.com/story/opinion-ai-can-help-find-scientists-find-a-covid-19-vaccine.
    [2] Laguarta J, Hueto F, Subirana B. COVID-19 artificial intelligence diagnosis using only cough recordings[J]. IEEE Open J Eng Med Biol, 2020, 1: 275-281. doi: 10.1109/OJEMB.2020.3026928
    [3] Zeeberg A. D.I.Y. Artificial intelligence comes to a Japanese family farm[EB/OL ]. [2021-06-16]. https://www.newyorker.com/tech/annals-of-technology/diy-artificial-intelligence-comes-to-a-japanese-family-farm.
    [4] Bengio Y. Learning deep architectures for AI[G]. Foundations and Trends® in Machine Learning, 2009, 2: 1-127.
    [5] Schmidhuber J. Deep learning in neural networks: An overview[J]. Neural Netw, 2015, 61: 85-117. http://www.onacademic.com/detail/journal_1000036789998910_729a.html
    [6] Zheng A, Casari A. Feature Engineering for Machine Learning: Principles and Techniques for Data Scientists[M]. New York: O'Reilly Media Inc., 2018.
    [7] Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs[J]. JAMA, 2016, 316: 2402-2410. doi: 10.1001/jama.2016.17216
    [8] Burlina PM, Joshi N, Pekala M, et al. Automated grading of age-related macular degeneration from color fundus images using deep convolutional neural networks[J]. JAMA Ophthalmol, 2017, 135: 1170-1176. doi: 10.1001/jamaophthalmol.2017.3782
    [9] Kermany DS, Goldbaum M, Cai W, et al. Identifying medical diagnoses and treatable diseases by image-based deep learning[J]. Cell, 2018, 172: 1122-1131. e9. doi: 10.1016/j.cell.2018.02.010
    [10] Wei Q, Li X, Wang H, et al. Laser scar detection in fundus images using convolutional neural network[C]. ACCV, 2018: 191-206.
    [11] Lai X, Li X, Qian R, et al. Four models for automatic recognition of left and right eye in fundus images[C]. MMM, 2019: 507-517.
    [12] Xu C, Zhu X, He W, et al. Fully deep learning for slit-lamp photo based nuclear cataract grading[C]. MICCAI, 2019: 513-521.
    [13] Yang Z, Li X, He X, et al. Joint localization of optic disc and fovea in ultra-widefield fundus images[C]. MLMI, 2019: 453-460.
    [14] Wu J, Zhang Y, Wang J, et al. AttenNet: Deep attention based retinal disease classification in OCT images[C]. MMM, 2020: 565-576.
    [15] Ding F, Yang G, Wu J, et al. High-order attention networks for medical image segmentation[C]. MICCAI, 2020: 253-262.
    [16] Ding F, Yang G, Ding D, et al. Retinal nerve fiber layer defect detection with position guidance[C]. MICCAI, 2020: 745-754.
    [17] Wei Q, Li X, Yu W, et al. Learn to segment retinal lesions and beyond[C]. ICPR, 2020: 7403-7410.
    [18] Li X, Wan W, Y. Zhou, et al. Deep multiple instance learning with spatial attention for ROP case classification, instance selection and abnormality localization[C]. ICPR, 2020: 7293-7298.
    [19] Li B, Chen H, Zhang B, et al. Development and evaluation of a deep learning model for the detection of multiple fundus diseases based on colour fundus photography[J]. Br J Ophthalmol, 2021. doi: 10.1136/bjophthalmol-2020-316290.
    [20] Zhang C, He F, Li B, et al. Development of a deep-learning system for detection of lattice degeneration, retinal breaks, and retinal detachment in tessellated eyes using ultra-wide-field fundus images: A pilot study[J]. Graefes Arch Clin Exp Ophthalmol, 2021, 259: 2225-2234. doi: 10.1007/s00417-021-05105-3
    [21] Zhang C, Yang Z, He X, et al. Multimodal intelligence: Representation learning, information fusion, and applications[J]. IEEE J Sel Top Signal Process, 2020, 14: 478-493. doi: 10.1109/JSTSP.2020.2987728
    [22] Baltrušaitis T, Ahuja C, Morency LP. Multimodal machine learning: A survey and taxonomy[J]. IEEE Trans Pattern Anal Mach Intell, 2018, 41: 423-443. http://arxiv.org/pdf/1705.09406
    [23] Wang J, Tian K, Ding D, et al. Unsupervised domain expansion for visual categorization[J]. ACM Trans Multimedia Comput Commun Appl, 2021. https://arxiv.org/abs/2104.00233. https://arxiv.org/abs/2104.00233
    [24] Li X, Jia M, Islam M T, et al. Self-supervised feature learning via exploiting multi-modal data for retinal disease diag-nosis[J]. IEEE Trans Med Imaging, 2020, 39: 4023-4033. doi: 10.1109/TMI.2020.3008871
    [25] Wang W, Xu Z, Yu W, et al. Two-stream CNN with loose pair training for multi-modal AMD categorization[C]. MICCAI, 2019: 156-164.
    [26] Chen RJ, Lu MY, Wang J, et al. Pathomic fusion: an integrated framework for fusing histopathology and genomic features for cancer diagnosis and prognosis[J]. IEEE Trans Med Imaging, 2020. doi: 10.1109/TMI.2020.3021387.
    [27] Yang J, Yang Z, Mao Z, et al. Bi-modal deep learning for recognizing multiple retinal diseases based on color fundus photos and OCT images[C]. ARVO Annual Meeting, 2021.
    [28] Wang J, Miao J, Yang X, et al. Auto-weighting for breast cancer classification in multi- modal ultrasound[C]. MICCAI, 2020: 190-199.
    [29] Zhou T, Fu H, Zhang Y, et al. M2Net: Multi-modal multi-channel network for overall survial time prediction of brain tumor patients[C]. MICCAI, 2020: 221-231.
    [30] Jiang X, Luo Q, Wang Z, et al. Multiphase and multi-level selective feature fusion for automated pancreas segment from CT images[C]. MICCAI, 2020: 460-469.
    [31] Peng Y, Bi L, Fulham M, et al. Multi-modality information fusion for radiomics-based neural architecture search[C]. MICCAI, 2020: 763-771.
    [32] Wang W, Xu Z, Yu W, et al. Two-stream CNN with loose pair training for multi-modal AMD categorization[C]. MICCAI, 2019: 156-164.
    [33] Xu Z, Wang W, Yang J, et al. Automated diagnoses of age-related macular degeneration and polypoidal choroidal vasculopathy using bi-modal deep convolutional neural networks[J]. Br J Ophthalmol, 2021, 105: 561-566. doi: 10.1136/bjophthalmol-2020-315817
    [34] Zhu J, Park T, Isola P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]. ICCV, 2017: 2223-2232.
    [35] Li X, Zhou Y, Wang J, et al. Multi-modal multi-instance learning for retinal disease recognition[C]. ACMMM, 2021. doi: 10.1145/3474085.3475418.
  • 加载中
图(2) / 表(3)
计量
  • 文章访问数:  489
  • HTML全文浏览量:  471
  • PDF下载量:  0
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-06-28
  • 录用日期:  2021-07-29
  • 网络出版日期:  2021-11-26
  • 发布日期:  2021-08-19
  • 刊出日期:  2021-09-30

目录

    /

    返回文章
    返回