Issue 5
Nov 2021
Turn off MathJax
Article Contents
LI Xirong. Multi-modal Deep Learning and Its Applications in Ophthalmic Artificial Intelligence[J]. JOURNAL OF MECHANICAL ENGINEERING, 2021, 12(5): 602-607. doi: 10.12290/xhyxzz.2021-0500
Citation: LI Xirong. Multi-modal Deep Learning and Its Applications in Ophthalmic Artificial Intelligence[J]. JOURNAL OF MECHANICAL ENGINEERING, 2021, 12(5): 602-607. doi: 10.12290/xhyxzz.2021-0500

Multi-modal Deep Learning and Its Applications in Ophthalmic Artificial Intelligence

doi: 10.12290/xhyxzz.2021-0500
Funds:

Beijing Natural Science Foundation 4202033

Beijing Natural Science Foundation Haidian Original InnovationJoint Fund 19L2062

the Pharmaceutical Collaborative Innovation Research Project of Beijing Science and Technology Commission Z191100007719002

More Information
  • Corresponding author: LI Xirong  Tel: 86-10-82504345, E-mail: xirong@ruc.edu.cn
  • Received Date: 28 Jun 2021
  • Accepted Date: 29 Jul 2021
  • Available Online: 26 Nov 2021
  • Publish Date: 19 Aug 2021
  • Issue Publish Date: 30 Sep 2021
  • Deep learning, for its powerful learning capability and high usability, has been a prevalent algorithm of machine learning and a core technique for artificial intelligence(AI) in medicine and healthcare. Due to the importance of medical imaging in many tasks such as health screening, disease diagnosis, precise treatment, and prognosis prediction, deep learning of structural analysis and semantic understanding for medical images is becoming an important interdisciplinary research direction. In clinical scenarios, in order to achieve a more accurate diagnosis, doctors need to simultaneously refer to multiple modalities of medical imaging for a comprehensive analysis and judgment. This article introduced the basic concepts and working principles of multimodal deep learning in such scenarios, reviewed recent research progress on applying multi-modal deep learning in both generic medical fields and ophthalmology, and discussed technical challenges and also envision potential applications of multi-modal deep learning in AI-assisted ophthalmology.

     

  • loading
  • [1]
    Etzioni O, Decario N. AI can help scientists find a COVID-19 vaccine[EB/OL ]. [2021-06-16]. https://www.wired.com/story/opinion-ai-can-help-find-scientists-find-a-covid-19-vaccine.
    [2]
    Laguarta J, Hueto F, Subirana B. COVID-19 artificial intelligence diagnosis using only cough recordings[J]. IEEE Open J Eng Med Biol, 2020, 1: 275-281. doi: 10.1109/OJEMB.2020.3026928
    [3]
    Zeeberg A. D.I.Y. Artificial intelligence comes to a Japanese family farm[EB/OL ]. [2021-06-16]. https://www.newyorker.com/tech/annals-of-technology/diy-artificial-intelligence-comes-to-a-japanese-family-farm.
    [4]
    Bengio Y. Learning deep architectures for AI[G]. Foundations and Trends® in Machine Learning, 2009, 2: 1-127.
    [5]
    Schmidhuber J. Deep learning in neural networks: An overview[J]. Neural Netw, 2015, 61: 85-117. http://www.onacademic.com/detail/journal_1000036789998910_729a.html
    [6]
    Zheng A, Casari A. Feature Engineering for Machine Learning: Principles and Techniques for Data Scientists[M]. New York: O'Reilly Media Inc., 2018.
    [7]
    Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs[J]. JAMA, 2016, 316: 2402-2410. doi: 10.1001/jama.2016.17216
    [8]
    Burlina PM, Joshi N, Pekala M, et al. Automated grading of age-related macular degeneration from color fundus images using deep convolutional neural networks[J]. JAMA Ophthalmol, 2017, 135: 1170-1176. doi: 10.1001/jamaophthalmol.2017.3782
    [9]
    Kermany DS, Goldbaum M, Cai W, et al. Identifying medical diagnoses and treatable diseases by image-based deep learning[J]. Cell, 2018, 172: 1122-1131. e9. doi: 10.1016/j.cell.2018.02.010
    [10]
    Wei Q, Li X, Wang H, et al. Laser scar detection in fundus images using convolutional neural network[C]. ACCV, 2018: 191-206.
    [11]
    Lai X, Li X, Qian R, et al. Four models for automatic recognition of left and right eye in fundus images[C]. MMM, 2019: 507-517.
    [12]
    Xu C, Zhu X, He W, et al. Fully deep learning for slit-lamp photo based nuclear cataract grading[C]. MICCAI, 2019: 513-521.
    [13]
    Yang Z, Li X, He X, et al. Joint localization of optic disc and fovea in ultra-widefield fundus images[C]. MLMI, 2019: 453-460.
    [14]
    Wu J, Zhang Y, Wang J, et al. AttenNet: Deep attention based retinal disease classification in OCT images[C]. MMM, 2020: 565-576.
    [15]
    Ding F, Yang G, Wu J, et al. High-order attention networks for medical image segmentation[C]. MICCAI, 2020: 253-262.
    [16]
    Ding F, Yang G, Ding D, et al. Retinal nerve fiber layer defect detection with position guidance[C]. MICCAI, 2020: 745-754.
    [17]
    Wei Q, Li X, Yu W, et al. Learn to segment retinal lesions and beyond[C]. ICPR, 2020: 7403-7410.
    [18]
    Li X, Wan W, Y. Zhou, et al. Deep multiple instance learning with spatial attention for ROP case classification, instance selection and abnormality localization[C]. ICPR, 2020: 7293-7298.
    [19]
    Li B, Chen H, Zhang B, et al. Development and evaluation of a deep learning model for the detection of multiple fundus diseases based on colour fundus photography[J]. Br J Ophthalmol, 2021. doi: 10.1136/bjophthalmol-2020-316290.
    [20]
    Zhang C, He F, Li B, et al. Development of a deep-learning system for detection of lattice degeneration, retinal breaks, and retinal detachment in tessellated eyes using ultra-wide-field fundus images: A pilot study[J]. Graefes Arch Clin Exp Ophthalmol, 2021, 259: 2225-2234. doi: 10.1007/s00417-021-05105-3
    [21]
    Zhang C, Yang Z, He X, et al. Multimodal intelligence: Representation learning, information fusion, and applications[J]. IEEE J Sel Top Signal Process, 2020, 14: 478-493. doi: 10.1109/JSTSP.2020.2987728
    [22]
    Baltrušaitis T, Ahuja C, Morency LP. Multimodal machine learning: A survey and taxonomy[J]. IEEE Trans Pattern Anal Mach Intell, 2018, 41: 423-443. http://arxiv.org/pdf/1705.09406
    [23]
    Wang J, Tian K, Ding D, et al. Unsupervised domain expansion for visual categorization[J]. ACM Trans Multimedia Comput Commun Appl, 2021. https://arxiv.org/abs/2104.00233. https://arxiv.org/abs/2104.00233
    [24]
    Li X, Jia M, Islam M T, et al. Self-supervised feature learning via exploiting multi-modal data for retinal disease diag-nosis[J]. IEEE Trans Med Imaging, 2020, 39: 4023-4033. doi: 10.1109/TMI.2020.3008871
    [25]
    Wang W, Xu Z, Yu W, et al. Two-stream CNN with loose pair training for multi-modal AMD categorization[C]. MICCAI, 2019: 156-164.
    [26]
    Chen RJ, Lu MY, Wang J, et al. Pathomic fusion: an integrated framework for fusing histopathology and genomic features for cancer diagnosis and prognosis[J]. IEEE Trans Med Imaging, 2020. doi: 10.1109/TMI.2020.3021387.
    [27]
    Yang J, Yang Z, Mao Z, et al. Bi-modal deep learning for recognizing multiple retinal diseases based on color fundus photos and OCT images[C]. ARVO Annual Meeting, 2021.
    [28]
    Wang J, Miao J, Yang X, et al. Auto-weighting for breast cancer classification in multi- modal ultrasound[C]. MICCAI, 2020: 190-199.
    [29]
    Zhou T, Fu H, Zhang Y, et al. M2Net: Multi-modal multi-channel network for overall survial time prediction of brain tumor patients[C]. MICCAI, 2020: 221-231.
    [30]
    Jiang X, Luo Q, Wang Z, et al. Multiphase and multi-level selective feature fusion for automated pancreas segment from CT images[C]. MICCAI, 2020: 460-469.
    [31]
    Peng Y, Bi L, Fulham M, et al. Multi-modality information fusion for radiomics-based neural architecture search[C]. MICCAI, 2020: 763-771.
    [32]
    Wang W, Xu Z, Yu W, et al. Two-stream CNN with loose pair training for multi-modal AMD categorization[C]. MICCAI, 2019: 156-164.
    [33]
    Xu Z, Wang W, Yang J, et al. Automated diagnoses of age-related macular degeneration and polypoidal choroidal vasculopathy using bi-modal deep convolutional neural networks[J]. Br J Ophthalmol, 2021, 105: 561-566. doi: 10.1136/bjophthalmol-2020-315817
    [34]
    Zhu J, Park T, Isola P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]. ICCV, 2017: 2223-2232.
    [35]
    Li X, Zhou Y, Wang J, et al. Multi-modal multi-instance learning for retinal disease recognition[C]. ACMMM, 2021. doi: 10.1145/3474085.3475418.
  • 加载中

Catalog

    Figures(2)  / Tables(3)

    Article Metrics

    Article views(484) PDF downloads(0) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return