Early Diagnosis Model of Mycosis Fungoides Based on Intelligent Analysis of Dermoscopic Images
-
摘要:目的 比较基于皮肤镜图像的卷积神经网络(convolutional neural network, CNN)二分类模型在蕈样肉芽肿(mycosis fungoides, MF)与炎症性疾病鉴别诊断中的应用价值。方法 回顾性纳入2016年1月至2020年12月北京协和医院皮肤科门诊确诊的早期MF患者和临床表现与之相似的炎症性皮肤病患者,并按4∶1的比例随机分为训练集和测试集。使用训练集患者的皮肤镜图像对6种经典网络结构利用迁移学习进行训练,以构建CNN二分类模型。同时,在测试集中随机挑选每例患者1幅图像,并结合皮损的临床图像,由13名皮肤科医师对疾病归类进行判读。比较CNN二分类模型与皮肤科医师对测试集病例早期MF与炎症性疾病鉴别诊断的性能,结果以曲线下面积(area under the curve, AUC)、灵敏度、特异度、Kappa值等表示,并采用受试者工作特征(receiver operating characteristic, ROC)曲线进行可视化分析。结果 共纳入48例早期MF患者(皮肤镜图像402幅)和96例炎症性皮肤病患者(皮肤镜图像557幅),其中训练集117例(皮肤镜图像772幅),测试集27例(皮肤镜图像187幅)。测试集中,皮肤科医师鉴别诊断早期MF与炎症性皮肤病的灵敏度和特异度分别为70.19%(95% CI: 59.68%~80.70%)和94.74%(95% CI: 91.77%~97.71%),Kappa值为0.677(95% CI: 0.566~0.789)。按图像分类时,CNN二分类模型对早期MF与炎症性皮肤病鉴别诊断的AUC为0.87(95% CI: 0.84~0.89),灵敏度和特异度分别为75.02%(95% CI: 70.19%~79.85%)和82.02%(95% CI: 79.30%~84.87%),Kappa值为0.563(95% CI: 0.507~0.620);按病例分类时,CNN二分类模型对早期MF与炎症性皮肤病鉴别诊断的AUC为0.97(95% CI: 0.95~0.99),灵敏度和特异度分别为87.50%(95% CI: 78.55%~96.45%)和93.85%(95% CI: 88.93%~98.77%),Kappa值为0.920(95% CI: 0.884~0.954)。ROC曲线显示,按病例分类时网络结构为EfficientNet-B0的CNN二分类模型诊断早期MF的AUC为0.99,灵敏度和特异度分别为88.9%和100%,且13名皮肤科医师诊断的灵敏度和特异度均值对应点位于曲线右下方。结论 基于皮肤镜图像智能分析的CNN二分类模型可实现对早期MF与炎症性皮肤病的精确分类,对二者的鉴别诊断能力优于皮肤科医师的平均水平。Abstract:Objective To compare the application value of the binary classification model based on dermoscopic images of convolutional neural network (CNN) in the diagnosis of mycosis fungoides (MF) and inflammatory dermatosis.Methods Patients diagnosed with early MF or inflammatory dermatosis with similar clinical manifestations in the dermatology clinic of Peking Union Medical College Hospital from January 2016 to December 2020 were retrospectively included. The patients were divided into the training set and the test set at a ratio of 4∶1. Six classical network structures were trained by using the dermoscopic images of patients in the training set, and the CNN binary classification model was constructed by using transfer learning. At the same time, in the test set, 1 image of each patient that was randomly selected, together with clinical images of the skin lesions, was interpreted by 13 dermatologists. Compare the CNN binary classification model with dermatologists in the differential diagnosis of early MF and inflammatory dermatosis in the test set. The results were expressed in terms of area under the curve (AUC), sensitivity, specificity, Kappa coefficient, etc., and receiver operating characteristic (ROC) curve was used for visual analysis.Results A total of 48 patients with early MF (402 dermoscopic images) and 96 patients with inflammatory dermatosis (557 dermoscopic images) were included. Among them, there were 117 cases in the training set (772 dermoscopic images), and 27 cases in the test set (187 dermoscopic images). In the test set, the sensitivity and specificity of dermatologists in the differential diagnosis of early MF and inflammatory dermatosis were 70.19% (95% CI: 59.68%-80.70%) and 94.74% (95% CI: 91.77%-97.71%) respectively, and the Kappa coefficient is 0.677(95% CI: 0.566-0.789). When classified by the single image, the AUC of the CNN binary classification model for the differential diagnosis of early MF and inflammatory dermatosis was 0.87 (95% CI: 0.84-0.89); the sensitivity and specificity were 75.02% (95% CI: 70.19%-79.85%) and 82.02% (95% CI: 79.30%-84.87%), respectively; the Kappa coefficient was 0.563(95% CI: 0.507-0.620). When classified by cases, the AUC of the CNN binary classification model for the differential diagnosis of early MF and inflammatory dermatosis was 0.97 (95% CI: 0.95-0.99); the sensitivity and specificity were 87.50% (95% CI: 78.55%-96.45%) and 93.85% (95% CI: 88.93%-98.77%), respectively; the Kappa coefficient was 0.920(95% CI: 0.884-0.954). The ROC curve showed that the AUC of the CNN binary classification model with EfficientNet-B0 for diagnosing MF was 0.99 when classified by cases, the sensitivity and specifity were 88.9% and 100%, and the corresponding point of the average diagnostic sensitivity and specificity of 13 dermatologists were at the lower right of the curve.Conclusions The CNN binary classification model based on the intelligent analysis of dermoscopic images can accurately classify early MF and inflammatory dermatosis, and its ability of differential diagnosis is better than the average level of dermatologists.
-
图 4 CNN二分类模型误诊的1例早期MF病例的皮肤镜图像及对应的临床图像
MF、CNN:同图 2
图 5 皮肤科医生诊断准确率较低的早期MF病例皮肤镜图像及对应的临床图像
A.皮肤科医师的诊断准确度为0;B~D.皮肤科医师的诊断准确度均为69.23%
MF:同图 2表 1 CNN二分类模型交叉验证评估结果(x±s)
网络结构 阳性似然比 阴性似然比(%) 灵敏度(%) 特异度(%) 准确度(%) 阳性预测值(%) 阴性预测值(%) AlexNet 4.59±2.62 40.51±5.84 66.72±3.62 82.60±6.52 75.82±4.59 75.25±6.69 76.42±4.19 VGG16 3.59±1.06 38.55±8.77 69.58±6.12 79.36±5.74 75.02±4.60 72.26±4.51 77.30±5.78 ResNet18 4.03±2.16 32.25±8.45 74.76±7.49 77.92±8.80 76.60±4.44 73.06±7.10 80.36±5.36 SENet 3.41±1.14 52.24±11.18 57.36±8.17 81.94±5.06 71.26±5.04 70.76±6.90 71.68±5.45 DenseNet121 3.00±0.80 44.21±8.04 65.88±8.31 76.40±8.66 71.98±3.16 68.76±4.01 74.94±4.50 EfficientNet-B0 4.48±2.40 33.47±5.84 72.82±5.54 80.96±6.96 77.48±3.57 75.18±6.54 79.68±3.97 CNN:同图 2 表 2 CNN二分类模型与皮肤科医师诊断结果比较[均值(95% CI)]
指标 皮肤科医师(n=27) CNN二分类模型 按图像分类(n=187) 按病例分类(n=27) 阳性似然比 NA 4.32(3.61~5.02) NA 阴性似然比(%) 31.87(20.46~43.28) 30.52(24.56~36.48) 17.54(8.34~26.77)# AUC / 0.87(0.84~0.89) 0.97(0.95~0.99) 灵敏度(%) 70.19(59.68~80.70) 75.02(70.19~79.85) 87.50(78.55~96.45)# 特异度(%) 94.74(91.77~97.71) 82.02(79.30~84.87)# 93.85(88.93~98.77) 准确度(%) 87.46(83.32~91.60) 79.52(76.87~82.16) 91.98(88.52~95.44) Kappa值 0.677(0.566~0.789) 0.563(0.507~0.620) 0.920(0.884~0.954)# 阳性预测值(%) 85.83(77.57~94.09) 70.60(67.09~74.11)# 87.43(78.55~96.45) 阴性预测值(%) 88.68(85.07~92.29) 85.23(82.77~87.70)# 94.93(91.43~98.44) CNN:同图 2;AUC: 曲线下面积;NA:诊断结果中存在特异度为100%的情况,未能估计阳性似然比;/: 无AUC; #与皮肤科医师诊断结果比较差异有统计学意义 -
[1] 刘洁, 邹先彪. 实用皮肤镜学[M]. 北京: 人民卫生出版社, 2021: 1-249. [2] Olsen E, Vonderheid E, Pimpinelli N, et al. Revisions to the staging and classification of mycosis fungoides and Sezary syndrome: a proposal of the International Society for Cutaneous Lymphomas (ISCL) and the cutaneous lymphoma task force of the European Organization of Research and Treatment of Cancer (EORTC)[J]. Blood, 2007, 110: 1713-1722. doi: 10.1182/blood-2007-03-055749 [3] Krizhevsky A, Sutskever I, Hinton G. ImageNet Classifica-tion with Deep Convolutional Neural Networks[C]. Advances in Neural Information Processing Systems, 2012, 25: 1097-1105. [4] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv, 2014: 1409.1556V6. [5] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]. Proceedings of the IEEE Confer-ence on Computer Vision and Pattern Recognition, 2016: 770-778. [6] Huang G, Liu Z, Van Der Maaten L, et al. Densely connected convolutional networks[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 4700-4708. [7] Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 7132-7141. [8] Tan M, Le Q. Efficientnet: Rethinking model scaling for convolutional neural networks[C]. International Confer-ence on Machine Learning, 2019: 6105-6114. [9] Hunt RJ. Percent agreement, Pearson's correlation, and kappa as measures of inter-examiner reliability[J]. J Dent Res, 1986, 65: 128-130. doi: 10.1177/00220345860650020701 [10] Lallas A, Apalla Z, Lefaki I, et al. Dermoscopy of early stage mycosis fungoides[J]. J Eur Acad Dermatol Venereol, 2013, 27: 617-621. doi: 10.1111/j.1468-3083.2012.04499.x [11] Ghahramani GK, Goetz KE, Liu V. Dermoscopic characterization of cutaneous lymphomas: a pilot survey[J]. Int J Dermatol, 2018, 57: 339-343. doi: 10.1111/ijd.13860 [12] Xu C, Liu J, Wang T, et al. Dermoscopic patterns of early-stage mycosis fungoides in a Chinese population[J]. Clin Exp Dermatol, 2019, 44: 169-175. doi: 10.1111/ced.13680 [13] Bilgic SA, Cicek D, Demir B. Dermoscopy in differential diagnosis of inflammatory dermatoses and mycosis fungoides[J]. Int J Dermatol, 2020, 59: 843-850. doi: 10.1111/ijd.14925 [14] 谢凤英, 刘洁, 崔勇, 等. 皮肤镜图像计算机辅助诊断技术[J]. 中国医学文摘(皮肤科学), 2016, 33: 45-50. https://www.cnki.com.cn/Article/CJFDTOTAL-ZYXW201601011.htmXie FY, Liu J, Cui Y, et al. Computer aided diagnosis of dermoscopic images[J]. Zhongguo Yixue Wenzhai(Pifu Kexue), 2016, 33: 45-50. https://www.cnki.com.cn/Article/CJFDTOTAL-ZYXW201601011.htm [15] Schindewolf T, Stolz W, Albert R, et al. Classification of melanocytic lesions with color and texture analysis using digital image processing[J]. Anal Quant Cytol Histol, 1993, 1: 1-11. http://europepmc.org/abstract/MED/8471104 [16] 谢斌, 何小宇, 黄伟红, 等. 基于卷积神经网络的基底细胞癌和色素痣的临床图像鉴别[J]. 中南大学学报(医学版), 2019, 44: 113-120. https://www.cnki.com.cn/Article/CJFDTOTAL-HNYD201909019.htmXie B, He XY, Huang WH, et al. Clinical image identification of basal cell carcinoma and pigmented nevi based on convolutional neural network[J]. Zhongnan Daxue Xuebao (Yixueban), 2019, 44: 113-120. https://www.cnki.com.cn/Article/CJFDTOTAL-HNYD201909019.htm [17] Serener A, Serte S. Keratinocyte carcinoma detection via convolutional neural networks[C]. 2019 3rd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), 2019: 1-5. [18] Pangti R, Mathur J, Chouhan V, et al. A machine learning‐based, decision support, mobile phone application for diagnosis of common dermatological diseases[J]. J Eur Acad Dermatol Venereol, 2021, 35: 536-545. doi: 10.1111/jdv.16967