Abstract
Background & Aims: Liver cancer ranks second globally in cancer death and has a high case-fatality rate. The Liver Imaging Reporting and Data System (LIRADS) categorizes liver observations on cross-sectional imaging based on hepatocellular carcinoma (HCC) risk. Intermediate-risk observations require repeated scans making an early diagnosis of HCC difficult. It remains uncertain if artificial intelligence can improve the diagnostic performance of computed tomography (CT) for HCC.
Method: We retrospectively collected archived thin-cut (<1.25 mm slice thickness) contrast triphasic CT images in raw DICOM format and relevant clinical information. CT observations were contoured and categorized via LIRADS, with ground truth diagnosis of HCC established via AASLD recommendations and validated by a clinical composite reference standard based on subsequent 12-month outcomes. We constructed several deep learning algorithms (NVIDIA Tesla V100 GPUs, Dell Technologies), including the multiscale three-dimensional convolutional network (MS3DCN, Figure 1) model which uniquely considers the multi-phasic nature of CT, and followed the Checklist for AI in Medical Imaging (CLAIM) framework for algorithm training, validation and testing.
Results: Among 2,796 retrieved scans, 2,281 were included with 3,620 liver observations contoured. The cohort’s mean age was 58.4±14.2 years, 61.0% male, with 1,214 (53.2%) at-risk for HCC. Median observation size was 21.0 (IQR 12.6-41.3) mm; 793 (21.9%) had a ground truth diagnosis of HCC. After randomly dividing observations into training and validation sets in a 7:3 ratio, MS3DCN was the best-performing algorithm at observation level for diagnosing HCC, achieving an AUC of 96.9% (95%CI 95.3%-98.2%), sensitivity 95.9%, specificity 98.1%, positive predictive value (PPV) 93.7%, and negative predictive value (NPV) 98.8%; compared to AUC 85.3% (95%CI 82.4%-88.1%), sensitivity 71.3%, specificity 99.3%, PPV 96.7% and NPV 91.9% for LIRADS. Sensitivity analysis found MS3DCN’s diagnostic performance to remain robust, achieving an AUC of 96.3% (95%CI 95.8%-97.6%) at patient level and 97.1% (95%CI 95.3%-98.6%) in the at-risk cohort. In external testing of an independent cohort of 551 scans and 780 observations, MS3DCN achieved an AUC of 98.0% (95%CI 96.8%-99.0%) and 97.6% (95%CI 96.0%-98.9%) at observation and patient level respectively.
Conclusion: The MS3DCN deep learning model deployed to triphasic CT was highly and robustly accurate in diagnosing HCC, superior to LIRADS, with performance validated via internal validation and external testing. Artificial intelligence-based technologies can facilitate precise establishment of diagnosis and improve clinical outcomes. Supported by the Innovation and Technology Fund, the Government of the HKSAR; and United Ally Research Limited, a subsidiary of Hong Kong Sanatorium and Hospital Limited. Copyright © 2022 International Liver Congress.
Method: We retrospectively collected archived thin-cut (<1.25 mm slice thickness) contrast triphasic CT images in raw DICOM format and relevant clinical information. CT observations were contoured and categorized via LIRADS, with ground truth diagnosis of HCC established via AASLD recommendations and validated by a clinical composite reference standard based on subsequent 12-month outcomes. We constructed several deep learning algorithms (NVIDIA Tesla V100 GPUs, Dell Technologies), including the multiscale three-dimensional convolutional network (MS3DCN, Figure 1) model which uniquely considers the multi-phasic nature of CT, and followed the Checklist for AI in Medical Imaging (CLAIM) framework for algorithm training, validation and testing.
Results: Among 2,796 retrieved scans, 2,281 were included with 3,620 liver observations contoured. The cohort’s mean age was 58.4±14.2 years, 61.0% male, with 1,214 (53.2%) at-risk for HCC. Median observation size was 21.0 (IQR 12.6-41.3) mm; 793 (21.9%) had a ground truth diagnosis of HCC. After randomly dividing observations into training and validation sets in a 7:3 ratio, MS3DCN was the best-performing algorithm at observation level for diagnosing HCC, achieving an AUC of 96.9% (95%CI 95.3%-98.2%), sensitivity 95.9%, specificity 98.1%, positive predictive value (PPV) 93.7%, and negative predictive value (NPV) 98.8%; compared to AUC 85.3% (95%CI 82.4%-88.1%), sensitivity 71.3%, specificity 99.3%, PPV 96.7% and NPV 91.9% for LIRADS. Sensitivity analysis found MS3DCN’s diagnostic performance to remain robust, achieving an AUC of 96.3% (95%CI 95.8%-97.6%) at patient level and 97.1% (95%CI 95.3%-98.6%) in the at-risk cohort. In external testing of an independent cohort of 551 scans and 780 observations, MS3DCN achieved an AUC of 98.0% (95%CI 96.8%-99.0%) and 97.6% (95%CI 96.0%-98.9%) at observation and patient level respectively.
Conclusion: The MS3DCN deep learning model deployed to triphasic CT was highly and robustly accurate in diagnosing HCC, superior to LIRADS, with performance validated via internal validation and external testing. Artificial intelligence-based technologies can facilitate precise establishment of diagnosis and improve clinical outcomes. Supported by the Innovation and Technology Fund, the Government of the HKSAR; and United Ally Research Limited, a subsidiary of Hong Kong Sanatorium and Hospital Limited. Copyright © 2022 International Liver Congress.
Original language | English |
---|---|
Publication status | Published - Jun 2022 |
Event | International Liver Congress 2022 - London, United Kingdom Duration: 22 Jun 2022 → 26 Jun 2022 https://easl.eu/event/international-liver-congress-2022/ |
Conference
Conference | International Liver Congress 2022 |
---|---|
Country/Territory | United Kingdom |
City | London |
Period | 22/06/22 → 26/06/22 |
Internet address |