Introduction
-
The intricate characteristics of lung structures and the overlapping patterns of diseases might result in misinterpretations.
-
Various imaging methods may lead to differences in the quality and consistency of data.
-
The scarcity of labeled datasets impeded the training of accurate models, particularly regarding rare illnesses.
-
The progressive characteristics of disorders such as COVID-19 provide difficulty for pre-existing models.
-
Some solutions can be opted to deal with these impediments:
-
Model generalization may be improved by supplementing datasets with diversified samples and assuring uniform imaging techniques.
-
Continuous model adaption via real-time data updates is critical, particularly with changing features.
-
This review analyzes ML approaches for diagnosing lung diseases. The main contribution of the research is:
-
It investigates and addresses prominent lung diseases such as pneumonia, lung cancer, and COVID-19.
-
It investigates and addresses the publicly accessible imaging modalities datasets for each prominent lung disease.
-
It explores and addresses existing challenges and issues in diagnosing prominent lung diseases using ML and its associated novel solutions.
-
It examines ML and its subfield approaches for identifying prominent lung diseases based on radiographic images and their significance.
-
It qualitatively assesses ML approaches, emphasizing their efficiency in identifying, classifying, and forecasting prominent lung diseases while outlining essential considerations for enhancing the diagnosis.
-
The particularity of the investigation is that it offers a conceptual context for the issues. Furthermore, the analysis emphasizes the techniques and primary methods used in the published findings.
Necessity
Ref. | Year | Type of Analysis | Focused Research | Lung Disease | Imaging Modality | Detailed Dataset | ML Methods | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Type | S | M | X-ray | CT | O | ML | DL | CNN | TL | EM | O | |||||
[13] | 2022 | Review | Detection | Pneumonia | √ | X | √ | X | X | X | X | √ | √ | X | X | ~ |
[14] | 2022 | Review | Diagnosis | Pneumonia | √ | X | √ | X | X | X | ~ | ~ | ~ | ~ | ~ | ~ |
[15] | 2022 | Review | Detection and Classification | COVID-19 | √ | X | √ | √ | X | X | X | √ | √ | X | √ | ~ |
[16] | 2022 | Review | Diagnosis | COVID-19 | √ | X | √ | √ | √ | X | √ | √ | √ | ~ | ~ | ~ |
[17] | 2022 | Survey | Detection and Diagnosis | COVID-19 | √ | X | √ | √ | X | ~ | X | √ | √ | ~ | ~ | ~ |
[18] | 2022 | Survey | Detection | COVID-19 | √ | X | ~ | ~ | X | X | ~ | ~ | ~ | ~ | ~ | ~ |
[19] | 2022 | Review | Classification | COVID-19 | √ | X | √ | √ | X | √ | X | √ | ~ | ~ | ~ | ~ |
[20] | 2022 | Review | Prognostication and Detection | Lung Cancer | √ | X | X | √ | X | X | √ | √ | ~ | X | X | X |
[21] | 2022 | Survey | Classification | Lung Cancer | √ | X | X | √ | X | ~ | X | X | √ | X | X | X |
[22] | 2023 | Review | Detection and Classification | Lung Cancer | √ | X | √ | √ | X | X | X | √ | √ | X | X | ~ |
This Review | 2023 | Review | Diagnosis, Detection, Classification, Prediction | Pneumonia COVID-19 Lung Cancer | √ | √ | √ | √ | √ | √ | √ | √ | √ | √ | √ | √ |
Methodology
Identification
Major Considerations | Keywords |
---|---|
Lung Diseases | "Lung Disease" "Pneumonia" "Lung Cancer" "COVID-19" and "Coronavirus" |
Imaging Modality | "X-Ray" "CT scan" "PET" and "MRI" |
Machine Learning | "Machine Learning" "Deep Learning" "Convolutional Neural Network" "Transfer Learning" and "Ensemble Learning" |
Screening
Inclusion
Lung diseases
-
Airways-Related Lung Diseases: The lung's windpipe, or trachea, is split into bronchi, branching into smaller tubes that extend throughout the lungs. Some conditions that might affect these airways include asthma, COPD, acute bronchitis, chronic bronchitis, emphysema, and cystic fibrosis.
-
Air Sacs-Related Lung Diseases: The respiratory system comprises bronchioles and narrow passageways inside the lungs, terminating in clusters of alveoli, also called air sacs. These air sacs facilitate the formation of tissue in the lungs. Pneumonia, TB, emphysema, pulmonary edema, COVID-19, and lung cancer represent a selection of respiratory ailments affecting the lungs.
-
Interstitium-Related Lung Diseases: The narrow, tiny membrane between the lung's alveoli is known as the interstitium. The interstitium is filled with tiny blood capillaries that facilitate the exchange of gases between alveoli and blood. A few lung conditions that impact the interstitium are interstitial lung disease (ILD), pneumonia, and pulmonary edema.
-
Blood-Vessels-Related Lung Diseases: Low-oxygen blood is pumped into the right side of the heart through veins. It uses the pulmonary arteries to push blood into your lungs. These blood vessels can also acquire diseases. Pulmonary embolism and pulmonary hypertension are two lung disorders that impact blood vessels.
-
Pleura-Related Lung Diseases: The pleura is a thin membrane surrounding the lungs and chest walls. A slight fluid coating with each inhalation permits the pulmonary pleura to slide down the wall. Pleural effusion and pneumothorax are pleural lung disorders.
-
Chest Wall-Related Lung Diseases: The chest wall is essential to the respiratory process. The ribs are connected by muscles, enabling the lungs to expand. The diaphragm descends with each breath, which allows the lungs to enlarge due to the action. Neuromuscular problems, chubbiness, and hypo-ventilation disorder are all diseases that disrupt the chest wall [28]. After reviewing these categories of lung diseases, explaining each one in depth is difficult due to the numerous kinds. Our review focuses on humanity's most debilitating and catastrophic prominent lung diseases.
Prominent lung diseases
Pneumonia
Lung cancer
COVID-19
Developmental analysis of prominent lung diseases over the internet
Challenges and issues
Imaging modalities
Conventional imaging modalities
X-ray
CT scan
Positron emission tomography
Magnetic resonance imaging
Sputum smear microscopy images
Molecular imaging
At-bedside imaging modalities
Machine learning
ML strategy | Virtues | Limitation | Preferred Diagnoses | Reference |
---|---|---|---|---|
Supervised Learning | - Assists in resolving issues with training data - Provides results with good performance measures - Task driven approach - Classification and Regression | - Training data must be labeled - Input data must be of good quality with adequate data | Pneumonia | [91] |
Unsupervised Learning | - It works best with unprocessed or raw data - Data driven approach - Clustering and Dimensionality Reduction | - Does not employ a feedback mechanism to evaluate the standard results | Lung Cancer | [92] |
Semi-supervised Learning | - Data with labels and without labels can both be used - Classification and Clustering | - Unable to handle unobserved data | COVID-19 | [11] |
Machine learning developmental analysis on the internet
Introductory steps for employing machine learning to diagnose lung diseases
Publicly accessible datasets
Preprocessing
Feature extraction and relevant feature selection
Training of the machine learning model
Machine learning and its algorithms
-
Regression is a common technique for reducing model-based uncertainty by iteratively adjusting the model in response to the errors it produces. Some types are linear, logistic, stepwise, and multivariate adaptive regression splines (MARS).
-
To predict the target variable based on the input variables, an algorithm in the form of a decision tree is utilized. Some examples are random forest, classification and regression tree (CART).
-
Those algorithms that are based on the Bayesian technique are the ones that use the Bayes theorem and make it easier to use subjective probability in model development. The significant algorithms used for classification and regression problems are Nave Bayes and Bayesian Belief Network.
-
Pattern analysis is the basis of the kernel approach, which incorporates a wide range of mapping methods. Support vector machines (SVM) and linear discriminant analysis (LDA) are essential kernel approaches in ML modeling.
-
By grouping data points according to their similarities, clustering is the most widely used unsupervised learning approach. K-Means, partitioning-based, hierarchical, and density-based clustering are just a few examples of clustering techniques that may be classified in various ways.
-
Ensemble methods are strategies that work on several models and unite them to obtain more accurate outcomes. Compared to relying on a single model, the results of ensemble techniques are often more reliable. Bagging, boosting, AdaBoost, gradient boosting machine, and random forest are prominent ensemble techniques.
-
Simulations on a computer based on biological principles are used for various purposes, including clustering and classification. There are many ways to use ANN, such as the perceptron, the Hopfield network, and backpropagation.
Performance metrics
-
Accuracy: The accuracy of an ML model is measured as the proportion of correctly classified samples to the total samples. It is the most common metric used to measure the performance of an ML model. It can be expressed as (Eq. 1):$$\mathrm{Accuracy\,}=(correctly\,classified\,samples)\,/\,(Total samples)$$
-
Sensitivity: This metric measures how many relevant samples an ML model can identify by calculating the proportion of true positives to all actual positives and presented through Eq. 2. It is often called the "true positive rate" and the "recall."
-
Precision: This metric measures how accurate a model's predictions are by calculating the ratio of true positives to all positive predictions made by the model. It is often referred to as "positive predictive value" and is presented through Eq. 3.
-
Specificity: It measures how well a model can correctly identify negative samples. It is the ratio of true negatives that are correctly identified and presented through Eq. 4. An ML model with high specificity may have a low false-positive rate, meaning it will rarely incorrectly classify negative examples as positive.
-
F1 Score: This amalgamation of precision and recall scores provides an overall score for model evaluation. The F1 score is presented in Eq. 5.
-
AUC: AUC stands for Area Under the Receiver Operating Characteristic Curve. For varied thresholds, AUC graphs the actual positive rate versus the false positive rate, which is used to evaluate a model's ability. The AUC represents the degree of discrimination between classes [115]. Some of the performance metrics are presented in Table 4.
Classification of lung diseases
Metric | Equation | |
---|---|---|
Accuracy |
\(Accuracy = \frac{\left(TP+TN\right)}{\left(TP+FP+TN+FN\right)}\)
| (1) |
Sensitivity |
\(Sensitivity= \frac{TP}{TP+FN}\)
| (2) |
Precision |
\(Precision = \frac{TP}{TP+FP}\)
| (3) |
Specificity |
\(Specificity= \frac{TN}{TN+FP}\)
| (4) |
F1 Score |
\(F1 Score = 2*\left(\frac{Precision*Recall}{Precision+Recall}\right)\)
| (5) |
ML sub-fields
Deep learning
Convolutional neural network
Convolutional layer
Activation functions
Ref | Activation Function | Output Range | Equation | |
---|---|---|---|---|
[121] | Relu | 0 to ( +) Values |
\({f\left(x\right)}_{Relu}\text{= Max(0, x)}\)
| (7) |
[126] | Sigmoid | 0 to 1 |
\(\updelta \left({\text{x}}\right)=\frac{1}{1+{{\text{e}}}^{-{\text{x}}}}\)
| (8) |
[127] | Tanh | (-) to ( +) Values |
\(Tanh(x)=\frac{\left({e}^{x}-{e}^{-x}\right)}{\left({e}^{x}+{e}^{-x}\right)}\)
| (9) |
Pooling layer
Optimizers
Fully connected layer
CNN architectures
Ensemble learning
Transfer learning
Detection of prominent lung diseases using machine learning and imaging
Publicly accessible datasets
Pneumonia
Dataset Name | Pneumonia Types | Modality | Number of Images | Reference |
---|---|---|---|---|
Large Dataset of Labeled Optical Coherence Tomography and Chest X-Ray Images (LDOCTCXR) | Viral Pneumonia Bacterial Pneumonia | X-Ray | Total – 5,232 3,883—Pneumonia 1,349 – Normal | [42] |
Radiological Society of North America (RSNA) | Pneumonia Normal | X-Ray | Total – 5,528 | [43] |
NIH Chest X-rays (ChestX-ray8) | Pneumonia And 7 others | X-Ray | Total – 1,08,948 1,062—Pneumonia 84,312 – No Findings | [44] |
NIH Chest X-rays (ChectX-ray14) | Pneumonia And 13 others | X-Ray | Total – 1,12,120 1,353—Pneumonia 60,412 – No Findings | [46] |
Chest computed tomography in COVID-19 pneumonia | COVID-19 Pneumonia | CT scan | 105 – Positive | [63] |
Curated Dataset for COVID-19 Posterior-Anterior Chest Radiography Images (X-Rays) | Bacterial Pneumonia Viral Pneumonia COVID-19 Normal | X-Ray | Total – 9,208 3,001- Bacterial Pneumonia 1,656—Viral Pneumonia 1,281—COVID-19 3,270—Normal | [47] |
Balanced Augmented Covid CXR Dataset | Viral Pneumonia Lung Opacity COVID-19 Normal | X-Ray | 1,345—Viral Pneumonia 6,012—Lung Opacity 3,616 – COVID-19 10,192 – Normal | [48] |
CheXpert | Pneumonia and 13 others | X-Ray | Total – 2,24,316 4,576 – Positive | [49] |
MIMIC-CXR | Pneumonia and 13 others | X-Ray | Total – 3,77,110 18,434 – Pneumonia | [50] |
Covid19-Pneumonia-Normal Chest X-Ray Images | Pneumonia COVID-19 Normal | X-Ray | Total – 5,228 1,800—Pneumonia 1,626—COVID-19 1,802 – Normal | [51] |
VinDr-CXR | Pneumonia and 27 others | X-Ray | Total—18,000 715—Pneumonia | [52] |
COVID-QU-Ex | Viral/Bacterial Pneumonia COVID-19 Normal | X-Ray | Total—33,920 11,263—Viral or Bacterial Pneumonia 11,956—COVID-19 10,701—Normal | [53] |
Covid19 Detection | Pneumonia COVID-19 Normal Fibrosis Tuberculosis | X-Ray | Total – 24,867 4,265—Pneumonia 3,616—COVID-19 11,800—Normal 1,686 – Fibrosis 3,500—Tuberculosis | [54] |
Chest X-ray (Covid-19 & Pneumonia) | Pneumonia COVID-19 Normal | X-Ray | Total – 6,432 4,273—Pneumonia 576—COVID-19 1583—Normal | [55] |
Lung cancer
Dataset Name | Lung Cancer Types | Modality | Number of Images | Reference |
---|---|---|---|---|
NIH Chest X-rays (ChestX-ray8) | Lung Nodule And 7 others | X-Ray | Total – 1,08,948 1,971—Lung Nodule 84,312 – No Findings | [44] |
NIH Chest X-rays (ChectX-ray14) | Lung Nodule And 13 others | X-Ray | Total – 1,12,120 6,323—Lung Nodule 60,412 – No Findings | [46] |
VinDr-CXR | Lung nodule and 27 others | X-Ray | Total – 18,000 586—Lung Nodule | [52] |
SPIE-AAPM Lung CT Challenge | Benign and Malignant Lung Nodules | CT scan | Total – 22,489 37 – Benign Nodule 36—Malignant Nodule | [64] |
Development of a Digital Image Database for Chest Radiographs with and Without a Lung Nodule (JSRT) | Lung Nodule Normal | X-Ray | Total—247 100—Malignant Nodules 54—Benign Nodules 93—Without Nodule | [56] |
National Lung Screening Trial (NLST) | Lung Cancer | CT scan | Total—75,000 With and Without Nodule – 28,000 | [65] |
NSCLC-Radiomics | NSCLC | CT scan | Total—52,073 | [66] |
Cancer Moonshot Biobank—Lung Cancer Collection (CMB-LCA) | Lung Cancer | CT scan | Total—20,918 | [67] |
CT Ventilation as a functional imaging modality for lung cancer radiotherapy (CT-vs-PET-Ventilation-Imaging) | Lung Cancer | 4D CT scan & PET | Total—29,491 | [68] |
Lung-PET-CT-Dx | Lung Cancer | CT scan & PET | Total—251,135 | [69] |
QIN LUNG CT | NSCLC | CT scan | Total—3,954 | [70] |
4D-Lung | NSCLC | CT scan – 4D fan beam 4D cone beam | Total – 3,47,330 | [71] |
RIDER Lung CT | NSCLC | CT scan | Total—15,419 | [72] |
RIDER Lung PET-CT | Lung Cancer | CT scan & PET | Total – 2,69,511 | [73] |
COVID-19
Dataset Name | COVID-19 Types | Modality | Number of Images | Reference |
---|---|---|---|---|
Curated Dataset for COVID-19 Posterior-Anterior Chest Radiography Images (X-Rays) | Bacterial Pneumonia Viral Pneumonia COVID-19 Normal | X-Ray | Total – 9,208 3,001—Bacterial Pneumonia 1,656—Viral Pneumonia 1,281—COVID-19 3,270—Normal | [47] |
Balanced Augmented COVID CXR Dataset | Viral Pneumonia Lung Opacity COVID-19 Normal | X-Ray | Total – 21,165 1,345—Viral Pneumonia 6,012—Lung Opacity 3,616 – COVID-19 10,192—Normal | [48] |
COVID-QU-Ex | COVID-19 Viral or Bacterial Pneumonia Normal | X-Ray | Total—33,920 11,956—COVID-19 11,263—Non-COVID infections 10,701—Normal | [53] |
Covid19 Detection | COVID-19 Pneumonia Normal Fibrosis Tuberculosis | X-Ray | Total – 24,867 3,616—COVID-19 4,265—Pneumonia 11,800—Normal 1,686 – Fibrosis 3,500—Tuberculosis | [54] |
Chest X-ray (Covid-19 & Pneumonia) | COVID-19 Pneumonia Normal | X-Ray | Total – 6,432 576—COVID-19 4,273—Pneumonia 1583—Normal | [55] |
COVID-19-NY-SBU | COVID-19 | CT & X-Ray | Total—5,62,376 | [57] |
CT Images in COVID-19 | COVID-19 | CT scan | Total—771 | [74] |
MIDRC-RICORD-1a | COVID-19 | CT scan | Total—31,856 | [75] |
MIDRC-RICORD-1b | COVID-19 | CT scan | Total—21,220 | [76] |
MIDRC-RICORD-1c | COVID-19 | X-Ray | Total—1,257 | [58] |
COVID-19-AR | COVID-19 | CT & X-Ray | Total—31,935 | [59] |
SARS-COV-2 Ct-Scan Dataset | COVID-19 | CT scan | Total – 2,482 1,252 – Positive | [77] |
COVID-XRay-5 K DATASET | COVID-19 | X-Ray | Total—5,000 | [60] |
COVID-CT | COVID-19 | CT scan | Total—349 | [78] |
Machine learning in pneumonia detection
Author/Ref. | Imaging Modality | Image Dataset Samples (with Classified Diseases) | ML Method | Performance Metrics(%) |
---|---|---|---|---|
Szepesi et al. [140] | X-Ray | 4,273 – Pneumonia 1,583 – Normal 5,856 – Total Labeled Images | CNN + Modified Dropout | Accuracy—97.2 Recall – 97.30 Precision – 97.40 F1 Score – 97.40 AUC – 0.982 |
Avola et al. [141] | X-Ray | 2,780 – Bacterial Pneumonia 1,493 – Viral Pneumonia 474 – COVID-19 1,583 – Normal 6,330 – Total | AlexNet, MnasNet, MobileNetv2, MobileNet v3, DenseNet, GoogleNet, ResNet50, ResNeXt, SqueezeNet, Wide ResNet50, VGG16, and ShuffleNet | Average F1 Score – 84.46 |
Liu et al. [142] | X-Ray | Dataset 1: 2,777 – Bacterial Pneumonia 2,838 – Viral Pneumonia 3,674 – COVID-19 11,768 – Normal 21,057 – Total Dataset 2: 2,777 – Bacterial Pneumonia 2,838 – Viral Pneumonia 3,665 – COVID-19 3,251 – Normal 12,531 – Total | Multi-Branch Fusion Auxiliary Learning (MBFAL): Auxiliary Learning method, and Prior-Attention Residual Learning (PARL) Architecture | MBFAL Average: Accuracy – 95.61 |
Srivastava et al. [143] | X-Ray | 1,656—Viral Pneumonia 1,281—COVID-19 3,270—Normal 6,207 – Total | Ensemble Model: Ensemble DNN classifiers’ score based on Condorcet’s Jury Theorem (CJT) And Domain Extended Transfer Learning (DETL) | CJT - Accuracy – 98.22 Sensitivity – 98.37 Specificity – 99.79 DETL - Accuracy – 97.26 Sensitivity – 98.37 Specificity – 100 |
Qu et al. [144] | Infrared Thermal Images + RGB images | Number of Subjects: 30—Normal 28 – Pneumonia 58—Total | SVM KNN Decision Tree Gaussian Naïve Bayes classifier LDA, QDA | Binary Classification: Accuracy – 93.00 |
Singh et al. [145] | X-Ray | 1,345—Viral Pneumonia 371—COVID-19 1,341—Normal 3,057—Total | Hybrid Social Group Optimization algorithm + Support Vector Classifier | Accuracy—99.65 |
Chowdhury et al. [146] | X-Ray | 423—COVID-19 Pneumonia 1,485—Viral Pneumonia 1,579 – Normal 3,487—Total | Three Shallow Networks: MobileNetv2, SqueezeNet, and ResNet18 Five Deep Networks: Inceptionv3, ResNet101, CheXNet, VGG19, and DenseNet201 | Binary Classification (Normal, Pneumonia) - Accuracy—99.70 Sensitivity – 99.70 Precision – 99.70 Specificity – 99.55 Multi Classification – Accuracy—97.90 Sensitivity – 97.95 Precision – 97.90 Specificity – 98.80 |
Wong et al. [147] | CT Scan (2D/3D) | 4,017—Viral Pneumonia 7,766—Bacterial Pneumonia 3,443—Mycoplasma Pneumonia 10,687—COVID-19 11,666 – Normal 37,579—Total | CNN: Multi-Scale Attention Network (MSANet) | Accuracy—97.46 Recall – 96.18 Precision – 97.31 F1 Score – 96.71 Macro-Average AUC—0.9981 |
Ukwuoma et al. [148] | X-Ray | Binary Classification (Mendeley Dataset) – 4,290—Viral Pneumonia 3,834 – Normal 8,124 – Total Multi Classification (Chest X-ray Dataset) - 5,000—Viral Pneumonia 5,000—Bacterial Pneumonia 5,000 – Normal 15,000—Total | Ensembled CNN + Transformer Encoder Method Ensemble A (DenseNet201, VGG16, GoogleNet) Ensemble B (DenseNet201, InceptionResNetV2, Xception) | Binary Classification (Normal, Pneumonia) - Accuracy – 99.21 F1 Score – 99.21 Multi Classification Accuracy – 98.19 F1 Score – 97.29 Ensemble Binary Class Ensemble A - Accuracy – 97.22 F1 Score – 97.14 Ensemble B - Accuracy – 96.44 F1 Score – 96.44 Ensemble Multi-Class Ensemble A - Accuracy – 97.20 F1 Score – 95.80 Ensemble B - Accuracy – 96.40 F1 Score – 94.90 |
Kusk et al. [149] | X-Ray | 4,273—Viral and Bacterial Pneumonia 1,583 – Normal 5,856 – Total | CNN + Gaussian noise (Five Gaussian Noise Levels) | Accuracy – (96.80—97.60) Sensitivity – (96.90—98.20) Specificity – (94.40—98.70) |
Li & Li [150] | X-Ray | 2,530 – Bacterial Pneumonia 1,345 – Viral Pneumonia 797 – COVID-19 5,510—Healthy 10,182 – Total | 17 CNNs (AlexNet, GoogleNet, Vgg16, ResNet18, SqueezeNet, MobileNetv2, Inceptionv3, DenseNet201, Xception, Vgg19, Places365GoogleNet, InceptionResNetv2, ResNet50, ResNet101, NASNetMobile, NASNetLarge, ShuffleNet) | Distinguishing Covid-19 Pneumonia from Bacterial Pneumonia - (Accuracy – 99.85) Normal Lung Images (Accuracy – 100) Viral Covid-19 Pneumonia (Accuracy – 99.95) |
Bhandari et al. [151] | X-Ray | 4, 273 – Pneumonia 576—COVID-19 700 – TB 1583 – Normal 7,132 – Total | CNN + XAI + Grad-CAM, Local Interpretable Modelagnostic Explanation (LIME), and SHapley Additive exPlanation (SHAP) | Overall Accuracy – 95.94 Average - Specificity – 95.71 ± 1.55 Sensitivity – 95.50 ± 1.72 F1 Score – 96.53 ± 0.95 |
Khaniabadi et al. [152] | CT Scan | 100 – Pneumonia 100 – COVID-19 100—Healthy 300 – Total | ML Algorithms: SVM, KNN, Decision Tree, Naïve Bays, Bagging, Random Forest, and Ensemble Meta voting | Random Forest, and Ensemble Meta voting - Accuracy(RF) – 0.94 ± 0.031 Accuracy(EM) – 0.92 ± 0.034 Sensitivity(RF) – 0.90 ± 0.056 Sensitivity(EM)—0.90 ± 0.078 Specificity(RF) – 0.95 ± 0.020 Specificity(RF) – 0.95 ± 0.010 AUC—0.98 ± 0.010 AUC—0.92 ± 0.043 |
Ascencio-Cabral et al. [153] | CT Scan | 2,946—Community Acquired Pneumonia 7,593 – COVID-19 6,893 – Non-COVID-19 17,432 – Total | Transfer Learning: ResNet-50, ResNet-50r, DenseNet-121, MobileNet-v3, and CaiT-24-XXS-224 (CaiT) Transformer | ResNet-50’s – Accuracy – 98.00 Balanced Accuracy – 98.00 F1 Score – 98.00 F2 Score – 98.00 MCC – 98.00 Sensitivity – 98.00 Specificity – 98.00 |
Machine learning in lung cancer detection
Author/Ref. | Imaging Modality | Image Dataset Samples (with Classified Diseases) | ML Method | Performance Metrics(%) |
---|---|---|---|---|
Sekeroglu et al. [154] | CT Scan | LIDC/IDRI – 100—Annotated Nodules 604 – Total Nodules & non-nodules (diameter ≥ 3 mm) | Multi-Perspective Hierarchical Deep Fusion Learning Approach | Accuracy – 91.20 Specificity – 87.00 Sensitivity – 95.00 False Positive/scan—0.4 |
Donga et al. [155] | CT Scan | LIDC/IDRI – 1018—Total | Modified Gradient Boosting Algorithm | Accuracy – 95.67 Precision – 95.70 Recall – 91.00 F1 Score – 94.10 |
Khehrah et al. [156] | CT Scan | LIDC – (~ 250–350)—Nodule’s Images of 70 lung Scans | Otsu method + SVM | Accuracy—92.00 Sensitivity – 93.75 Specificity – 91.18 Precision – 85.19 FPI – 0.13 FPE – 0.22 MCC – 0.8385 |
Ausawalaithong et al. [157] | X-Ray | JSRT – 100 – Malignant ( +) 147 – Benign and Normal (-) 247—Total ChestX-ray14 - 6,282 – Positive ( +) 1,05,197 – Negative (-) 1,11,479—Total | Transfer Learning - Base Model – Densenet-121 Retrained Model – A (On ChestX-ray14) Retrained Model – B (On JSRT) Retrained Model – C (On ChestX-ray14 + JSRT) | Retrained Model—C Mean - Accuracy—74.43 + 6.01 Specificity—74.96 + 9.85 Sensitivity – 74.68 + 15.33 |
Chen et al. [158] | CT Scan | 10,000—Total | Manual SegNet Deeplab v3 VGG 19 | Accuracy – 92.50 Sensitivity—98.33 Specificity – 86.67 Overlap Rate-95.11 |
Nanglia et al. [159] | Low-Dose CT Scan (LDCT) | 500—Total | Feature Extraction – SURF + Genetic Algorithm Classification -SVM + Feed Forward Back Propagation Neural Network | Overall Accuracy – 98.08 Precision—98.17 Recall—96.50 F-measure – 97.00 |
Alshmrani et al. [160] | X-ray | 20,000 – Lung Cancer 3,615 – COVID-19 5,856 – Pneumonia 6,012—Lung opacity 1,400 – Tuberculosis 10,192—Normal 80,000—Total | VGG19 + 3 Blocks of CNN | Accuracy – 96.48 Precision – 97.56 Recall – 93.75 F1 Score – 95.62 AUC – 99.82 |
Heuvelmans et al. [161] | CT Scan | NLST - 205—Malignant 2,106 – Total Lung Nodules | Lung Cancer Prediction CNN (LCP-CNN) | Sensitivity – 99.00 AUC—94.50 |
Rahouma et al. [162] | CT Scan | 30 – NSCLC 20 – Benign 50 – Total Lung Nodules | Polynomial Neural Network (PNN) | Accuracy—96.66 Sensitivity – 95.00 |
Bilal et al. [163] | X-ray | 250 – Normal 320 – Benign 320 – Malignant 910 – Total | VGGNet, ResNet, GoogLeNet AlexNet, InceptionNet-V3 + Improved Gray Wolf Optimization and InceptionNet-V3 | Accuracy – 98.96 Sensitivity—100.00 Specificity – 94.74 |
Torres et al. [164] | CT Scan | 09—Benign 51—Malignant 60 – Total Lung Nodules | Nodule Extraction – Otsu thresholding and morphological operations + GLCM + t-test Classification—Feed-Forward Neural Network | Accuracy – 96.30 Sensitivity—100.00 Specificity – 83.00 F1 Score – 97.67 AUC – 94.00 |
Hussain et al. [165] | MRI | 377 – NSCLC 568 – SCLC 945 – Total Lung Nodules | (I) Texture features using SVM polynomial (II) Image Adjustment using SVM RBF and Polynomial (III) Contrast stretching at threshold of (0.02, 0.98) using SVM RBF and Polynomial (IV) Gamma Correction at gamma value 0.9 | (I) Sensitivity = 100 Specificity = 99.72 Accuracy = 99.89 (II), (III), and (IV) - Sensitivity = 100 Specificity = 100 Accuracy = 100 |
Kuo et al. [166] | CT Scan | 273 – GGO 120 – Part Solid 274 – Solid 667 – Total Lung Nodules | Preprocessing – Adaptive Wiener filter Lung Segmentation—Fast Otsu & Edge Search Method Nodule Enhancement—Gray Level Adjustment Candidate Detection- Fast Otsu Method Classification—SVM | Total Sensitivity—92.05 Small Nodules (5 mm–9 mm) - Sensitivity—93.73 GGO – Sensitivity—93.02 |
Singh et al. [167] | CT Scan | 6,910 – Benign 8,840 – Malignant 15,750—Total Lung Nodules | Feature Extraction – GLCM + Statistical Method Classification -KNN, SVM, DT, RF, MLP, Naïve Bayes, Gradient Descent | Accuracy—88.55 Sensitivity – 89.84 Precision – 86.59 F1 Score – 87.35 |
Machine learning in COVID-19 detection
Author/Ref. | Imaging Modality | Image Dataset Samples (with Classified Diseases) | ML Method | Performance Metrics(%) |
---|---|---|---|---|
Wang et al. [168] | X-Ray | COVIDx: 13,975 – COVID-19 + | COVIDNet: Machine Driven Design Exploration: Projection-Expansion-Projection-Extension (PEPX) Architecture | Accuracy—93.30 Sensitivity – 91.00 Positive Predictive Value – 98.90 |
Keles et al. [169] | X-Ray | 210—COVID-19 + 350—Viral Pneumonia 350—Normal 910—Total | COV19-CNNet: Feature Engineering—7 convolutional layers Classification—4 Dense Layer | Accuracy—94.28 Specificity—96.94 Sensitivity—94.33 F1-score—94.20 |
COV19-ResNet: (Based on ResNet) | Accuracy—97.61 Specificity – 98.72 Sensitivity – 97.61 F1-score – 97.62 | |||
Ohata et al. [170] | X-Ray | Dataset-A: 194—COVID-19 + 194 – Healthy 388—Total | Transfer Learning with MobileNet + Linear SVM | Accuracy—98.46 F1-score—98.46 FPR – 1.026 |
Dataset-B: 194—COVID-19 + 194—Healthy 388—Total | Densenet201 + MLP | Accuracy—95.64 F1-score—95.63 FPR – 4.103 | ||
Singh et al. [171] | X-Ray | Dataset-A: 573—COVID-19 + 573—Normal 573 – Pneumonia 1,719—Total Dataset-B: 1,519—COVID-19 + 1,519—Normal 1,519—Pneumonia 4,557—Total Dataset-C: 573—COVID-19 + 1,600—Normal 1,600 – Pneumonia 3,773—Total | COVIDScreen (Pruned Ensemble Learning framework): Base Learners – VGG-19, VGG-16, DenseNet-121, DenseNet-169, ResNet-50 Meta learner – Naïve Bayes + GAN | Accuracy—98.67 Precision – 100.00 Recall – 100.00 F1-score – 100.00 Kappa score—0.98 |
Iqbal et al. [172] | X-Ray | Dataset-1: 284—COVID-19 + 310—Normal 330—Pneumonia Bacterial 327—Pneumonia Viral 1,251—Total Dataset-2: 157—COVID-19 + 500—Normal, 500—Pneumonia, 1,157—Total | CoroNet: Xception (An Extreme Version of Inception Model – 71 Layer), Flatten, Dropout, Dense | CoroNet on Dataset-1: Average - Precision- 93.17 Recall—98.25 Specificity – 97.90 F1-Score—95.61 Accuracy 4 class—89.60 Accuracy 3 class – 95.00 Accuracy 2 class—99.00 CoroNet on Dataset-2: Overall Accuracy- 90.21 Precision – 97.00 Recall – 89.00 Specificity—99.6 F-measure – 93.00 Overall 3 and 4 Class CoroNet: Accuracy-89.60 |
Madaan et al. [173] | X-Ray (Frontal Postero- anterior) | Dataset-1: 196—COVID-19 + Dataset-2: 1,583—COVID-19- | XCOVNet: Convolution (First – 32, Second – 64, Third—128) + ReLu + Adam Optimizer | Accuracy—98.44 |
Das et al. [174] | X-Ray (Frontal) | Generated: 538—Class 0 (COVID-19 +) 468—Class 1 (COVID-19-) 1,006—Total | Ensemble method: Combination of InceptionV3, Resnet50V2 and Densenet201 | Accuracy- 91.62 Sensitivity– 95.09 Specificity—88.33 F1-score—91.71 AUC—91.71 |
Hussain et al. [175] | X-Ray | COVID-R: 2,843—COVID19 + 3,108—Normal 1,439 – Pneumonia (Viral + Bacteria) 7,390—Total | CoroDet model(22-layer): 9 Conv2d layers, 9 maxpool2d layers, 1 Flatten, 2 dense, 1 LeakyReLu | 2 class classification: Accuracy—99.12 3 class classifications: Accuracy—94.20 4 class classification: Accuracy—91.20 |
Rahman et al. [176] | X-Ray | COVQU: 3,616—COVID19 + 8,851—Normal 6,012 – Non-COVID Total – 18,479 CXR | Lung segmentation: Modified U-net Classification: 7 Deep CNN model (ResNet18, ResNet50, ResNet101, InceptionV3, DenseNet201, and ChexNet and a shallow CNN model) | Lung segmentation: Accuracy—98.63 Dice Score – 96.94 Classification: Accuracy—96.29 Sensitivity- 97.28 F1-score—96.28 |
Narin et al. [177] | X-Ray | Dataset-1: 341—COVID-19 + 2,800—Normal 3,141—Total Dataset-2: 341—COVID-19 + 1,493—Viral pneumonia 1,834- Total Dataset-3: 341- COVID-19 + 2,772 – Bacterial pneumonia 3,113—Total | InceptionV3, ResNet50, ResNet101, ResNet152, Inception-ResNetV2 | Binary Classification: Accuracy: Dataset-1: COVID-19—96.10 Dataset-2: COVID-19—99.50 Dataset-3: COVID-19—99.70 |
Gaffari Celik [178] | CT scan & X-Ray | CT scan images: 1,601– COVID-19 + 1,693 – Normal 3,294 – Total X-Ray images: 3,616 – COVID-19 + 10,192 – Normal 6,012—Lung Opacity 1,345—Viral pneumonia 21,165 – Total | CovidDWNet: Feature Reuse Residual Block and Depth-wise Dilated Convolutions + Gradient Boosting Architecture | Binary Class: (CT Images) Accuracy – 100.00 (Application 1) Accuracy – 99.84 (Application 2) Multi-Class: (X-Rays) Accuracy – 96.81 (Application 3) Multi-Class (CT and X-Rays) Accuracy – 96.32 (Application 4) |
Gozes et al. [179] | CT scan | 829—COVID-19 + 1,036—COVID-19- 1,865—Total | Lung Segmentation: Proposed U-net with VGG-16 base encoder Classifier: ResNet-50 | AUC – 94.80 (95% CI: 0.912–0.985) |
Ahuja et al. [180] | CT scan | 349—COVID19 + 397 – NonCOVID19 746—Total | Augmentation: Rotation + Translation + Shearing + SWT Transfer Learning: SqueezeNet, ResNet18, ResNet50, ResNet101 | Binary Class: ResNet18 Accuracy—99.40 Sensitivity- 100.00 Specificity – 98.60 F1-score – 99.50 NPV – 100.00 |
Silva et al. [181] | CT scan | SARS-CoV-2 CT scan: 1,252—COVID19 + 1,230 – NonCOVID19 2,482—Total COVID-CT: 349—COVID19 + 463 – NonCOVID19 812—Total | EffiecintCovidNet: Transfer Learning - Base Learner—EfficientNet B0 Architecture | Accuracy—98.99 Sensitivity – 98.80 Positive Prediction – 99.20 |
Methodical exploration
-
Image Dataset Availability: Since there is a need for imaging samples and datasets available, it might be challenging to acquire all of the information necessary to diagnose lung illness accurately.
-
Imbalanced Datasets: Imbalance in the dataset can lead to inaccurate diagnosis, as DL solutions may overfit the majority or minority classes and fail to classify accurately.
-
Quality of Images: Low-resolution or poor-quality images can yield inaccurate results when using ML solutions for lung disease diagnosis.
-
Unreliable data: ML models rely highly on high-quality, consistent data, which can be hard to come by. Poor quality, incomplete, or inconsistent data can lead to an incorrect diagnosis.
-
Bias in data: Healthcare providers must recognize that bias may exist in the data they provide to train the ML models, and they must ensure that these biases are corrected to prevent any false positives or misdiagnoses.
-
Uncontrolled data sources: The image dataset used for ML models may come from multiple sources, which may be difficult to control for quality and accuracy.
-
Limited flexibility: ML models have limited flexibility due to the heavy dependence on training data. The model's performance may suffer when contextual images are added to the diagnostic process.
-
Overfitting: Overfitting occurs when an ML model is too complex and captures patterns that may not generalize, leading to inaccurate predictions on unseen data. It can lead to erroneous diagnoses when ML models are trained and tested on limited datasets.
-
Lack of Interpretability: Because ML models aren't easy to understand, it's hard to know why a particular prediction was made. It makes it hard to trust the results and could raise ethical concerns.
-
Computational cost: Training an ML model is computationally expensive, requiring significant computing power and time depending on the model's complexity and the dataset used for training. These costs can be too high for systems that cannot afford or do not have access to the resources needed to train these models.
-
False positives or negatives: ML models can lead to false-negative results, meaning they can incorrectly identify a healthy person as having lung disease. In the case of a false positive, a patient with lung disease is considered a healthy individual. It could happen because of imperfect training data that does not accurately reflect the behavior of the disease or due to misclassification in the dataset being used.
-
Unreliable model performance metrics: Due to the complexity and variability of features, it is hard to accurately assess or measure how well an ML model works when diagnosing a disease.
Observed concerns about imaging modalities
Pneumonia
Lung cancer
COVID-19
Observed concerns about datasets
Observed concerns about ML
Lung Disease | Imaging Modality | ML/Sub-domains | Article Investigated |
---|---|---|---|
Pneumonia | X-Ray | Conventional ML | [145] |
DL/CNN | |||
Ensemble Methods | |||
Transfer Learning | |||
CT scan | Conventional ML | [152] | |
DL/CNN | |||
Ensemble Methods | [152] | ||
Transfer Learning | [153] | ||
Infrared Thermal | Conventional ML | [144] | |
Lung Cancer | X-Ray | Conventional ML | X |
DL/CNN | |||
Ensemble Methods | X | ||
Transfer Learning | [157] | ||
CT scan | Conventional ML | ||
DL/CNN | |||
Ensemble Methods | [155] | ||
Transfer Learning | X | ||
MRI | Conventional ML | [165] | |
COVID-19 | X-Ray | Conventional ML | |
DL/CNN | |||
Ensemble Methods | |||
Transfer Learning | |||
CT scan | Conventional ML | [178] | |
DL/CNN | |||
Ensemble Methods | X | ||
Transfer Learning |
ML pathway
-
Image Acquisition: Researchers amassed vast and varied images from chest X-rays, CT scans, and other imaging modalities associated with certain lung diseases [6‐9]. These images have been labeled chiefly for identification purposes, mostly. Most researchers preferred publicly accessible datasets in comparison to private datasets [42‐55, 63, 137, 138].
-
Image Preprocessing: Researchers preprocessed the image dataset to reduce noise and outliers and normalize the data for superior results. Significant preprocessing operations had been carried out, such as the selection and modification of attributes, the imputing of missing values, the normalization of features, and the elimination of noise. The images are preprocessed to reduce their dimensionality. They converted images into numerical data by breaking them into individual pixel colors to input them into the ML model. Once the preprocessing is completed, the dataset is generally split into training and test datasets so that each portion adequately represents relevant cases [19, 140‐167].
-
Training of the ML Model: Researchers trained the ML model using labeled datasets with known outcomes to detect patterns associated with the specified disease class in supervised learning. In the case of unsupervised learning, the ML model can also draw a pattern and identify the disease with the unlabeled data. They chose an appropriate model and algorithm to learn from the input dataset. With CNN, they trained the model on processed data with different learning rates and weights or different architectures to find the best performance [121‐125, 128].
-
Performance Metrics: Researchers evaluated the ML model using a particular performance metric. Evaluate by measuring performance metrics on how well it learned from the training data. After training the model, it is evaluated using metrics such as accuracy, recall, precision, F1 score, etc., which measure how well it performs on unseen data samples. In DL and CNN, monitoring accuracy and other metrics such as sensitivity and specificity is performed after each training epoch to ensure all parameters are fine-tuned and that training ends with an acceptable performance score that has attained desirable precision and recall scores [140‐181].