Lung Opacity Detection

Hashwanth Gogineni

Related Listings

Malaria Detection

0 comments, 2 reviews , 2 likes
Stress Level Predicti...

0 comments, 2 reviews , 2 likes

Lyme Disease Detection

0 comments, 1 review , 657 views, 1 like
Census Income Prediction

0 comments, 1 review , 484 views, 1 like

Major Concepts

Models Home » Domain Usecases » Health Care and Pharmaceuticals » Lung Opacity Detection

Models Status

Model Overview

Ground Glass Opacity

The hazy grey regions that can be seen in CT scans or X-rays of the lungs are known as ground-glass opacity (GGO). The increasing density inside the lungs is seen by these grey spots. The word stems from a glassmaking technique in which sand is used to blast the surface of the glass. The glass appears hazy white or frosted as a result of this process.GGO can be caused by a variety of factors, including infections, inflammation, and growth. GGO was also the most common abnormality among persons with COVID-19-related pneumonia, according to a 2020 analysis.

There are several types of GGO. These include:

Diffuse: Diffuse opacities appear in several lobes of one or both lungs. When the air in the lungs is replaced by fluid, inflammation, or damaged tissue, this pattern develops.

Nodular: Both benign and malignant illnesses might be indicated by this kind. GGO that appears on many scans could suggest premalignant or malignant growths.

Centrilobular: This form of cancer occurs in one or more lung lobules. The hexagonal divisions of the lung are called lobules. The connective tissue that connects the lobules is not harmed.

Mosaic: When small arteries or airways within the lungs become obstructed, this pattern emerges. The intensity of the opaque patches varies.

Crazy paving: Crazy pavement seems like a straight line. When the crevices between the lobules widen, this can happen.

Halo sign: The area around the nodules is filled with this type of opacity.

Reversed halo sign: An region that is almost completely encircled by liquid-filled tissue is known as a reversed halo sign.

Why Lung Opacity Detection?

The project can help healthcare organizations detect Lung Opacity problems in patients and take necessary measures.

Dataset

The dataset includes '12,024' Chest X-ray images of Healthy patients and Lung Opacity patients i.e '6,012 ' images in each class.

VGG19

In layman's terms, VGG is a deep CNN that is used to classify images. The VGG19 model has the following layers:

This network was given a fixed-size (224 * 224) RGB image as input, implying that the matrix was of shape (224,224,3).

The only preprocessing was to subtract the mean RGB value from each pixel, which was computed throughout the entire training set.

They used kernels with a size of (3 * 3) and a stride size of 1 pixel to cover the entire image concept.

To keep the image's spatial resolution, spatial padding was applied.

Max pooling was done using stride 2 over 2 * 2 pixel windows.

This was followed by a Rectified linear unit (ReLu) to introduce non-linearity to increase model classification and computational time, since earlier models used tanh or sigmoid functions, and this proved to be far superior to them.

Three fully connected layers were implemented, the first two of which were of size 4096, followed by a layer with 1000 channels for 1000-way ILSVRC classification, and finally a softmax function.

Understanding Code

First, let us import the required libraries for our project.

import os

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

from pathlib import Path

from tensorflow.keras.preprocessing import image_dataset_from_directory  

import tensorflow as tf

import cv2

from keras.layers import Input, Lambda, Dense, Flatten,GlobalAveragePooling2D, Dropout, Activation

from keras.models import Model

from keras.preprocessing.image import ImageDataGenerator

from keras.models import Sequential

from tensorflow.python.keras.preprocessing.image import ImageDataGenerator

from sklearn.metrics import classification_report, log_loss, accuracy_score

Now, let us load the data into our system and convert the data into a dataframe.

image_dir = Path('/content/sample_data/data')



# Get filepaths and labels

filepaths = list(image_dir.glob(r'**/*.png'))

labels = list(map(lambda x: os.path.split(os.path.split(x)[0])[1], filepaths))



filepaths = pd.Series(filepaths, name='Filepaths').astype(str)

labels = pd.Series(labels, name='Labels')



# Concatenate filepaths and labels

image_df = pd.concat([filepaths, labels], axis=1)



# Shuffle the DataFrame and reset index

image_df = image_df.sample(frac=1).reset_index(drop = True)

As you can see, we extracted data from the data's directory and concatenated 'filepaths' and 'labels' into a dataframe.

Let us also split the dataframe for testing and training purposes.

# Separating train and test data

from sklearn.model_selection import train_test_split



train_df, test_df = train_test_split(image_df, train_size=0.85, shuffle=True, random_state=1)

As you can see, I used the "train_test_split" function to split the dataframe.

train_datagen = ImageDataGenerator(rescale=1./255, validation_split=0.15)



test_datagen = ImageDataGenerator(rescale=1./255)

As you can see, I used the 'ImageDataGenerator' function for data augmentation purposes.

train_images = train_datagen.flow_from_dataframe(

    dataframe=train_df,

    x_col='Filepaths',

    y_col='Labels',

    target_size=(250, 250),

    color_mode='rgb',

    class_mode='categorical',

    batch_size=64,

    shuffle=True,

    seed=42,

    subset='training'

)



val_images = train_datagen.flow_from_dataframe(

    dataframe=train_df,

    x_col='Filepaths',

    y_col='Labels',

    target_size=(250, 250),

    color_mode='rgb',

    class_mode='categorical',

    batch_size=64,

    shuffle=True,

    seed=42,

    subset='validation'

)



test_images = test_datagen.flow_from_dataframe(

    dataframe=test_df,

    x_col='Filepaths',

    y_col='Labels',

    target_size=(250, 250),

    color_mode='rgb',

    class_mode='categorical',

    batch_size=32,

    shuffle=False

)

Also, I loaded train and test data using the 'flow_from_dataframe' function into the kernel.

Next, let us get into the modelling part of the project.

from tensorflow.keras.applications import VGG19

vgg = VGG19(include_top=False, weights='imagenet', input_shape=(224,224,3))

vgg.summary()

So, I chose the 'VGG19' model to get the best results.
I used 'imagenet' as weights for our model.

for layer in vgg.layers:

    layer.trainable=False



x = Flatten()(vgg.output)

x = Dense(512,activation='relu')(x)

x = Dense(512,activation='relu')(x)

prediction = Dense(2,activation='softmax')(x)

model = Model(inputs=vgg.input, outputs=prediction)

model.summary()

I also used neural network layers to the model for efficient results.

Now, let us compile our model and fit the data.

model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])



callback = tf.keras.callbacks.EarlyStopping(monitor='accuracy', patience=2)



history = model.fit_generator(train_images, validation_data=val_images, epochs=25, callbacks=callback)

As you can see, I used 'categorical_crossentropy' and 'accuracy' as metrics.

Now let us understand how our model performed.

get_acc = history.history['accuracy']

value_acc = history.history['val_accuracy']

get_loss = history.history['loss']

validation_loss = history.history['val_loss']



epochs = range(len(get_acc))

plt.plot(epochs, get_acc, 'r', label='Accuracy of Training data')

plt.plot(epochs, value_acc, 'b', label='Accuracy of Validation data')

plt.title('Training vs validation accuracy')

plt.legend(loc=0)

plt.figure()

plt.show()

Finally, I used 'matplotlib.pyplot' to generate our model's performance graph.

Also, let us have a look at our model's classification report.

# Classification Report



from sklearn.metrics import classification_report



test_labels=test_images.classes

predictions=model.predict_generator(test_images, verbose=1)

y_pred = np.argmax(predictions, axis=-1)

print(classification_report(test_labels, y_pred))

Here in the report '0' represents 'Lung Opacity' and '1' represents 'Healthy'

Thank you for your time

0 comments

Viaan Prakash, Advika Banerjee, and Maryam Bains like this

Related Listings

Hashwanth Gogineni's other Models Reports

Major Concepts