Note: If the author has requested for "Expert Guidance" and you can help, please start a New Topic in the "Discussions" Tab

Hashwanth Gogineni's other Models Reports

Major Concepts


Sign-Up/Login to access Several ML Models and also Deploy & Monetize your own ML solutions for free

Models Home » Domain Usecases » Health Care and Pharmaceuticals » Detection Of Tuberculosis Disease Using Tensorflow Technique

Detection Of Tuberculosis Disease Using Tensorflow Technique

Models Status

Model Overview


Tuberculosis (TB) is a potentially fatal infectious illness affecting mostly the lungs. Tuberculosis bacteria are communicated from person to person via minute droplets discharged into the air by coughs and sneezes. In 1985, tuberculosis infections began to rise in affluent countries, mainly due to the advent of HIV, the virus that causes AIDS. HIV affects a person's immune system, making it incapable of fighting tuberculosis bacteria. In the United States, tuberculosis began to decline again in 1993 as a result of improved control strategies. However, it is still a source of concern. Many tuberculosis strains are resistant to the most commonly used antituberculosis medications. Active tuberculosis patients must take a variety of treatments for months to clear the infection and avoid antibiotic resistance.

Signs and symptoms of active Tuberculosis include:

  • Coughing for three or more weeks

  • Coughing up blood or mucus

  • Chest pain, or pain with breathing or coughing

  • Unintentional weight loss

  • Fatigue

  • Fever

  • Night sweats

  • Chills

  • Loss of appetite

Why Tuberculosis Detection?

The project can be used by healthcare companies to predict Tuberculosis in patients using Artificial intelligence.



The dataset consists of '1400' chest X-ray images. Each class consists of '700' images i.e 'normal' and '"Tuberculosis'.


For Image Classification and Mobile Vision, MobileNet is a CNN architecture model. Other models exist, but MobileNet stands out since it requires relatively minimal computational power to execute or apply transfer learning. This makes it ideal for mobile devices, embedded systems, and computers that lack a GPU or have low processing efficiency without sacrificing considerable accuracy. It's also best suited for web browsers, which have computational, graphic processing, and storage limits.

MobileNet Architecture

  • MobileNets, which are based on a streamlined architecture that leverages depthwise separable convolutions to generate lightweight deep neural networks, is proposed for mobile and embedded vision applications.

  • Two simple global hyper-parameters are presented that efficiently trade-off latency and accuracy.

Depthwise separable filters, also known as Depthwise Separable Convolution, are the foundation of MobileNet. 

Another thing that can improve performance is the network structure. 

Finally, the width and resolution can be adjusted to optimise the latency/accuracy trade-off.

Understanding Code

First, let us import the required libraries for our project.

import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pathlib import Path
from tensorflow.keras.preprocessing import image_dataset_from_directory
import tensorflow as tf
import cv2
from keras.layers import Input, Lambda, Dense, Flatten,GlobalAveragePooling2D, Dropout, Activation
from keras.models import Model
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from tensorflow.python.keras.preprocessing.image import ImageDataGenerator
from sklearn.metrics import classification_report, log_loss, accuracy_score

Now, let us load the data into our system and convert the data into a dataframe.

image_dir = Path('/content/sample_data/data')

# Get filepaths and labels
filepaths = list(image_dir.glob(r'**/*.png'))
labels = list(map(lambda x: os.path.split(os.path.split(x)[0])[1], filepaths))

filepaths = pd.Series(filepaths, name='Filepaths').astype(str)
labels = pd.Series(labels, name='Labels')

# Concatenate filepaths and labels
image_df = pd.concat([filepaths, labels], axis=1)

# Shuffle the DataFrame and reset index
image_df = image_df.sample(frac=1).reset_index(drop = True)

# Show the result

As you can see, we extracted data from the data's directory and concatenated 'filepaths' and 'labels' into a dataframe.

Let us also split the dataframe for testing and training purposes.

# Separating train and test data
from sklearn.model_selection import train_test_split

train_df, test_df = train_test_split(image_df, train_size=0.85, shuffle=True, random_state=1)

As you can see, I used the "train_test_split" function to split the dataframe.

train_datagen = ImageDataGenerator(rescale=1./255, validation_split=0.15)

test_datagen = ImageDataGenerator(rescale=1./255)

I used the 'ImageDataGenerator' function for data augmentation purposes.

train_images = train_datagen.flow_from_dataframe(
target_size=(224, 224),

val_images = train_datagen.flow_from_dataframe(
target_size=(224, 224),

test_images = test_datagen.flow_from_dataframe(
target_size=(224, 224),

Also, I loaded train and test data using the 'flow_from_dataframe' function into the kernel.

Next, let us get into the modelling part of the project.

from tensorflow.keras.applications import MobileNetV2
mblnet = MobileNetV2(include_top=False, weights='imagenet', input_shape=(224,224,3))

So, I chose the 'MobileNet' model to get the best results.
I used 'imagenet' as weights for our model.

for layer in mblnet.layers:

x = Flatten()(mblnet.output)
x = Dense(256,activation='relu')(x)
x = Dense(256,activation='relu')(x)
prediction = Dense(2,activation='softmax')(x)
model = Model(inputs=mblnet.input, outputs=prediction)

I also used neural network layers to the model for efficient results.

Now, let us compile our model and fit the data.


callback = tf.keras.callbacks.EarlyStopping(monitor='accuracy', patience=2)

history = model.fit_generator(train_images, validation_data=val_images, epochs=25, callbacks=callback)

As you can see, I used 'categorical_crossentropy' and 'accuracy' as metrics.

Now let us understand how our model performed.

 get_acc = history.history['accuracy']
value_acc = history.history['val_accuracy']
get_loss = history.history['loss']
validation_loss = history.history['val_loss']

epochs = range(len(get_acc))
plt.plot(epochs, get_acc, 'r', label='Accuracy of Training data')
plt.plot(epochs, value_acc, 'b', label='Accuracy of Validation data')
plt.title('Training vs validation accuracy')

Also, let us have a look at our model's classification report.

# Classification Report

from sklearn.metrics import classification_report

predictions=model.predict_generator(test_images, verbose=1)
y_pred = np.argmax(predictions, axis=-1)
print(classification_report(test_labels, y_pred))

Here in the report '0' represents 'Normal' and '1' represents 'Tuberculosis'.

Thank you for your time.