Dev Agrawal's other Models Reports

Major Concepts

 

Sign-Up/Login to access Several ML Models and also Deploy & Monetize your own ML solutions for free

Models Home » Domain Usecases » Health Care and Pharmaceuticals » Breast Cancer Detection Using CNN

Breast Cancer Detection Using CNN

Models Status

Model Overview

Statement


Abstract Breast cancer is common cancer in women and one of the significant causes of death worldwide. Invasive ductal carcinoma (IDC);


 




Cancer develops in a milk duct and occupies the fibrous or fatty breast tissue outside the duct; it's the most common type of breast cancer, forming 80% of all breast cancer diagnoses. Early accurate diagnosis plays a crucial role in choosing the right treatment plan and improving the survival rate among the patients. Few efforts have been made to predict and detect cancers by applying AI. A relevant dataset is the first essential step to achieve such a goal. Cancer is a severe public health issue worldwide and the second leading cause of death. According to the International Agency for Research on Cancer (IARC), about 18.1 million new cases and 9.6 million deaths caused by breast cancer were reported in 2020.


Usage


After processing and staining, the samples are examined under a microscope. The abnormal cells are found and marked with a special pen. A pathologist will then examine the marked cells to make a diagnosis. A pathology report is a written medical record of a tissue diagnosis. A tissue diagnosis is the diagnosis made when a piece of tissue is examined by a pathologist, a doctor who is trained to examine tissue. He or she interprets the findings in tissue and makes a diagnosis. Deep learning approach for automatic detection of invasive ductal carcinoma (IDC) tissue regions in whole slide images (WSI) of breast cancer (BCa).


Dataset


Dataset link: https://www.kaggle.com/paultimothymooney/breast-histopathology-images


The dataset contains 162 whole mount slide images of (BCa) cells zoomed at 40x. From those images, 277,524 patches of size 50 x 50 were taken out. The format of each patch's file's name is uxXyYclassC.png — > example 10253idx5x1351y1101class0.png. Where u is the patient ID (10253idx5), X is the x-coordinate of an image where this patch was extracted, Y is the y-coordinate of an image where this patched from, and C indicates the class of image where 0 is non-IDC, and 1 is IDC.


THE SAMPLE DATASET: The user will provide 40x zoomed images(50 x 50) of the mounted slide of the cell.


Divide the downloaded dataset as follows:



Pre-processing of an image and batch provided to the following model is as follows:


def generator(dir, gen=image.ImageDataGenerator(rescale=1/255.0),target_size=(50,50),class_mode='categorical' ):
return gen.flow_from_directory(dir,batch_size=32,shuffle=True,color_mode='rgb',class_mode=class_mode,target_size=target_size)

def generator1(dir, gen=image.ImageDataGenerator(rescale=1/255.0),target_size=(50,50),class_mode='categorical' ):
return gen.flow_from_directory(dir,batch_size=32,shuffle=False,color_mode='rgb',class_mode=class_mode,target_size=target_size)

BS= 32
TS=(50,50)
train_batch = generator('..../IDC_final/Training',target_size=TS)
valid_batch = generator1('..../IDC_final/Validation',target_size=TS)
SPE= len(train_batch.classes)//BS
VS = len(valid_batch.classes)//BS

Model


CNN (Convolutional Neural Network) network is built for the training. This network architecture is as follows:



  • It Uses 3×3 CONV filters.

  • Mound these filters on top of each other.

  • Performed max-pooling.

  • Use depthwise separable convolution (more efficient, takes up less memory)

  • Taking optimizer as adam and loss function as categorical_crossentropy.


SeparableConv2D - The SeparableConv2D is a variation of the traditional convolution that was proposed to compute it faster. It performs a depthwise spatial convolution followed by a pointwise convolution which mixes the resulting output channels.


BatchNormalization - Increases the speed for loss to converge by normalizing the hidden units to zero mean and unit variance.


MaxPooling2D - Downsamples the input along its spatial dimensions (height and width) by taking the maximum value over an input window (of size defined by pool_size) for each channel of the input. Strides shift the window along each dimension.


Dropout - To prevent the nodes from depending on each other. In addition, it reduces overfitting by dropping nodes in each layer with some probability.


Dense - used in the output layer, with 2 units, and activation function as softmax.


Categorical Crossentropy: Categorical cross-entropy is a loss function that is used in multi-class classification use cases. These are use cases where an example can only belong to one out of many possible categories, and the model must decide which one


Adam Optimizer: Adam(Adaptive Moment Estimation) is an optimization algorithm that can be used instead of the classical stochastic gradient descent procedure to update network weights iterative based on training data.Adam works with momentums of first and second order. The intuition behind the Adam is that we don’t want to roll so fast just because we can jump over the minimum, we want to decrease the velocity a little bit for a careful search.


model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])

Results


The classification report and Confusion matrix for the constructed model are shown below.


 


 


0 comments