Disease Prediction Using Random Forest Classifier

NEHA SINGH

Related Listings

Skin Disease Prediction

0 comments, 1 review , 2 likes
Detection Of Tubercul...

0 comments, 1 review , 1 like

Covid Prediction Using Chest X-Ray

0 comments, 1 review , 746 views, 1 like
Diabetic Retinopathy

0 comments, 3 reviews , 527 views, 4 likes

Major Concepts

Models Home » Domain Usecases » Health Care and Pharmaceuticals » Disease Prediction Using Random Forest Classifier

Disease Prediction Using Random Forest Classifier

Models Status

Model Overview

Every day we predict according to symptoms what will be the possible reason for a symptom, just like that in this use case we will predicting the prognosis according to the symptoms defined by the patient. This can act as a second eye for the doctor to check their suggestion. Even at times, we have seen various apps and websites claims to help as a doctor on the basis of their models, we are also trying to build something like that here.

About the DataSet

Complete Dataset consists of 2 CSV files. One of them is training and other is for testing your model.

Each CSV file has 133 columns. 132 of these columns are symptoms that a person experiences and last column is the prognosis.

These symptoms are mapped to 42 diseases you can classify these set of symptoms to.

You are required to train your model on training data and test it on testing data

Source:https://www.kaggle.com/kaushil268/disease-prediction-using-machine-learning

The shape of Dataset :(4428, 132)

We have both Training and Testing Dataset All the values are present in 0,1 format.

There is equal distribution of values 120 in each class of prognosis.

Model:
Model Used: Random Forest Classifier
Accuracy: 93%
F1 Score: 0.91
Recall: 0.90
Precision: 0.92

The data set has 132 symptoms that act as features to prognosis:

'itching', 'skin_rash', 'nodal_skin_eruptions', 'continuous_sneezing', 'shivering', 'chills', 'joint_pain', 'stomach_pain', 'acidity', 'ulcers_on_tongue', 'muscle_wasting', 'vomiting', 'burning_micturition', 'spotting_ urination' ,'fatigue',

'weight_gain', 'anxiety' ,'cold_hands_and_feets' ,'mood_swings', 'weight_loss' ,'restlessness', 'lethargy',

'patches_in_throat', 'irregular_sugar_level', 'cough', 'high_fever', 'sunken_eyes','breathlessness', 'sweating',

'dehydration' ,'indigestion', 'headache', 'yellowish_skin', 'dark_urine' ,'nausea' ,'loss_of_appetite',

'pain_behind_the_eyes', 'back_pain','constipation', 'abdominal_pain', 'diarrhoea', 'mild_fever', 'yellow_urine',

'yellowing_of_eyes', 'acute_liver_failure' ,'fluid_overload', 'swelling_of_stomach', 'swelled_lymph_nodes',

'malaise', 'blurred_and_distorted_vision', 'phlegm' ,'throat_irritation', 'redness_of_eyes', 'sinus_pressure',

'runny_nose', 'congestion', 'chest_pain', 'weakness_in_limbs', 'fast_heart_rate', 'pain_during_bowel_movements',

'pain_in_anal_region', 'bloody_stool', 'irritation_in_anus', 'neck_pain', 'dizziness', 'cramps', 'bruising',

'obesity', 'swollen_legs', 'swollen_blood_vessels', 'puffy_face_and_eyes', 'enlarged_thyroid', 'brittle_nails',

'swollen_extremeties', 'excessive_hunger', 'extra_marital_contacts' ,'drying_and_tingling_lips', 'slurred_speech',

'knee_pain', 'hip_joint_pain', 'muscle_weakness' ,'stiff_neck', 'swelling_joints', 'movement_stiffness', 'spinning_movements',

'loss_of_balance', 'unsteadiness', 'weakness_of_one_body_side', 'loss_of_smell', 'bladder_discomfort',

'foul_smell_of urine', 'continuous_feel_of_urine', 'passage_of_gases', 'internal_itching', 'toxic_look_(typhos)',

'depression', 'irritability', 'muscle_pain', 'altered_sensorium', 'red_spots_over_body', 'belly_pain',

'abnormal_menstruation', 'dischromic _patches', 'watering_from_eyes', 'increased_appetite', 'polyuria', 'family_history',

'mucoid_sputum', 'rusty_sputum', 'lack_of_concentration', 'visual_disturbances', 'receiving_blood_transfusion',

'receiving_unsterile_injections', 'coma', 'stomach_bleeding', 'distention_of_abdomen', 'history_of_alcohol_consumption',

'fluid_overload.1', 'blood_in_sputum', 'prominent_veins_on_calf', 'palpitations', 'painful_walking', 'pus_filled_pimples', 'blackheads', 'scurring', 'skin_peeling',

'silver_like_dusting', 'small_dents_in_nails', 'inflammatory_nails', 'blister', 'red_sore_around_nose', 'yellow_crust_ooze'

In prognosis we have 41 diseases as result:

'(vertigo) Paroymsal  Positional Vertigo', 'AIDS', 'Acne', 'Alcoholic hepatitis', 'Allergy', 'Arthritis', 'Bronchial Asthma', 'Cervical spondylosis', 'Chicken pox', 'Chronic cholestasis', 'Common Cold', 'Dengue', 'Diabetes ', 'Dimorphic hemmorhoids(piles)', 'Drug Reaction', 'Fungal infection', 'GERD', 'Gastroenteritis', 'Heart attack', 'Hepatitis B', 'Hepatitis C', 'Hepatitis D', 'Hepatitis E', 'Hypertension ', 'Hyperthyroidism', 'Hypoglycemia', 'Hypothyroidism', 'Impetigo', 'Jaundice', 'Malaria', 'Migraine', 'Osteoarthristis', 'Paralysis (brain hemorrhage)', 'Peptic ulcer diseae', 'Pneumonia', 'Psoriasis', 'Tuberculosis', 'Typhoid', 'Urinary tract infection', 'Varicose veins','hepatitis A'

Importing Libraries

# importing the library

import pandas as pd

Reading Training Dataset

# Reading Training Dataset

df = pd.read_csv("training_data.csv")

# Checking shape of Dataset

df.shape

(4920, 134)

# Storing prognosis(prediction column) in y_train dataframe

y_train =df["prognosis"]



y_train.head(50)

# deleting prediction column as we have stored in y_train

del df["prognosis"]



# Unnamed column as it is of no use to us.

del df["Unnamed: 133"]

df.head()

# Checking the NULL Values

df.isnull().sum()

As we can see our dataset has no NULL values, and even we can not do any Data Cleaning, as the dataset is not that good. So, we can go on with this or else we can use a better dataset.

# Storing training dataset in X

X = df



# Stroing prediction column in Y

Y = y_train

Now we will divide the dataset into train and test for doing the dataset.

Splitting the Data Set

# importing sklearn library for train test slpitting

from sklearn.model_selection import train_test_split

X_train, X_test,Y_train,Y_test = train_test_split(X,Y,test_size=0.1,stratify=Y,random_state=2)

# Checking the shape of test train dataset

print(X.shape, X_train.shape, X_test.shape)

(4920, 132) (4428, 132) (492, 132)

Making Random Forest Classifier

from sklearn.ensemble import RandomForestClassifier

classifier = RandomForestClassifier(n_estimators = 200, criterion = 'entropy', random_state = 0)

classifier.fit(X_train,Y_train)



prediction_rfc = classifier.predict(X_test)



import sklearn.metrics as metrics



print('Confusion Matrix: Random Forest Classifier')

print(metrics.confusion_matrix(Y_test, prediction_rfc))

print('\nClassification Report:')

print(metrics.classification_report(Y_test, prediction_rfc))

Accuracy: 1.0

Accuracy came out to be 100%(1.0)

# Reding the Testing Dataset

dft = pd.read_csv("test_data.csv")

# Viweing Dataset

dft.head()

# Storing prediction column of testing dataset in y_test

y_test =dft["prognosis"]



# Cheking Dataset

y_test.head(3)

# Deleting the prognosis column from testing dataset

del dft["prognosis"]

dft.head(3)

# Doing prediction for testing dataset

prediction = clf.predict(dft)



# Printing values of prediction

print(prediction)



# Checking the accuracy of prediction

print("Accuracy: ", metrics.accuracy_score(prediction,y_test))

As you can see the prediction is coming out 100%.

You can run the model for yourself from the link of Deployment provided above.

0 comments

4 person likes this

Related Listings

NEHA SINGH's other Models Reports

Major Concepts

Disease Prediction Using Random Forest Classifier

Models Status

Model Overview

Deployment

Photos

Vault

Reviews

Connect With Us

Member Sign In

Member Sign In

Create Account

Related Listings

NEHA SINGH's other Models Reports

Major Concepts

Disease Prediction Using Random Forest Classifier

Models Status

Model Overview

Deployment

Photos

Vault

Reviews

Connect With Us