Prasad Chaskar's other Models Reports

Major Concepts

 

Sign-Up/Login to access Several ML Models and also Deploy & Monetize your own ML solutions for free

Models Home » Domain Usecases » Health Care and Pharmaceuticals » Patient Health Severity Score Prediction

Patient Health Severity Score Prediction

Models Status

Model Overview


About Dataset : 
The dataset contains 79540 training instances and 4 classes.





Features :

1) TEMPF : 
Temperature of body in fahrenheit.
2) PULSE : 
Pulse rate (beats/min).
3) RESPR : 
Respiration(breaths per min)
4) BPSYS : 
Systolic blood pressure.
5) BPDIAS : 
Diastolic blood pressure.
6) POPCT : 
Oxygen Saturation.



Dataset Link : https://www.kaggle.com/hansaniuma/patient-health-scores-for-ehr-data

Code:

Import Libraries:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.metrics import classification_report,f1_score,confusion_matrix
from sklearn.ensemble import RandomForestClassifier

Some Useful Parameters :

fig_X = 10
fig_y = 8
bins = 25
title_size = 20
color = 'b'

Read Data :

score_df = pd.read_csv('Patient Severity Score.csv')
score_df.head()


Data Cleaning :

score_df = score_df.rename(columns={'SCORE ':'SCORE'})
score_df.columns


Check for Null values :


score_df.isnull().sum()



Inference : 
There are no null values present in our dataset.




Data Visualization :
1] Temperature


plt.figure(figsize=(fig_X,fig_y))
sns.histplot(x='TEMPF',data=score_df,color=color,bins=bins);
plt.title("Distribution of feature temp",{'fontsize':title_size});





Some Facts about TEMPF Feature :
The ideal temperature of human body is 98.6F(37C).

For adults body temp can be between 97F to 99F and for chidrens/babies have higher range between 97.9 to 100F
Looking after data most of the patients have ideal(healthy) body temperature.
Looking towards our dataset most of patients are healthy 


2] Pulse :

plt.figure(figsize=(fig_X,fig_y))
sns.histplot(x='PULSE',data=score_df,bins=bins,color=color);
plt.title("Distribution of feature PULSE",{'fontsize':title_size})



Some Facts about PULSE feature 
The feature pulse is all about pulse rate of human body or it also called as heart rate. It is basically no of times your heartbeats per minutes.

Pulse rate for healthy person is 60 to 100 beats/min.
Pulse rate normal when you are at rest and increases when you exercise(more oxygen rich blood is needed by the body)
Ideal pulse rate for children is 70 to 100 beats/min and for adults 60 to 100 beats/min.
The highest pulse rate achieved during maximal exercise.
Formula to calculate maximum pulse rate :
maximum heart rate = 200 - your ageLooking towards our dataset most of patients are healthy 


3]Respiration


plt.figure(figsize=(fig_X,fig_y))
sns.histplot(x='RESPR',data=score_df,bins=bins+30,color=color);
plt.title("Distribution of feature RESPR(Respiration)",{'fontsize':title_size});



Some facts about RESPR feature 
Respiration means breaths per minutes.

The rate measures when a person rest by counting no of breaths per minutes by counting how many times the chest rises.


4]Systolic BP and Diastolic BP


fig, axes = plt.subplots(1, 2, figsize=(fig_X, fig_y), sharey=True);
fig.suptitle('Distribution of features BPSYS and BPDIAS');
sns.histplot(ax=axes[0],x='BPSYS',data=score_df,bins=bins,color=color);
sns.histplot(ax=axes[1],x='BPDIAS',data=score_df,bins=bins,color=color);




Some facts about BPSYS(Systolic Blood Pressure) and BPDIAS(Diastolic Blood Pressure) 
Systolic blood pressure means maximum pressure attend by human body
Diastolic blood pressure means minimum level it reaches between breaths.
Normal BPSYS is 120 to 129 mmHg and BPDIAS is 80 to 84 mmHg.
High blood pressure is anything above 140/90 mmHg(here 140 is BPSYS & 90 is BPDIAS).A blood pressure reading over 180/120 is dangerous high. Its also called Hypertensive Crisis.


5]Oxygen Saturation


plt.figure(figsize=(fig_X,fig_y))
sns.histplot(x='POPCT',data=score_df,color=color,bins=bins);
plt.title("Distribution of POPCT feature",{'fontsize':title_size});




Some facts about POPCT(Oxygen Saturation) 


POPCT means how much oxygen the haemoglobin in your body is carrying.
Normal saturation is 75 to 100 mmHg.


fig, axes = plt.subplots(3, 3, figsize=(fig_X, fig_y+15), sharey=True);
sns.violinplot(ax=axes[0][0],x='TEMPF',data=score_df,color=color);
axes[0][0].set_title('TEMPF');
sns.violinplot(ax=axes[0][1],x='PULSE',data=score_df,color=color);
axes[0][1].set_title('PULSE');
sns.violinplot(ax=axes[0][2],x='PULSE',data=score_df,color=color);
axes[0][2].set_title('PULSE');
sns.violinplot(ax=axes[1][0],x='RESPR',data=score_df,color=color);
axes[1][0].set_title('RESPR');
sns.violinplot(ax=axes[1][1],x='BPSYS',data=score_df,color=color);
axes[1][1].set_title('BPSYS');
sns.violinplot(ax=axes[1][2],x='BPDIAS',data=score_df,color=color);
axes[1][2].set_title('BPDIAS');
sns.violinplot(ax=axes[2][0],x='POPCT',data=score_df,color=color);
axes[2][0].set_title('POPCT');
fig.delaxes(axes[2][1])
fig.delaxes(axes[2][2])


Split data into dependent and independent variable


X = score_df.drop('SCORE',axis=1)
y = score_df.SCORE​




Split Data into Training and Testing


X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=11)




Data Normalization


scalar = StandardScaler()
X_train = scalar.fit_transform(X_train)
X_test = scalar.transform(X_test)

Model Creation


model_name = list()
accuracy = list()
models = {
LogisticRegression(max_iter=500):'Logistic Regression',
SVC():"Support Vector Machine",
RandomForestClassifier():'Random Forest'
}
for m in models.keys():
m.fit(X_train,y_train)
for model,name in models.items():
print(f"Accuracy Score for {name} is : ",model.score(X_test,y_test)*100,"%")
model_name.append(name)
accuracy.append(model.score(X_test,y_test)*100)

Classification Report




for model,name in models.items():
y_pred = model.predict(X_test)
print(f"Classification Report for {name}")
print("----------------------------------------------------------")
print(classification_report(y_test,y_pred))
print("----------------------------------------------------------")




Confusion Matrix


for model,name in models.items():
y_pred = model.predict(X_test)
class_names = [0,1]
fig,ax = plt.subplots()
tick_marks = np.arange(len(class_names))
plt.xticks(tick_marks,class_names)
plt.yticks(tick_marks,class_names)

cnf_matrix = confusion_matrix(y_test,y_pred)
sns.heatmap(pd.DataFrame(cnf_matrix), annot = True, cmap = 'YlGnBu',
fmt = 'g')
ax.xaxis.set_label_position('top')
plt.tight_layout()
plt.title(f'Confusion Matrix for {name}', {'fontsize':20})
plt.ylabel('Actual label')
plt.xlabel('Predicted label')
plt.show()


Accuracy of Each model


plt.figure(figsize=(fig_X,fig_y))
plt.bar(model_name,accuracy,color=color);
plt.title("Accuracy of each model",{'fontsize':title_size});











0 comments