Diabetes Prediction using Support Vector Machine

Tarun Reddy

Related Listings

Stale Food Detection

0 comments, 2 reviews , 2 likes
Blood Cells Classific...

0 comments, 0 reviews , 0 likes

stock prices prediction in Python using recurrent neural network and machine learning.

0 comments, 1 review , 1,149 views, 3 likes
Sales Data Analysis With Python

0 comments, 1 review , 903 views, 1 like

Major Concepts

Models Home » Domain Usecases » Health Care and Pharmaceuticals » Diabetes Prediction using Support Vector Machine

Diabetes Prediction using Support Vector Machine

Models Status

Model Overview

Diabetes mellitus, generally known as diabetes, is a metabolic sickness that causes high glucose. The chemical insulin moves sugar from the blood into your cells to be put away or utilized for energy. With diabetes, your body either doesn't make sufficient insulin or can't successfully utilize the insulin it makes. Untreated high glucose from diabetes can harm your nerves, eyes, kidneys, and different organs.If you have diabetes, your body isn’t able to properly process and use glucose from the food you eat. There are different types of diabetes, each with different causes, but they all share the common problem of having too much glucose in your bloodstream. Treatments include medications and/or insulins. Some types of diabetes can be prevented by adopting a healthy lifestyle.

DATASET DESCRIPTION:

Pregnancies: Number of times pregnant
Glucose: Plasma glucose concentration a 2 hours in an oral glucose tolerance test
BloodPressure: Diastolic blood pressure (mm Hg)
SkinThickness: Triceps skin fold thickness (mm)
Insulin: 2-Hour serum insulin (mu U/ml)
BMI: Body mass index (weight in kg/(height in m)^2)
DiabetesPedigreeFunction: Diabetes pedigree function
Age: Age (years)
Outcome: Class variable (0 or 1)

MODEL ACCURACY: 77%

Lets import the required libraries

import numpy as np

import pandas as pd

from sklearn.preprocessing import StandardScaler

from sklearn.model_selection import train_test_split

from sklearn.metrics import accuracy_score

import pickle

Now load the dataset and let us check the first five rows of the dataset.

diabetes_dataset = pd.read_csv('E:\\diabetes\\diabetes.csv')

diabetes_dataset.head()

Let us now check the dimensions of the dataset.

diabetes.shape

In the next step let us check the statistical measures of the data

diabetes.describe()

Now let us see how many cases are there for diabetic examples and non diabetic examples

diabetes['Outcome'].value_counts()

we can see there are 500 non diabetic examples and 268 diabetic examples.
In the next step let us find the correlation between data points.

import seaborn as sns

corr_mat=diabetes.corr()

sns.heatmap(corr_mat, annot=True)

Now let us check for any null values present in the dataset.

diabetes.isna().sum()

There are no null values present in the dataset.
Now let us separate the datapoints and store them in different variables.

X = diabetes.drop(columns = 'Outcome', axis=1)

Y = diabetes['Outcome']

print(X)

print(Y)

Now split dataset into training set and testing set

X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size = 0.2, stratify=Y, random_state=2)

print(X.shape, X_train.shape, X_test.shape)

Now let us build the model and check their accuracy score

from sklearn import svm

classifier = svm.SVC(kernel='linear')

classifier.fit(X_train, Y_train)

let us check the accuracy on training data

# accuracy score on the training data

X_train_prediction = classifier.predict(X_train)

training_data_accuracy = accuracy_score(X_train_prediction, Y_train)

print('Accuracy score of the training data : ', training_data_accuracy)

Let us check the accuracy on testing data.

# accuracy score on the test data

X_test_prediction = classifier.predict(X_test)

test_data_accuracy = accuracy_score(X_test_prediction, Y_test)

print('Accuracy score of the test data : ', test_data_accuracy)

from sklearn.metrics import classification_report

print(classification_report(X_test_prediction, Y_test))

Now let us check the model by passing some random data points.

input_data = (5,166,72,19,175,25.8,0.587,51)

input_data_as_numpy_array = np.asarray(input_data)

input_data_reshaped = input_data_as_numpy_array.reshape(1,-1)

std_data = (input_data_reshaped)

print(std_data)



prediction = classifier.predict(std_data)

print(prediction)



if (prediction[0] == 0):

  print('The person is not diabetic')

else:

  print('The person is diabetic')

save the model

pickle.dump(classifier,open('E:\\diabetes\\model.pkl','wb'))

0 comments

Viaan Prakash, Advika Banerjee, and Maryam Bains like this

Related Listings

Tarun Reddy's other Models Reports

Major Concepts

Diabetes Prediction using Support Vector Machine

Models Status

Model Overview

Deployment

Photos

Reviews

Connect With Us

Member Sign In

Member Sign In

Create Account

Related Listings

Tarun Reddy's other Models Reports

Major Concepts

Diabetes Prediction using Support Vector Machine

Models Status

Model Overview

Deployment

Photos

Reviews

Connect With Us