Vaibhav Mali

Related Listings

Video Game Sales Pred...

0 comments, 1 review , 1 like
Baseball Sports Analysis

0 comments, 1 review , 1 like

Heart Failure Prediction

0 comments, 0 reviews , 633 views, 0 likes
Covid19 Classification Using Machine Learning

0 comments, 0 reviews , 653 views, 0 likes

Major Concepts

Models Home » Domain Usecases » Health Care and Pharmaceuticals » ECG Arrhythmia Classification Using Random Forest

ECG Arrhythmia Classification Using Random Forest

Models Status

Model Overview

An arrhythmia is an abnormal heart rhythm. Some arrhythmias can cause problems with contractions of your heart chambers by: Not allowing the lower chambers (ventricles) to fill with enough blood, because an abnormal electrical signal is causing your heart to pump too fast or too slow.

The dataset contains features extracted two-lead ECG signal (lead II, V) from the MIT-BIH Arrhythmia dataset (Physionet). In addition, we have programmatically extracted relevant features from ECG signals to classify regular/irregular heartbeats. The dataset can be used to classify heartbeats for arrhythmia detection.

There are four ECG arrhythmia datasets in here, each employing 2-lead ECG features. Datasets obtained from PhysioNet are MIT-BIH Supraventricular Arrhythmia Database, MIT-BIH Arrhythmia Database, St Petersburg INCART 12-lead Arrhythmia Database, and Sudden Cardiac Death Holter Database.

In each of the datasets, the first column, named "record" is the name of the subject/patient.

Each data contain five classes/categories: N (Normal), S (Supraventricular ectopic beat), V (Ventricular ectopic beat), F (Fusion beat), and Q (Unknown beat). The column "type" contains the class information.

The remaining 34 columns contain 17 features for each ECG lead (17 features for lead-II and 17 features for lead-V5)

Link Of Dataset: https://www.kaggle.com/datasets/sadmansakib7/ecg-arrhythmia-classification-dataset

Importing The Necessary Library

import pickle



import numpy as np

import pandas as pd

import seaborn as sns

import matplotlib

import matplotlib.pyplot as plt



plt.rcParams.update({'figure.figsize': (12.0, 8.0)})

plt.rcParams.update({'font.size': 14})



#default theme

sns.set(context='notebook', style='darkgrid', palette='colorblind', font='sans-serif', font_scale=1, rc=None)

matplotlib.rcParams['figure.figsize'] =[8,8]

matplotlib.rcParams.update({'font.size': 15})

matplotlib.rcParams['font.family'] = 'sans-serif'

Read Dataset and check first 5 rows of the dataset

# Reading MIT-BIH Arrhythmia Dataset as an example

data_df = pd.read_csv('MIT-BIH Arrhythmia Database.csv', low_memory=False)



print("Number of rows in data =",data_df.shape[0])

print("Number of columns in data =",data_df.shape[1])

print("\n")

print("**Sample data:**")

print(data_df.head())

Description Of The Dataset

print(data_df.describe())

Checking for Null values

print(data_df.isnull().sum())

Split the data into features and class labels

x_data = data_df.drop(['type','record'], axis = 1)

y_label = data_df[['type']]

print(y_label.value_counts())

Replacing Multiple Target Variable into Binary Class

# Transform multi-class labels into binary-class (arrhythmia and normal)

y_label.replace(['VEB','SVEB','F','Q'], 'arrhythmia', inplace=True)

y_label.replace(['N'], 'normal', inplace=True)

data_df['type'] = data_df['type'].replace(['normal','arrhythmia'],[0,1], inplace=True)

Train-test Split

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(x_data, y_label, random_state=101)

Feature Scaling

from sklearn.preprocessing import MinMaxScaler



min_max_scaler = MinMaxScaler()

X_train = min_max_scaler.fit_transform(X_train)

X_test = min_max_scaler.transform(X_test)



print(min_max_scaler.scale_)



pickle.dump(min_max_scaler, open('min_max_scaler.pkl','wb'))

importances = model.feature_importances_

# Sort the feature importance in descending order

sorted_indices = np.argsort(importances)[::-1]



feat_labels = data_df.columns[2:]



for f in range(X_train.shape[1]):

    print("%2d) %-*s %f" % (f + 1, 30,

                            feat_labels[sorted_indices[f]],

                            importances[sorted_indices[f]]))

Model training

from sklearn.ensemble import RandomForestClassifier

model = RandomForestClassifier(random_state=101, n_estimators=150)

model.fit(X_train, y_train.values.ravel())

Model testing
a)Accuracy of the model

from sklearn import metrics

y_pred = model.predict(X_test)

print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

b)Confusion Matrix

print("*** Confusion Matrix ***")

print(metrics.confusion_matrix(y_test, y_pred))

c)Classification Report

print("*** Classificstion Report ***")

print(metrics.classification_report(y_test,y_pred))

0 comments

Related Listings

Vaibhav Mali's other Models Reports

Major Concepts

ECG Arrhythmia Classification Using Random Forest

Models Status

Model Overview

Feature Scaling

Model training

Model testing
a)Accuracy of the model

Deployment

Photos

Reviews

Connect With Us

Member Sign In

Member Sign In

Create Account

Related Listings

Vaibhav Mali's other Models Reports

Major Concepts

ECG Arrhythmia Classification Using Random Forest

Models Status

Model Overview

Feature Scaling

Model training

Model testinga)Accuracy of the model

Deployment

Photos

Reviews

Connect With Us

Model testing
a)Accuracy of the model