Mental illness Prediction

Hashwanth Gogineni

Related Listings

Blood Cells Classific...

0 comments, 0 reviews , 0 likes
Diabetic Retinopathy

0 comments, 3 reviews , 4 likes

Lyme Disease Detection

0 comments, 1 review , 657 views, 1 like
Census Income Prediction

0 comments, 1 review , 485 views, 1 like

Major Concepts

Models Home » Domain Usecases » Health Care and Pharmaceuticals » Mental illness Prediction

Models Status

Model Overview

Mental illness

Mental illness, often known as mental health issues, refers to a wide range of conditions that affect your emotions, thoughts, and behaviour. Mental illnesses include depression, anxiety disorders, schizophrenia, eating problems, and addictive behaviours.

Many people experience mental problems from time to time. A mental health disorder becomes a mental disease when persistent signs and symptoms cause frequent stress and impede your ability to function.

Mental illness can make you sad and cause problems in your daily life, such as school, work, or relationships. Symptoms are often managed with a mix of medications and talk therapy (psychotherapy).

Mental illness Symptoms

Mental disease can manifest itself in a variety of ways. Symptoms of mental illness can alter emotions, attitudes, and behaviours.

Here are some instances of warning signs and symptoms:

Feeling sad or low

Confusion in thoughts or a loss of concentration

Excessive anxieties or fears, as well as severe guilt emotions

Extreme mood swings with highs and lows

Friendships and hobbies are being cut off

Significant exhaustion, a lack of energy, or difficulty sleeping

Detachment from reality (delusions), paranoia, or hallucinations are all examples of delusions

Inability to deal with day-to-day issues or stress

Having difficulty comprehending and responding to circumstances and people

Problems with alcohol or drug use

Major changes in eating habits

Sex drive changes

Excessive anger, hostility, or violence

Suicidal thinking

Physical difficulties, such as stomach discomfort, back pain, headaches, or other inexplicable aches and pains, can sometimes indicate mental health.

Why Mental Illness Prediction?

The project can be useful to Tech companies to analyze and solve employees' mental issues.

Dataset

The data comes from a 2014 poll that looked at attitudes about mental health in the workplace and the prevalence of mental health issues.

This dataset contains the following data:

Timestamp

Age

Gender

Country

State-If you live in the United States, which state or territory do you live in?

Self employed-Are you self-employed?

Family history-Do you have a family history of mental illness?

Treatment-Have you sought treatment for a mental health condition?

Work interfere-If you have a mental health condition, do you feel it interferes with your work?

No Employees-How many employees do your company or organization have?

Remote_work-Do you work remotely or at least 50% of the time?

tech_company-Is your employer primarily a Tech company?

Benefits-Does your employer provide mental health benefits for you?

care_options-Are you familiar with your company's mental health care options?

wellness_program-Have you ever had a conversation with your boss about mental health as part of an employee wellness program?

seek_help-Does your employer provides resources to learn more about mental health issues and how to seek help?

Anonymity-Is your anonymity protected if you choose to take advantage of mental health or substance abuse treatment resources?

Leave-How easy is it for you to take medical leave for a mental health condition?

Mental health consequence-Do you think discussing a mental health issue with your employer would have negative consequences?

physhealthconsequence-Do you think that discussing a physical health issue with your employer would have negative consequences?

Coworkers-Would you be willing to discuss a mental health issue with your coworkers?

Supervisor-Would you be willing to discuss a mental health issue with your direct supervisor(s)?

Mental health interview-Would you bring up a mental health issue with a potential employer in an interview?

physhealthinterview-Would you bring up a physical health issue with a potential employer in an interview?

mentalvsphysical-Do you feel that your employer takes mental health as seriously as physical health?

obs_consequence-Have you heard of or observed negative consequences for coworkers with mental health conditions in your workplace?

comments-Any additional comments

Random Forest

A random forest is a machine learning approach for solving classification and regression issues.

It uses ensemble learning, a technique for solving complicated problems by combining several classifiers.

Many decision trees make up a 'random forest' algorithm.

Bagging/bootstrap aggregation is used to train the 'forest' formed by the random forest method.

Bagging is an algorithm that increases the accuracy of machine learning methods by grouping them.

Random forest algorithm determines the output based on decision tree predictions.

It forecasts by averaging or averaging the outputs of various trees.

The precision of the result improves as the number of trees grows.

The random forest method overcomes the drawbacks of a decision tree algorithm.

It reduces dataset overfitting problems and improves precision.

It generates forecasts without requiring a large number of package setups (like sci-kit-learn).

Why Random Forest?

The following are a few reasons why we should utilize the Random Forest algorithm:

It takes less training time as compared to other algorithms.

It predicts output with high accuracy. Even for the large dataset, it runs efficiently.

It can also maintain accuracy when a large proportion of data is missing.

Understanding Code

First, let us import the required libraries for the project.

import pandas as pd

import numpy as np

import seaborn as sns

import matplotlib.pyplot as plt

from sklearn.preprocessing import LabelEncoder

from sklearn.pipeline import Pipeline

from sklearn.model_selection import train_test_split

from sklearn import metrics

from sklearn.metrics import mean_squared_error as mse

from sklearn.metrics import r2_score

import joblib

import pickle

And now load the data into the system.

df=pd.read_csv("data.csv")

Also, let us have a look at a few important visualizations of our data.

from collections import Counter



country_count = Counter(df['Country'].dropna().tolist()).most_common(10)

country_idx = [country[0] for country in country_count]

country_val = [country[1] for country in country_count]

fig,ax = plt.subplots(figsize=(8,6))

sns.barplot(x = country_idx,y=country_val ,ax =ax)

plt.title('Top ten country')

plt.xlabel('Country')

plt.ylabel('Count')

ticks = plt.setp(ax.get_xticklabels(),rotation=90)

import seaborn as sns



sns.countplot(df['treatment'])

plt.title('Treatement Distribution')

Coming to the 'Data Preprocessing' part, let us search for missing values in the data.

df.isnull().sum()

As you can see, missing values exist in our data.

df['work_interfere'] = df['work_interfere'].fillna('Don\'t know' )

print(df['work_interfere'].unique())



df['self_employed'] = df['self_employed'].fillna('No')

print(df['self_employed'].unique())



df.drop(["Timestamp", "comments", "state"], axis = 1, inplace = True)

As you can see I dropped 'Timestamp', 'comments' and 'state' features as there are a lot of missing values in them.

Now let us encode the categorical values to feed the data into the model.

from sklearn import preprocessing



categorical = ['Gender', 'Country', 'self_employed', 'family_history',

       'treatment', 'work_interfere', 'no_employees', 'remote_work',

       'tech_company', 'benefits', 'care_options', 'wellness_program',

       'seek_help', 'anonymity', 'leave', 'mental_health_consequence',

       'phys_health_consequence', 'coworkers', 'supervisor',

       'mental_health_interview', 'phys_health_interview',

       'mental_vs_physical', 'obs_consequence']

for feature in categorical:

        le = preprocessing.LabelEncoder()

        df[feature] = le.fit_transform(df[feature])

As you can see, I used the 'Label encoder' function to encode our data.

Let us split the data using the "train_test_split" function into training and testing sets.

Y = df['treatment']

X = df.drop('treatment', axis = 1)



X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.2)

Finally, we need to scale our data before feeding our data into a model.

from sklearn.preprocessing import StandardScaler



scaler = StandardScaler()



X_train = pd.DataFrame(scaler.fit_transform(X_train), columns = X.columns)



X_test = pd.DataFrame(scaler.transform(X_test), columns = X.columns)

As you can see, I used the "StandardScaler" function to scale the data.

Now, let us dive deep into the modelling part of the project.

from sklearn.ensemble import RandomForestClassifier





rf_model= RandomForestClassifier()

rf_model.fit(X_train, y_train)

y_pred = rf_model.predict(X_test)

rf_model.score(X_train, y_train)

I used the "Random Forest" model to solve the problem.
As you can see, I used the "RandomForestClassifier" function to use the "Random Forest" algorithm.

Now let us have a look at the model's performance report.

from sklearn.metrics import classification_report

class_names = ['Mental illness Treatment is not required', 'Mental illness Treatment is required']

print(classification_report(y_test, y_pred, target_names=class_names))

As you can see the model performed well.

Thank you for your time.

2 comments

Viaan Prakash, Maryam Bains, and Vaibhav Mali like this
Raj Gupta how an I get the dataset. Where is the dataset link?
December 11, 2021 - 1 likes this
Hashwanth Gogineni Link for the Dataset - https://www.kaggle.com/osmi/mental-health-in-tech-survey
December 20, 2021

Related Listings

Hashwanth Gogineni's other Models Reports

Major Concepts