Term Deposit Prediction

Hashwanth Gogineni

Related Listings

Credit Card Customers...

0 comments, 1 review , 3 likes
Classification of gas...

0 comments, 0 reviews , 0 likes

Lyme Disease Detection

0 comments, 1 review , 678 views, 1 like
Census Income Prediction

0 comments, 1 review , 500 views, 1 like

Major Concepts

Models Home » Domain Usecases » Banking and Financial Services » Term Deposit Prediction

Models Status

Model Overview

Term Deposit

Term Deposits, also known as Specified Deposits, are an investment vehicle in which a lump-sum sum amount is placed for a fixed length of time, ranging from one month to five years, at an agreed rate of interest.

Banks, non-banking financial companies (NBFCs), credit unions, post offices, and building societies are all places where you may get a term deposit.

Characteristics of Term Deposits

Term Deposits have certain monetary characteristics that have made them popular among investors.

Term deposits have the following important characteristics:

Fixed-rate of interest: Term deposits provide a set rate of interest that is not affected by market changes.

Safety of investment: Because term deposit interest rates are unaffected by economic fluctuations, it is one of the safest investment alternatives accessible.

Preset investment period: Depending on the aims of the financial institution, the investor can choose the investment period. The institution's interest rate will normally be higher for a longer tenor. Before buying, though, it's a good idea to look at interest-to-tenor ratios.

Interest Payment: The investor has the choice of receiving the interest income at maturity or on a monthly, quarterly, or annual basis.

Wealth Generation: Even during market downturns, the constant interest earned on the investment assures that the investors' wealth rises.

Why Term Deposit Prediction?

The project can help Banks and financial institutions help find valuable investors who are interested in buying term deposits.

Dataset

The classic marketing bank dataset was uploaded originally in the UCI Machine Learning Repository. The dataset contains information on a financial institution's marketing campaign, which you must examine to discover methods to enhance future marketing efforts for the bank.

CatBoost

Yandex's CatBoost machine learning algorithm was just open-sourced.

It's simple to interface with deep learning frameworks like TensorFlow from Google and Core ML from Apple.

It can work with a variety of data formats to assist organizations in addressing various challenges.

To top it off, it has the highest accuracy in the industry.

It is particularly strong in two ways:

1) it produces cutting-edge results without the considerable data training that other machine learning approaches require, and

2) it provides excellent out-of-the-box support for the more descriptive data formats that often accompany business challenges.

The term "CatBoost" is derived from the phrases "Category" and "Boosting."

As previously stated, the library works well with various data types, including audio, text, picture, and historical data.

Because this library is based on the gradient boosting library, the name "Boost" derives from the gradient boosting machine learning technique.

Gradient boosting is a sophisticated machine learning approach that has been successfully used for various commercial difficulties such as fraud detection, recommendation items, and forecasting.

It may also produce excellent results with a small quantity of data instead of deep learning models, which require a large amount of data to learn from.

Understanding Code

First, let us import the required libraries for the project.

import pandas as pd

import numpy as np

import seaborn as sns

import matplotlib.pyplot as plt

from sklearn.preprocessing import LabelEncoder

from sklearn.pipeline import Pipeline

from sklearn.model_selection import train_test_split

from sklearn import metrics

from sklearn.metrics import mean_squared_error as mse

from sklearn.metrics import r2_score

import joblib

import pickle

And now load the data into the system.

df=pd.read_csv("data.csv")

Also, let us look at a few important visualizations of our data.

# Pie-chart

labels = 'Yes', 'No'

sizes = [5868, 5284]

colors = ['black','greenyellow']

fig1, ax1 = plt.subplots(figsize =(7.5,7.5))

ax1.pie(sizes,colors = colors , labels=labels, shadow=True, startangle=90)

ax1.axis('equal')

plt.title("Deposit")

plt.show()

# Pie-chart

labels = 'married', 'single' , 'divorced'

sizes = [6346, 3513, 1293]

colors = ['violet', 'lightgrey','black']

fig1, ax1 = plt.subplots(figsize =(7.5,7.5))

ax1.pie(sizes,colors = colors , labels=labels, shadow=True, startangle=90)

ax1.axis('equal')

plt.title("marital")

plt.show()

Coming to the 'Data Preprocessing' part, let us search for missing values in the data.

df.isnull().sum()

As you can see, no missing values exist in our data.

Now let us encode the categorical values to feed the data into the model.

from sklearn import preprocessing



categorical = ['job', 'marital', 'education', 'default', 'housing',

       'loan', 'contact', 'month', 'poutcome']

for feature in categorical:

        le = preprocessing.LabelEncoder()

        df[feature] = le.fit_transform(df[feature])



df['deposit']=df['deposit'].map({'yes':1,'no':0})

As you can see, I used the 'LabelEncoder' function to encode our data.

Let us split the data into training and testing sets using the "train_test_split" function.

Y = df['deposit']

X = df.drop('deposit', axis = 1)



from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.2)

Finally, we need to scale our data before feeding our data into a model.

from sklearn.preprocessing import StandardScaler



scaler = StandardScaler()



X_train = pd.DataFrame(scaler.fit_transform(X_train), columns = X.columns)



X_test = pd.DataFrame(scaler.transform(X_test), columns = X.columns)

As you can see, I used the "StandardScaler" function to scale the data.

Now, let us dive deep into the modelling part of the project.

from catboost import CatBoostClassifier





cb_model= CatBoostClassifier()

cb_model.fit(X_train, y_train)

y_pred = cb_model.predict(X_test)

cb_model.score(X_train, y_train)*100

I used the "Catboost" model to solve the problem.
As you can see, I used the "CatBoostClassifier" function to use the "Catboost" algorithm.

Now let us have a look at the model's performance report.

from sklearn.metrics import classification_report

class_names = ['Customers who did not buy the Term Deposit', 'Customers who Bought the Term Deposit']

print(classification_report(y_test, y_pred, target_names=class_names))

As you can see, the model did well and is now production-ready.

Thank you for your time.

0 comments

Vaibhav Mali and Viaan Prakash like this

Related Listings

Hashwanth Gogineni's other Models Reports

Major Concepts