Hashwanth Gogineni's other Models Reports

Major Concepts

 

Sign-Up/Login to access Several ML Models and also Deploy & Monetize your own ML solutions for free

Term Deposit Prediction

Models Status

Model Overview

Term Deposit


Term Deposits, also known as Specified Deposits, are an investment vehicle in which a lump-sum sum amount is placed for a fixed length of time, ranging from one month to five years, at an agreed rate of interest. 


Banks, non-banking financial companies (NBFCs), credit unions, post offices, and building societies are all places where you may get a term deposit.


Characteristics of Term Deposits


Term Deposits have certain monetary characteristics that have made them popular among investors. 


Term deposits have the following important characteristics:



  • Fixed-rate of interest: Term deposits provide a set rate of interest that is not affected by market changes.

  • Safety of investment: Because term deposit interest rates are unaffected by economic fluctuations, it is one of the safest investment alternatives accessible.

  • Preset investment period: Depending on the aims of the financial institution, the investor can choose the investment period. The institution's interest rate will normally be higher for a longer tenor. Before buying, though, it's a good idea to look at interest-to-tenor ratios.

  • Interest Payment: The investor has the choice of receiving the interest income at maturity or on a monthly, quarterly, or annual basis.

  • Wealth Generation: Even during market downturns, the constant interest earned on the investment assures that the investors' wealth rises.


 



Why Term Deposit Prediction?


The project can help Banks and financial institutions help find valuable investors who are interested in buying term deposits.


Dataset 


The classic marketing bank dataset was uploaded originally in the UCI Machine Learning Repository. The dataset contains information on a financial institution's marketing campaign, which you must examine to discover methods to enhance future marketing efforts for the bank.



CatBoost


Yandex's CatBoost machine learning algorithm was just open-sourced. 


It's simple to interface with deep learning frameworks like TensorFlow from Google and Core ML from Apple. 


It can work with a variety of data formats to assist organizations in addressing various challenges. 


To top it off, it has the highest accuracy in the industry.


It is particularly strong in two ways: 


1) it produces cutting-edge results without the considerable data training that other machine learning approaches require, and 


2) it provides excellent out-of-the-box support for the more descriptive data formats that often accompany business challenges. 


The term "CatBoost" is derived from the phrases "Category" and "Boosting."


As previously stated, the library works well with various data types, including audio, text, picture, and historical data.


Because this library is based on the gradient boosting library, the name "Boost" derives from the gradient boosting machine learning technique. 


Gradient boosting is a sophisticated machine learning approach that has been successfully used for various commercial difficulties such as fraud detection, recommendation items, and forecasting. 


It may also produce excellent results with a small quantity of data instead of deep learning models, which require a large amount of data to learn from.


Understanding Code

First, let us import the required libraries for the project.


import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.preprocessing import LabelEncoder
from sklearn.pipeline import Pipeline
from sklearn.model_selection import train_test_split
from sklearn import metrics
from sklearn.metrics import mean_squared_error as mse
from sklearn.metrics import r2_score
import joblib
import pickle


And now load the data into the system.


df=pd.read_csv("data.csv")


Also, let us look at a few important visualizations of our data.


# Pie-chart
labels = 'Yes', 'No'
sizes = [5868, 5284]
colors = ['black','greenyellow']
fig1, ax1 = plt.subplots(figsize =(7.5,7.5))
ax1.pie(sizes,colors = colors , labels=labels, shadow=True, startangle=90)
ax1.axis('equal')
plt.title("Deposit")
plt.show()





# Pie-chart
labels = 'married', 'single' , 'divorced'
sizes = [6346, 3513, 1293]
colors = ['violet', 'lightgrey','black']
fig1, ax1 = plt.subplots(figsize =(7.5,7.5))
ax1.pie(sizes,colors = colors , labels=labels, shadow=True, startangle=90)
ax1.axis('equal')
plt.title("marital")
plt.show()




Coming to the 'Data Preprocessing' part, let us search for missing values in the data.


df.isnull().sum()


As you can see, no missing values exist in our data.

Now let us encode the categorical values to feed the data into the model.


from sklearn import preprocessing

categorical = ['job', 'marital', 'education', 'default', 'housing',
'loan', 'contact', 'month', 'poutcome']
for feature in categorical:
le = preprocessing.LabelEncoder()
df[feature] = le.fit_transform(df[feature])

df['deposit']=df['deposit'].map({'yes':1,'no':0})

As you can see, I used the 'LabelEncoder' function to encode our data.

Let us split the data into training and testing sets using the "train_test_split" function.


Y = df['deposit']
X = df.drop('deposit', axis = 1)

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.2)


Finally, we need to scale our data before feeding our data into a model.


from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

X_train = pd.DataFrame(scaler.fit_transform(X_train), columns = X.columns)

X_test = pd.DataFrame(scaler.transform(X_test), columns = X.columns)

As you can see, I used the "StandardScaler" function to scale the data.

Now, let us dive deep into the modelling part of the project.


from catboost import CatBoostClassifier


cb_model= CatBoostClassifier()
cb_model.fit(X_train, y_train)
y_pred = cb_model.predict(X_test)
cb_model.score(X_train, y_train)*100

I used the "Catboost" model to solve the problem.
As you can see, I used the "CatBoostClassifier" function to use the "Catboost" algorithm.

Now let us have a look at the model's performance report.


from sklearn.metrics import classification_report
class_names = ['Customers who did not buy the Term Deposit', 'Customers who Bought the Term Deposit']
print(classification_report(y_test, y_pred, target_names=class_names))


As you can see, the model did well and is now production-ready.


Thank you for your time.


0 comments