Video Game Sales Prediction

Hashwanth Gogineni

Related Listings

Dandelions Image Dete...

0 comments, 2 reviews , 2 likes
Parkinson's Disease P...

0 comments, 3 reviews , 4 likes

Lyme Disease Detection

0 comments, 1 review , 669 views, 1 like
Census Income Prediction

0 comments, 1 review , 497 views, 1 like

Major Concepts

Models Home » Domain Usecases » Retail » Video Game Sales Prediction

Video Game Sales Prediction

Models Status

Model Overview

Video game industry

A video game is a computer, gaming console, or smartphone game. Depending on the platform, video games are separated into two categories: computer games and console games. However, in recent years, the development of social networks, cellphones, and tablets has resulted in the emergence of new categories such as mobile and social games. Video games have come a long way since the first ones were introduced in the 1970s. Photorealistic graphics and realistic approximations of reality are common in today's video games.

Video games have been around for a long time and are a multibillion-dollar industry. In 2020, the worldwide PC gaming market is predicted to create around 37 billion dollars in sales, while the mobile gaming market will generate more than 77 billion dollars. The fact that the first generation of gamers has reached maturity and has significant purchasing power is what matters now. Although youngsters spend a large amount of time playing games daily, the activity is no longer solely a child's pleasure. In actuality, it has been discovered that video gaming is gaining popularity among parents all across the world, with a fairly even gender distribution of video gaming parents.

Why Video Game Sales Prediction?

The project can help Video Game Companies analyze the market trend and produce video games accordingly.

Dataset

The dataset includes a list of video games that have sold over 100,000 copies. A scrape of vgchartz.com was used to create it.

Fields include:

Name - The video game's name

Platform - The platform of the game's release (i.e. Xbox, PS4, etc.)

Year - The Year of the game's release

Genre - Genre of the game

Publisher - Publisher of the game

NA_Sales - Sales in 'North America' (in millions)

Global_Sales - Total worldwide sales.

Decision Tree

The 'decision tree' is the most powerful and extensively used categorization and prediction approach.

Each internal node represents a test on an attribute, each branch represents the test's outcome, and each leaf node (terminal node) holds a class label.

The following are some of the advantages of decision tree methods:

Easy to understand rules can be found in the decision trees method.

Decision trees are used for classification, and they don't require a lot of computer power.

Decision trees algorithm can handle both continuous and categorical variables.

The most important fields for classification or prediction are shown in decision trees.

Understanding Code

First, let us import the necessary libraries for the project.

import numpy as np 

import pandas as pd 

import matplotlib.pyplot as plt

import seaborn as sns

from sklearn.preprocessing import LabelEncoder, StandardScaler

from sklearn.model_selection import train_test_split

from sklearn import metrics

import pickle

import joblib

Now, let us load the required data into the system.

df = pd.read_csv('data.csv')

Before we start preprocessing our data let us explore the data using a few visualizations.

plt.figure(figsize=(15, 10))

sns.barplot(x="Genre", y="Global_Sales", data=df)

y = df.groupby('Year')['Global_Sales'].sum()

plt.figure(figsize=(15,10))

plt.bar(y.index,y)

plt.xlabel('Year')

plt.ylabel('Global Sales')

plt.show

Coming to the 'Data Preprocessing' part, let us search for missing values in the data.

df.isnull().sum()

As you can see missing values exist in our data. So let us eliminate the rows which include missing values

df = df.dropna(axis=0, subset=['Year','Publisher', 'Global_Sales'])

Now let us encode the categorical values to feed the data into the model.

Platform_encoder=LabelEncoder()

df['Platform'] = Platform_encoder.fit_transform(df['Platform'])

pickle.dump(Platform_encoder, open('Platform_encoder.pkl','wb'))



Genre_encoder=LabelEncoder()

df['Genre'] = Genre_encoder.fit_transform(df['Genre'])

pickle.dump(Genre_encoder, open('Genre_encoder.pkl','wb'))



Publisher_encoder=LabelEncoder()

df['Publisher'] = Publisher_encoder.fit_transform(df['Publisher'])

pickle.dump(Publisher_encoder, open('Publisher_encoder.pkl','wb'))

As you can see, I used the 'LabelEncoder' function to encode our data.

Let us split the data using the "train_test_split" function into training and testing sets.

X = df.drop(columns=['Global_Sales', 'Name'])

Y = df['Global_Sales']



X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2)

Finally, we need to scale our data before feeding our data into a model.

from sklearn.preprocessing import MinMaxScaler



scaler = MinMaxScaler()



X_train = pd.DataFrame(scaler.fit_transform(X_train), columns = X.columns)



X_test = pd.DataFrame(scaler.transform(X_test), columns = X.columns)



pickle.dump(scaler, open('scaler.pkl','wb'))

As you can see, I used the "MinMaxScaler" function to scale the data.

Now, let us dive deep into the modelling part of the project.

from sklearn.tree import DecisionTreeRegressor



dt_model = DecisionTreeRegressor()

dt_model.fit(X_train, y_train) 

y_pred = dt_model.predict(X_test)

dt_model.score(X_train, y_train)

I used the "Decision Tree" model to solve the problem.
As you can see, I used the "DecisionTreeRegressor" function to use the "Decision Tree" algorithm.

Now let us have a look at the model's performance report.

print('r2 score', metrics.r2_score(y_test, y_pred))

print('MAE:', metrics.mean_absolute_error(y_test, y_pred))

print('MSE:', metrics.mean_squared_error(y_test, y_pred))

print('RMSE:', np.sqrt(metrics.mean_squared_error(y_test, y_pred)))

As you can see the model performed well on the data.

Thank you for your time.

0 comments

Maryam Bains likes this

Related Listings

Hashwanth Gogineni's other Models Reports

Major Concepts

Video Game Sales Prediction

Models Status

Model Overview

Video game industry

Why Video Game Sales Prediction?

Dataset

Decision Tree

Understanding Code

Deployment

Photos

Reviews

Connect With Us

Member Sign In

Member Sign In

Create Account

Related Listings

Hashwanth Gogineni's other Models Reports

Major Concepts

Video Game Sales Prediction

Models Status

Model Overview

Video game industry

Why Video Game Sales Prediction?

Dataset

Decision Tree

Understanding Code

Deployment

Photos

Reviews

Connect With Us