Major Concepts

 

Sign-Up/Login to access Several ML Models and also Deploy & Monetize your own ML solutions for free

Models Home » Domain Usecases » Banking and Financial Services » Bitcoin Price Prediction Using Time Seriers Analysis

Bitcoin Price Prediction Using Time Seriers Analysis

Models Status

Model Overview


Bitcoin is the first decentralized digital currency. This means it is not governed by any central bank or some other authority. This cryptocurrency was created in 2009 but it became extremely popular in 2017. Some experts call bitcoin "the currency of the future" or even lead it as an example of the social revolution. The bitcoin price has increased several times during 2017. At the same time, it is very volatile. Many economic entities are interested in tools for predicting bitcoin prices. It is especially important for existing or potential investors and for government structures. The last needs to be ready for significant price movements to prepare a consistent economic policy. So, the demand for Bitcoin price prediction mechanism is high.


REQUIRED LIBRARIES:



DATASET DESCRIPTION:-


  • Bitcoin Core Price: Bitcoin prices

  • Money Supply: The amount of Bitcoin Core (BTC) in circulation.

  • Price Volatility: The annualized daily volatility of price changes. Price volatility is computed as the standard deviation of daily returns, scaled by the square root of 365 to annualize, and expressed as a decimal.

  • Daily Transactions: The number of transactions included in the blockchain each day

  • Block Size: Miners collect Bitcoin Core (BTC) transactions into distinct packets of data called blocks. Each block is cryptographically linked to the preceding block, forming a 19 "blockchain." As more people use the Bitcoin Core (BTC) network for Bitcoin Core (BTC) transactions, the block size increases

  • Transaction Fees: Total amount of Bitcoin Core (BTC) fees earned by all miners in a 24-hour period, measured in Bitcoin Core (BTC)

  • Inflation rate: The federal funds rate decide the shape of the future interest rates in the economy.




EXPLORATORY DATA ANALYSIS:

Exploratory data analysis is an approach to analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods.



  • Seasonal Decomposition:


Time series decomposition involves thinking of a series as a combination of level, trend, seasonality, and noise components. Decomposition provides a useful abstract model for thinking about time series generally and for better understanding problems during time series analysis and forecasting.


DATA PREPROCESSING:



  • There are no NaNs to fill.

  • The training set goes from 06 February 2011 to 30 September 2019

  • Test set goes from 25 October 2019 au 22 Mars 2020 ie the 4 last months (150 days approx)


We need to prepare our dataset according to the requirements of the model, as well as to split the dataset into train and test parts. We define a function which creates X inputs and Y labels for our model. In sequential forecasting, we predict the future value based on some previous and current values. So, our Y label is the value from the next (future) point of time while the X inputs are one or several values from the past. The amount of these values we can set by tuning the parameter look_back in our function. We set it to 30, this means that we predict current value based on the previous 30 days values (t-1),...(t-30)

MODEL BUILDING:


  • Multi-Layer Perceptron(MLP):


It is a supplement of a feed-forward neural network. It consists of three types of layers—the input layer, output layer and hidden layer, as shown in the below Fig. The input layer receives the input signal to be processed. The required task such as prediction and classification is performed by the output layer. An arbitrary number of hidden layers that are placed in between the input and output layer is the true computational engine of the MLP. Similar to a feed-forward network in an MLP the data flows in the forward direction from input to output layer. The neurons in the MLP are trained with the backpropagation learning algorithm. MLPs are designed to approximate any continuous function and can solve problems that are not linearly separable. The major use cases of MLP are pattern classification, recognition, prediction, and approximation.

Fig. 3





  • LSTM Networks:


Long Short Term Memory networks – usually just called “LSTMs” – are a special kind of RNN, capable of learning long-term dependencies.LSTMs are explicitly designed to avoid the long-term dependency problem. Remembering information for long periods of time.
9.2. Long Short-Term Memory (LSTM) — Dive into Deep Learning 0.17.0  documentation






  • GATED RECURRENT UNIT(GRU):


It is part of a specific model of recurrent neural network that intends to use connections through a sequence of nodes to perform machine learning tasks associated with memory and clustering, for instance, in speech recognition.

                                                   1-Layer GRU Neural Network



                                                2-Layer GRU Neural Network





  • ARIMA(AutoregRessive Integrated Moving Average) Time Series Model:


The Bitcoin price appears to be non-stationary with an element of seasonality and trend, hence a SARIMA model is considered as one of the baseline models (Paulo Cortez, 2004).One of the most common methods used in time series forecasting is known as the ARIMA model, which stands for AutoregRessive Integrated Moving Average. ARIMA is a model that can be fitted to time series data in order to better understand or predict future points in the series. There are three distinct integers (p, d, q) that are used to parametrize ARIMA models. Because of that, ARIMA models are denoted with the notation ARIMA(p, d, q). Together these three parameters account for seasonality, trend, and noise in datasets:


  • p is the autoregressive part of the model. It allows us to incorporate the effect of past values into our model. Intuitively, this would be similar to stating that it is likely to be warm tomorrow if it has been warm the past 3 days.

  • d is the integrated part of the model. This includes terms in the model that incorporates the amount of differencing (i.e. the number of past time points to subtract from the current value) to apply to the time series. Intuitively, this would be similar to stating that it is likely to be the same temperature tomorrow if the difference in temperature in the last three days has been very small.

  • q is the moving average part of the model. This allows us to set the error of our model as a linear combination of the error values observed at previous time points in the past.







 




COMPARISON OF MSE ACROSS VARIOUS MODELS:



CONCLUSION:


  • LSTM and ARIMA gave the best possible results

  • LSTM model is used for deployment


The same process can be followed to forecast different stock price predictions

0 comments