I am really new to Data Science/ML and have been working on Tensorflow to implement Linear Regression on California Housing Prices from Kaggle.
I tried to train a mode in two different ways:
In both cases, the loss of the model was really high and I have not been able to understand what are the ways to improve it.
df = pd.read_csv('california-housing-prices.zip')
df = df[['total_rooms', 'total_bedrooms', 'median_house_value', 'housing_median_age', 'median_income']]
print('Shape of dataset before removing NAs and duplicates {}'.format(df.shape))
df.dropna(inplace=True)
df.drop_duplicates(inplace=True)
input_train, input_test, target_train, target_test = train_test_split(df['total_rooms'].values, df['median_house_value'].values, test_size=0.2)
scaler = MinMaxScaler()
input_train = input_train.reshape(-1,1)
input_test = input_test.reshape(-1,1)
input_train = scaler.fit_transform(input_train)
input_test = scaler.fit_transform(input_test)
target_train = target_train.reshape(-1,1)
target_train = scaler.fit_transform(target_train)
target_test = target_test.reshape(-1,1)
target_test = scaler.fit_transform(target_test)
print('Number of training input elements {}'.format(input_train.shape))
print('Number of training target elements {}'.format(target_train.shape))
Sequential
API:BATCH_SIZE = 10
BUFFER = 5000
dataset = tf.data.Dataset.from_tensor_slices((input_train, target_train))
dataset = dataset.shuffle(BUFFER).batch(BATCH_SIZE)
DENSE_UNITS = 64
model = tf.keras.Sequential([
tf.keras.Input(shape=(1,)),
tf.keras.layers.Dense(DENSE_UNITS, activation='relu'),
tf.keras.layers.Dense(DENSE_UNITS, activation='relu'),
tf.keras.layers.Dense(1)
])
EPOCH = 5000
early_stopping = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=5)
model.compile(optimizer=tf.keras.optimizers.RMSprop(learning_rate=0.001), loss='mean_squared_error', metrics=['accuracy', 'mse'])
history = model.fit(dataset, epochs=EPOCH, callbacks=[early_stopping])
Epoch 1/5000
1635/1635 [==============================] - 13s 8ms/step - loss: 0.0564 - accuracy: 0.0013 - mse: 0.0564
Epoch 2/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0552 - accuracy: 0.0016 - mse: 0.0552
Epoch 3/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0551 - accuracy: 0.0012 - mse: 0.0551
Epoch 4/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0551 - accuracy: 9.1766e-04 - mse: 0.0551
Epoch 5/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0551 - accuracy: 0.0013 - mse: 0.0551
Epoch 6/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0551 - accuracy: 0.0013 - mse: 0.0551
Epoch 7/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 0.0013 - mse: 0.0549
Epoch 8/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0550 - accuracy: 0.0012 - mse: 0.0550
Epoch 9/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 0.0011 - mse: 0.0549
Epoch 10/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0550 - accuracy: 0.0012 - mse: 0.0550
Epoch 11/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 0.0010 - mse: 0.0549
Epoch 12/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 0.0011 - mse: 0.0549
Epoch 13/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 0.0013 - mse: 0.0549
Epoch 14/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 0.0016 - mse: 0.0549
Epoch 15/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 0.0011 - mse: 0.0549
Epoch 16/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 0.0017 - mse: 0.0549
Epoch 17/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 0.0013 - mse: 0.0549
Epoch 18/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 6.1177e-04 - mse: 0.0549
Epoch 19/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 6.1177e-04 - mse: 0.0549
Epoch 20/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 6.1177e-04 - mse: 0.0549
Epoch 21/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 0.0012 - mse: 0.0550
Epoch 22/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0548 - accuracy: 9.7883e-04 - mse: 0.0549
Epoch 23/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0550 - accuracy: 7.3412e-04 - mse: 0.0549
Epoch 24/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 7.9530e-04 - mse: 0.0549
Epoch 25/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0548 - accuracy: 0.0013 - mse: 0.0548
Epoch 26/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 7.9530e-04 - mse: 0.0549
Epoch 27/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 6.7295e-04 - mse: 0.0549
Epoch 28/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0548 - accuracy: 0.0012 - mse: 0.0548
Epoch 29/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0549 - accuracy: 0.0013 - mse: 0.0549
Epoch 30/5000
1635/1635 [==============================] - 7s 4ms/step - loss: 0.0548 - accuracy: 9.7883e-04 - mse: 0.0549
class Linear(object):
def __init__(self):
"""
Y = mX + C
Initializing the intercet and the slope
"""
self.m = tf.Variable(tf.random.normal(shape=()))
self.C = tf.Variable(tf.random.normal(shape=()))
def __call__(self, x):
return self.m * x + self.C
# Defining a MSE loss function
def loss(predicted_y, target_y):
return tf.reduce_mean(tf.square(predicted_y - target_y))
def train(model, input, output, learning_rate):
with tf.GradientTape() as tape:
predicted_y = model(input)
current_loss = loss(predicted_y, output)
df_m, df_C = tape.gradient(current_loss, [model.m, model.C])
model.m.assign_sub(learning_rate * df_m)
model.C.assign_sub(learning_rate * df_C)
epochs = 5000
model = Linear()
print(model.m.assign_sub(1))
ms, Cs, losses = [], [], []
target_train = target_train.astype('float32')
for epoch in range(epochs):
ms.append(model.m.numpy())
Cs.append(model.C.numpy())
predicted_y = model(input_train)
current_loss = loss(predicted_y, target_train)
losses.append(current_loss)
train(model, input_train, target_train, 0.1)
if epoch % 500 == 0:
print('Epoch %2d: W=%1.2f b=%1.2f, loss=%2.5f' %
(epoch, ms[-1], Cs[-1], current_loss))
predicted_test = model(input_test[:10])
print(np.argmax(predicted_test.numpy()))
print(scaler.inverse_transform(predicted_test))
print(scaler.inverse_transform(target_test[:10]))
predicted_loss = loss(predicted_test, target_test[:10])
print(predicted_loss.numpy())
Epoch 0: W=-1.86 b=-0.09, loss=0.44381
Epoch 500: W=-1.19 b=0.47, loss=0.06470
Epoch 1000: W=-0.73 b=0.44, loss=0.06034
Epoch 1500: W=-0.39 b=0.42, loss=0.05799
Epoch 2000: W=-0.13 b=0.40, loss=0.05671
Epoch 2500: W=0.05 b=0.39, loss=0.05602
Epoch 3000: W=0.19 b=0.38, loss=0.05565
Epoch 3500: W=0.29 b=0.38, loss=0.05545
Epoch 4000: W=0.36 b=0.37, loss=0.05534
Epoch 4500: W=0.41 b=0.37, loss=0.05528
input_train = input_train.reshape(-1,1)
input_test = input_test.reshape(-1,1)
model = tf.keras.Sequential([
tf.keras.Input(shape=(8,)),
Your loss will decrease. After a few epochs I get this:
1644/1652 [============================>.] -
ETA: 0s - loss: 0.0144 - accuracy: 0.0434 - mse: 0.0144
I couldn't use your .zip
file so I did it differently. Here is all my reproducible code:
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import tensorflow as tf
from sklearn.datasets import fetch_california_housing
x, y = fetch_california_housing(return_X_y=True)
input_train, _, target_train, _ = train_test_split(x, y)
scaler = MinMaxScaler()
input_train = scaler.fit_transform(input_train)
target_train = target_train.reshape(-1,1)
target_train = scaler.fit_transform(target_train)
dataset = tf.data.Dataset.from_tensor_slices((input_train, target_train))
dataset = dataset.shuffle(5000).batch(32)
model = tf.keras.Sequential([
tf.keras.Input(shape=(8,)),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(1)])
early_stopping = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=5)
model.compile(optimizer=tf.keras.optimizers.RMSprop(learning_rate=0.001),
loss='mean_squared_error', metrics=['accuracy', 'mse'])
history = model.fit(dataset, epochs=50, callbacks=[early_stopping])
Playing around with the learning_rate might yield better results, but it could be that your network is just too complex (computing a super non-convex function) for simple Gradient Descent to work well here. Other than that I don't spot any immediate issues, but debugging a neural network implementation can be pretty tricky sometimes.
Try out a quick switch to AdamOptimizer or another advanced optimizer or toying around with the learning_rate. If you're still getting super low test accuracy, then I'll try to go through it with a fine-toothed comb later.