How to save labels during training to a file so that we can use them again during inference?

QBoard » Artificial Intelligence & ML » AI and ML - Python » How to save labels during training to a file so that we can use them again during inference?

User Dashboard

How to save labels during training to a file so that we can use them again during inference?

Back To Topics

Tags : None

Tarun Reddy

84

Suppose we are training a keras model on 1000 images of 3 classes and the labels list is ["label1", "label3", "label2", ......"label3"]. How can we save these labels to a file and use them again during predictions to get the label name from the prediction array?

January 11, 2021 5:19 PM IST

0
Laksh Nath

126
We can save those class names to a numpy file

from sklearn.preprocessing import LabelEncoder
```
le = LabelEncoder()
lab = le.fit_transform(labels)
unique_labels = le.classes_
np.save("labels.npy", unique_labels)
num_labels = len(unique_labels)

labels = keras.utils.to_categorical(lab)
print(labels)
```
In the above code, we are using sklearn LabelEncoder to convert text labels to integers and then using keras.utils.to_categorical to convert the integer labels to numpy matrix of binary values. The label names are saved to a numpy file 'labels.npy'.

During inference, the label names can be reread from the file to get class from model predictions
```
unique_labels = np.load("labels.npy", allow_pickle=True)
yhat = model.predict(images)
yhat = np.array(yhat)
indices = np.argmax(yhat, axis=1)
scores = yhat[np.arange(len(yhat)), indices]
predicted_categories = [unique_labels for i in indices]
```
January 11, 2021 5:26 PM IST

0

Maryam Bains

317

Here is a simple example using Tensorflow 2.0 SavedModel format (which is the recommended format, according to the docs) for a simple MNIST dataset classifier, using Keras functional API without too much fancy going on:

# Imports
import tensorflow as tf
from tensorflow.keras.layers import Input, Dense, Flatten
from tensorflow.keras.models import Model
import matplotlib.pyplot as plt

# Load data
mnist = tf.keras.datasets.mnist # 28 x 28
(x_train,y_train), (x_test, y_test) = mnist.load_data()

# Normalize pixels [0,255] -> [0,1]
x_train = tf.keras.utils.normalize(x_train,axis=1)
x_test = tf.keras.utils.normalize(x_test,axis=1)

# Create model
input = Input(shape=(28,28), dtype='float64', name='graph_input')
x = Flatten()(input)
x = Dense(128, activation='relu')(x)
x = Dense(128, activation='relu')(x)
output = Dense(10, activation='softmax', name='graph_output', dtype='float64')(x)
model = Model(inputs=input, outputs=output)

model.compile(optimizer='adam',
             loss='sparse_categorical_crossentropy',
             metrics=['accuracy'])

# Train
model.fit(x_train, y_train, epochs=3)

# Save model in SavedModel format (Tensorflow 2.0)
export_path = 'model'
tf.saved_model.save(model, export_path)

# ... possibly another python program 

# Reload model
loaded_model = tf.keras.models.load_model(export_path) 

# Get image sample for testing
index = 0
img = x_test[index] # I normalized the image on a previous step

# Predict using the signature definition (Tensorflow 2.0)
predict = loaded_model.signatures["serving_default"]
prediction = predict(tf.constant(img))

# Show results
print(np.argmax(prediction['graph_output']))  # prints the class number
plt.imshow(x_test[index], cmap=plt.cm.binary)  # prints the image

What is serving_default? It's the name of the signature def of the tag you selected (in this case, the default serve tag was selected). Also, here explains how to find the tag's and signatures of a model using saved_model_cli. Disclaimers This is just a basic example if you just want to get it up and running, but is by no means a complete answer - maybe I can update it in the future. I just wanted to give a simple example using the SavedModel in TF 2.0 because I haven't seen one, even this simple, anywhere. @Tom's answer is a SavedModel example, but it will not work on Tensorflow 2.0, because unfortunately there are some breaking changes. @Vishnuvardhan Janapati's answer says TF 2.0, but it's not for SavedModel format.

January 3, 2022 2:00 PM IST

Vaibhav Mali

259

You should have two files in your current working directory:

model.pkl
scaler.pkl

# load model and scaler and make predictions on new data
from sklearn.datasets import make_blobs
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from pickle import load
# prepare dataset
X, y = make_blobs(n_samples=100, centers=2, n_features=2, random_state=1)
# split data into train and test sets
_, X_test, _, y_test = train_test_split(X, y, test_size=0.33, random_state=1)
# load the model
model = load(open('model.pkl', 'rb'))
# load the scaler
scaler = load(open('scaler.pkl', 'rb'))
# check scale of the test set before scaling
print('Raw test set range')
for i in range(X_test.shape[1]):
	print('>%d, min=%.3f, max=%.3f' % (i, X_test[:, i].min(), X_test[:, i].max()))
# transform the test dataset
X_test_scaled = scaler.transform(X_test)
print('Scaled test set range')
for i in range(X_test_scaled.shape[1]):
	print('>%d, min=%.3f, max=%.3f' % (i, X_test_scaled[:, i].min(), X_test_scaled[:, i].max()))
# make predictions on the test set
yhat = model.predict(X_test_scaled)
# evaluate accuracy
acc = accuracy_score(y_test, yhat)
print('Test Accuracy:', acc)

January 5, 2022 2:07 PM IST

Member Sign In

Member Sign In

Create Account

How to save labels during training to a file so that we can use them again during inference?

Connect With Us