Gadde Sai Shailesh's other Models Reports

Major Concepts

 

Sign-Up/Login to access Several ML Models and also Deploy & Monetize your own ML solutions for free

Models Home » Domain Usecases » Others » Sign Language Detection

Sign Language Detection

Models Status

Model Overview


Sign-Language Detection

Introduction

Sign language recognition is a problem that has been addressed in research for years. However, we are still far from finding a complete solution available in our society.


Among the works developed to address this problem, the majority of them have been based on basically two approaches: contact-based systems, such as sensor gloves; or vision-based systems, using only cameras. The latter is way cheaper and the boom of deep learning makes it more appealing.


There are new and accessible technologies emerging to help those with hearing disabilities, there is still plenty of work to be done. For example, advancements in machine learning algorithms could help the deaf and hard-of-hearing even further by offering ways to better communicate using computer vision applications. Our project aims to do just that.


We sought to create a system that is capable of identifying American Sign Language (ASL) hand gestures. Since ASL has both static and dynamic hand gestures, we needed to build a system that can identify both types of gestures. This article will detail the phases of our project.


Dataset


American Sign Language Letters - v1 v1


This dataset was exported via roboflow.ai on October 20, 2020, at 4:54 PM GMT


It includes 1728 images. Letters are annotated in Tensorflow TFRecord (raccoon) format.


The following pre-processing was applied to each image:



  • Auto-orientation of pixel data (with EXIF-orientation stripping)

  • Resize to 416x416 (Stretch)


The following augmentation was applied to create 3 versions of each source image:



  • 50% probability of horizontal flip

  • Randomly crop between 0 and 20 % of the image

  • Random rotation of between -5 and +5 degrees

  • Random shear of between -5° to +5° horizontally and -5° to +5° vertically

  • Random brightness adjustment of between -25 and +25 %.

  • Random Gaussian blur of between 0 and 1.25 pixels




Model Used

Before we begin the setup, make sure to change the runtime type in Colab to GPU so that we can make use of the free GPU provided.


There are many models ready to download from the Tensorflow Model Zoo.


Be careful in choosing which model to use as some are not made for Object Detection. For this tutorial we will be using the following model:


SSD MobileNet V2 FPNLite 320x320.


Download it into your Colab Notebook.

Let's Understand the Code


Let's Implement all the required Libraries


Here, we are trying to automate the whole process so that when a user uploads the images with their annotations they will receive the weights of the last checkpoint.


Here is my approach in order to automate the process.

How to upload and unzip the dataset?


We are uploading the dataset in a zip format to google colab and then unzip the folder so that we can use it.





dataset.zip contains 2 folders train and test.



  • train folder contains images and their respective XML files.

  • test folder contains images and their respective XML files.


How to automate the process?


We create some temporary folders in order to store the configuration file, the tfrecords, weights and everything required.



Defining The Paths


We define the paths of everything that we use in this project.



Installing The Required Libraries,


Let's install the required libraries.



How To Create LabelMaps From Xml & Generate Tf records?


Creating CSV and LabelMap.



xml_to_csv.py contains:


"""
Usage:
# Create train data:
python xml_to_csv.py -i [PATH_TO_IMAGES_FOLDER]/train -o [PATH_TO_ANNOTATIONS_FOLDER]/train_labels.csv

# Create test data:
python xml_to_csv.py -i [PATH_TO_IMAGES_FOLDER]/test -o [PATH_TO_ANNOTATIONS_FOLDER]/test_labels.csv
"""

import os
import glob
import pandas as pd
import argparse
import xml.etree.ElementTree as ET


def xml_to_csv(path):
"""Iterates through all .xml files (generated by labelImg) in a given directory and combines them in a single Pandas datagrame.

Parameters:
----------
path : {str}
The path containing the .xml files
Returns
-------
Pandas DataFrame
The produced dataframe
"""
classes_names = []
xml_list = []
for xml_file in glob.glob(path + "/*.xml"):
tree = ET.parse(xml_file)
root = tree.getroot()
for member in root.findall("object"):
classes_names.append(member[0].text)
value = (
root.find("filename").text,
int(root.find("size")[0].text),
int(root.find("size")[1].text),
member[0].text,
int(member[4][0].text),
int(member[4][1].text),
int(member[4][2].text),
int(member[4][3].text),
)
xml_list.append(value)
column_name = [
"filename",
"width",
"height",
"class",
"xmin",
"ymin",
"xmax",
"ymax",
]
xml_df = pd.DataFrame(xml_list, columns=column_name)
classes_names = list(set(classes_names))
classes_names.sort()
return xml_df, classes_names


def main():
# Initiate argument parser
parser = argparse.ArgumentParser(
description="Sample TensorFlow XML-to-CSV converter"
)
parser.add_argument(
"-i",
"--inputDir",
help="Path to the folder where the input .xml files are stored",
type=str,
)
parser.add_argument(
"-o", "--outputFile", help="Name of output .csv file (including path)", type=str
)

parser.add_argument(
"-l",
"--labelMapDir",
help="Directory path to save label_map.pbtxt file is specified.",
type=str,
default="",
)

args = parser.parse_args()

if args.inputDir is None:
args.inputDir = os.getcwd()
if args.outputFile is None:
args.outputFile = args.inputDir + "/labels.csv"

assert os.path.isdir(args.inputDir)
os.makedirs(os.path.dirname(args.outputFile), exist_ok=True)
xml_df, classes_names = xml_to_csv(args.inputDir)
xml_df.to_csv(args.outputFile, index=None)
print("Successfully converted xml to csv.")
if args.labelMapDir:
os.makedirs(args.labelMapDir, exist_ok=True)
label_map_path = os.path.join(args.labelMapDir, "label_map.pbtxt")
print("Generate `{}`".format(label_map_path))

# Create the `label_map.pbtxt` file
pbtxt_content = ""
for i, class_name in enumerate(classes_names):
pbtxt_content = (
pbtxt_content
+ "item {{\n id: {0}\n name: '{1}'\n}}\n\n".format(
i + 1, class_name
)
)
pbtxt_content = pbtxt_content.strip()
with open(label_map_path, "w") as f:
f.write(pbtxt_content)


if __name__ == "__main__":
main()


Now let's create the tfrecords:



generate_tfrecord.py contains:

""" Sample TensorFlow XML-to-TFRecord converter

usage: generate_tfrecord.py [-h] [-x XML_DIR] [-l LABELS_PATH] [-o OUTPUT_PATH] [-i IMAGE_DIR] [-c CSV_PATH]

optional arguments:
-h, --help show this help message and exit
-x XML_DIR, --xml_dir XML_DIR
Path to the folder where the input .xml files are stored.
-l LABELS_PATH, --labels_path LABELS_PATH
Path to the labels (.pbtxt) file.
-o OUTPUT_PATH, --output_path OUTPUT_PATH
Path of output TFRecord (.record) file.
-i IMAGE_DIR, --image_dir IMAGE_DIR
Path to the folder where the input image files are stored. Defaults to the same directory as XML_DIR.
-c CSV_PATH, --csv_path CSV_PATH
Path of output .csv file. If none provided, then no file will be written.
"""

import os
import glob
import pandas as pd
import io
import xml.etree.ElementTree as ET
import argparse

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' # Suppress TensorFlow logging (1)
import tensorflow.compat.v1 as tf
from PIL import Image
from object_detection.utils import dataset_util, label_map_util
from collections import namedtuple

# Initiate argument parser
parser = argparse.ArgumentParser(
description="Sample TensorFlow XML-to-TFRecord converter")
parser.add_argument("-x",
"--xml_dir",
help="Path to the folder where the input .xml files are stored.",
type=str)
parser.add_argument("-l",
"--labels_path",
help="Path to the labels (.pbtxt) file.", type=str)
parser.add_argument("-o",
"--output_path",
help="Path of output TFRecord (.record) file.", type=str)
parser.add_argument("-i",
"--image_dir",
help="Path to the folder where the input image files are stored. "
"Defaults to the same directory as XML_DIR.",
type=str, default=None)
parser.add_argument("-c",
"--csv_path",
help="Path of output .csv file. If none provided, then no file will be "
"written.",
type=str, default=None)

args = parser.parse_args()

if args.image_dir is None:
args.image_dir = args.xml_dir

label_map = label_map_util.load_labelmap(args.labels_path)
label_map_dict = label_map_util.get_label_map_dict(label_map)


def xml_to_csv(path):
"""Iterates through all .xml files (generated by labelImg) in a given directory and combines
them in a single Pandas dataframe.

Parameters:
----------
path : str
The path containing the .xml files
Returns
-------
Pandas DataFrame
The produced dataframe
"""

xml_list = []
for xml_file in glob.glob(path + '/*.xml'):
tree = ET.parse(xml_file)
root = tree.getroot()
for member in root.findall('object'):
value = (root.find('filename').text,
int(root.find('size')[0].text),
int(root.find('size')[1].text),
member[0].text,
int(member[4][0].text),
int(member[4][1].text),
int(member[4][2].text),
int(member[4][3].text)
)
xml_list.append(value)
column_name = ['filename', 'width', 'height',
'class', 'xmin', 'ymin', 'xmax', 'ymax']
xml_df = pd.DataFrame(xml_list, columns=column_name)
return xml_df


def class_text_to_int(row_label):
return label_map_dict[row_label]


def split(df, group):
data = namedtuple('data', ['filename', 'object'])
gb = df.groupby(group)
return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]


def create_tf_example(group, path):
with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
encoded_jpg = fid.read()
encoded_jpg_io = io.BytesIO(encoded_jpg)
image = Image.open(encoded_jpg_io)
width, height = image.size

filename = group.filename.encode('utf8')
image_format = b'jpg'
xmins = []
xmaxs = []
ymins = []
ymaxs = []
classes_text = []
classes = []

for index, row in group.object.iterrows():
xmins.append(row['xmin'] / width)
xmaxs.append(row['xmax'] / width)
ymins.append(row['ymin'] / height)
ymaxs.append(row['ymax'] / height)
classes_text.append(row['class'].encode('utf8'))
classes.append(class_text_to_int(row['class']))

tf_example = tf.train.Example(features=tf.train.Features(feature={
'image/height': dataset_util.int64_feature(height),
'image/width': dataset_util.int64_feature(width),
'image/filename': dataset_util.bytes_feature(filename),
'image/source_id': dataset_util.bytes_feature(filename),
'image/encoded': dataset_util.bytes_feature(encoded_jpg),
'image/format': dataset_util.bytes_feature(image_format),
'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
'image/object/class/label': dataset_util.int64_list_feature(classes),
}))
return tf_example


def main(_):

writer = tf.python_io.TFRecordWriter(args.output_path)
path = os.path.join(args.image_dir)
examples = xml_to_csv(args.xml_dir)
grouped = split(examples, 'filename')
for group in grouped:
tf_example = create_tf_example(group, path)
writer.write(tf_example.SerializeToString())
writer.close()
print('Successfully created the TFRecord file: {}'.format(args.output_path))
if args.csv_path is not None:
examples.to_csv(args.csv_path, index=None)
print('Successfully created the CSV file: {}'.format(args.csv_path))


if __name__ == '__main__':
tf.app.run()​



Or, there is another way to create Tf Records and apply Agumentation so that we can increase the size of the data.


We can do the following:



  • Create TFRecord ourselves

  • Upload the annotations to Roboflow and get the dataset in TFRecord Format.


Creating the TFRecords ourselves is a bit tedious as the XML created after annotating may sometimes vary, so for the sake of ease, I suggest using Roboflow to perform the above task. They also provide an option to perform additional Data Augmentation which will increase the size of the dataset.

How to make changes in pipeline.config file?


Copying the pipeline.config file of SSD MobileNet V2 FPNLite 320x320 to our model's folder so that we can change the configurations with respect to our use case and without making changes to our original pipeline config file.


Now, let's make changes to our pipeline_config file.


The most important ones we will need to change are -


batch_size is the number of batches the model will train in parallel. A suitable number to use is 8. It could be more/less depending on the computing power available.


A good suggestion given on StackOverflow is:



Max batch size= available GPU memory bytes / 4 / (size of tensors + trainable parameters)



fine_tune_checkpoint is the last trained checkpoint (a checkpoint is how the model is stored by Tensorflow).


If you are starting the training for the first time, set this to the pre-trained model.


If you want to continue training on a previously trained checkpoint, set it to the respective checkpoint path. (This will continue training, building upon the features and loss instead of starting from scratch).



Let's Train!


Now, let's train the model



Use the output generated from this print statement to train the model for 5000 steps.


Now, Let's complete the inference


Here I have added text to speech using pyttsx3 so that when the user shows a sign we could hear the output.


This could be helpful to people who don't know sign language.


First, Let's define the paths of our weights.



Now, Let's restore our weights



We can either upload images and test our model or else we can use our camera.

First Let's see how we can do it with our camera





We are using pyttsx3 for text to speech conversion.

Now let's see how can we do this using uploading images.


First, we specify the paths of the images that we want to check



Then,



Output Preview



Let's test on some images:










0 comments