Dev Agrawal's other Models Reports

Major Concepts

 

Sign-Up/Login to access Several ML Models and also Deploy & Monetize your own ML solutions for free

Models Home » Domain Usecases » Others » Capital Handwritten Words & Sentence Recognition

Capital Handwritten Words & Sentence Recognition

Models Status

Model Overview

Problem Statement:


In handwritten text recognition (HTR), the device predicts the person's handwritten characters or words or lines into a format that the computer recognizes ( e.g., Unicode text). There are many levels of HTR, starting from the recognition of simplified individual characters to the recognition of whole words and sentences of handwriting.


Offline handwriting recognition, also referred to as optical character recognition, is experimented on by converting the handwritten document into digital format. The advantage of recognition is that it can be done at any time after the paper has been written, even years later.




Usage:


Usage of handwriting recognition is numerous: recognizing postal addresses, bank checks, and forms. Eventually, OCR plays an essential role for digital libraries, allowing the entry of image textual information into computers by digitization, image restoration, and recognition methods.


Dataset:


Dataset link: https://www.nist.gov/itl/products-and-services/emnist-dataset


The EMNIST dataset is a set of handwritten character digits derived from the NIST Special Database 19  and converted to a 28x28 pixel image format and dataset structure that directly matches the MNIST dataset.


EMNIST letter consists of 145600 characters i.e., 26 balanced classes 


Extracting the dataset in train and test by this method


x_train, y_train = extract_training_samples('letters')
x_test, y_test = extract_test_samples('letters')

 Model Used:


Model summary for the neural network created is-


Conv2D - This layer creates a convolution kernel that is convolved with the layer input to produce a tensor of outputs. If use_bias is True, a bias vector is created and added to the outputs.


MaxPooling2D - Downsamples the input along its spatial dimensions (height and width) by taking the maximum value over an input window (of size defined by pool_size) for each channel of the input. Strides shift the window along each dimension.


Flatten - Flatten is the function that converts the pooled feature map to a single column that is passed to the fully connected layer.


Dropout - To prevent the nodes from depending on each other. In addition, it reduces overfitting by dropping nodes in each layer with some probability.


Dense - used in the output layer, with 2 units, and activation function as softmax.


Results:


Some of the predictions on different images are:



Note: Input Format: Capital letter aligned in a line with proper space between words.


          Future Vision: Improving the model for cursive letters and use it for multiple lines.

0 comments