Most Viewed Models of the Month
Models Keywords (407)
Natural Language Processing
2 models found.
Grid View
List View
-
September 9, 2021 - MODEL_posted_by
Dev Agrawal,
700 views, 1 like
Introduction:
Prediction of the natural language of a text can be an important step in a Natural Language Processing (NLP)... moreIntroduction:
Prediction of the natural language of a text can be an important step in a Natural Language Processing (NLP) use case. For use cases like translation or sentiment analysis, it is better to know the language of the text used in it. For example, if you go to google translate, translation of the text is followed by detecting the language.
DataSet:
Dataset link: https://www.kaggle.com/zarajamshaid/language-identification-datasst
WiLI-2018, the Wikipedia language identification benchmark dataset, contains 235000 paragraphs of 235 languages. Each language in this dataset contains 1000 rows/paragraphs.
Model used:
For feature extraction used Bag of Words. Bag of Words (BOW) is used to extract features from text documents that are used for training machine learning algorithms by creating a vocabulary of all the unique words occurring in all the documents in the training set. The bag of words model is when we use all the words of any article/paragraph/text to get a feature vector. The Count vectorizer is used for the N-gram approach which tells how many words are taken together as a single entity in the training set for classification. Training is done on the following machine learning algorithms:
Random Forest Classifier:
Logistic regression:
Training is evaluated on multiple models from Uni-gram to 10-gram word/char models, and fitted on the following machine learning algorithms.
Results:
The accuracy score on the test dataset for all the models created are as follows:After analysis, the final model used is a uni-gram model for the final predictions on the random text entered by users. The accuracy achieved by this model is 95%.NOTE: Greater the length of input text better the accuracy. less
-
April 29, 2021 - MODEL_posted_by
Pranav B,
524 views, 2 likes
The model can be used to find the entity/word pertaining to a disease from a given text.
Loading ...