NLTK vs Stanford NLP

  • I have recently started to use NLTK toolkit for creating few solutions using Python.

    I hear a lot of community activity regarding using stanford NLP. Can anyone tell me what is the difference between NLTK and Stanford NLP ? Are they 2 different libraries ? i know that NLTK has an interface to stanford NLP but can anyone throw some light on few basic differences or even more in detail.

    Can stanford NLP be used using Python?

    This post was edited by Rakesh Racharla at September 18, 2020 5:17 PM IST
      August 1, 2020 3:25 PM IST
    0
  • The choice will depend upon your use case. NLTK is great for pre-processing and tokenizingtext. It also includes a good POS tagger. Standford Core NLP for only tokenizing/POS tagging is a bit of overkill, because Standford NLP requires more resources.
    But one fundamental difference is, you can't parse syntactic dependencies out of the box with NLTK. You need to specify a Grammar for that which can be very tedious if the text domain is not restricted. Whereas Standford NLP provides a probabilistic parser for general text as a down-loadable model, which is quite accurate. It also has built in NER (Named Entity Recognition) and more. Also I will recomend to take a look at Spacy, which is written in python, easy to use and much faster than CoreNLP.
      August 1, 2020 3:26 PM IST
    0
  • We have been using the NLTK toolkit mainly for chatbot development. It is a great tool to build innovative AI based conversational bots. For NLP, we use Python. We also use Pandas – For data processing and analysis for the chatbot, a software library is built using Pandas for the Python programming language. When it comes to Stanford NLP for chatbot development services, CoreNLP is your one stop shop for natural language processing in Java! CoreNLP enables users to derive linguistic annotations for text, including token and sentence boundaries, parts of speech, named entities, numeric and time values, dependency and constituency parses, coreference, sentiment, quote attributions, and relations. CoreNLP currently supports 8 languages: Arabic, Chinese, English, French, German, Hungarian, Italian, and Spanish.
      October 11, 2022 8:58 AM IST
    0
  • The choice will depend upon your use case. NLTK is great for pre-processing and tokenizingtext. It also includes a good POS tagger. Standford Core NLP for only tokenizing/POS tagging is a bit of overkill, because Standford NLP requires more resources.
    But one fundamental difference is, you can't parse syntactic dependencies out of the box with NLTK. You need to specify a Grammar for that which can be very tedious if the text domain is not restricted. Whereas Standford NLP provides a probabilistic parser for general text as a down-loadable model, which is quite accurate. It also has built in NER (Named Entity Recognition) and more. Also I will recomend to take a look at Spacy, which is written in python, easy to use and much faster than CoreNLP.
      September 11, 2020 4:44 PM IST
    0
  • NLTK can be used for the learning phase to and perform natural language process from scratch and basic level. Standford NLP gives you high-level flexibility to done task very fast and easiest way.

    If you want fast and production use, can go for Standford NLP.

      September 11, 2020 4:46 PM IST
    0
  • Stanford released STANZA, Python library based on Stanford NLP. You can find it here https://stanfordnlp.github.io/stanza/

    If you familiar with Spacy NLP, it quite similar :

    >>> import stanza
    >>> stanza.download('en') # download English model
    >>> nlp = stanza.Pipeline('en') # initialize English neural pipeline
    >>> doc = nlp("Barack Obama was born in Hawaii.") # run annotation over a sentence
      September 11, 2020 4:46 PM IST
    0
    • Jainew Nanda
      Jainew Nanda @Samar Patil, any chance you would know what is the difference between the new stanza and the original stanfordnlp libraries?
      September 11, 2020
    • Samar Patil
      Samar Patil @Jainew Nanda,actually i can't answer that precisely since I never use java stanfordnlp in my project because I use Python. But in their website stanfordnlp.github.io/stanza/corenlp_client.html they claim that Stanza actually access their native java...  more
      September 11, 2020
  • They are two different libraries.

    • Stanford CoreNLP is written in Java
    • NLTK is a Python library

    The main functional difference is that NLTK has multiple versions or interfaces to other versions of NLP tools, while Stanford CoreNLP only has their version. NLTK also supports installing third-party Java projects, and even includes instructions for installing some Stanford NLP packages on the wiki.

    Both have good support for English, but if you are dealing with other languages:

    That said, which one is "best" will depend on your specific application and required performance (what features you are using, language, vocabulary, desired speed, etc.).

    Can Stanford NLP be used using Python?

    Yes, there are a number of interfaces and packages for using Stanford CoreNLP in Python(independent of NLTK).

    This post was edited by Jasmine Chacko at September 11, 2020 4:54 PM IST
      September 11, 2020 4:53 PM IST
    0