Skip to main content

Simple, Pythonic text processing. Sentiment analysis, POS tagging, noun phrase parsing, and more.

Project description

TextBlob: Simplified Text Processing

Travis-CI Number of PyPI downloads

Homepage: https://textblob.readthedocs.org/

TextBlob is a Python (2 and 3) library for processing textual data. It provides a consistent API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, and more.

from text.blob import TextBlob

text = '''
The titular threat of The Blob has always struck me as the ultimate movie
monster: an insatiably hungry, amoeba-like mass able to penetrate
virtually any safeguard, capable of--as a doomed doctor chillingly
describes it--"assimilating flesh on contact.
Snide comparisons to gelatin be damned, it's a concept with the most
devastating of potential consequences, not unlike the grey goo scenario
proposed by technological theorists fearful of
artificial intelligence run rampant.
'''

blob = TextBlob(text)
blob.pos_tags       # [(Word('The'), u'DT'), (Word('titular'), u'JJ'),
                    #  (Word('threat'), u'NN'), ...])
blob.noun_phrases   # WordList(['titular threat', 'blob',
                    #            'ultimate movie monster',
                    #            'amoeba-like mass', ...])

for sentence in blob.sentences:
    print(blob.sentiment)
# (0.060, 0.605)
# (-0.34, 0.77)

Get it now

$ pip install -U textblob
$ curl https://raw.github.com/sloria/TextBlob/master/download_corpora.py | python

Documentation

Hosted here at ReadTheDocs.

Requirements

  • Python >= 2.6 or >= 3.3

Testing

Run

python run_tests.py

to run all tests.

License

TextBlob is licenced under the MIT license. See the bundled LICENSE file for more details.

Changelog

0.4.0 (unreleased)

  • New tokenizer module with WordTokenizer and SentenceTokenizer. Both textblob and NLTK tokenizer objects and be passed to TextBlob’s constructor. Tokens are accessed through the new tokens property.

  • New Blobber class for creating TextBlobs that share the same tagger, tokenizer, and np_extractor.

  • Backwards-incompatible: TextBlob.json() is now a method, not a property. This allows you to pass arguments (the same that you would pass to json.dumps()).

  • New home for documentation: https://textblob.readthedocs.org/

  • Fix bug with adding blobs to bytestrings.

0.3.10 (2013-08-02)

  • Bundled NLTK no longer overrides local installation.

  • Fix sentiment analysis of text with non-ascii characters.

0.3.9 (2013-07-31)

  • Updated nltk.

  • ConllExtractor is now Python 3-compatible.

  • Improved sentiment analysis.

  • Blobs are equal (with ==) to their string counterparts.

  • Added instructions to install textblob without nltk bundled.

  • Dropping official 3.1 and 3.2 support.

0.3.8 (2013-07-30)

  • Importing TextBlob is now much faster. This is because the noun phrase parsers are trained only on the first call to noun_phrases (instead of training them every time you import TextBlob).

  • Add text.taggers module which allows user to change which POS tagger implementation to use. Currently supports PatternTagger and NLTKTagger (NLTKTagger only works with Python 2).

  • NPExtractor and Tagger objects can be passed to TextBlob’s constructor.

  • Fix bug with POS-tagger not tagging one-letter words.

  • Rename text/np_extractor.py -> text/np_extractors.py

  • Add run_tests.py script.

0.3.7 (2013-07-28)

  • Every word in a Blob or Sentence is a Word instance which has methods for inflection, e.g word.pluralize() and word.singularize().

  • Updated the np_extractor module. Now has an new implementation, ConllExtractor that uses the Conll2000 chunking corpus. Only works on Py2.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textblob-0.3.10.tar.gz (1.7 MB view details)

Uploaded Source

Built Distribution

textblob-0.3.10-py2.py3-none-any.whl (1.4 MB view details)

Uploaded Python 2 Python 3

File details

Details for the file textblob-0.3.10.tar.gz.

File metadata

  • Download URL: textblob-0.3.10.tar.gz
  • Upload date:
  • Size: 1.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for textblob-0.3.10.tar.gz
Algorithm Hash digest
SHA256 7bbec0195d00d8cc1a074a9d461ab549b6d393d30f50884837b90c3acbf162c3
MD5 23352f2daf17dd4bfac94298d1a70571
BLAKE2b-256 30c971207e71d6b7d7da7f4fa4a8cd15a3641c3df1b8251c109c337910aa416b

See more details on using hashes here.

File details

Details for the file textblob-0.3.10-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for textblob-0.3.10-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 2af47cceec869d41ab0e01c3a9d8137aac3810cd33165957fbbb2e54de0cefad
MD5 9c0bd3c66a8e129d6ca22a2afcdde352
BLAKE2b-256 88500521d458d64de215d67c7b6a9d44c729e1593e212debbef20314e2681892

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page