TextBlob: Simplified Text Processing
Release v0.5.0. (Installation)TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, translation, and more.
from text.blob import TextBlob
text = '''
The titular threat of The Blob has always struck me as the ultimate movie
monster: an insatiably hungry, amoeba-like mass able to penetrate
virtually any safeguard, capable of--as a doomed doctor chillingly
describes it--"assimilating flesh on contact.
Snide comparisons to gelatin be damned, it's a concept with the most
devastating of potential consequences, not unlike the grey goo scenario
proposed by technological theorists fearful of
artificial intelligence run rampant.
'''
blob = TextBlob(text)
blob.tags # [(u'The', u'DT'), (u'titular', u'JJ'),
# (u'threat', u'NN'), (u'of', u'IN'), ...]
blob.noun_phrases # WordList(['titular threat', 'blob',
# 'ultimate movie monster',
# 'amoeba-like mass', ...])
for sentence in blob.sentences:
print(sentence.sentiment) # returns (polarity, subjectivity)
# (0.060, 0.605)
# (-0.341, 0.767)
blob.translate(to="es") # 'La amenaza titular de The Blob...'
Features
- Noun phrase extraction
- Part-of-speech tagging
- Sentiment analysis
- Language translation and detection powered by Google Translate (new in 0.5.0)
- Tokenization (splitting text into words and sentences)
- Word and phrase frequencies
- n-grams
- Word inflection (pluralization and singularization)
- JSON serialization
Get it now
$ pip install -U textblob
$ curl https://raw.github.com/sloria/TextBlob/master/download_corpora.py | python
Guide
- License
- Installation
- Quickstart
- Create a TextBlob
- Part-of-speech Tagging
- Noun Phrase Extraction
- Sentiment Analysis
- Tokenization
- Words and Inflection
- WordLists
- Get Word and Noun Phrase Frequencies
- Translation and Language Detection
- TextBlobs Are Like Python Strings!
- n-grams
- Get Start and End Indices of Sentences
- Dealing with HTML
- Get a JSON-serialized version of a blob
- Advanced Usage
- API Reference
No comments:
Post a Comment