Pure-Python implementations of the Snowball stemmers
Project description
The traditional way of using the Snowball stemmers in Python is via the pystemmer package, which provides a Python wrapper around the Snowball C library. However, Python C extensions are problematic in some environments. Therefore, this package provides pure-Python implementations of the Snowball stemming algorithms.
The implementations of the stemming algorithms is translated from the Snowball language to Python via sbl2py.
Usage
Usually, you’ll prefer to use the pystemmer module whenever that is possible, because it’s much faster than purestemmer:
try: import Stemmer except ImportError: # pystemmer is not available, use purestemmer instead import purestemmer as Stemmer
Since purestemmer has the same public API and provides the same algorithms as pystemmer, there should be no need to change any code when switching between pystemmer and purestemmer like this.
Please see the pystemmer documentation for details on how to use the stemming algorithms.
Differences between purestemmer and pystemmer
purestemmer has only been tested on Python 2.7
purestemmer.Stemmer instances are thread-safe
purestemmer is on average about 100x slower than pystemmer
License
purestemmer itself is covered by the MIT License. The underlying Snowball algorithms are covered by the BSD-3 License. Please see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file purestemmer-0.1.0.tar.gz
.
File metadata
- Download URL: purestemmer-0.1.0.tar.gz
- Upload date:
- Size: 77.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f639835d8620487411bd2438027f686493c9a4b6d616eb352afe3652e63639c7 |
|
MD5 | 1625ea5318086913f877bf9a23858522 |
|
BLAKE2b-256 | 473675004761951686fa7f590bb6f4117c7b5a8e0d2a087f92319859fe205f99 |