Skip to main content

Extracts body text from MediaWiki wikitext by stripping off templates, html tags, tables, headers, etc.

Project description

https://travis-ci.org/danmichaelo/mwtextextractor.png?branch=master https://coveralls.io/repos/danmichaelo/mwtextextractor/badge.png

mwtextextractor extracts simple body text from MediaWiki wikitext by stripping off templates, html tags, tables, headers, etc. The extracted text can be used for word counting.

Example:

from mwtextextractor import get_body_text
print get_body_text('Lorem {{ipsum}} dolor')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mwtextextractor-0.1.2.tar.gz (3.0 kB view details)

Uploaded Source

Built Distribution

mwtextextractor-0.1.2-py2.py3-none-any.whl (3.6 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file mwtextextractor-0.1.2.tar.gz.

File metadata

File hashes

Hashes for mwtextextractor-0.1.2.tar.gz
Algorithm Hash digest
SHA256 910877eac0f7a45e301e78081cbba7e59579a287a80e0170ad3f9bdadfa8537c
MD5 fa805f6a65d256fc288323f870bb2c2f
BLAKE2b-256 d5f5075d52b7fc695f19cf4fb37781d0d01374a0bf0d16e3277bb705cdc3fc36

See more details on using hashes here.

File details

Details for the file mwtextextractor-0.1.2-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for mwtextextractor-0.1.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 1aa24b115dfd5af458850173fda7526f90136433fe0d9d86b4b31d77f22bf4ff
MD5 c522b22eb00d809f440923e026ff031a
BLAKE2b-256 a27bdb6a258f3805c9db821cb841b862a29bf55baa29bd1ac30b34dbe94b09cf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page