Skip to main content

Truncating HTML with html5lib filter

Project description

html5lib-truncation is a html5lib filter implementation, which can truncate HTML to specific length in display, but never breaks HTML tags.

There is a shortcut function, the simplest way to use it:

>>> from html5lib_truncation import truncate_html
>>>
>>> html = u'<p>A <a href="#">very very long link</a></p>'
>>> truncate_html(html, 8)
u'<p>A <a href=#>very</a>'
>>> truncate_html(html, 8, break_words=True)
u'<p>A <a href=#>very ve</a>'
>>> truncate_html(html, 20, end='...')
u'<p>A <a href=#>very very...</a>'
>>> truncate_html(html, 20, end='...', break_words=True)
u'<p>A <a href=#>very very lon...</a>'

Installation

pip install html5lib-truncation

Don’t forget to put it into your requirements.txt or setup.py.

API Overview

The core API of html5lib-truncation is the filter:

import html5lib
from html5lib_truncation import TruncationFilter

etree = html5lib.parse(u'<p>A <a href="#">very very long link</a></p>')
walker = html5lib.getTreeWalker('etree')

stream = walker(etree)
stream = TruncationFilter(stream, 20, end='...', break_words=True)

serializer = html5lib.serializer.HTMLSerializer()
serialized = serializer.serialize(stream)

print(u''.join(serialized).strip())

The output is <p>A <a href=#>very very lon...</a>.

Issues

If you want to report bugs or other issues, please create issues on GitHub Issues.

Contributes

You can send a pull reueqst on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

html5lib-truncation-0.1.0.tar.gz (4.4 kB view details)

Uploaded Source

Built Distribution

html5lib_truncation-0.1.0-py2.py3-none-any.whl (8.0 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file html5lib-truncation-0.1.0.tar.gz.

File metadata

File hashes

Hashes for html5lib-truncation-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f69d3b3e31d4e9caef138d4602ed8eb531eaafd94b6f9ee8b4932722cd3d0308
MD5 7e41e3c92ce9fdd8590c7899415dc056
BLAKE2b-256 cf0dd07cc96c60000dfa1afd446b0660e6f3f8ff59ef0d513dafc907f1d3ee60

See more details on using hashes here.

File details

Details for the file html5lib_truncation-0.1.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for html5lib_truncation-0.1.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 2179c3d04a948aaf4ce8b4472b18fa3b2cb9f4956eb3810942be620faf36a9d3
MD5 01b9926eed1bebe6d6538945697b8d6b
BLAKE2b-256 ca9685470bf06ca3a5fef024aefe29514516871c169e4c42e8e1c2ff81d48513

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page