Skip to main content

Python wrapper for HTML Tidy (tidylib), compatible with Python 2 and 3

Project description

0.2.0: Works on Windows! See documentation for available DLL download locations. Documentation rewritten and expanded.

PyTidyLib is a Python package that wraps the HTML Tidy library. This allows you, from Python code, to “fix” invalid (X)HTML markup. Some of the library’s many capabilities include:

  • Clean up unclosed tags and unescaped characters such as ampersands

  • Output HTML 4 or XHTML, strict or transitional, and add missing doctypes

  • Convert named entities to numeric entities, which can then be used in XML documents without an HTML doctype.

  • Clean up HTML from programs such as Word (to an extent)

  • Indent the output, including proper (i.e. no) indenting for pre elements, which some (X)HTML indenting code overlooks.

Small example of use

The following code cleans up an invalid HTML document and sets an option:

from tidylib import tidy_document
document, errors = tidy_document('''<p>f&otilde;o <img src="bar.jpg">''',
  options={'numeric-entities':1})
print document
print errors

Docs

Documentation is shipped with the source distribution and is available at the PyTidyLib web page.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytidylib6-0.2.2.tar.gz (155.1 kB view details)

Uploaded Source

File details

Details for the file pytidylib6-0.2.2.tar.gz.

File metadata

  • Download URL: pytidylib6-0.2.2.tar.gz
  • Upload date:
  • Size: 155.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for pytidylib6-0.2.2.tar.gz
Algorithm Hash digest
SHA256 421ae35f32a32610cf3e8a5c85b830c92fb8c620192b42916fb8a4a34d5bc006
MD5 9bb597536c886b7a3ed56bc5d0e8113c
BLAKE2b-256 a54498ddad5e111352282b82f763aba8bf75da9adb789f473a588028edbbf145

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page