Skip to main content

HTML parser based on the WHAT-WG Web Applications 1.0(

Project description

HTML parser designed to follow the WHATWG HTML5 specification. The parser is designed to handle all flavours of HTML and parses invalid documents using well-defined error handling rules compatible with the behaviour of major desktop web browsers.

Output is to a tree structure; the current release supports output to ElementTree (including cElementTree and lxml.etree), minidom, and a custom simpletree format.

html5lib also includes a HTML sanitizer, “treewalkers” for converting various tree formats into streams and filters and serializers to operate on those streams.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

html5lib-0.10.tar.gz (145.7 kB view details)

Uploaded Source

File details

Details for the file html5lib-0.10.tar.gz.

File metadata

  • Download URL: html5lib-0.10.tar.gz
  • Upload date:
  • Size: 145.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for html5lib-0.10.tar.gz
Algorithm Hash digest
SHA256 02ac9b9d87bfe5a8e68dba07bb59be0cc9017ece1456ad0367c22dde760b47e0
MD5 48a0c483ae4e8aa71d1703cf4837c52a
BLAKE2b-256 f689c388a695d6a3d274c27c113d2e489c517bf1db8ab7e9b9ee1f8f41f69aad

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page