library to compare HTML while ignoring non-functional differences
Project description
htmlcompare
A Python library to ensure two HTML documents are "equal". Currently the functionality is very limited but the idea is that the library should ignore differences automatically when these are not relevant for HTML semantics (e.g. <img style="">
should be the same as <img>
.
Usage
import htmlcompare
diff = htmlcompare.compare('<div>', '<p>')
is_same = bool(diff)
To ease testing the library provides some helpers
from htmlcompare import assert_different_html, assert_same_html
assert_different_html('<br>', '<p>')
assert_same_html('<div />', '<div></div>')
Limitations / Plans
CSS is currently not validated. Later I hope to add CSS parsing using a real CSS parser like tinycss2 but right now the only support for CSS is that contents of <style>
tags is completely ignored and that trailing ;
s in style
attributes are stripped.
No validation of conditional comments. Not sure which library I can use here but at some point I'll likely need this as well.
JavaScript - for obvious reasons it will be impossible to implement perfect JS comparison but it might be possible to run some kind of "beautifier" to take care of insignificant stylistic changes. However I don't need this feature so this is unlikely to get implemented (unless contributed by someone else).
Custom hooks could help adapting the comparison to your specific needs. However I don't know which API would be best so this will wait until there are real-world use cases.
Better API: The current API is very minimal and implements just what I needed right now. I hope to improve the API once I use this project in more complex scenarios.
Other projects
xmldiff is a well established project to compare two XML documents. However it seems as if the code does not contain knowledge about specific HTML semantics (e.g. CSS, empty attributes, insignificant attribute order).
Misc
The code is licensed under the MIT license. It supports Python 2.7 and Python 3.4+.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file HTMLCompare-0.1.tar.gz
.
File metadata
- Download URL: HTMLCompare-0.1.tar.gz
- Upload date:
- Size: 5.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.2 pkginfo/1.4.2 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 391826c96a3e701e789aeceb1fc8a844fac032aaeaa56f17e9d4141d59428fb0 |
|
MD5 | c3dd53e91a1068880a5e395224992d16 |
|
BLAKE2b-256 | 8019f29df2d8d3621ff3c24c89ce80283af10555c164c88dbe46bb750ceb3c80 |
File details
Details for the file HTMLCompare-0.1-py2.py3-none-any.whl
.
File metadata
- Download URL: HTMLCompare-0.1-py2.py3-none-any.whl
- Upload date:
- Size: 6.9 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.2 pkginfo/1.4.2 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f2ab6daecb0f38f5dfe684c5788872ea9518bacca244396c0ba11caf5f05e458 |
|
MD5 | 73a1ba4d2325882807ee4bf67fe3279d |
|
BLAKE2b-256 | 16e0b2071c1b1c84fba4860e84443279774fab709981ad2dbd00541a44abb1e1 |