This blueprint extracts out title, description and body from html either via xpath or by automatic cluster analysis
Project description
Introduction
- transmogrify.htmlcontentextractor
This blueprint extracts out title, description and body from html either via xpath or by automatic cluster analysis
Changelog
1.0b1 (2010-11-03)
ignore already found items. better debug [“Dylan Jay”]
skip templates if item already parsed [“Dylan Jay”]
print automaticly found XPaths [“Dylan Jay”]
make text fields strip tail text [“Vitaliy Podoba”]
1.0dev (2010-03-22)
split the auto templatefinder out to it’s own blueprint [“Dylan Jay”]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Close
Hashes for transmogrify.htmlcontentextractor-1.0b1.zip
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1698e6ef619b670ba16ed72492aa05e649bbc7b78dd951b07d79f11ec161f04f |
|
MD5 | 74cf35ddd26825c6acc3a2c92f2da7ba |
|
BLAKE2b-256 | be872d68d34c6d889d0a230ba259b57ce4cc1898992acd021d4e5f295636be0d |