Extract metadata from html pages using Open Graph metadata, HTML metadata, and a series of fallbacks
Project description
HTMLmetadata
Extract metadata from html pages using Open Graph metadata, HTML metadata, and a series of fallbacks
Inspired in https://metascraper.js.org
Install
pip install htmlmetadata
Use
You can use it by calling the module directly.
python -m htmlmetadata http://schema.org/docs/about.html
{
"request": {
"url": "http://schema.org/docs/about.html"
},
"summary": {
"description": "Schema.org is a set of extensible schemas that enables webmasters to embed\n structured data on their web pages for use by search engines and other applications.",
"title": "about page - schema.org",
"language": "en"
}
}
Or use it directly in your code.
from htmlmetadata import extract_metadata
data = extract_metadata("http://schema.org/docs/about.html")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
htmlmetadata-1.0.zip
(8.3 kB
view details)
Built Distribution
File details
Details for the file htmlmetadata-1.0.zip
.
File metadata
- Download URL: htmlmetadata-1.0.zip
- Upload date:
- Size: 8.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.38.0 CPython/3.8.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f4e3934edd422e90acbc3de0fc68008564c7fde00ce944804d9c362f2c7509d5 |
|
MD5 | 643df0df7ff0bf232856ecbb227d4b93 |
|
BLAKE2b-256 | b017524ab54164fcc98279f9b3d005e15ff929a340d01d0e5f636bf48d6f63c3 |
File details
Details for the file htmlmetadata-1.0-py2.py3-none-any.whl
.
File metadata
- Download URL: htmlmetadata-1.0-py2.py3-none-any.whl
- Upload date:
- Size: 5.4 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.38.0 CPython/3.8.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6715d60226dbcbb826f79472771b45089f08cf9d6e65d323b5c9ba77bb27f171 |
|
MD5 | 43a4cab795f1e1446d51d22300524e74 |
|
BLAKE2b-256 | 177ea3bd8045025135c40615b2318e5d741d4e8465b9895ad754fc1ea323b68e |