Skip to main content

Parse hentry from microformats.

Project description

Parse a well designed webpage with microformats markup. If you have no knowledge about microformats, take a look at http://microformats.org/wiki/hentry.

A hentry schema looks like:

<article class="hentry">
    <h1 class="entry-title">Article title</h1>
    <time class="updated" datetime="2014-11-06T20:00:00Z" pubdate>2014-11-06</time>
    <div class="entry-content">
        <p>Here is the content</p>
    </div>
    <div class="entry-tags">
        <a href="#tag1" rel="tag">tag1</a>
        <a href="#tag2" rel="tag">tag2</a>
    </div>
    <div class="vcard author">
        <span class="fn">Author Name</span>
    </div>
</article>

With this library hentry.py, you can parse the html into meta data:

hentry.parse_html(text, format='html')

Installation

Install hentry with pip:

$ pip install hentry

Basic Usage

Parse a webpage with a url:

hentry.parse_url(url)

Parse a webpage with html content:

hentry.parse_html(content)

The result is a dict which contains:

  1. title

  2. content

  3. author

  4. pubdate

  5. tags

  6. categories

  7. image

Project details


Release history Release notifications | RSS feed

This version

0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hentry-0.1.tar.gz (3.5 kB view details)

Uploaded Source

File details

Details for the file hentry-0.1.tar.gz.

File metadata

  • Download URL: hentry-0.1.tar.gz
  • Upload date:
  • Size: 3.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for hentry-0.1.tar.gz
Algorithm Hash digest
SHA256 0d433180f01b66966f556d8f01aef38fb2655c59f7462f29ee103043eea6b43a
MD5 4d44f9a1745e5851ec6e84d20e49fd02
BLAKE2b-256 afb96ab464dd1abf3b511598ac5bd0d28254c23bf2ec07a58aba0600e689f2e0

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page