Scraping micro-framework based on pyquery.
Project description
PyQuery-based scraping micro-framework. Supports Python 2.x and 3.x.
Documentation: http://demiurge.readthedocs.org
Installing demiurge
$ pip install demiurge
Quick start
Define items to be scraped using a declarative (Django-inspired) syntax:
>>> import demiurge >>> class TorrentDetails(demiurge.Item): ... label = demiurge.TextField(selector='strong') ... value = demiurge.TextField() ... def clean_value(self, value): ... unlabel = value[value.find(':') + 1:] ... return unlabel.strip() ... class Meta: ... selector = 'div#specifications p' ... >>> class Torrent(demiurge.Item): ... url = demiurge.AttributeValueField( ... selector='td:eq(2) a:eq(1)', attr='href') ... name = demiurge.TextField(selector='td:eq(2) a:eq(2)') ... size = demiurge.TextField(selector='td:eq(3)') ... details = demiurge.RelatedItem( ... TorrentDetails, selector='td:eq(2) a:eq(2)', attr='href') ... class Meta: ... selector = 'table.maintable:gt(0) tr:gt(0)' ... base_url = 'http://www.mininova.org' ... >>> >>> t = Torrent.one('/search/ubuntu/seeds') >>> t.name 'Ubuntu 7.10 Desktop Live CD' >>> t.size u'695.81\xa0MB' >>> t.url '/get/1053846' >>> t.html u'<td>19\xa0Dec\xa007</td><td><a href="/cat/7">Software</a></td><td>...' >>> results = Torrent.all('/search/ubuntu/seeds') >>> len(results) 116 >>> for t in results[:3]: ... print t.name, t.size ... Ubuntu 7.10 Desktop Live CD 695.81 MB Super Ubuntu 2008.09 - VMware image 871.95 MB Portable Ubuntu 9.10 for Windows 559.78 MB ... >>> t = Torrent.one('/search/ubuntu/seeds') >>> for detail in t.details: ... print detail.label, detail.value ... Category: Software > GNU/Linux Total size: 695.81 megabyte Added: 2467 days ago by Distribution Share ratio: 17 seeds, 2 leechers Last updated: 35 minutes ago Downloads: 29,085
See documentation for details: http://demiurge.readthedocs.org
Why demiurge?
Plato, as the speaker Timaeus, refers to the Demiurge frequently in the Socratic dialogue Timaeus, c. 360 BC. The main character refers to the Demiurge as the entity who “fashioned and shaped” the material world. Timaeus describes the Demiurge as unreservedly benevolent, and hence desirous of a world as good as possible. The world remains imperfect, however, because the Demiurge created the world out of a chaotic, indeterminate non-being.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file demiurge-0.2.tar.gz
.
File metadata
- Download URL: demiurge-0.2.tar.gz
- Upload date:
- Size: 5.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | be04a5a2b06a44d3ae3c94aa24a92625b2544147bf46227da9114a55936fa760 |
|
MD5 | 2cdbfcd8bf629f2158847c4a4df7351c |
|
BLAKE2b-256 | 568a495fa2401c757c2bf336fa9d05c3e0c765fb2c9b9d8f1067e8e7506b429d |