Skip to main content

Local version of scraperwiki scraperlibs

Project description

Local ScraperWiki Python Library
====
This library aims to be a drop-in replacement for the Python `scraperwiki` library
for use locally. That is, functions will work the same way, and data will go
into a local SQLite database; a targeted bombing of ScraperWiki's servers
will not stop this local library from working, unless you happen to be running
it on one of ScraperWiki's servers.

## Installing

This will soon be in PyPI, but for now you can just install from the git repository.

## Documentation

Read the standard ScraperWiki Python library's [documentation](https://scraperwiki.com/docs/python/python_help_documentation/),
then look below for some quirks about the local version.

## Quirks

The local library aims to be a drop-in replacement.
In reality, the local version sometimes works better,
though not all of the features have been implemented.

## Differences

### Datastore differences

The local `scraperwiki.sqlite` is powered by
[DumpTruck](http://dumptruck.io), so some things
work a bit differently.

Data is stored to a local sqlite database named `scraperwiki.sqlite`.

Bizarre table and column names are supported.

`scraperwiki.sqlite.execute` will return an empty list of keys on an
empty select statement result.

`scraperwiki.sqlite.attach` downloads the whole datastore from ScraperWiki,
So you might not want to use this too often on large databases.

### Other Differences

## Status of implementation
In general, features that have not been implemented raise a `NotImplementedError`.

### Datastore
`scraperwiki.sqlite` is missing the following features.

* All of the `verbose` keyword arguments (These control what is printed on the ScraperWiki code editor)

### Geo
The UK geocoding helpers (`scraperwiki.geo`) documented on scraperwiki.com have been implemented. They partially depend on scraperwiki.com being available.

<!-- They have also been released as a separate library (ukgeo). Not true yet. -->

### Utils
`scraperwiki.utils` is implemented, as well as the following functions.

* `scraperwiki.log`
* `scraperwiki.scrape`
* `scraperwiki.pdftoxml`
* `scraperwiki.swimport`

### Deprecated
These submodules are deprecated and thus will not be implemented.

* `scraperwiki.apiwrapper`
* `scraperwiki.datastore`
* `scraperwiki.jsqlite`
* `scraperwiki.metadata`
* `scraperwiki.newsql`

### Development

Run tests with `./runtests`; this small wrapper cleans up after itself.

### Specs
Here are some ScraperWiki scrapers that demonstrate the non-local library's quirks.

https://scraperwiki.com/scrapers/scraperwiki_local/
https://scraperwiki.com/scrapers/cast/
https://scraperwiki.com/scrapers/things_happen_when_you_do_not_commit/
https://scraperwiki.com/scrapers/what_does_show_tables_return/
https://scraperwiki.com/scrapers/on_conflict/
https://scraperwiki.com/scrapers/spaces_in_table_names/
https://scraperwiki.com/scrapers/spaces_in_table_names_1/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scraperwiki_local-0.0.6.tar.gz (14.8 kB view details)

Uploaded Source

File details

Details for the file scraperwiki_local-0.0.6.tar.gz.

File metadata

File hashes

Hashes for scraperwiki_local-0.0.6.tar.gz
Algorithm Hash digest
SHA256 1a8c721d589f1c61f52e36f8658c9b5895274143279b35592f225de0d9ad673a
MD5 eafa859b71078759c46f90740c29c4ce
BLAKE2b-256 b5eef24ca44efb7e366b5029d26cbf4f51260ea317919adeb0036177a6a65498

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page