Skip to main content

No project description provided

Project description

scrapelib is a library for making requests to less-than-reliable websites.

Source: https://github.com/jamesturk/scrapelib

Documentation: https://jamesturk.github.io/scrapelib/

Issues: https://github.com/jamesturk/scrapelib/issues

PyPI badge Test badge

Features

scrapelib originated as part of the Open States project to scrape the websites of all 50 state legislatures and as a result was therefore designed with features desirable when dealing with sites that have intermittent errors or require rate-limiting.

Advantages of using scrapelib over using requests as-is:

  • HTTP(S) and FTP requests via an identical API
  • support for simple caching with pluggable cache backends
  • highly-configurable request throtting
  • configurable retries for non-permanent site failures
  • All of the power of the suberb requests library.

Installation

scrapelib is on PyPI, and can be installed via any standard package management tool:

poetry add scrapelib

or:

pip install scrapelib

Example Usage

  import scrapelib
  s = scrapelib.Scraper(requests_per_minute=10)

  # Grab Google front page
  s.get('http://google.com')

  # Will be throttled to 10 HTTP requests per minute
  while True:
      s.get('http://example.com')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapelib-2.2.0.tar.gz (15.3 kB view details)

Uploaded Source

Built Distribution

scrapelib-2.2.0-py3-none-any.whl (16.9 kB view details)

Uploaded Python 3

File details

Details for the file scrapelib-2.2.0.tar.gz.

File metadata

  • Download URL: scrapelib-2.2.0.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.1 CPython/3.11.2 Darwin/21.6.0

File hashes

Hashes for scrapelib-2.2.0.tar.gz
Algorithm Hash digest
SHA256 2f71c6e46c1ba0e110973671a5fa4cded0551101526319108c51f86eed75718c
MD5 c5dfd61102e9d629c1dfe6ae212a394d
BLAKE2b-256 b2ee12fb6996491f5ec17bad4e1455ce688ad0ee230fb63c12f00d4fc955a4f5

See more details on using hashes here.

File details

Details for the file scrapelib-2.2.0-py3-none-any.whl.

File metadata

  • Download URL: scrapelib-2.2.0-py3-none-any.whl
  • Upload date:
  • Size: 16.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.1 CPython/3.11.2 Darwin/21.6.0

File hashes

Hashes for scrapelib-2.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ae663b620ddc568736ac09a4162cf61921fe7c7d1d00e44e7bc2d0d98b3551f9
MD5 891bb9634ab01943e2bc926d36562491
BLAKE2b-256 1dd3210ae7068ebb9f7a7c7cc5cecb2bc741d50a6ffb4815998ad475faf8f1ba

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page