Skip to main content

Scrapy Downloader Middleware that helps to integrate Scrapy with Nimble Web API.

Project description

Scrapy Nimble Middleware

scrapy-nimble is a Scrapy Downloader Middleware that helps to integrate Scrapy with Nimble Web API.

Install

You can install scrapy-nimble as a regular Python package from PyPI using:

pip install scrapy-nimble

Configuration

  1. If you don't have it yet, open an account with Nimble.

  2. Provide your credentials and enable the middleware through Scrapy settings.

    # settings.py
    NIMBLE_ENABLED = True
    
    NIMBLE_USERNAME = "username"
    NIMBLE_PASSWORD = "password"
    
  3. Add the downloader middleware to your DOWNLOADER_MIDDLEWARES Scrapy setting.

    # settings.py
    DOWNLOADER_MIDDLEWARES = {
        "scrapy_nimble.middlewares.NimbleWebApiMiddleware": 570,
    }
    

    If you have scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware enabled (it is enabled by default in DOWNLOADER_MIDDLEWARES_BASE setting with default order equal to 590), configure scrapy-nimble middleware before it.

Basic Usage

Once the downloader middleware is properly configured, every request goes through the Nimble's Web API. There is no need to change anything in your spider's code.

Real-time URL request

scrapy-nimble uses Nimble Web API with Real-time URL requests. In addition to the default GET request for a specific URL, this API provides some extra options that allow you to execute geolocated requests, render dynamic content, among others.

Right now the following request options can be used. Check the documentation for usage and the valid values that can be provided. If the option is not given, the default value from Web API will be used.

  • method
  • country
  • locale
  • headers
  • cookies
  • render
  • render_options

Add the options you want to be used inside the meta key of your request, appending nimble_ to the option name such as:

# Inside your spider
yield scrapy.Request(
   "https://nimbleway.com",
   meta={
      "nimble_country": "DE",
      "nimble_locale": "uk",
      "nimble_render": True,
   }
)

Development

We suggest the use of pyenv to manage your Python version and create an isolated environment where you can safely develop. After installing it, you can prepare the environment using the following commands:

$ pyenv virtualenv 3.11.6 myvenv
$ pyenv activate myvenv
$ python -m pip install -e .

To keep a standard in code formatting and do some linter checks, we use pre-commit hooks. Install pre-commit package and install the project hooks using:

$ pre-commit install

Now you are ready to start development.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapy_nimble-0.0.2.tar.gz (19.6 kB view details)

Uploaded Source

Built Distribution

scrapy_nimble-0.0.2-py3-none-any.whl (4.6 kB view details)

Uploaded Python 3

File details

Details for the file scrapy_nimble-0.0.2.tar.gz.

File metadata

  • Download URL: scrapy_nimble-0.0.2.tar.gz
  • Upload date:
  • Size: 19.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for scrapy_nimble-0.0.2.tar.gz
Algorithm Hash digest
SHA256 8ac089198c968a5e1c2410b2d36286ad8003a8897a62f25c356e8e50445101ca
MD5 1babc392f4bd67b6d972a7140096e3f6
BLAKE2b-256 d3c8f348b9559c78e2a8b0889c353b8bf3761f1bc9bb700018432f0468f52b4a

See more details on using hashes here.

File details

Details for the file scrapy_nimble-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for scrapy_nimble-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 af722b5c4df50d7fb878ae594770521bd0d26e84f0f27a41fe33bb7c8b384af2
MD5 55cbcd10ea44763e905e2f845555a11c
BLAKE2b-256 4e362132c758590bac652f550198c079a4af3f6a11df684199769291e5051986

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page