Skip to main content

Page Object pattern for Scrapy

Project description

PyPI Version Supported Python Versions Build Status Coverage report Documentation Status

scrapy-poet is the web-poet Page Object pattern implementation for Scrapy. scrapy-poet allows to write spiders where extraction logic is separated from the crawling one. With scrapy-poet is possible to make a single spider that supports many sites with different layouts.

Read the documentation for more information.

License is BSD 3-clause.

Quick Start

Installation

pip install scrapy-poet

Requires Python 3.7+ and Scrapy >= 2.6.0.

Usage in a Scrapy Project

Add the following inside Scrapy’s settings.py file:

DOWNLOADER_MIDDLEWARES = {
    "scrapy_poet.InjectionMiddleware": 543,
}
SPIDER_MIDDLEWARES = {
    "scrapy_poet.RetryMiddleware": 275,
}

Developing

Setup your local Python environment via:

  1. pip install -r requirements-dev.txt

  2. pre-commit install

Now everytime you perform a git commit, these tools will run against the staged files:

  • black

  • isort

  • flake8

You can also directly invoke pre-commit run –all-files or tox -e linters to run them without performing a commit.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapy-poet-0.13.0.tar.gz (49.2 kB view details)

Uploaded Source

Built Distribution

scrapy_poet-0.13.0-py3-none-any.whl (27.1 kB view details)

Uploaded Python 3

File details

Details for the file scrapy-poet-0.13.0.tar.gz.

File metadata

  • Download URL: scrapy-poet-0.13.0.tar.gz
  • Upload date:
  • Size: 49.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.3

File hashes

Hashes for scrapy-poet-0.13.0.tar.gz
Algorithm Hash digest
SHA256 62517958cbdd369911458472a38bcca0d9566afe37ee144551cdbad9bacf0262
MD5 be1768381f8a63098644333016a2643b
BLAKE2b-256 631c82db1e855017d258cebb0b0eb73ae2f820ec66646d00ff4f0c8f98884277

See more details on using hashes here.

Provenance

File details

Details for the file scrapy_poet-0.13.0-py3-none-any.whl.

File metadata

File hashes

Hashes for scrapy_poet-0.13.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a513573ec369d7756d2de030b8916cd1b9b45f32aa8120705b0b9f345217cd95
MD5 f7e9a2830dee8dacd27aee5da297f024
BLAKE2b-256 1d87b905ae123d1ad72e284166ff82089c5916c64f17f4e6090549927f86be58

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page