Skip to main content

Page Object pattern for Scrapy

Project description

PyPI Version Supported Python Versions Build Status Coverage report Documentation Status

scrapy-poet is the web-poet Page Object pattern implementation for Scrapy. scrapy-poet allows to write spiders where extraction logic is separated from the crawling one. With scrapy-poet is possible to make a single spider that supports many sites with different layouts.

Read the documentation for more information.

License is BSD 3-clause.

Quick Start

Installation

pip install scrapy-poet

Requires Python 3.8+ and Scrapy >= 2.6.0.

Usage in a Scrapy Project

Add the following inside Scrapy’s settings.py file:

DOWNLOADER_MIDDLEWARES = {
    "scrapy_poet.InjectionMiddleware": 543,
}
SPIDER_MIDDLEWARES = {
    "scrapy_poet.RetryMiddleware": 275,
}

Developing

Setup your local Python environment via:

  1. pip install -r requirements-dev.txt

  2. pre-commit install

Now everytime you perform a git commit, these tools will run against the staged files:

  • black

  • isort

  • flake8

You can also directly invoke pre-commit run –all-files or tox -e linters to run them without performing a commit.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapy-poet-0.15.1.tar.gz (49.4 kB view details)

Uploaded Source

Built Distribution

scrapy_poet-0.15.1-py3-none-any.whl (26.8 kB view details)

Uploaded Python 3

File details

Details for the file scrapy-poet-0.15.1.tar.gz.

File metadata

  • Download URL: scrapy-poet-0.15.1.tar.gz
  • Upload date:
  • Size: 49.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for scrapy-poet-0.15.1.tar.gz
Algorithm Hash digest
SHA256 9a090e91e940c5e027db45d63cfe569a6a09fe25d6b964658e3c5a5e666317f0
MD5 f9d96c8b9fb3cd2694b594694fbf1d90
BLAKE2b-256 238b5c7e5be9ebf374e1aae02221f01d7206fbda38e495773d1513d6013f7b9c

See more details on using hashes here.

Provenance

File details

Details for the file scrapy_poet-0.15.1-py3-none-any.whl.

File metadata

  • Download URL: scrapy_poet-0.15.1-py3-none-any.whl
  • Upload date:
  • Size: 26.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for scrapy_poet-0.15.1-py3-none-any.whl
Algorithm Hash digest
SHA256 198efaec9a35a2af3644b1b89670a43fda6f1e63eff6e43a9670cdc3f57e9d06
MD5 a971bb2119e841405d07631f9157a50e
BLAKE2b-256 3030b6984c7046f5a65209c4585307a01e6df69aed56f3751a1057063e659583

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page