Skip to main content

Page Object pattern for Scrapy

Project description

PyPI Version Supported Python Versions Build Status Coverage report Documentation Status

scrapy-poet is the web-poet Page Object pattern implementation for Scrapy. scrapy-poet allows to write spiders where extraction logic is separated from the crawling one. With scrapy-poet is possible to make a single spider that supports many sites with different layouts.

Read the documentation for more information.

License is BSD 3-clause.

Quick Start

Installation

pip install scrapy-poet

Requires Python 3.7+ and Scrapy >= 2.6.0.

Usage in a Scrapy Project

Add the following inside Scrapy’s settings.py file:

DOWNLOADER_MIDDLEWARES = {
    "scrapy_poet.InjectionMiddleware": 543,
}
SPIDER_MIDDLEWARES = {
    "scrapy_poet.RetryMiddleware": 275,
}

Developing

Setup your local Python environment via:

  1. pip install -r requirements-dev.txt

  2. pre-commit install

Now everytime you perform a git commit, these tools will run against the staged files:

  • black

  • isort

  • flake8

You can also directly invoke pre-commit run –all-files or tox -e linters to run them without performing a commit.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapy-poet-0.7.0.tar.gz (36.7 kB view details)

Uploaded Source

Built Distribution

scrapy_poet-0.7.0-py3-none-any.whl (21.6 kB view details)

Uploaded Python 3

File details

Details for the file scrapy-poet-0.7.0.tar.gz.

File metadata

  • Download URL: scrapy-poet-0.7.0.tar.gz
  • Upload date:
  • Size: 36.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.1

File hashes

Hashes for scrapy-poet-0.7.0.tar.gz
Algorithm Hash digest
SHA256 065495ac684ea31473369aedf0f9683e02d1992d1ae1881a2c6f0d6e8acb4fb8
MD5 683e058561e61f5ed4482dfc98dc2e4e
BLAKE2b-256 b4839c874ab4c663ea45f35bfde2a5724f00b67454646d9359210c0eb9e22242

See more details on using hashes here.

Provenance

File details

Details for the file scrapy_poet-0.7.0-py3-none-any.whl.

File metadata

  • Download URL: scrapy_poet-0.7.0-py3-none-any.whl
  • Upload date:
  • Size: 21.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.1

File hashes

Hashes for scrapy_poet-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5bd5620f214fc0f6d174bb0cba35a3e0a53a7a6f746330fd0a8b8182be5c69d9
MD5 bfe08d09eac2b6a27029f58b015e5318
BLAKE2b-256 a6696b121328ac15a49262d0848b8b54ce8f7fb4dffad1a7f886ab6353a171ad

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page