Skip to main content

Page Object pattern for Scrapy

Project description

PyPI Version Supported Python Versions Build Status Coverage report Documentation Status

scrapy-poet is the web-poet Page Object pattern implementation for Scrapy. scrapy-poet allows to write spiders where extraction logic is separated from the crawling one. With scrapy-poet is possible to make a single spider that supports many sites with different layouts.

Read the documentation for more information.

License is BSD 3-clause.

Quick Start

Installation

pip install scrapy-poet

Requires Python 3.7+ and Scrapy >= 2.6.0.

Usage in a Scrapy Project

Add the following inside Scrapy’s settings.py file:

DOWNLOADER_MIDDLEWARES = {
    "scrapy_poet.InjectionMiddleware": 543,
}
SPIDER_MIDDLEWARES = {
    "scrapy_poet.RetryMiddleware": 275,
}

Developing

Setup your local Python environment via:

  1. pip install -r requirements-dev.txt

  2. pre-commit install

Now everytime you perform a git commit, these tools will run against the staged files:

  • black

  • isort

  • flake8

You can also directly invoke pre-commit run –all-files or tox -e linters to run them without performing a commit.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapy-poet-0.6.0.tar.gz (33.7 kB view details)

Uploaded Source

Built Distribution

scrapy_poet-0.6.0-py3-none-any.whl (20.7 kB view details)

Uploaded Python 3

File details

Details for the file scrapy-poet-0.6.0.tar.gz.

File metadata

  • Download URL: scrapy-poet-0.6.0.tar.gz
  • Upload date:
  • Size: 33.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.0

File hashes

Hashes for scrapy-poet-0.6.0.tar.gz
Algorithm Hash digest
SHA256 f9b21e7e00423b0e275572840d180bcd0fb744e03a580873c1f35f8bc9d427d4
MD5 23ef6e1cdf48e7368446131420a5da79
BLAKE2b-256 ca32cc1a4f19cb5468e9ce8f0eed4b5f657965955a4940b6d0015d1f1bd1a58b

See more details on using hashes here.

Provenance

File details

Details for the file scrapy_poet-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: scrapy_poet-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 20.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.0

File hashes

Hashes for scrapy_poet-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 64555204026e883bdd05f0958b8ab0369e05108bb8d8ee2bf3ac6133f3449aee
MD5 475a6ab7fc87b790303c0f1521be34a0
BLAKE2b-256 92df01caef9023e408f3a036989b577d5bcf0ec64333b732bc4ff13bf255d7e1

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page