Skip to main content

Scrapy spider middleware to split an item into multiple items on a multi-valued key

Project description

https://travis-ci.org/scrapy-plugins/scrapy-splitvariants.svg?branch=master https://codecov.io/gh/scrapy-plugins/scrapy-splitvariants/branch/master/graph/badge.svg

SplitVariantsMiddleware is a Scrapy spider middleware used to split single items into multiple items when they have a “variants” key with multiple values.

Example usage

Let’s assume your spider outputs an item with different size options (from an ecommerce website for example):

item = {"id": 12,
        "name": "Big chair",
        "variants": [{"size": "XL", "price": 200, "currency": "USD"},
                     {"size": "L", "price": 100, "currency": "USD"}]}

When you enable SplitVariantsMiddleware, this single item will become 2 items with the different variants values into a different item:

{"id": 12, "name": "Big chair", "size": "XL", "price": 200, "currency": "USD"}
{"id": 12, "name": "Big chair", "size": "L", "price": 100, "currency": "USD"}

Installation

Install scrapy-splitvariants using pip:

$ pip install scrapy-splitvariants

Configuration

  1. Add SplitVariantsMiddleware by including it in SPIDER_MIDDLEWARES in your settings.py file:

    SPIDER_MIDDLEWARES = {
        'scrapy_splitvariants.SplitVariantsMiddleware': 100,
    }

    Here, priority 100 is just an example. Set its value depending on other middlewares you may have enabled already.

  2. Enable the middleware using SPLITVARIANTS_ENABLED set to True in your setting.py.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapy-splitvariants-1.1.0.tar.gz (2.1 kB view details)

Uploaded Source

Built Distribution

scrapy_splitvariants-1.1.0-py2.py3-none-any.whl (3.1 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file scrapy-splitvariants-1.1.0.tar.gz.

File metadata

File hashes

Hashes for scrapy-splitvariants-1.1.0.tar.gz
Algorithm Hash digest
SHA256 43f393f0380e59e0f8a80771102054f6a9660c5e8cb0f716c517c07b6e2bfdcd
MD5 82879a91f37d2b0d4590668b9846d207
BLAKE2b-256 ec8c457f1f79de762e2afa8f8cfd163744edd3b586c17b7556eaef9586ffed5d

See more details on using hashes here.

File details

Details for the file scrapy_splitvariants-1.1.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for scrapy_splitvariants-1.1.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 da425cdaa03101406f076bd9600429d997ed395288e47bd4032e0a0b23d9f478
MD5 f0ea6737148ae0e8b3834d648f6c96fe
BLAKE2b-256 33fdb5a0d2d0c8a4ba9636c09f6d47cc5d1ef56561a076c5f9aae89b6e5b68f2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page