Skip to main content

Common interface for Scrapy items

Project description

itemadapter

version pyversions actions codecov

The ItemAdapter class is a wrapper for data container objects, providing a common interface to handle objects of different types in an uniform manner, regardless of their underlying implementation.

This package started as an initiative to support dataclass objects as items [1]. It was extracted out to a standalone package in order to allow it to be used independently.

Currently supported types are:

Requirements

  • Python 3.5+
  • dataclasses (stdlib in Python 3.7+, or its backport in Python 3.6): optional, needed to interact with dataclass-based items
  • attrs: optional, needed to interact with attrs-based items

API

ItemAdapter class

class itemadapter.ItemAdapter(item: Any)

ItemAdapter implements the MutableMapping interface, providing a dict-like API to manipulate data for the object it wraps (which is modified in-place).

Two additional methods are available:

get_field_meta(field_name: str) -> MappingProxyType

Return a MappingProxyType object with metadata about the given field, or raise TypeError if the item class does not support field metadata.

The returned value is taken from the following sources, depending on the item type:

field_names() -> List[str]

Return a list with the names of all the defined fields for the item.

is_item function

itemadapter.is_item(obj: Any) -> bool

Return True if the given object belongs to one of the supported types, False otherwise.

Metadata support

scrapy.item.Item, dataclass and attrs objects allow the inclusion of arbitrary field metadata, which can be retrieved with the ItemAdapter.get_field_meta method. The definition procedure depends on the underlying type.

scrapy.item.Item objects

>>> from scrapy.item import Item, Field
>>> from itemadapter import ItemAdapter
>>> class InventoryItem(Item):
...     name = Field(serializer=str)
...     value = Field(serializer=int, limit=100)
...
>>> adapter = ItemAdapter(InventoryItem(name="foo", value=10))
>>> adapter.get_field_meta("name")
mappingproxy({'serializer': <class 'str'>})
>>> adapter.get_field_meta("value")
mappingproxy({'serializer': <class 'int'>, 'limit': 100})

dataclass objects

>>> from dataclasses import dataclass, field
>>> @dataclass
... class InventoryItem:
...     name: str = field(metadata={"serializer": str})
...     value: int = field(metadata={"serializer": int, "limit": 100})
...
>>> adapter = ItemAdapter(InventoryItem(name="foo", value=10))
>>> adapter.get_field_meta("name")
mappingproxy({'serializer': <class 'str'>})
>>> adapter.get_field_meta("value")
mappingproxy({'serializer': <class 'int'>, 'limit': 100})

attrs objects

>>> import attr
>>> @attr.s
... class InventoryItem:
...     name = attr.ib(metadata={"serializer": str})
...     value = attr.ib(metadata={"serializer": int})
...
>>> adapter = ItemAdapter(InventoryItem(name="foo", value=10))
>>> adapter.get_field_meta("name")
mappingproxy({'serializer': <class 'str'>})
>>> adapter.get_field_meta("value")
mappingproxy({'serializer': <class 'int'>})

Other types

In fact, any supported object with a fields attribute which values are mappings works:

>>> class DictWithFields(dict):
...     fields = {
...         "name": {"serializer": str},
...         "value": {"serializer": int, "limit": 100},
...     }
...
>>> adapter = ItemAdapter(DictWithFields(name="foo", value=10))
>>> adapter.get_field_meta("name")
mappingproxy({'serializer': <class 'str'>})
>>> adapter.get_field_meta("value")
mappingproxy({'serializer': <class 'int'>, 'limit': 100})

Examples

scrapy.item.Item objects

>>> from scrapy.item import Item, Field
>>> from itemadapter import ItemAdapter
>>> class InventoryItem(Item):
...     name = Field()
...     price = Field()
...
>>> item = InventoryItem(name="foo", price=10)
>>> adapter = ItemAdapter(item)
>>> adapter.item is item
True
>>> adapter["name"]
'foo'
>>> adapter["name"] = "bar"
>>> adapter["price"] = 5
>>> item
{'name': 'bar', 'price': 5}

dict

>>> from itemadapter import ItemAdapter
>>> item = dict(name="foo", price=10)
>>> adapter = ItemAdapter(item)
>>> adapter.item is item
True
>>> adapter["name"]
'foo'
>>> adapter["name"] = "bar"
>>> adapter["price"] = 5
>>> item
{'name': 'bar', 'price': 5}

dataclass objects

>>> from dataclasses import dataclass
>>> from itemadapter import ItemAdapter
>>> @dataclass
... class InventoryItem:
...     name: str
...     price: int
...
>>> item = InventoryItem(name="foo", price=10)
>>> adapter = ItemAdapter(item)
>>> adapter.item is item
True
>>> adapter["name"]
'foo'
>>> adapter["name"] = "bar"
>>> adapter["price"] = 5
>>> item
InventoryItem(name='bar', price=5)

attrs objects

>>> import attr
>>> from itemadapter import ItemAdapter
>>> @attr.s
... class InventoryItem:
...     name = attr.ib()
...     price = attr.ib()
...
>>> item = InventoryItem(name="foo", price=10)
>>> adapter = ItemAdapter(item)
>>> adapter.item is item
True
>>> adapter["name"]
'foo'
>>> adapter["name"] = "bar"
>>> adapter["price"] = 5
>>> item
InventoryItem(name='bar', price=5)

[1]: dataclass objects as items: issue and pull request

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

itemadapter-0.0.2.tar.gz (6.8 kB view details)

Uploaded Source

Built Distribution

itemadapter-0.0.2-py3-none-any.whl (6.1 kB view details)

Uploaded Python 3

File details

Details for the file itemadapter-0.0.2.tar.gz.

File metadata

  • Download URL: itemadapter-0.0.2.tar.gz
  • Upload date:
  • Size: 6.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.8.2

File hashes

Hashes for itemadapter-0.0.2.tar.gz
Algorithm Hash digest
SHA256 81a81aeae1d2662e497c06cd283c75edc7ce37895b433627010dced8c2b051d3
MD5 fda52677a5ad6084678068671e43d741
BLAKE2b-256 5b1668359577e90bfcf8512148b23e1bbeb556ae08696a6b4c96e65a86080828

See more details on using hashes here.

Provenance

File details

Details for the file itemadapter-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: itemadapter-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 6.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.8.2

File hashes

Hashes for itemadapter-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1f22a0ec8566642c8b5aa4e9739a7dab46b7de0cc6428cab5aa955ab428f079d
MD5 237121426cd6c1b4e08d5241689e1999
BLAKE2b-256 90b8fc802b01ef63fe12abb6a58c4172c6c4a425fcfc00ba356c42d554343433

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page