Zyte's Page Object pattern for web scraping
Project description
web-poet implements Page Object pattern for web scraping. It defines a standard for writing web data extraction code, which allows the code to be portable & reusable.
License is BSD 3-clause.
Installation
pip install web-poet
It requires Python 3.7+.
Overview
web-poet is a library which defines a standard on how to write and organize web data extraction code.
If web scraping code is written as web-poet Page Objects, it can be reused in different contexts. For example, such code can be developed in an IPython notebook, then tested in isolation, and then plugged into a Scrapy spider, or used as a part of some custom aiohttp-based web scraping framework.
Currently, the following integrations are available:
Scrapy, via scrapy-poet
See Documentation for more.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.