13 projects
extruct
Extract embedded metadata from HTML markup
scrapyd-client
A client for Scrapyd
scrapyd
A service for running Scrapy spiders, with an HTTP API
cssselect
cssselect parses CSS3 Selectors and translates them to XPath 1.0
scrapy-deltafetch
Scrapy middleware to ignore previously crawled pages
scrapy-jsonschema
Scrapy schema validation pipeline and Item builder using JSON Schema
sketchtml
Helper library to experiment with HTML fingerprinting.
scrapylib
Scrapy helper functions and processors
scrapy-splitvariants
Scrapy spider middleware to split an item into multiple items on a multi-valued key
scrapy-hcf
Scrapy spider middleware to use Scrapinghub's Hub Crawl Frontier as a backend for URLs
scrapy-querycleaner
Scrapy spider middleware to clean up query parameters in request URLs
scrapy-magicfields
Scrapy middleware to add extra "magic" fields to items
parslepy
Parsley extraction library using lxml