18 projects
Scrapy
A high-level Web Crawling and Web Scraping framework
scrapyd
A service for running Scrapy spiders, with an HTTP API
w3lib
Library of web-related functions
queuelib
Collection of persistent (disk-based) and non-persistent (memory-based) queues
parsel
Parsel is a library to extract data from HTML and XML using XPath and CSS selectors
shub
Scrapinghub Command Line Client
dateparser
Date parsing library designed to parse dates from HTML pages
scrapy-crawlera
Crawlera middleware for Scrapy
splash
A javascript rendered with a HTTP API
scrapely
A pure-python HTML screen-scraping library
slybot
Slybot crawler
frontera
A scalable frontier for web crawlers
webstruct
A library for creating statistical NER systems that work on HTML data
hubstorage
Client interface for Scrapinghub HubStorage
scrapylib
Scrapy helper functions and processors
adblockparser
Parser for Adblock Plus rules
scrapy-dotpersistence
Scrapy extension to sync `.scrapy` folder to an S3 bucket
scrapyjs
JavaScript support for Scrapy using Splash