19 projects
Scrapy
A high-level Web Crawling and Web Scraping framework
scrapyd
A service for running Scrapy spiders, with an HTTP API
w3lib
Library of web-related functions
queuelib
Collection of persistent (disk-based) and non-persistent (memory-based) queues
shub
Scrapinghub Command Line Client
dateparser
Date parsing library designed to parse dates from HTML pages
scrapinghub
Client interface for Scrapinghub API
scrapy-crawlera
Crawlera middleware for Scrapy
splash
A javascript rendered with a HTTP API
scrapely
A pure-python HTML screen-scraping library
slybot
Slybot crawler
webstruct
A library for creating statistical NER systems that work on HTML data
hubstorage
Client interface for Scrapinghub HubStorage
scrapylib
Scrapy helper functions and processors
adblockparser
Parser for Adblock Plus rules
scrapy-dotpersistence
Scrapy extension to sync `.scrapy` folder to an S3 bucket
scrapyjs
JavaScript support for Scrapy using Splash
flatson
Tool to flatten stream of JSON-like objects, configured via schema
crawl-frontier
A flexible frontier for web crawlers