Skip to main content

A simple web spider with pluggable recursion strategies

Project description

A simple web spider with several recursion strategies.

It doesn’t do much except follow links and report status. I mostly use it for quick and dirty smoke testing and link checking.

The only unusual feature is the –traversal=pattern option, which does recursive traversal in an unusual order: It tries to recognize patterns in URLs, and will follow URLs of novel patterns before those with patterns it has seen before. If you use this for smoke-testing a typical modern web app, it will very quickly hit all your views/controllers at least once… usually.

Also, it’s designed so that adding a new recursion strategy is trivial. Spydey was originally written for the purpose of experimenting with different recursive crawling strategies. Read the source.

Oh, and if you install Fabulous, console output is in color.

For smoke testing, I typically run it like:

spydey -r --max-requests=100 --traversal=pattern --profile --log-referrer URL

There are a number of other command-line options, many stolen from wget. Use –help to see what they are.

Home page is at http://github.com/slinkp/spydey.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spydey-0.1.tar.gz (7.9 kB view details)

Uploaded Source

File details

Details for the file spydey-0.1.tar.gz.

File metadata

  • Download URL: spydey-0.1.tar.gz
  • Upload date:
  • Size: 7.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for spydey-0.1.tar.gz
Algorithm Hash digest
SHA256 098026f9d5da35b15282aa6ef035861bd1a040545d59240a320a9cb87a7269f5
MD5 56d159ae8de9fcc889fa6c43bff49707
BLAKE2b-256 eb3325763e044afb693e0053e6f4d8445ed17e28bb862d36c1c64833a0a9cb90

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page