Skip to main content

Fetch a given sitemap and retrieve all URLs in it.

Project description

fetch-sitemap

Retrieves all URL's of a given sitemap.xml URL and fetches each page one by one. Useful for (load) testing the entire site for error responses.

Sample Output

Note: The default concurrency limit is 10, so 10 URLs are fetched at once. Depending on your server's worker count, this might already be enough to DoS it. Try --concurrency-limit=2 and increase if you feel comfortable.

usage: fetch-sitemap 
    [-h] 
    [--basic-auth BASIC_AUTH] 
    [-l LIMIT] 
    [-c CONCURRENCY_LIMIT] 
    [-t REQUEST_TIMEOUT] 
    [--report-path REPORT_PATH] 
    sitemap_url

Fetch a given sitemap and retrieve all URLs in it.

positional arguments:
  sitemap_url           URL of the sitemap to fetch

options:
  -h, --help            show this help message and exit
  --basic-auth BASIC_AUTH
                        Basic auth information. Use: 'username:password'.
  -l LIMIT, --limit LIMIT
                        Max number of URLs to fetch from the given sitemap.xml. Default: All
  -c CONCURRENCY_LIMIT, --concurrency-limit CONCURRENCY_LIMIT
                        Max number of concurrent requests. Default: 10
  -t REQUEST_TIMEOUT, --request-timeout REQUEST_TIMEOUT
                        Timeout for fetching a URL. Default: 30
  --random              Append a random string like ?12334232343 to each URL to bypass frontend cache. Default: False
  --report-path REPORT_PATH
                        Store results in a CSV file. Example: ./report.csv

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fetch-sitemap-3.tar.gz (497.7 kB view details)

Uploaded Source

Built Distribution

fetch_sitemap-3-py3-none-any.whl (6.4 kB view details)

Uploaded Python 3

File details

Details for the file fetch-sitemap-3.tar.gz.

File metadata

  • Download URL: fetch-sitemap-3.tar.gz
  • Upload date:
  • Size: 497.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.1

File hashes

Hashes for fetch-sitemap-3.tar.gz
Algorithm Hash digest
SHA256 e056e71f753b5cd451a459bc609cae7e329b3b7e42c6b0e365e56612bb7141d3
MD5 9606f4dc1000c248ee1dac582584e2f9
BLAKE2b-256 edef0d3051a91659fb521821e2db0e047aba3bff194105314f85ac4e881e7bc9

See more details on using hashes here.

File details

Details for the file fetch_sitemap-3-py3-none-any.whl.

File metadata

  • Download URL: fetch_sitemap-3-py3-none-any.whl
  • Upload date:
  • Size: 6.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.1

File hashes

Hashes for fetch_sitemap-3-py3-none-any.whl
Algorithm Hash digest
SHA256 1b77510d1b350d51317baf6fe6ecff695254b0e2edff2733932ee63605b8e7c2
MD5 13e4fe0e0b19e471d04bf9bc12944c54
BLAKE2b-256 1c2af5e15718e9ac840045cca7c024d72b614aac98167f75e8b75caa7bf5ec15

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page