Fetch a given sitemap and retrieve all URLs in it.
Project description
fetch-sitemap
Retrieves all URL's of a given sitemap.xml URL and fetches each page one by one. Useful for (load) testing the entire site for error responses.
Note: The default concurrency limit is 5, so 5 URLs are fetched at once.
Depending on your server's worker count, this might already be enough to DoS it.
Try --concurrency-limit=2
and increase if you feel comfortable.
Usage: fetch-sitemap [-h] [--basic-auth BASIC_AUTH] [-l LIMIT] [-c CONCURRENCY_LIMIT]
[-t REQUEST_TIMEOUT] [--random] [--report-path REPORT_PATH]
[-o OUTPUT] [-v]
sitemap_url
Fetch a given sitemap and retrieve all URLs in it.
Positional Arguments:
sitemap_url URL of the sitemap to fetch
Options:
-h, --help show this help message and exit
--basic-auth BASIC_AUTH
Basic auth information. Use: 'username:password' (default: None)
-l, --limit LIMIT Maximum number of URLs to fetch from the given sitemap.xml
(default: None)
-c, --concurrency-limit CONCURRENCY_LIMIT
Max number of concurrent requests (default: 5)
-t, --request-timeout REQUEST_TIMEOUT
Timeout for fetching a URL in seconds (default: 30)
--random Append a random string like ?12334232343 to each URL to bypass
frontend cache (default: False)
--report-path REPORT_PATH
Store results in a CSV file (example: ./report.csv) (default:
None)
-o, --output-dir OUTPUT
Store all fetched sitemap documents in this folder (default: None)
-v, --version Show program's version number and exit```
🤺 Local Development
poetry install
poetry run fetch-sitemap -h
poetry run ./tests.sh
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
fetch_sitemap-12.tar.gz
(6.1 kB
view details)
Built Distribution
File details
Details for the file fetch_sitemap-12.tar.gz
.
File metadata
- Download URL: fetch_sitemap-12.tar.gz
- Upload date:
- Size: 6.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.12.3 Darwin/23.4.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d15acb8fbde4cf61ec422672284e5dfbe77af20dd544d964340bca66e2eef27a |
|
MD5 | 09e8619043c5d57a38edecb40d4cafd8 |
|
BLAKE2b-256 | 16f8e7a03566c4e53d0016bdbbc1085cbff98fe0ea41913d5d407c19c98b8568 |
File details
Details for the file fetch_sitemap-12-py3-none-any.whl
.
File metadata
- Download URL: fetch_sitemap-12-py3-none-any.whl
- Upload date:
- Size: 6.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.12.3 Darwin/23.4.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 03a57cf5693a9369e3134a667a0cd765e1d9af4ac73af2a50818a2078f36d6b8 |
|
MD5 | ba43e6f91a24640321b276e2067186b4 |
|
BLAKE2b-256 | aa8e3d8267be4e4ef3ad5ba344e4952294f4095586dfd7c9b76e06a7f29d4b50 |