Skip to main content

Fetch a given sitemap and retrieve all URLs in it.

Project description

fetch-sitemap

Retrieves all URLs of a given sitemap.xml URL and fetches each page one by one. Useful for (load) testing the entire site for error responses.

Sample Output

Installation

$ pip install fetch-sitemap

Usage

$ fetch-sitemap --help

 Usage: fetch-sitemap [OPTIONS] SITEMAP_URL

 Fetch a given sitemap and retrieve all URLs in it.

╭─ Options ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --basic-auth                -a  TEXT              Basic auth information. Format: 'username:password'                                                          │
│ --limit                     -l  INT [>=1]         Maximum number of URLs to fetch from the given sitemap.xml.                                                  │
│ --recursive/--no-recursive                        Recursively fetch all sitemap documents from the given sitemap.xml. [default: recursive]                     │
│ --concurrency-limit         -c  INT [>=1]         Max number of concurrent requests. [default: 5; >=1]                                                         │
│ --request-timeout           -t  INT [>=1]         Timeout for fetching a URL in seconds. [default: 30; >=1]                                                    │
│ --random                    -r                    Append a random string like ?12334232343 to each URL to bypass frontend cache.                               │
│ --random-length                 INT [1 to 100]    Length of the --random hash. [default: 15; 1 to 100]                                                         │
│ --report-path               -p  FILE              Store results in a CSV file. Example: ./report.csv                                                           │
│ --output-dir                -o  DIRECTORY         Store all fetched sitemap documents in this folder. Example: /tmp/my.domain.com/                             │
│ --slow-threshold                FLOAT [>=0.0]     Responses slower than this (in seconds) are considered 'slow'. [default: 5.0; >=0.0]                         │
│ --slow-num                      INTEGER OR "ALL"  How many 'slow' responses to show. [default: 10]                                                             │
│ --user-agent                    TEXT              User-Agent string set in the HTTP header. [default: Mozilla/5.0 (compatible; fetch-sitemap/23)]              │
│ --version                                         Show the version and exit.                                                                                   │
│ --help                                            Show this message and exit.                                                                                  │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

🤺 Local Development

poetry install
poetry run fetch-sitemap -h
poetry run ./tests.sh

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fetch_sitemap-27.tar.gz (9.2 kB view details)

Uploaded Source

Built Distribution

fetch_sitemap-27-py3-none-any.whl (9.9 kB view details)

Uploaded Python 3

File details

Details for the file fetch_sitemap-27.tar.gz.

File metadata

  • Download URL: fetch_sitemap-27.tar.gz
  • Upload date:
  • Size: 9.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.7 Darwin/24.0.0

File hashes

Hashes for fetch_sitemap-27.tar.gz
Algorithm Hash digest
SHA256 ff888992d0e3eee82075b42f5d441784ad3d20816ebc3f480cd2a0750a504052
MD5 3acd3e540c3ed5bc239399c9d62a0fd9
BLAKE2b-256 5ff7a438aafe4b8c25943177c300dc6067523d2df642393682b9587ccc8a2d44

See more details on using hashes here.

File details

Details for the file fetch_sitemap-27-py3-none-any.whl.

File metadata

  • Download URL: fetch_sitemap-27-py3-none-any.whl
  • Upload date:
  • Size: 9.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.7 Darwin/24.0.0

File hashes

Hashes for fetch_sitemap-27-py3-none-any.whl
Algorithm Hash digest
SHA256 4f9f4606303c416a46be20ff82450b87391ece12fef33bd810da1514b1519ad4
MD5 5bfdba457048eec23c644589d86ce6d2
BLAKE2b-256 09a38e87dab6872b12ee477adc5005e30c3c01a5ef6cbe95b6fa0315bacb569e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page