Skip to main content

Fetch a given sitemap and retrieve all URLs in it.

Project description

fetch-sitemap

Retrieves all URLs of a given sitemap.xml URL and fetches each page one by one. Useful for (load) testing the entire site for error responses.

Sample Output

Installation

$ pip install fetch-sitemap

Usage

$ fetch-sitemap --help

 Usage: fetch-sitemap [OPTIONS] SITEMAP_URL

 Fetch a given sitemap and retrieve all URLs in it.

╭─ Options ──────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --basic-auth         -a  TEXT       Basic auth information. Format: 'username:password'                            │
│ --limit              -l  INTEGER    Maximum number of URLs to fetch from the given sitemap.xml.                    │
│ --concurrency-limit  -c  INTEGER    Max number of concurrent requests. [default: 5]                                │
│ --request-timeout    -t  INTEGER    Timeout for fetching a URL in seconds. [default: 30]                           │
│ --random             -r             Append a random string like ?12334232343 to each URL to bypass frontend cache. │
│ --random-length          INTEGER    Length of the --random hash. [default: 15]                                     │
│ --report-path        -p  FILE       Store results in a CSV file. Example: ./report.csv                             │
│ --output-dir         -o  DIRECTORY  Store all fetched sitemap documents in this folder. Example:                   │
│                                     /tmp/my.domain.com/                                                            │
│ --slow-threshold         FLOAT      Responses slower than this (in seconds) are considered 'slow'. [default: 5.0]  │
│ --slow-num               INTEGER    How many 'slow' responses to show. [default: 10]                               │
│ --version            -v             Show the version and exit.                                                     │
│ --help                              Show this message and exit.                                                    │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

🤺 Local Development

poetry install
poetry run fetch-sitemap -h
poetry run ./tests.sh

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fetch_sitemap-17.tar.gz (6.3 kB view details)

Uploaded Source

Built Distribution

fetch_sitemap-17-py3-none-any.whl (7.2 kB view details)

Uploaded Python 3

File details

Details for the file fetch_sitemap-17.tar.gz.

File metadata

  • Download URL: fetch_sitemap-17.tar.gz
  • Upload date:
  • Size: 6.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.3 Darwin/23.4.0

File hashes

Hashes for fetch_sitemap-17.tar.gz
Algorithm Hash digest
SHA256 3bbdea6dbe949a7bbd760f67b086dde8271b374e620030c3f8986c16afd09d8c
MD5 ce0fe7d53bbac08f4ff841080dd98470
BLAKE2b-256 09bdb4f8f24bbc9c70b5a4e99e78c4883826abbc98a5de6bd500af74791e3a32

See more details on using hashes here.

File details

Details for the file fetch_sitemap-17-py3-none-any.whl.

File metadata

  • Download URL: fetch_sitemap-17-py3-none-any.whl
  • Upload date:
  • Size: 7.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.3 Darwin/23.4.0

File hashes

Hashes for fetch_sitemap-17-py3-none-any.whl
Algorithm Hash digest
SHA256 64c7ad75c7f6f13ffbcb4c377849e70bf667cde543e4833a4a3caa29dfa2ae33
MD5 a37f5f931e2bed23e6cf085244796468
BLAKE2b-256 62921c7d344114ba18f9d81a3e9ed2e35d1933a165bfeadc2fc1acc26b815303

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page