Skip to main content

Sitemap generation for ASGI applications.

Project description

asgi-sitemaps

Build Status Coverage Python versions Package version

Sitemap generation for ASGI applications. Inspired by Django's sitemap framework.

Note: This is alpha software. Be sure to pin your dependencies to the latest minor release.

Contents

Features

  • Build and compose sitemap sections into a single dynamic ASGI endpoint.
  • Supports drawing sitemap items from a variety of sources (static lists, (async) ORM queries, etc).
  • Compatible with any ASGI framework.
  • Fully type annotated.
  • 100% test coverage.

Installation

Install with pip:

$ pip install asgi-sitemaps

asgi-sitemaps requires Python 3.7+.

Quickstart

Let's build a static sitemap for a "Hello, world!" application. The sitemap will contain a single URL entry for the home / endpoint.

Here is the project file structure:

.
└── server
    ├── __init__.py
    ├── app.py
    └── sitemap.py

First, declare a sitemap section by subclassing Sitemap, then wrap it in a SitemapApp:

# server/sitemap.py
import asgi_sitemaps

class Sitemap(asgi_sitemaps.Sitemap):
    async def items(self):
        return ["/"]

    def location(self, item: str):
        return item

    def changefreq(self, item: str):
        return "monthly"

sitemap = asgi_sitemaps.SitemapApp(Sitemap(), domain="example.io")

Now, register the sitemap endpoint as a route onto your ASGI app. For example, if using Starlette:

# server/app.py
from starlette.applications import Starlette
from starlette.responses import PlainTextResponse
from starlette.routing import Route
from .sitemap import sitemap

async def home(request):
    return PlainTextResponse("Hello, world!")

routes = [
    Route("/", home),
    Route("/sitemap.xml", sitemap),
]

app = Starlette(routes=routes)

Serve the app using $ uvicorn server.app:app, then request the sitemap:

curl http://localhost:8000/sitemap.xml
<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>http://example.io/</loc>
    <changefreq>monthly</changefreq>
    <priority>0.5</priority>
  </url>
</urlset>

Tada!

To learn more:

  • See How-To for more advanced usage, including splitting the sitemap in multiple sections, and dynamically generating entries from database queries.
  • See the Sitemap API reference for all supported sitemap options.

How-To

Sitemap sections

You can combine multiple sitemap classes into a single sitemap endpoint. This is useful to split the sitemap in multiple sections that may have different items() and/or sitemap attributes. Such sections could be static pages, blog posts, recent articles, etc.

To do so, declare multiple sitemap classes, then pass them as a list to SitemapApp:

# server/sitemap.py
import asgi_sitemaps

class StaticSitemap(asgi_sitemaps.Sitemap):
    ...

class BlogSitemap(asgi_sitemaps.Sitemap):
    ...

sitemap = asgi_sitemaps.SitemapApp([StaticSitemap(), BlogSitemap()], domain="example.io")

Entries from each sitemap will be concatenated when building the final sitemap.xml.

Dynamic generation from database queries

Sitemap.items() supports consuming any async iterable. This means you can easily integrate with an async database client or ORM so that Sitemap.items() fetches and returns relevant rows for generating your sitemap.

Here's an example using Databases, assuming you have a Database instance in server/resources.py:

# server/sitemap.py
import asgi_sitemaps
from .resources import database

class Sitemap(asgi_sitemaps.Sitemap):
    async def items(self):
        query = "SELECT permalink, updated_at FROM articles;"
        return await database.fetch_all(query)

    def location(self, row: dict):
        return row["permalink"]

Advanced web framework integration

While asgi-sitemaps is framework-agnostic, you can use the .scope attribute available on Sitemap instances to feed the ASGI scope into your framework-specific APIs for inspecting and manipulating request information.

Here is an example with Starlette where we build sitemap of static pages. To decouple from the raw URL paths, pages are referred to by view name. We reverse-lookup their URLs by building a Request instance from the ASGI .scope, and using .url_for():

# server/sitemap.py
import asgi_sitemaps
from starlette.requests import Request

class StaticSitemap(asgi_sitemaps.Sitemap):
    def items(self):
        return ["home", "about", "blog:home"]

    def location(self, name: str):
        request = Request(scope=self.scope)
        return request.url_for(name)

The corresponding Starlette routing table could look something like this:

# server/routes.py
from starlette.routing import Mount, Route
from . import views
from .sitemap import sitemap

routes = [
    Route("/", views.home, name="home"),
    Route("/about", views.about, name="about"),
    Route("/blog/", views.blog_home, name="blog:home"),
    Route("/sitemap.xml", sitemap),
]

API Reference

class Sitemap

Represents a source of sitemap entries.

You can specify the type T of sitemap items for extra type safety:

import asgi_sitemaps

class MySitemap(asgi_sitemaps.Sitemap[str]):
    ...

async items

Signature: async def () -> Union[Iterable[T], AsyncIterable[T]]

(Required) Return an iterable or an asynchronous iterable of items of the same type. Each item will be passed as-is to .location(), .lastmod(), .changefreq(), and .priority().

Examples:

# Simplest usage: return a list
def items(self) -> List[str]:
    return ["/", "/contact"]

# Async operations are also supported
async def items(self) -> List[dict]:
    query = "SELECT permalink, updated_at FROM pages;"
    return await database.fetch_all(query)

# Sync and async generators are also supported
async def items(self) -> AsyncIterator[dict]:
    query = "SELECT permalink, updated_at FROM pages;"
    async for row in database.aiter_rows(query):
        yield row

location

Signature: def (item: T) -> str

(Required) Return the absolute path of a sitemap item.

"Absolute path" means an URL path without a protocol or domain. For example: /blog/my-article. (So https://mydomain.com/blog/my-article is not a valid location, nor is mydomain.com/blog/my-article.)

lastmod

Signature: def (item: T) -> Optional[datetime.datetime]

(Optional) Return the date of last modification of a sitemap item as a datetime object, or None (the default) for no lastmod field.

changefreq

Signature: def (item: T) -> Optional[str]

(Optional) Return the change frequency of a sitemap item.

Possible values are:

  • None - No changefreq field (the default).
  • "always"
  • "hourly"
  • "daily"
  • "weekly"
  • "monthly"
  • "yearly"
  • "never"

priority

Signature: def (item: T) -> float

(Optional) Return the priority of a sitemap item. Must be between 0 and 1. Defaults to 0.5.

protocol

Type: str

(Optional) This attribute defines the protocol used to build URLs of the sitemap.

Possible values are:

  • "auto" - The protocol with which the sitemap was requested (the default).
  • "http"
  • "https"

scope

This property returns the ASGI scope of the current HTTP request.

class SitemapApp

An ASGI application that responds to HTTP requests with the sitemap.xml contents of the sitemap.

Parameters:

  • (Required) sitemaps - A Sitemap object or a list of Sitemap objects, used to generate sitemap entries.
  • (Required) domain - The domain to use when generating sitemap URLs.

Examples:

sitemap = SitemapApp(Sitemap(), domain="mydomain.com")
sitemap = SitemapApp([StaticSitemap(), BlogSitemap()], domain="mydomain.com")

License

MIT

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog.

0.3.0b1 - 2020-07-05

Beta release.

0.2.0 - 2020-06-01

Changed

  • Project was renamed from sitemaps to asgi-sitemaps - sitemap generation for ASGI apps. (Pull #2)
  • Change options of CLI and programmatic API to fit new "ASGI-only" project scope. (Pull #2)
  • CLI now reads from stdin (for --check mode) and outputs sitemap to stdout. (Pull #2)

Removed

  • Drop support for crawling arbitrary remote servers. (Pull #2)

Fixed

  • Don't include non-200 or non-HTML URLs in sitemap. (Pull #2)

0.1.0 - 2020-05-31

Added

  • Initial implementation: CLI and programmatic async API.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

asgi-sitemaps-0.3.0b1.tar.gz (20.9 kB view details)

Uploaded Source

Built Distribution

asgi_sitemaps-0.3.0b1-py3-none-any.whl (21.4 kB view details)

Uploaded Python 3

File details

Details for the file asgi-sitemaps-0.3.0b1.tar.gz.

File metadata

  • Download URL: asgi-sitemaps-0.3.0b1.tar.gz
  • Upload date:
  • Size: 20.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.2

File hashes

Hashes for asgi-sitemaps-0.3.0b1.tar.gz
Algorithm Hash digest
SHA256 91ac310f142e6113ffc1810ee64b91b9a16a681a1b4c816e3fd199ff42c441a0
MD5 61e04fd0b7f15ed81bede840b55893ef
BLAKE2b-256 9da417b4a8ab68d311fbcb5336016a7bef5c78dcb76aa15f8eea8242e424aa7e

See more details on using hashes here.

File details

Details for the file asgi_sitemaps-0.3.0b1-py3-none-any.whl.

File metadata

  • Download URL: asgi_sitemaps-0.3.0b1-py3-none-any.whl
  • Upload date:
  • Size: 21.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.2

File hashes

Hashes for asgi_sitemaps-0.3.0b1-py3-none-any.whl
Algorithm Hash digest
SHA256 fb3d5939b01166a1ac8dec6a6c6875055656a1466f158e353af1c60bb2f1178e
MD5 536e164bc1dd2cc60d2046f4fb6b7d7c
BLAKE2b-256 84d94a8213732f199f996991c276628950482c84c918af3c3d80bd6b91dd23ad

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page