Library / middleware for URI-based assertions
Project description
:mod:`repoze.urispace` -- Hierarchical URI-based metadata
*********************************************************
:Author: Tres Seaver
:Version: |version|
.. module:: repoze.urispace
:synopsis: Hierarchical URI-based metadata
.. topic:: Overview
:mod:`repoze.urispace` implements the URISpace_ 1.0 spec, as proposed
to the W3C by Akamai. Its aim is to provide an implementation of
that language as a vehicle for asserting declarative metadata about a
resource based on pattern matching against its URI.
Once asserted, such metadata can be used to guide the application in
serving the resource, with possible applciations including:
- Setting cache control headers.
- Selecting externally applied themes, e.g. in :mod:`Deliverance`
- Restricting access, e.g. to emulate Zope's "placeful security."
URISpace Specification
----------------------
The URISpace_ specification provides for matching on the following
portions of a URI:
- scheme
- authority (see URIRFC_)
o host, including wildcarding (leading only) and port
o user (if specified in the URI)
- path elements, including nesting and wildcarding, as well as
parameters, where used.
- query elements, including test for presence or for specific value
- fragments (likely irrelevant for server-side applications)
.. Note:: :mod:`repoze.urispace` does not yet provide support for
fragment matching.
The asserted metadata can be scalar, or can use RDF Bag and Sequences
to indicate sets or ordered collections.
.. Note:: :mod:`repoze.urispace` does not yet provide support for
parsing multi-valued assertions using RDF.
Operators are provided to allow for incrementally updating or clearing
the value for a given metadata element. Specified operators include:
``replace``
Completely replace any previously asserted value with a new one.
This is the default operator.
``clear``
Remove any previously asserted value.
``union``
Perform a set union: ``old | new``
``intersection``
Perform a set intersection: ``old & new``
``rev-intersection``
Perform a set exclusion: ``old ^ new``
``difference``
Perform set subtraction: ``old - new``
``rev-difference``
Perform set subtraction: ``new - old``
``prepend``
Insert ``new`` values at the head of ``old`` values
``append``
Insert ``new`` values at the tail of ``old`` values
Example
-------
Suppose we want to select different Delieverance themes and or rulesets
based on the URI of the resource being themed. In particular:
- The ``news``, ``lifestyle``, and ``sports`` sections of the site each get
custom themes, with the homepage and any other sections sharing the
default theme.
- Within the news section, the ``world``, ``national``, and ``local``
sections all use a different theme URL (one with the desired color
scheme name encoded as a query string).
- Within any section, the ``index.html`` page should use a different
ruleset, than that for stories in that section (whose final path element
will be ``<slug>.html``): the index page's HTML structured very differently
from that used for stories.
A URISpace file specifying these policies would look like:
.. include:: examples/dv_news.xml
:literal:
Given that URISpace file, one can test how given URIs matches using
the ``uri_test`` script::
$ /path/to/bin/uri_test examples/dv_news.xml \
http://example.com/ \
http://example.com/foo \
http://example.com/news/ \
http://example.com/news/index.html \
http://example.com/news/world/index.html \
http://example.com/sports/ \
http://example.com/sports/world_series_2008.html
------------------------------------------------------------------------------
URI: http://example.com/
------------------------------------------------------------------------------
rules = http://static.example.com/rules/default.xml
theme = http://themes.example.com/default.html
------------------------------------------------------------------------------
URI: http://example.com/foo
------------------------------------------------------------------------------
rules = http://static.example.com/rules/default.xml
theme = http://themes.example.com/default.html
------------------------------------------------------------------------------
URI: http://example.com/news/
------------------------------------------------------------------------------
rules = http://static.example.com/rules/default.xml
theme = http://themes.example.com/news.html
------------------------------------------------------------------------------
URI: http://example.com/news/index.html
------------------------------------------------------------------------------
rules = http://static.example.com/rules/default.xml
theme = http://themes.example.com/news.html
------------------------------------------------------------------------------
URI: http://example.com/news/world/index.html
------------------------------------------------------------------------------
rules = http://static.example.com/rules/default.xml
theme = http://themes.example.com/news.html?style=world
------------------------------------------------------------------------------
URI: http://example.com/sports/
------------------------------------------------------------------------------
rules = http://static.example.com/rules/default.xml
theme = http://themes.example.com/sports.html
------------------------------------------------------------------------------
URI: http://example.com/sports/world_series_2008.html
------------------------------------------------------------------------------
rules = http://static.example.com/rules/default.xml
theme = http://themes.example.com/sports.html
Using a URISpace parser in Python Code
--------------------------------------
Once parsing is complete, the URISpace is available as tree-like object.
The canonical operators to extract metadata for a given URI are:
.. code-block:: python
from urlparse import urlsplit
scheme, nethost, path, query, fragment = urlsplit(uri)
path = path.split('/')
if len(path) > 1 and path[0] == '':
path = path[1:]
info = {'scheme': scheme,
'nethost': nethost,
'path': path,
'query': parse_qs(query, keep_blank_values=1),
'fragment': fragment,
}
operators = urispace.collect(info)
assertions = {}
for operator in operators:
operator.apply(assertions)
At this point, ``assertions`` will contain keys and values for all
operators found while matching against the URI.
Using URISpace as WSGI Middleware
---------------------------------
One application of a URISpace might be to make assertions about the
URI of a WSGI request, in order to allow other parts of the application
to use those assertions. :mod:`repoze.urispace` provides a component
which can be used as middleware for this purpose.
To configure the middleware in a :mod:`PasteDeploy` config file::
[filter:urispace]
use = egg:repoze.urispace#urispace
file = %{here)s/urispace.xml
You should then be able to add the middleware to your pipeline::
[pipeline:main]
pipeline =
urispace
your_app
In your application, you can get to the assertions made by the middleware
using the :func:`repoze.urispace.middleware.getAssertions` API, e.g.:
.. code-block:: python
from repoze.urispace.middleware import getAssertions
def your_app(environ, start_response):
assertions = getAssertions(environ)
Development Notes
-----------------
Extending :mod:`repoze.urispace`
++++++++++++++++++++++++++++++++
- Registering custom selectors (TBD)
- Registering operator converters (TBD)
.. toctree::
:maxdepth: 2
parser
.. _URISpace: http://www.w3.org/TR/urispace.html
.. _URIRFC: http://www.ietf.org/rfc/rfc2396.txt
.. target-notes::
``repoze.urispace`` Changelog
=============================
0.1 (2009-07-04)
----------------
- Initial release.
*********************************************************
:Author: Tres Seaver
:Version: |version|
.. module:: repoze.urispace
:synopsis: Hierarchical URI-based metadata
.. topic:: Overview
:mod:`repoze.urispace` implements the URISpace_ 1.0 spec, as proposed
to the W3C by Akamai. Its aim is to provide an implementation of
that language as a vehicle for asserting declarative metadata about a
resource based on pattern matching against its URI.
Once asserted, such metadata can be used to guide the application in
serving the resource, with possible applciations including:
- Setting cache control headers.
- Selecting externally applied themes, e.g. in :mod:`Deliverance`
- Restricting access, e.g. to emulate Zope's "placeful security."
URISpace Specification
----------------------
The URISpace_ specification provides for matching on the following
portions of a URI:
- scheme
- authority (see URIRFC_)
o host, including wildcarding (leading only) and port
o user (if specified in the URI)
- path elements, including nesting and wildcarding, as well as
parameters, where used.
- query elements, including test for presence or for specific value
- fragments (likely irrelevant for server-side applications)
.. Note:: :mod:`repoze.urispace` does not yet provide support for
fragment matching.
The asserted metadata can be scalar, or can use RDF Bag and Sequences
to indicate sets or ordered collections.
.. Note:: :mod:`repoze.urispace` does not yet provide support for
parsing multi-valued assertions using RDF.
Operators are provided to allow for incrementally updating or clearing
the value for a given metadata element. Specified operators include:
``replace``
Completely replace any previously asserted value with a new one.
This is the default operator.
``clear``
Remove any previously asserted value.
``union``
Perform a set union: ``old | new``
``intersection``
Perform a set intersection: ``old & new``
``rev-intersection``
Perform a set exclusion: ``old ^ new``
``difference``
Perform set subtraction: ``old - new``
``rev-difference``
Perform set subtraction: ``new - old``
``prepend``
Insert ``new`` values at the head of ``old`` values
``append``
Insert ``new`` values at the tail of ``old`` values
Example
-------
Suppose we want to select different Delieverance themes and or rulesets
based on the URI of the resource being themed. In particular:
- The ``news``, ``lifestyle``, and ``sports`` sections of the site each get
custom themes, with the homepage and any other sections sharing the
default theme.
- Within the news section, the ``world``, ``national``, and ``local``
sections all use a different theme URL (one with the desired color
scheme name encoded as a query string).
- Within any section, the ``index.html`` page should use a different
ruleset, than that for stories in that section (whose final path element
will be ``<slug>.html``): the index page's HTML structured very differently
from that used for stories.
A URISpace file specifying these policies would look like:
.. include:: examples/dv_news.xml
:literal:
Given that URISpace file, one can test how given URIs matches using
the ``uri_test`` script::
$ /path/to/bin/uri_test examples/dv_news.xml \
http://example.com/ \
http://example.com/foo \
http://example.com/news/ \
http://example.com/news/index.html \
http://example.com/news/world/index.html \
http://example.com/sports/ \
http://example.com/sports/world_series_2008.html
------------------------------------------------------------------------------
URI: http://example.com/
------------------------------------------------------------------------------
rules = http://static.example.com/rules/default.xml
theme = http://themes.example.com/default.html
------------------------------------------------------------------------------
URI: http://example.com/foo
------------------------------------------------------------------------------
rules = http://static.example.com/rules/default.xml
theme = http://themes.example.com/default.html
------------------------------------------------------------------------------
URI: http://example.com/news/
------------------------------------------------------------------------------
rules = http://static.example.com/rules/default.xml
theme = http://themes.example.com/news.html
------------------------------------------------------------------------------
URI: http://example.com/news/index.html
------------------------------------------------------------------------------
rules = http://static.example.com/rules/default.xml
theme = http://themes.example.com/news.html
------------------------------------------------------------------------------
URI: http://example.com/news/world/index.html
------------------------------------------------------------------------------
rules = http://static.example.com/rules/default.xml
theme = http://themes.example.com/news.html?style=world
------------------------------------------------------------------------------
URI: http://example.com/sports/
------------------------------------------------------------------------------
rules = http://static.example.com/rules/default.xml
theme = http://themes.example.com/sports.html
------------------------------------------------------------------------------
URI: http://example.com/sports/world_series_2008.html
------------------------------------------------------------------------------
rules = http://static.example.com/rules/default.xml
theme = http://themes.example.com/sports.html
Using a URISpace parser in Python Code
--------------------------------------
Once parsing is complete, the URISpace is available as tree-like object.
The canonical operators to extract metadata for a given URI are:
.. code-block:: python
from urlparse import urlsplit
scheme, nethost, path, query, fragment = urlsplit(uri)
path = path.split('/')
if len(path) > 1 and path[0] == '':
path = path[1:]
info = {'scheme': scheme,
'nethost': nethost,
'path': path,
'query': parse_qs(query, keep_blank_values=1),
'fragment': fragment,
}
operators = urispace.collect(info)
assertions = {}
for operator in operators:
operator.apply(assertions)
At this point, ``assertions`` will contain keys and values for all
operators found while matching against the URI.
Using URISpace as WSGI Middleware
---------------------------------
One application of a URISpace might be to make assertions about the
URI of a WSGI request, in order to allow other parts of the application
to use those assertions. :mod:`repoze.urispace` provides a component
which can be used as middleware for this purpose.
To configure the middleware in a :mod:`PasteDeploy` config file::
[filter:urispace]
use = egg:repoze.urispace#urispace
file = %{here)s/urispace.xml
You should then be able to add the middleware to your pipeline::
[pipeline:main]
pipeline =
urispace
your_app
In your application, you can get to the assertions made by the middleware
using the :func:`repoze.urispace.middleware.getAssertions` API, e.g.:
.. code-block:: python
from repoze.urispace.middleware import getAssertions
def your_app(environ, start_response):
assertions = getAssertions(environ)
Development Notes
-----------------
Extending :mod:`repoze.urispace`
++++++++++++++++++++++++++++++++
- Registering custom selectors (TBD)
- Registering operator converters (TBD)
.. toctree::
:maxdepth: 2
parser
.. _URISpace: http://www.w3.org/TR/urispace.html
.. _URIRFC: http://www.ietf.org/rfc/rfc2396.txt
.. target-notes::
``repoze.urispace`` Changelog
=============================
0.1 (2009-07-04)
----------------
- Initial release.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
repoze.urispace-0.1.tar.gz
(28.1 kB
view hashes)