Skip to main content

cssselect is a parser for CSS Selectors that can translate to XPath 1.0

Project description

cssselect is a parser for CSS Selectors Level 3 that can also translate selectors to XPath 1.0 queries. Such queries can be used in lxml to find the matching elements in an XML or HTML document.

This module used to live inside of lxml as lxml.cssselect before it was extracted as a stand-alone project.

The CSSSelector class

The most important class in the cssselect module is CSSSelector. It provides the same interface as lxml’s XPath class, but accepts a CSS selector expression as input:

>>> from cssselect import CSSSelector
>>> sel = CSSSelector('div.content')
>>> sel  #doctest: +ELLIPSIS
<CSSSelector ... for 'div.content'>
>>> sel.css
'div.content'

The selector actually compiles to XPath, and you can see the expression by inspecting the object:

>>> sel.path
"descendant-or-self::div[contains(concat(' ', normalize-space(@class), ' '), ' content ')]"

To use the selector, simply call it with a document or element object:

>>> from lxml.etree import fromstring
>>> h = fromstring('''<div id="outer">
...   <div id="inner" class="content body">
...       text
...   </div></div>''')
>>> [e.get('id') for e in sel(h)]
['inner']

CSS Selectors

This libraries attempts to implement CSS selectors as described in the w3c specification. Many of the pseudo-classes do not apply in this context, including all dynamic pseudo-classes. In particular these will not be available:

  • link state: :link, :visited, :target

  • actions: :hover, :active, :focus

  • UI states: :enabled, :disabled, :indeterminate (:checked and :unchecked are available)

Also, none of the pseudo-elements apply, because the selector only returns elements and pseudo-elements select portions of text, like ::first-line.

Namespaces

In CSS you can use namespace-prefix|element, similar to namespace-prefix:element in an XPath expression. In fact, it maps one-to-one, and the same rules are used to map namespace prefixes to namespace URIs.

Limitations

These applicable pseudoclasses are not yet implemented:

  • :lang(language)

  • *:first-of-type, *:last-of-type, *:nth-of-type, *:nth-last-of-type, *:only-of-type. All of these work when you specify an element type, but not with *

Unlike XPath you cannot provide parameters in your expressions – all expressions are completely static.

XPath has underspecified string quoting rules (there seems to be no string quoting at all), so if you use expressions that contain characters that requiring quoting you might have problems with the translation from CSS to XPath.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cssselect-0.1.tar.gz (20.3 kB view details)

Uploaded Source

File details

Details for the file cssselect-0.1.tar.gz.

File metadata

  • Download URL: cssselect-0.1.tar.gz
  • Upload date:
  • Size: 20.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for cssselect-0.1.tar.gz
Algorithm Hash digest
SHA256 5e2aa9a663c9f82423ff81479f181a009c93b97ab569dd789b4432104361e396
MD5 3b904f48f8995a320edce514b3bb2606
BLAKE2b-256 898caff6dd5b5a9a56ab4e0b2e639baf27705bfe7cfd3f241a1ef66fa71a4bda

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page