Parsing Expressions
Project description
Parsing Expressions
pe is a library for parsing expressions, including parsing expression grammars (PEGs). It aims to join the expressive power of parsing expressions with the familiarity of regular expressions. For example:
>>> import pe
>>> m = pe.match(r'["] (!["\\] . / "\\" .)* ["]',
... '"escaped \\"string\\"" ...')
>>> m.group()
'"escaped \\"string\\""'
Current Status
Please note that pe is very new and is currently alpha-level software. The API or behavior may change significantly as things are finalized.
Features and Goals
- Grammar notation is backward-compatible with standard PEG with few extensions
- A specification describes the semantic effect of parsing (e.g., for mapping expressions to function calls)
- Parsers are fast and memory efficient
- The API is intuitive and familiar; it's modeled on the standard API's re module
- Grammar definitions and parser implementations are separate
- Optimizations target the abstract grammar definitions
- Multiple parsers are available (currently packrat for recursive descent and machine for an iterative "parsing machine" as from Medeiros and Ierusalimschy, 2008 and implemented in LPeg).
Syntax Overview
pe is backward compatible with standard PEG syntax and it is conservative with extensions.
# terminals
. # any single character
"abc" # string literal
'abc' # string literal
[abc] # character class
# repeating expressions
e # exactly one
e? # zero or one (optional)
e* # zero or more
e+ # one or more
# combining expressions
e1 e2 # sequence of e1 and e2
e1 / e2 # ordered choice of e1 and e2
(e) # subexpression
# lookahead
&e # positive lookahead
!e # negative lookahead
# (extension) raw substring
~e # result of e is matched substring
# (extension) binding
name:e # bind result of e to 'name'
# grammars
Name <- ... # define a rule named 'Name'
... <- Name # refer to rule named 'Name'
Matching Inputs with Parsing Expressions
When a parsing expression matches an input, it returns a Match
object, which is similar to those of Python's
re module for regular
expressions. The default value of matching terminals is nothing, but
the raw (~
) operator returns the substring the matching expression,
similar to regular expression's capturing groups:
>>> e = pe.compile(r'[0-9] [.] [0-9]')
>>> m = e.match('1.4')
>>> m.group()
'1.4'
>>> m.groups()
()
>>> e = pe.compile(r'~([0-9] [.] [0-9])')
>>> m = e.match('1.4')
>>> m.group()
'1.4'
>>> m.groups()
('1.4',)
Value Bindings
A value binding takes a sub-match (e.g., of a sequence, choice, or
repetition) and extracts it from the match's value while associating
it with a name that is made available in the Match.groupdict()
dictionary.
>>> e = pe.compile(r'~[0-9] x:(~[.]) ~[0-9]')
>>> m = e.match('1.4')
>>> m.groups()
('1', '4')
>>> m.groupdict()
{'x': '.'}
Actions
Actions are functions that are called on a match as follows:
action(*match.groups(), **match.groupdict())
While you can define your own functions that follow this signature,
pe provides some helper functions for common operations, such as
pack(func)
, which packs the *args
into a list and calls
func(args)
, or join(func, sep='')
which joins all *args
into
a string with sep.join(args)
and calls func(argstring)
.
The return value of the action becomes the value of the
expression. Note that the return value of Match.groups()
is always
an iterable while Match.value()
can return a single object.
>>> from pe.actions import join
>>> e = pe.compile(r'~([0-9] [.] [0-9])',
... actions={'Start': float})
>>> m = e.match('1.4')
>>> m.groups()
(1.4,)
>>> m.groupdict()
{}
>>> m.value()
1.4
Similar Projects
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pe-0.1.0.tar.gz
.
File metadata
- Download URL: pe-0.1.0.tar.gz
- Upload date:
- Size: 25.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.8.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 52687ea1663ec6d2c32caf96a841f86944feee232437b38969dc60a8a623b60a |
|
MD5 | 5b3f7de0b0f0441766a556c404fa83ee |
|
BLAKE2b-256 | 139c6afdfc9f06e1c0bef68b1d34c12cb586fe87a4dbf204e6f1d4c2109987a3 |
File details
Details for the file pe-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: pe-0.1.0-py3-none-any.whl
- Upload date:
- Size: 23.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.8.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a669870941279e84692e786f2e2689d79e1e178b7a292e01fe4e8ad41033e643 |
|
MD5 | f169035d7a877aa86d812fea3cc04ae2 |
|
BLAKE2b-256 | f2e2780be06029eb4453efbcef7a558a3e0c6c38c2b02bde50377f54d100982f |