Skip to main content

PENMAN notation for graphs (e.g., AMR)

Project description

PyPI Version Python Support Build Status Documentation Status

This package models graphs encoded in PENMAN notation (e.g., AMR), such as the following for the boy wants to go:

(w / want-01
   :ARG0 (b / boy)
   :ARG1 (g / go
            :ARG0 b))

The Penman package may be used as a Python library or as a script.

Features

  • Read and write PENMAN-serialized graphs or triple conjunctions
  • Read metadata in comments (e.g., # ::id 1234)
  • Read surface alignments (e.g., foo~e.1,2)
  • Inspect and manipulate the graph or tree structures
  • Customize graphs for writing:
    • adjust indentation and compactness
    • select a new top node
    • rearrange edges (partially implemented)
    • restructure the tree shape
  • Transform the graph
    • Canonicalize roles
    • Reify edges
    • Reify attributes
    • Embed the tree structure with additional TOP triples
  • AMR model: role inventory and transformations
  • Tested (but not yet 100% coverage)
  • Documented (see the documentation)

Library Usage

>>> import penman
>>> g = penman.decode('(b / bark :ARG0 (d / dog))')
>>> g.triples
[('b', ':instance', 'bark'), ('b', ':ARG0', 'd'), ('d', ':instance', 'dog')]
>>> print(penman.encode(g))
(b / bark
   :ARG0 (d / dog))
>>> print(penman.encode(g, top='d', indent=6))
(d / dog
      :ARG0-of (b / bark))
>>> print(penman.encode(g, indent=False))
(b / bark :ARG0 (d / dog))

Script Usage

$ penman --help
usage: penman [-h] [-V] [--model FILE | --amr] [--indent N] [--compact]
              [--triples] [--canonicalize-roles] [--reify-edges]
              [--reify-attributes] [--indicate-branches]
              [FILE [FILE ...]]

Read and write graphs in the PENMAN notation.

positional arguments:
  FILE                  read graphs from FILEs instead of stdin

optional arguments:
  -h, --help            show this help message and exit
  -V, --version         show program's version number and exit
  --model FILE          JSON model file describing the semantic model
  --amr                 use the AMR model

formatting options:
  --indent N            indent N spaces per level ("no" for no newlines)
  --compact             compactly print node attributes on one line
  --triples             print graphs as triple conjunctions

normalization options:
  --canonicalize-roles  canonicalize role forms
  --reify-edges         reify all eligible edges
  --reify-attributes    reify all attributes
  --indicate-branches   insert triples to indicate tree structure

$ penman <<< "(w / want-01 :ARG0 (b / boy) :ARG1 (g / go :ARG0 b))"
(w / want-01
   :ARG0 (b / boy)
   :ARG1 (g / go
            :ARG0 b))

Requirements

  • Python 3.6+

PENMAN Notation

The PENMAN project was a large effort at natural language generation, and what I'm calling "PENMAN notation" is more accurately "Sentence Plan Language" (SPL; [Kaspar 1989][]), but I'll stick with "PENMAN notation" because it may be a more familiar name to modern users and it also sounds less specific to sentence representations, e.g., in case someone wants to use the format to encode arbitrary graphs.

This module expands the notation slightly to allow for untyped nodes (e.g., (x)) and anonymous relations (e.g., (x : (y))). A PEG* definition for the notation is given below (for simplicity, whitespace is not explicitly included; assume all nonterminals can be surrounded by /\s+/):

# Syntactic productions
Start     <- Node
Node      <- '(' Variable NodeLabel? Relation* ')'
NodeLabel <- '/' Concept Alignment?
Concept   <- Atom
Relation  <- Role Alignment? (Edge | Attribute)
Edge      <- Variable Alignment?
           | Node
Attribute <- Atom Alignment?
Atom      <- String | Float | Integer | Symbol
Variable  <- Symbol
# Lexical productions
Role      <- /:[^\s()\/,:~]*/
String    <- /"[^"\\]*(?:\\.[^"\\]*)*"/
Float     <- /[-+]?(((\d+\.\d*|\.\d+)([eE][-+]?\d+)?)|\d+[eE][-+]?\d+)/
Integer   <- /[-+]?\d+(?=[ )\/:])/
Symbol    <- /[^\s()\/,:~]+/
Alignment <- /~([a-zA-Z]\.?)?\d+(,\d+)*/

* Note: I use | above for ordered-choice instead of / so that / can be used to surround regular expressions.

There is ambiguity in that both Edge and Attribute can resolve their first token to Symbol (via the Variable and Atom productions, respectively), but they are shown like this for their semantic contribution. A Variable is distinguished from other Symbol tokens simply by its use as the identifier of a node. Examples of non-variable Symbol tokens in AMR are concepts (e.g., want-01), - in :polarity -, expressive in :mode expressive, etc.

The above grammar is for the PENMAN graphs that this library supports, but AMR is more restrictive. A variant for AMR might make the NodeLabel nonterminal required on Node, change Atom to Symbol on Concept, change Role to require at least one character after : or even spell out all valid roles, and change Variable to a form like /[a-z]+[0-9]*/. See also Nathan Schneider's PEG for AMR.

Disclaimer

This project is not affiliated with ISI, the PENMAN project, or the AMR project.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Penman-0.7.2.tar.gz (26.4 kB view details)

Uploaded Source

Built Distribution

Penman-0.7.2-py3-none-any.whl (32.3 kB view details)

Uploaded Python 3

File details

Details for the file Penman-0.7.2.tar.gz.

File metadata

  • Download URL: Penman-0.7.2.tar.gz
  • Upload date:
  • Size: 26.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.7.5rc1

File hashes

Hashes for Penman-0.7.2.tar.gz
Algorithm Hash digest
SHA256 6007a3e02c92fbf69341770834f6173371f765ed86884be7824c315c4fbe7faf
MD5 7293121b08b80e10678f4fc114e1bffc
BLAKE2b-256 12d684d85913ce8ec0e1957240a5ee1f8c7a77c6e7da83876bf73c2872c6867e

See more details on using hashes here.

File details

Details for the file Penman-0.7.2-py3-none-any.whl.

File metadata

  • Download URL: Penman-0.7.2-py3-none-any.whl
  • Upload date:
  • Size: 32.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.7.5rc1

File hashes

Hashes for Penman-0.7.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e41ac7ebc32feb1a1b3bd5b23c56ac2bbd83ddc754943ed018cb94510992b915
MD5 d65c2ecc8d8c413214a5e27cda4d70bc
BLAKE2b-256 51e0897b6eb85af5dac1637e859cf647356ca9a9f7eb6fcb5c6f392c6e28e370

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page