Skip to main content

A library to develop scalable production rule systems for any parsing expression grammar

Project description

Syntreenet facilitates easy development of any finite domain formal theory possible in any language described by a Parsing Expression Grammar.

It provides a knowledge base in the form of a performant and scalable production system, where rules and facts interact to produce new rules and facts. Facts in this context are the top productions in the provided PEGs, and rules are fundamentally composed of a set of those facts as conditions and another set of facts as consecuences.

It’s only dependence is on the excellent Parsimonious PEG parser by Erik Rose.

Example Usage

Let’s start with a simple grammar with which we can build triples consisting on a subject, a predicate, and an object, with 2 predicates, ∈ and ⊆, thus producing classifications of things. To make this example simpler, we will use the ASCII string “element-of” in place of unicode ∈, and “subset-of” in place of ⊆.

The grammar for this might be something like:

fact        = word ws pred ws word
pred        = element / subset
element     = "element-of"
subset      = "subset-of"
word        = ~"[a-z0-9]+"
ws          = ~"\s+"

With this, we can have “facts” such as:

a element-of b
b subset-of c
c subset-of d

On top of this language, we might want a logic where, if you add the previous 3 facts to a knowledge base, you would also have that:

a element-of c
a element-of d
b subset-of d

For this, we need a logic in which variables range over the “word” productions. So we modify the grammar substituting the “word” rule with:

word        = v_word / __var__
v_word      = ~"[a-z0-9]+"

With this grammar we can now build a knowledge base, and add rules appropriate for our purposes:

>>> from syntreenet import KnowledgeBase

>>> grammar = """
>>>    fact        = word ws pred ws word
>>>    pred        = element / subset
>>>    element     = "element-of"
>>>    subset      = "subset-of"
>>>    word        = v_word / __var__
>>>    v_word      = ~"[a-z0-9]+"
>>>    ws          = ~"\s+"
>>> """

>>> kb = KnowledgeBase(grammar)

>>> kb.tell("X1 element-of X2 ; X2 subset-of X3 -> X1 element-of X3")
>>> kb.tell("X1 subset-of X2 ; X2 subset-of X3 -> X1 subset-of X3")

Now we can add facts to the knowledge base, and query for them or for their conscuences:

>>> kb.tell("a element-of b")
>>> kb.tell("b subset-of c")
>>> kb.tell("c subset-of d")

>>> kb.query("a element-of b")
True
>>> kb.query("a element-of d")
True
>>> kb.query("c element-of d")
False
>>> kb.query("X1 subset-of d")
[{'X1': 'b'}, {'X1': 'c'}]

Goals

Scalability:

Adding new facts or rules is essentially O(1) in the number of rules plus facts already present in the knowledge base. Theoretically, this is due to the fact that the DAGs that hold the data are only ever searched by consulting Python dictionaries. Practically, I am getting a fairly constant value of a couple tenths of a millisecond per fact (this will depend on the complexity of the grammar), up to the capacity of my laptop (totalling around 2 million facts and rules).

Universality:

The “free” in the heading caption is in the sense of a “free object” over the “set of PEGs”: syntreenet knows nothing about the grammar underlying the particular logic it deals with at any particular moment.

Clear and tested code:

The code follows best practices for readability and is tested with 99% coverage including branch analysis.

Detailed Usage

Install

syntreenet is available at pypi, just use pip in a Python >= 3.7 environment:

$ pip install syntreenet
$ python
>>> import syntreenet

Test

To run the tests, you can download the sources from a mirror, create a Python environment for it, and use nose2:

$ git clone https://git.sr.ht/~enriquepablo/syntreenet
$ cd syntreenet/
$ virtualenv venv
$ source venv/bin/activate
$ python setup.py develop easy_install syntreenet[testing]
$ nose2

Grammar requirements

Note that these requirements can be overridden in the init method for KnowledgeBase.

  • The top production in the grammar must be called “fact”.

  • The productions that must be in the range of the logical variables must have a name starting with "v_".

  • These “logical” productions must happen in higher productions as alternatives to the builtin production “__var__”.

  • To make rules, 2 sets of facts (the conditions and the consecuences) must be joined by semicolons, and joined among them with the string “ -> “.

  • The conditions and consecuences in the rules can have variables in place of “logical” productions.

  • You can query the knowledge base with facts, that can also contain variables.

  • Variables start with an “X”, followed by any number of digits.

  • No grammar production can have a name starting and ending with 2 underscores.

Basic API

The API is extremelly simple. As seen above, the entry point for syntreenet is the KnowledgeBase class. It is instantiated with a string containing a PEG appropriate for Parsimonious and subject to the restrictions stated above.

Objects of this class offer 3 methods:

  • tell: accepts a fact or a rule in the form of a string and incorporates it to the knowledge base.

  • query: accepts a fact (possibly with variables) in the form of a string, and returns whether the fact can be found in the knowledge base. If it has variables, it will return the variable substitutions that result in facts present in the knowledge base, in the form of a dict of strings to strings.

  • goal: provided with a fact, it will return the facts that would be needed to get it to the knowledge base (without directly adding it). This is a form of backtracking.

Copyright (c) 2019 by Enrique Pérez Arnaud <enrique@cazalla.net>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

syntreenet-1.0.0b4.tar.gz (32.4 kB view details)

Uploaded Source

File details

Details for the file syntreenet-1.0.0b4.tar.gz.

File metadata

  • Download URL: syntreenet-1.0.0b4.tar.gz
  • Upload date:
  • Size: 32.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.3

File hashes

Hashes for syntreenet-1.0.0b4.tar.gz
Algorithm Hash digest
SHA256 61fd569612f2fd01880105205d2f40956557de790a320149d9b421ba16503201
MD5 ac20fe8e3fc3cf5ba7de4ad6165a2506
BLAKE2b-256 f4a769983a07f4fe2a45a5ea8026c6542e7cc9f8644a1e9bc52c58af257736a3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page