A library to develop scalable production rule systems for any parsing expression grammar
Project description
Free logics from PEGs
Syntreenet facilitates easy development of any finite domain formal theory possible in any language described by a Parsing Expression Grammar.
It provides a knowledge base in the form of a performant and scalable production system, where rules and facts interact to produce new rules and facts. Facts in this context are the top productions in the provided PEGs, and rules are fundamentally composed of a set of those facts as conditions and another set of facts as consecuences.
It’s only dependence is on the excellent Parsimonious PEG parser by Erik Rose.
Example Usage
Let’s start with a simple grammar with which we can build triples consisting on a subject, a predicate, and an object, with 2 predicates, ∈ and ⊆, thus producing classifications of things. To make this example simpler, we will use the ASCII string “element-of” in place of unicode ∈, and “subset-of” in place of ⊆.
The grammar for this might be something like:
fact = word ws pred ws word pred = element / subset element = "element-of" subset = "subset-of" word = ~"[a-z0-9]+" ws = ~"\s+"
With this, we can have facts such as:
a element-of b b subset-of c c subset-of d
On top of this language, we might want a logic where, if the previous 3 facts are added to a knowledge base, it would also have that:
a element-of c a element-of d b subset-of d
For this, we need a logic in which variables range over the “word” productions. So we modify the grammar substituting the “word” rule with:
word = v_word / __var__ v_word = ~"[a-z0-9]+"
With this grammar we can now build a knowledge base, and add rules appropriate for our purposes:
>>> from syntreenet import KnowledgeBase
>>> grammar = """
>>> fact = word ws pred ws word
>>> pred = element / subset
>>> element = "element-of"
>>> subset = "subset-of"
>>> word = v_word / __var__
>>> v_word = ~"[a-z0-9]+"
>>> ws = ~"\s+"
>>> """
>>> kb = KnowledgeBase(grammar)
>>> kb.tell("X1 element-of X2 ; X2 subset-of X3 -> X1 element-of X3")
>>> kb.tell("X1 subset-of X2 ; X2 subset-of X3 -> X1 subset-of X3")
Now we can add facts to the knowledge base, and query for them or for their conscuences:
>>> kb.tell("a element-of b")
>>> kb.tell("b subset-of c")
>>> kb.tell("c subset-of d")
>>> kb.query("a element-of b")
True
>>> kb.query("a element-of d")
True
>>> kb.query("c element-of d")
False
>>> kb.query("X1 subset-of d")
[{'X1': 'b'}, {'X1': 'c'}]
Goals
- Scalability:
Adding new facts or rules is essentially O(1) in the number of rules plus facts already present in the knowledge base. Theoretically, this is due to the fact that the DAGs that hold the data (facts and rules) are only ever searched by consulting Python dictionaries. Practically, I am getting a fairly constant value of a couple tenths of a millisecond per fact (this will depend on the complexity of the grammar), up to the capacity of my laptop (totalling around 2 million facts and rules).
- Universality:
The “free” in the heading caption is in the sense of a free object over the formal languages described by PEGs: syntreenet knows nothing about the grammar underlying the particular logic it deals with at any particular moment.
- Clear and tested code:
The code follows best practices for readability and is tested with 99% coverage including branch analysis.
Detailed Usage
Install
syntreenet is available at pypi, just use pip in a Python >= 3.7 environment:
$ pip install syntreenet $ python >>> import syntreenet
Test
To run the tests, you can download the sources from a mirror, create a Python environment for it, and use nose2:
$ git clone https://git.sr.ht/~enriquepablo/syntreenet $ cd syntreenet/ $ virtualenv venv $ source venv/bin/activate $ python setup.py develop easy_install syntreenet[testing] $ nose2
Grammar requirements
Note that these requirements can be overridden in the __init__ method for KnowledgeBase.
The top production in the grammar must be called “fact”.
The productions that must be in the range of the logical variables must have a name starting with "v_".
These “logical” productions must happen in higher productions as alternatives to the builtin production “__var__”.
To make rules, 2 sets of facts (the conditions and the consecuences) must be joined by semicolons, and joined among them with the string “ -> “.
Only conditions, consecuences, and queries can have variables in place of “logical” productions. Facts cannot.
No grammar production can have a name starting and ending with 2 underscores.
Variables start with an “X”, followed by any number of digits.
Basic API
The API is extremelly simple. As seen above, the entry point for syntreenet is the KnowledgeBase class. It is instantiated with a string containing a PEG appropriate for Parsimonious and subject to the restrictions stated above.
Objects of this class offer 3 methods:
tell(self, sentence): accepts a fact or a rule in the form of a string and incorporates it to the knowledge base.
query(self, fact): accepts a fact (possibly with variables) in the form of a string, and returns whether the fact can be found in the knowledge base. If it has variables, it will return the variable substitutions that result in facts present in the knowledge base, in the form of a dict of strings to strings.
goal(self, fact): provided with a fact, it will return the facts that would be needed to get it to the knowledge base (without directly adding it). This is a form of backward chaining.
Copyright (c) 2019 by Enrique Pérez Arnaud <enrique@cazalla.net>
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file syntreenet-1.0.0b6.tar.gz
.
File metadata
- Download URL: syntreenet-1.0.0b6.tar.gz
- Upload date:
- Size: 32.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: Python-urllib/3.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 17e865f6a4ac9f7896a3f52d314de0784e3dbab536b115737aa0fa99b606efc1 |
|
MD5 | bfe91ed94df2d7716086dc16fb8067d1 |
|
BLAKE2b-256 | 3ece86bca1649d21dcc94ed7729cf4da00e8d6f4dc8141b9000c9a1ab815dcbc |