A set of utilities for training probabilistic context-free grammars and scoring new sentences with them.
Project description
A library for training and applying probabilistic context-free grammars to
text.
* Kasami, T. (1965). An efficient recognition and syntax analysis algorithm
for context-free languages. (No. Scientific-2). Hawaii University, Dept. of
Electrical Engineering.
# Example use
```python
>>> from bllipparser import RerankingParser
>>>
>>> from kasami import TreeScorer
>>> from kasami.normalizers import bllip
>>>
>>> # Loading WSJ-PTB3 treebank into bllip's RerankingParser
... bllip_rrp = RerankingParser.fetch_and_load('WSJ-PTB3')
>>> bllip_parse = lambda s: bllip.normalize_tree(bllip_rrp.parse(s)[0].ptb_parse)
>>>
>>> tree = bllip_parse("I am a little teapot")
>>> print(tree)
(S1 (S (NP (PRP 'I')) (VP (VBP 'am') (NP (DT 'a') (JJ 'little') (NN 'teapot')))))
>>> print(tree.format(depth=1))
(S1
(S
(NP
(PRP 'I')
)
(VP
(VBP 'am')
(NP
(DT 'a')
(JJ 'little')
(NN 'teapot')
)
)
)
)
>>>
>>> for production in tree:
... print(str(production))
...
(S1 S)
(S NP VP)
(NP PRP)
(PRP 'I')
(VP VBP NP)
(VBP 'am')
(NP DT JJ NN)
(DT 'a')
(JJ 'little')
(NN 'teapot')
>>> sentences = ["I am a little teapot",
... "Here is my handle",
... "Here is my spout",
... "When I get all steamed up I just shout tip me over and pour me out",
... "I am a very special pot",
... "It is true",
... "Here is an example of what I can do",
... "I can turn my handle into a spout",
... "Tip me over and pour me out"]
>>>
>>>
>>> teapot_grammar = TreeScorer.from_tree_bank(bllip_parse(s) for s in sentences)
>>>
>>> teapot_grammar.score(bllip_parse("Here is a little teapot"))
-9.392661928770137
>>> teapot_grammar.score(bllip_parse("It is my handle"))
-10.296301543090733
>>> teapot_grammar.score(bllip_parse("I am a spout"))
-10.40166205874856
>>> teapot_grammar.score(bllip_parse("Your teapot is gay"))
-12.96352974967269
>>> teapot_grammar.score(bllip_parse("Your mom's teapot is asldasnldansldal"))
-19.424997926026403
```
# Author
* Aaron Halfaker -- https://github.com/halfak
... and substantially informed by https://github.com/aetilley
text.
* Kasami, T. (1965). An efficient recognition and syntax analysis algorithm
for context-free languages. (No. Scientific-2). Hawaii University, Dept. of
Electrical Engineering.
# Example use
```python
>>> from bllipparser import RerankingParser
>>>
>>> from kasami import TreeScorer
>>> from kasami.normalizers import bllip
>>>
>>> # Loading WSJ-PTB3 treebank into bllip's RerankingParser
... bllip_rrp = RerankingParser.fetch_and_load('WSJ-PTB3')
>>> bllip_parse = lambda s: bllip.normalize_tree(bllip_rrp.parse(s)[0].ptb_parse)
>>>
>>> tree = bllip_parse("I am a little teapot")
>>> print(tree)
(S1 (S (NP (PRP 'I')) (VP (VBP 'am') (NP (DT 'a') (JJ 'little') (NN 'teapot')))))
>>> print(tree.format(depth=1))
(S1
(S
(NP
(PRP 'I')
)
(VP
(VBP 'am')
(NP
(DT 'a')
(JJ 'little')
(NN 'teapot')
)
)
)
)
>>>
>>> for production in tree:
... print(str(production))
...
(S1 S)
(S NP VP)
(NP PRP)
(PRP 'I')
(VP VBP NP)
(VBP 'am')
(NP DT JJ NN)
(DT 'a')
(JJ 'little')
(NN 'teapot')
>>> sentences = ["I am a little teapot",
... "Here is my handle",
... "Here is my spout",
... "When I get all steamed up I just shout tip me over and pour me out",
... "I am a very special pot",
... "It is true",
... "Here is an example of what I can do",
... "I can turn my handle into a spout",
... "Tip me over and pour me out"]
>>>
>>>
>>> teapot_grammar = TreeScorer.from_tree_bank(bllip_parse(s) for s in sentences)
>>>
>>> teapot_grammar.score(bllip_parse("Here is a little teapot"))
-9.392661928770137
>>> teapot_grammar.score(bllip_parse("It is my handle"))
-10.296301543090733
>>> teapot_grammar.score(bllip_parse("I am a spout"))
-10.40166205874856
>>> teapot_grammar.score(bllip_parse("Your teapot is gay"))
-12.96352974967269
>>> teapot_grammar.score(bllip_parse("Your mom's teapot is asldasnldansldal"))
-19.424997926026403
```
# Author
* Aaron Halfaker -- https://github.com/halfak
... and substantially informed by https://github.com/aetilley
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
kasami-0.0.4.tar.gz
(6.7 kB
view details)
Built Distribution
kasami-0.0.4-py3-none-any.whl
(10.8 kB
view details)
File details
Details for the file kasami-0.0.4.tar.gz
.
File metadata
- Download URL: kasami-0.0.4.tar.gz
- Upload date:
- Size: 6.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 21c9c33ea9a00bc69197a4d14ae98e1df2fd86d01286a5c75744189f8b6ba840 |
|
MD5 | 37ab4038d5283376b38ee74c92ec47dc |
|
BLAKE2b-256 | c5640bf33616e7910fb279fc9bf2b891eedbfa877422c29112c870ddd31b9235 |
File details
Details for the file kasami-0.0.4-py3-none-any.whl
.
File metadata
- Download URL: kasami-0.0.4-py3-none-any.whl
- Upload date:
- Size: 10.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 468ce38014ddf97feb68fcfa1c02b3ceceb7ae40f44d2c81d58ddd93aef7e85c |
|
MD5 | 51db6b51ba5d3e314eacc9a9186ea813 |
|
BLAKE2b-256 | b29820ef47453395bffff7e436a1bbc294790dcb2eb116381cadf7b0a1ec8b15 |