wikify your texts! micro-framework for text wikification
Project description
**wikify** your texts!
*micro-framework for text wikification*
goals - avoid conflicts between text modifications rules
and be easy to extend and debug
**author**: anatoly techtonik <techtonik@gmail.com>
**license**: Public Domain
[![Build Status](https://drone.io/bitbucket.org/techtonik/wikify/status.png)](https://drone.io/bitbucket.org/techtonik/wikify/latest)
#### the problem and solution
this example is pasted from real-word replacement rules of
Roundup issue tracker:
>>> import re
>>> rules = [
# link to debian bug tracker
(re.compile('debian:\#(?P<id>\d+)'),
'<a href="http://bugs.debian.org/\g<id>">debian#\g<id></a>' ),
# link to local issue
(re.compile('\#(?P<id>\d+)'),
'<a href="issue\g<id>">#\g<id></a>' ),
]
>>> text = "debian:#222"
>>> for search, replace in rules:
... text = search.sub(replace, text)
...
>>> text
'<a href="http://bugs.debian.org/222">debian<a href="issue222">#222</a></a>'
expected output is:
'<a href="http://bugs.debian.org/222">debian#222</a>'
the solution:
>>> import wikify
>>> wrules = [wikify.RegexpRule(s,r) for s,r in rules]
>>> wikify.wikify("debian:#222", wrules)
'<a href="http://bugs.debian.org/222">debian#222</a>'
#### usage
1. define rules that match and process parts of text
2. text = wikify(text, rules)
`rule` is a function or an object run() method that takes text and
returns either `None` (means not matched) or this text split into
three parts [ not-matched, processed, the-rest ]. `processed` part
of text is returned modified by the rule.
example of a rule in action:
>>> import wikify
>>> wikify.rule_link_wikify('wikify your texts!')
('', '<a href="https://bitbucket.org/techtonik/wikify/">wikify</a>', ' your texts!')
and its source code:
def rule_link_wikify(text):
""" replace `wikify` text with a link to repository """
if not 'wikify' in text:
return None
res = text.split('wikify', 1)
site = 'https://bitbucket.org/techtonik/wikify/'
url = '<a href="%s">wikify</a>' % site
return (res[0], url, res[1])
using the rule with wikify to get processed text:
>>> from wikify import wikify, rule_link_wikify
>>> wikify('wikify your texts!', rule_link_wikify)
'<a href="https://bitbucket.org/techtonik/wikify/">wikify</a> your texts!'
you probably want change url and searched string, so to avoid
rewriting the rule from scratch, **wikify** provides some.
#### API
###### RegexpRule(search, replace=r'\0')
wikify rule class. `search` is regexp, `replace` can be string
with backreferences (like \0, \1 etc.) or a callable that receives
`re.MatchObject`.
r = RegexpRule('(\d+)', '[\\1]')
print(wikify('wrap list 1 2 3 45', r))
# wrap list [1] [2] [3] [45]
in comparison to standard `re.sub`, RegexpRule expands \0 in
replacement template to the whole matched string.
###### tracker_link_rule(url)
chained function rule (function that returns list of rules) that
replaces references like #123, issue #123 with link to `url` with
issue number appended.
w = tracker_link_rule('https://bitbucket.org/techtonik/wikify/issue/')
print(wikify('issue #123, Ᾱ', w))
# <a href="https://bitbucket.org/techtonik/wikify/issue/123">issue #123</a>, Ᾱ
###### wikify(text, rules)
`rules` argument can be a list of rules. **wikify** ensures that text
processed by one rule is not reachable by others. if you try to process
some text without **wikify** with just a series of replacement commands,
there can be situations when later replacement may affect the text just
pasted by previous one. **wikify** was made to prevent this from
happening.
#### using as a Sphinx extension
**wikify** is also a Sphinx extension. the following lines if added
to `conf.py`, will link issue numbers on `changes` page to bugtracker for
the `sphinx` project:
extensions = ['wikify']
# setup wikify extension to convert issue references to links
from wikify import RegexpRule, tracker_link_rule
wikify_html_rules = [
# PR#123 or pull request #123
RegexpRule('(PR|pull request\s)\s*#(\d+)',
'<a href="https://bitbucket.org/birkenfeld/sphinx/pull-request/\\2">\\0</a>'),
# issue #123 or just #123
tracker_link_rule('https://bitbucket.org/birkenfeld/sphinx/issue/')
]
wikify_html_pages = ['changes']
#### operation (flat algorithm)
for each region
- find region in processed text
- process text matched by region
- exclude processed text from further processing
note: (flat algorithm) doesn't process nested markup,
such as:
*`bold preformatted text`*
example - replace all wiki:something with HTML links
- [x] wrap text into list with single item
- [x] split text into three parts using regexp `wiki:\w+`
- [x] copy 1st part (not-matched) into the resulting list
- [x] replace matched part with link, insert (processed)
into the resulting list
- [ ] process (the-rest) until text list doesn't change
- [x] repeat the above for the rest of rules, skipping
(processed) parts
- [x] reassemble text from the list
#### roadmap
- [ ] optimize - measure performance of using indexes
instead of text chunks
- [x] write docs
- [x] upload to PyPI
#### history
- 1.5 - fixed major flaw in subst order for single rule
- 1.4 - support named group replacements in RegexpRule
- 1.3 - create_tracker_link_rule to tracker_link_rule
- 1.2 - convert create_regexp_rule to RegexpRule class
- 1.1 - allow rules to be classes (necessary for Sphinx)
- 1.0 - use wikify as Sphinx extension
- 0.9 - case insensitive match in tracker link rule
- 0.8 - python 3 compatibility
- 0.7 - fixed major flaw in text replacements mapping
- 0.5 - helper to build rules to link tracker references
- 0.6 - flatten nested rule lists
- 0.4 - accept single rule in wikify in addition to list
- 0.3 - allow callables in replacements for regexp rules
- 0.2 - helper to build regexp based rules
- 0.1 - proof of concept, production ready, no API sugar and optimizations
*micro-framework for text wikification*
goals - avoid conflicts between text modifications rules
and be easy to extend and debug
**author**: anatoly techtonik <techtonik@gmail.com>
**license**: Public Domain
[![Build Status](https://drone.io/bitbucket.org/techtonik/wikify/status.png)](https://drone.io/bitbucket.org/techtonik/wikify/latest)
#### the problem and solution
this example is pasted from real-word replacement rules of
Roundup issue tracker:
>>> import re
>>> rules = [
# link to debian bug tracker
(re.compile('debian:\#(?P<id>\d+)'),
'<a href="http://bugs.debian.org/\g<id>">debian#\g<id></a>' ),
# link to local issue
(re.compile('\#(?P<id>\d+)'),
'<a href="issue\g<id>">#\g<id></a>' ),
]
>>> text = "debian:#222"
>>> for search, replace in rules:
... text = search.sub(replace, text)
...
>>> text
'<a href="http://bugs.debian.org/222">debian<a href="issue222">#222</a></a>'
expected output is:
'<a href="http://bugs.debian.org/222">debian#222</a>'
the solution:
>>> import wikify
>>> wrules = [wikify.RegexpRule(s,r) for s,r in rules]
>>> wikify.wikify("debian:#222", wrules)
'<a href="http://bugs.debian.org/222">debian#222</a>'
#### usage
1. define rules that match and process parts of text
2. text = wikify(text, rules)
`rule` is a function or an object run() method that takes text and
returns either `None` (means not matched) or this text split into
three parts [ not-matched, processed, the-rest ]. `processed` part
of text is returned modified by the rule.
example of a rule in action:
>>> import wikify
>>> wikify.rule_link_wikify('wikify your texts!')
('', '<a href="https://bitbucket.org/techtonik/wikify/">wikify</a>', ' your texts!')
and its source code:
def rule_link_wikify(text):
""" replace `wikify` text with a link to repository """
if not 'wikify' in text:
return None
res = text.split('wikify', 1)
site = 'https://bitbucket.org/techtonik/wikify/'
url = '<a href="%s">wikify</a>' % site
return (res[0], url, res[1])
using the rule with wikify to get processed text:
>>> from wikify import wikify, rule_link_wikify
>>> wikify('wikify your texts!', rule_link_wikify)
'<a href="https://bitbucket.org/techtonik/wikify/">wikify</a> your texts!'
you probably want change url and searched string, so to avoid
rewriting the rule from scratch, **wikify** provides some.
#### API
###### RegexpRule(search, replace=r'\0')
wikify rule class. `search` is regexp, `replace` can be string
with backreferences (like \0, \1 etc.) or a callable that receives
`re.MatchObject`.
r = RegexpRule('(\d+)', '[\\1]')
print(wikify('wrap list 1 2 3 45', r))
# wrap list [1] [2] [3] [45]
in comparison to standard `re.sub`, RegexpRule expands \0 in
replacement template to the whole matched string.
###### tracker_link_rule(url)
chained function rule (function that returns list of rules) that
replaces references like #123, issue #123 with link to `url` with
issue number appended.
w = tracker_link_rule('https://bitbucket.org/techtonik/wikify/issue/')
print(wikify('issue #123, Ᾱ', w))
# <a href="https://bitbucket.org/techtonik/wikify/issue/123">issue #123</a>, Ᾱ
###### wikify(text, rules)
`rules` argument can be a list of rules. **wikify** ensures that text
processed by one rule is not reachable by others. if you try to process
some text without **wikify** with just a series of replacement commands,
there can be situations when later replacement may affect the text just
pasted by previous one. **wikify** was made to prevent this from
happening.
#### using as a Sphinx extension
**wikify** is also a Sphinx extension. the following lines if added
to `conf.py`, will link issue numbers on `changes` page to bugtracker for
the `sphinx` project:
extensions = ['wikify']
# setup wikify extension to convert issue references to links
from wikify import RegexpRule, tracker_link_rule
wikify_html_rules = [
# PR#123 or pull request #123
RegexpRule('(PR|pull request\s)\s*#(\d+)',
'<a href="https://bitbucket.org/birkenfeld/sphinx/pull-request/\\2">\\0</a>'),
# issue #123 or just #123
tracker_link_rule('https://bitbucket.org/birkenfeld/sphinx/issue/')
]
wikify_html_pages = ['changes']
#### operation (flat algorithm)
for each region
- find region in processed text
- process text matched by region
- exclude processed text from further processing
note: (flat algorithm) doesn't process nested markup,
such as:
*`bold preformatted text`*
example - replace all wiki:something with HTML links
- [x] wrap text into list with single item
- [x] split text into three parts using regexp `wiki:\w+`
- [x] copy 1st part (not-matched) into the resulting list
- [x] replace matched part with link, insert (processed)
into the resulting list
- [ ] process (the-rest) until text list doesn't change
- [x] repeat the above for the rest of rules, skipping
(processed) parts
- [x] reassemble text from the list
#### roadmap
- [ ] optimize - measure performance of using indexes
instead of text chunks
- [x] write docs
- [x] upload to PyPI
#### history
- 1.5 - fixed major flaw in subst order for single rule
- 1.4 - support named group replacements in RegexpRule
- 1.3 - create_tracker_link_rule to tracker_link_rule
- 1.2 - convert create_regexp_rule to RegexpRule class
- 1.1 - allow rules to be classes (necessary for Sphinx)
- 1.0 - use wikify as Sphinx extension
- 0.9 - case insensitive match in tracker link rule
- 0.8 - python 3 compatibility
- 0.7 - fixed major flaw in text replacements mapping
- 0.5 - helper to build rules to link tracker references
- 0.6 - flatten nested rule lists
- 0.4 - accept single rule in wikify in addition to list
- 0.3 - allow callables in replacements for regexp rules
- 0.2 - helper to build regexp based rules
- 0.1 - proof of concept, production ready, no API sugar and optimizations
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
wikify-1.5.zip
(9.2 kB
view details)
File details
Details for the file wikify-1.5.zip
.
File metadata
- Download URL: wikify-1.5.zip
- Upload date:
- Size: 9.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0009c84e7eae012b2c19993fdd35b3129fb865b5eedc3d721b48f2ea150cb765 |
|
MD5 | d030ec5a2d497634448462e7065fbb8a |
|
BLAKE2b-256 | aed4e7640058d040d24462c9e20d0044b37a6e27ae6c4d7c402d2af25c4ca79a |