A native Python2/3 reader module for the SWORD Project Bible Modules
Project description
A native Python reader of the SWORD Project Bible Modules
This project is not an official CrossWire project. It merely provides an alternative way to read the bible modules created by CrossWires SWORD project.
Features
Read SWORD bibles (not commentaries etc.)
Detection of locally installed bible modules.
Supports all known SWORD module formats (ztext, ztext4, rawtext, rawtext4)
Read from zipped modules, like those available from http://www.crosswire.org/sword/modules/ModDisp.jsp?modType=Bibles
Clean text of OSIS, GBF or ThML tags.
Supports both python 2 and 3 (tested with 2.7 and 3.5)
License
Since parts of the code is derived and/or copied from the SWORD project (see canons.py) which is GPL2, this code is also under the GPL2 license.
Installation
PySwords source code can be downloaded from PySwords release list, but it is also available from PyPI for install using pip or easy_install. It also available for ArchLinux (AUR), and will soon be available as a package in Debian and Fedora.
Example code
Use modules from default datapath
from pysword.modules import SwordModules
# Find available modules/bibles in standard data path.
# For non-standard data path, pass it as an argument to the SwordModules constructor.
modules = SwordModules()
# In this case we'll assume the modules found is something like:
# {'KJV': {'description': 'KingJamesVersion(1769)withStrongsNumbersandMorphology', 'encoding': 'UTF-8', ...}}
found_modules = modules.parse_modules()
bible = modules.get_bible_from_module('KJV')
# Get John chapter 3 verse 16
output = bible.get(books=['john'], chapters=[3], verses=[16])
Load module from zip-file
from pysword.modules import SwordModules
# Load module in zip
# NB: the zip content is only available as long as the SwordModules object exists
modules = SwordModules('KJV.zip')
# In this case the module found is:
# {'KJV': {'description': 'KingJamesVersion(1769)withStrongsNumbersandMorphology', 'encoding': 'UTF-8', ...}}
found_modules = modules.parse_modules()
bible = modules.get_bible_from_module('KJV')
# Get John chapter 3 verse 16
output = bible.get(books=['john'], chapters=[3], verses=[16])
Manually create bible
from pysword.bible import SwordBible
# Create the bible. The arguments are:
# SwordBible(<module path>, <module type>, <versification>, <encoding>, <text formatting>)
# Only the first is required, the rest have default values which should work in most cases.
bible = SwordBible('/home/me/.sword/modules/texts/ztext/kjv/', 'ztext', 'default', 'utf8', 'OSIS')
# Get John chapter 3 verse 16
output = bible.get(books=['john'], chapters=[3], verses=[16])
Run tests
To run the testsuite, first run the script that download the files used for testing, and then use nosetests to run the testsuite:
$ python tests/resources/download_bibles.py
$ nosetests -v tests/
The tests should run and pass using both python 2 and 3.
Contributing
If you want to contribute, you are most welcome to do so! Feel free to report issues and create merge request at https://gitlab.com/tgc-dk/pysword If you create a merge request please include a test the proves that your code actually works.
Module formats
I’ll use Python’s struct module’s format strings to describe byte formatting. See https://docs.python.org/3/library/struct.html
There are current 4 formats for bible modules in SWORD.
ztext format documentation
Take the Old Testament (OT) for example. Three files:
ot.bzv: Maps verses to character ranges in compressed buffers. 10 bytes (‘<IIH’) for each verse in the Bible:
buffer_num (I): which compressed buffer the verse is located in
verse_start (I): the location in the uncompressed buffer where the verse begins
verse_len (H): length of the verse, in uncompressed characters
These 10-byte records are densely packed, indexed by VerseKey ‘Indicies’ (docs later). So the record for the verse with index x starts at byte 10*x.
ot.bzs: Tells where the compressed buffers start and end. 12 bytes (‘<III’) for each compressed buffer:
offset (I): where the compressed buffer starts in the file
size (I): the length of the compressed data, in bytes
uc_size (I): the length of the uncompressed data, in bytes (unused)
These 12-byte records are densely packed, indexed by buffer_num (see previous). So the record for compressed buffer buffer_num starts at byte 12*buffer_num.
ot.bzz: Contains the compressed text. Read ‘size’ bytes starting at ‘offset’.
ztext4 format documentation
ztext4 is the same as ztext, except that in the bzv-file the verse_len is now represented by 4-byte integer (I), making the record 12 bytes in all.
rawtext format documentation
Again OT example. Two files:
ot.vss: Maps verses to character ranges in text file. 6 bytes (‘<IH’) for each verse in the Bible:
verse_start (I): the location in the textfile where the verse begins
verse_len (H): length of the verse, in characters
ot: Contains the text. Read ‘verse_len’ characters starting at ‘verse_start’.
rawtext4 format documentation
rawtext4 is the same as rawtext, except that in the vss-file the verse_len is now represented by 4-byte integer (I), making the record 8 bytes in all.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file pysword-0.2.3.tar.gz
.
File metadata
- Download URL: pysword-0.2.3.tar.gz
- Upload date:
- Size: 21.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1ea337f032e8e1f99ec0cde08ce10dfdf8f9ea209b665626ae8f82b0155f15a3 |
|
MD5 | 262c7cff6de73e353b1125bebd410add |
|
BLAKE2b-256 | bfbb9923d6e7252f5dc108106b8554205e59ac41896d7673021f6adf3ad86fa0 |