pluggable command-line tool for validating the formatting and orthography of text files
Project description
text-validator
pluggable command-line tool for validating the formatting and orthography of text files
You config your validator plugins with a TOML file like:
["text_validator.plugins.whitespace"]
CHECK_CRLF = true
CHECK_TABS = true
CHECK_TRAILING_WHITESPACE = true
CHECK_NO_EOF_NEWLINE = true
["text_validator.plugins.unicode"]
CONFIRM_UTF_8_NFC = true
["text_validator.plugins.ref_line_format"]
REF_REGEX = "(\\d+|EP|SB)\\.\\d+(\\.\\d+)?$" # example from AF
["text_validator.plugins.characters"]
REPLACE_CHARS = [
# bad character, suggested replacement
["\u02BC", "\u2019"],
["\u1FBF", "\u2019"],
["\u037E", "\u003B"],
["\u0387", "\u00B7"],
["\u0374", "\u02B9"],
["\u03D5", "\u03C6"],
["\u03D1", "\u03B8"],
]
and they'll validate the texts you give it:
tests/test_0001.txt:1:line ends with CRLF
tests/test_0001.txt:2:line ends with CRLF
tests/test_0002.txt:1:no newline at end of file
tests/test_0003.txt:1:line contains a tab
tests/test_0004.txt:1:trailing whitespace
tests/test_0006.txt:1:not NFC
tests/test_0007.txt:2:BLANK LINE
tests/test_0008.txt:1:BAD WHITESPACE
tests/test_0008.txt:2:BAD WHITESPACE
tests/test_0009.txt:4:BAD REFERENCE FORM
tests/test_0009.txt:5:BAD REFERENCE FORM
tests/test_0010.txt:2:29:bad U+02BC; consider replacing with U+2019
tests/test_0010.txt:3:29:bad U+1FBF; consider replacing with U+2019
To install:
pip install text-validator
Then you can either run from the command line:
validate-text tests/config_004.toml tests/test_0007.txt tests/test_0008.txt tests/test_0009.txt
or programmatically from Python, either with the helper function validate
:
from text_validator.main import validate
validate("tests/config_003.toml", ["tests/test_0005.txt", "tests/test_0006.txt"])
or by working directly with a Suite instance:
from text_validator.base import Suite
suite = Suite()
suite.load_toml("tests/config_002.toml")
suite.validate_files(["tests/test_0005.txt", "tests/test_0006.txt"])
Also see:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
text-validator-0.2.tar.gz
(5.0 kB
view details)
Built Distribution
File details
Details for the file text-validator-0.2.tar.gz
.
File metadata
- Download URL: text-validator-0.2.tar.gz
- Upload date:
- Size: 5.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.39.0 CPython/3.8.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ac3d87a5bbe2fab5ab3fd37c75ef51605d6f18e7f5bd976154d5beaef47ea9df |
|
MD5 | 987d20468cfc984c88c500eb51d8c3e3 |
|
BLAKE2b-256 | db44a7a5f988243eb23e676d9bfcba686455e23f8b0659a33e083b69972e7579 |
File details
Details for the file text_validator-0.2-py3-none-any.whl
.
File metadata
- Download URL: text_validator-0.2-py3-none-any.whl
- Upload date:
- Size: 6.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.39.0 CPython/3.8.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | afb37ced4f2d6bdc3dac95203edc72f4c1b055dbf788e9269cfde3d2dfb5e32d |
|
MD5 | 32605a960e82376aa9bb5b68a7a6faa3 |
|
BLAKE2b-256 | b1ce1d35c0e3efc4b12cfeaeab2acb510c763350012c69af4159cd05d016a0be |