Python API for CJK datasets. Part of the cihai project.
Project description
cihai - United front to provide open, accessible, and standardized access to CJK data
In development
Tool
Single tool for interfacing with CJK data, compare to cjklib.
API, in python, for programatically interfacing with data.
Compatible with python 2.7, 3.3+, and pypy/pypy3.
Designed against a robust test suite. See Travis Builds and Revision History.
Supports Unihan, upcoming support for character decomposition, dictionaries (CEDict).
Extensible. For new data sets, read more about how you can extend cihai to support new datapackages compatible datasets.
For more, see internals for design philosophy.
Workgroup and Standardization
Find undigitized data sets relating to CJK
Clarifying and negotiate license details of data sets, see permissively licensing your dataset.
Create standardized, consistent packages for all data sets
Maintain aforementioned datasets
Continue to improve current infrastructure and packages while seeking out rare and undigitized CJK data for preservation and access
Troubleshooting
Python 2.7 and UCS
Note, to get this working on python 2.7, you must have python built with UCS4 via --enable-unicode=ucs4. You can test for UCS4 with:
>>> import sys
>>> sys.maxunicode > 0xffff
True
Most packaged and included python distributions will already be build with UCS4 (such as Ubuntu’s system python). On python 3.3 and greater, this distinction no longer exists, no action is needed.
Python support |
Python 2.7, >= 3.3, pypy |
Source |
|
Docs |
|
Changelog |
|
API |
|
Issues |
|
Travis |
|
Test coverage |
|
pypi |
|
OpenHub |
|
License |
BSD. |
git repo |
|
install stable |
|
install dev |
|
tests |
|
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file cihai-0.3.0.tar.gz
.
File metadata
- Download URL: cihai-0.3.0.tar.gz
- Upload date:
- Size: 26.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 10efc361a053c1a8519f5e1108875859e4882bd55c480768c26f6b09c8762292 |
|
MD5 | 3a0994a67b23965ef5cc201dfc8df833 |
|
BLAKE2b-256 | 8e41d2f33a474eac738d8a7a5401879ac3a251891b55ea6c1ba3b9c45528eb6f |