Library for CJK (chinese, japanese, korean) language data.
Project description
cihai - Python library for CJK (chinese, japanese, korean) data
This project is under active development. Follow our progress and check back for updates!
Usage
API / Library (this repository)
$ pip install --user cihai
from cihai.core import Cihai
c = Cihai()
if not c.unihan.is_bootstrapped: # download and install Unihan to db
c.unihan.bootstrap(unihan_options)
query = c.unihan.lookup_char('好')
glyph = query.first()
print("lookup for 好: %s" % glyph.kDefinition)
# lookup for 好: good, excellent, fine; well
query = c.unihan.reverse_char('good')
print('matches for "good": %s ' % ', '.join([glph.char for glph in query]))
# matches for "good": 㑘, 㑤, 㓛, 㘬, 㙉, 㚃, 㚒, 㚥, 㛦, 㜴, 㜺, 㝖, 㤛, 㦝, ...
See API documentation and /examples.
CLI (cihai-cli)
$ pip install --user cihai[cli]
# character lookup
$ cihai info 好
char: 好
kCantonese: hou2 hou3
kDefinition: good, excellent, fine; well
kHangul: 호
kJapaneseOn: KOU
kKorean: HO
kMandarin: hǎo
kTang: '*xɑ̀u *xɑ̌u'
kTotalStrokes: '6'
kVietnamese: háo
ucn: U+597D
# reverse lookup
$ cihai reverse library
char: 圕
kCangjie: WLGA
kCantonese: syu1
kCihaiT: '308.302'
kDefinition: library
kMandarin: tú
kTotalStrokes: '13'
ucn: U+5715
--------
UNIHAN data
All datasets that cihai uses have stand-alone tools to export their data. No library required.
- unihan-etl - UNIHAN data exports for csv, yaml and json.
Developing
poetry is a required package to develop.
git clone https://github.com/cihai/cihai.git
cd cihai
poetry install -E "docs test coverage lint format"
Makefile commands prefixed with watch_
will watch files and rerun.
Tests
poetry run py.test
Helpers: make test
Rerun tests on file change: make watch_test
(requires entr(1))
Documentation
Default preview server: http://localhost:8035
cd docs/
and make html
to build. make serve
to start http server.
Helpers: make build_docs
, make serve_docs
Rebuild docs on file change: make watch_docs
(requires
entr(1))
Rebuild docs and run server via one terminal: make dev_docs
(requires
above, and a make(1)
with -J
support, e.g. GNU Make)
Formatting / Linting
The project uses black and isort (one after the other) and runs flake8 via CI. See the configuration in pyproject.toml and `setup.cfg`:
make black isort
: Run black
first, then isort
to handle import
nuances make flake8
, to watch (requires entr(1)
):
make watch_flake8
Releasing
As of 0.10, poetry handles virtualenv creation, package requirements, versioning, building, and publishing. Therefore there is no setup.py or requirements files.
Update __version__ in __about__.py and `pyproject.toml`:
git commit -m 'build(cihai): Tag v0.1.1'
git tag v0.1.1
git push
git push --tags
poetry build
poetry deploy
Quick links
- Usage
- Datasets a full list of current and future data sets
- Python API
- Roadmap
- Python support: >= 3.6, pypy
- Source: https://github.com/cihai/cihai
- Docs: https://cihai.git-pull.com
- Changelog: https://cihai.git-pull.com/history.html
- API: https://cihai.git-pull.com/api.html
- Issues: https://github.com/cihai/cihai/issues
- Test coverage: https://codecov.io/gh/cihai/cihai
- pypi: https://pypi-hypernode.com/pypi/cihai
- OpenHub: https://www.openhub.net/p/cihai
- License: MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file cihai-0.12.0.tar.gz
.
File metadata
- Download URL: cihai-0.12.0.tar.gz
- Upload date:
- Size: 21.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.0.10 CPython/3.7.9 Linux/4.19.128-microsoft-standard
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 48df072d02a0f7121ea7ad3d4e039e89d0f1102579f32f3ef9db78bb4358a3ef |
|
MD5 | 72bd80ae6951a086a682b391b4caf6da |
|
BLAKE2b-256 | 18b7166201d5945e9851757da0d5ad7299d24d2abf7a4af9dd42c900235a5a10 |
Provenance
File details
Details for the file cihai-0.12.0-py3-none-any.whl
.
File metadata
- Download URL: cihai-0.12.0-py3-none-any.whl
- Upload date:
- Size: 22.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.0.10 CPython/3.7.9 Linux/4.19.128-microsoft-standard
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4e748a98da8fab173d15d159ffa484c677f5f6b94c96ca4beb4f1b84cbcf4a27 |
|
MD5 | 16798dfae7124b11d378b1708f217f7e |
|
BLAKE2b-256 | a47f3fcfe9bf8d7353a270ce11ebbfc1bd4275b6cc783864bf126fd41a04bed5 |