Skip to main content

Library for CJK (chinese, japanese, korean) language data.

Project description

cihai - Python library for CJK (chinese, japanese, korean) data

Python Package Docs Build Status Code Coverage License

This project is under active development. Follow our progress and check back for updates!

Usage

API / Library (this repository)

$ pip install --user cihai
from cihai.core import Cihai

c = Cihai()

if not c.unihan.is_bootstrapped:  # download and install Unihan to db
    c.unihan.bootstrap(unihan_options)

query = c.unihan.lookup_char('好')
glyph = query.first()
print("lookup for 好: %s" % glyph.kDefinition)
# lookup for 好: good, excellent, fine; well

query = c.unihan.reverse_char('good')
print('matches for "good": %s ' % ', '.join([glph.char for glph in query]))
# matches for "good": 㑘, 㑤, 㓛, 㘬, 㙉, 㚃, 㚒, 㚥, 㛦, 㜴, 㜺, 㝖, 㤛, 㦝, ...

See API documentation and /examples.

CLI (cihai-cli)

$ pip install --user cihai[cli]
# character lookup
$ cihai info 好
char: 好
kCantonese: hou2 hou3
kDefinition: good, excellent, fine; well
kHangul: 호
kJapaneseOn: KOU
kKorean: HO
kMandarin: hǎo
kTang: '*xɑ̀u *xɑ̌u'
kTotalStrokes: '6'
kVietnamese: háo
ucn: U+597D

# reverse lookup
$ cihai reverse library
char: 圕
kCangjie: WLGA
kCantonese: syu1
kCihaiT: '308.302'
kDefinition: library
kMandarin: tú
kTotalStrokes: '13'
ucn: U+5715
--------

UNIHAN data

All datasets that cihai uses have stand-alone tools to export their data. No library required.

Developing

poetry is a required package to develop.

git clone https://github.com/cihai/cihai.git

cd cihai

poetry install -E "docs test coverage lint format"

Makefile commands prefixed with watch_ will watch files and rerun.

Tests

poetry run py.test

Helpers: make test Rerun tests on file change: make watch_test (requires entr(1))

Documentation

Default preview server: http://localhost:8035

cd docs/ and make html to build. make serve to start http server.

Helpers: make build_docs, make serve_docs

Rebuild docs on file change: make watch_docs (requires entr(1))

Rebuild docs and run server via one terminal: make dev_docs (requires above, and a make(1) with -J support, e.g. GNU Make)

Formatting / Linting

The project uses black and isort (one after the other) and runs flake8 via CI. See the configuration in pyproject.toml and `setup.cfg`:

make black isort: Run black first, then isort to handle import nuances make flake8, to watch (requires entr(1)): make watch_flake8

Releasing

As of 0.10, poetry handles virtualenv creation, package requirements, versioning, building, and publishing. Therefore there is no setup.py or requirements files.

Update __version__ in __about__.py and `pyproject.toml`:

git commit -m 'build(cihai): Tag v0.1.1'
git tag v0.1.1
git push
git push --tags
poetry build
poetry deploy

Quick links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cihai-0.12.0.tar.gz (21.0 kB view details)

Uploaded Source

Built Distribution

cihai-0.12.0-py3-none-any.whl (22.6 kB view details)

Uploaded Python 3

File details

Details for the file cihai-0.12.0.tar.gz.

File metadata

  • Download URL: cihai-0.12.0.tar.gz
  • Upload date:
  • Size: 21.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.0.10 CPython/3.7.9 Linux/4.19.128-microsoft-standard

File hashes

Hashes for cihai-0.12.0.tar.gz
Algorithm Hash digest
SHA256 48df072d02a0f7121ea7ad3d4e039e89d0f1102579f32f3ef9db78bb4358a3ef
MD5 72bd80ae6951a086a682b391b4caf6da
BLAKE2b-256 18b7166201d5945e9851757da0d5ad7299d24d2abf7a4af9dd42c900235a5a10

See more details on using hashes here.

Provenance

File details

Details for the file cihai-0.12.0-py3-none-any.whl.

File metadata

  • Download URL: cihai-0.12.0-py3-none-any.whl
  • Upload date:
  • Size: 22.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.0.10 CPython/3.7.9 Linux/4.19.128-microsoft-standard

File hashes

Hashes for cihai-0.12.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4e748a98da8fab173d15d159ffa484c677f5f6b94c96ca4beb4f1b84cbcf4a27
MD5 16798dfae7124b11d378b1708f217f7e
BLAKE2b-256 a47f3fcfe9bf8d7353a270ce11ebbfc1bd4275b6cc783864bf126fd41a04bed5

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page