Download and export UNIHAN to Python, CSV, JSON and YAML
Project description
*unihan-tabular* - tool to build `UNIHAN`_ into tabular-friendly formats
like python, JSON, CSV and YAML. Part of the `cihai`_ project.
|pypi| |docs| |build-status| |coverage| |license|
Unihan's data is dispersed across multiple files in the format of::
U+3400 kCantonese jau1
U+3400 kDefinition (same as U+4E18 丘) hillock or mound
U+3400 kMandarin qiū
U+3401 kCantonese tim2
U+3401 kDefinition to lick; to taste, a mat, bamboo bark
U+3401 kHanyuPinyin 10019.020:tiàn
U+3401 kMandarin tiàn
``unihan_tabular/process.py`` will download Unihan.zip and build all files into a
single tabular friendly format.
CSV (default output: *./data/unihan.csv*)::
char,ucn,kCantonese,kDefinition,kHanyuPinyin,kMandarin
㐀,U+3400,jau1,(same as U+4E18 丘) hillock or mound,,qiū
㐁,U+3401,tim2,"to lick; to taste, a mat, bamboo bark",10019.020:tiàn,tiàn
JSON (default output: *./data/unihan.json*):
.. code-block:: json
[
{
"char": "㐀",
"ucn": "U+3400",
"kCantonese": "jau1",
"kDefinition": "(same as U+4E18 丘) hillock or mound",
"kHanyuPinyin": null,
"kMandarin": "qiū"
},
{
"char": "㐁",
"ucn": "U+3401",
"kCantonese": "tim2",
"kDefinition": "to lick; to taste, a mat, bamboo bark",
"kHanyuPinyin": "10019.020:tiàn",
"kMandarin": "tiàn"
}
]
YAML (default output: *./data/unihan.yaml*):
.. code-block:: yaml
- char: 㐀
kCantonese: jau1
kDefinition: (same as U+4E18 丘) hillock or mound
kHanyuPinyin: null
kMandarin: qiū
ucn: U+3400
- char: 㐁
kCantonese: tim2
kDefinition: to lick; to taste, a mat, bamboo bark
kHanyuPinyin: 10019.020:tiàn
kMandarin: tiàn
ucn: U+3401
``process.py`` supports command line arguments. See `unihan_tabular/process.py CLI
arguments`_ for information on how you can specify custom columns, files,
download URL's and output destinations.
.. _cihai: https://cihai.git-pull.com
.. _cihai-handbook: https://github.com/cihai/cihai-handbook
.. _cihai team: https://github.com/cihai?tab=members
.. _cihai-python: https://github.com/cihai/cihai-python
.. _unihan-tabular on github: https://github.com/cihai/unihan-tabular
Usage
-----
To download and build your own ``unihan.csv``:
.. code-block:: bash
$ pip install unihan-tabular
.. code-block:: bash
$ unihan-tabular
Creates ``data/unihan.json``.
To output CSV::
$ unihan-tabular -F csv
To output YAML::
$ pip install pyyaml
$ unihan-tabular -F yaml
To only output the kDefinition field in a csv::
$ unihan-tabular -F csv -f kDefinition
See `unihan_tabular/process.py CLI arguments`_ for advanced usage examples.
.. _unihan_tabular/process.py CLI arguments: http://unihan-tabular.readthedocs.org/en/latest/cli.html
Structure
---------
.. code-block:: bash
# output (JSON)
data/unihan.json
# output (CSV)
data/unihan.csv
# script to download + build a SDF csv of unihan.
unihan_tabular/process.py
# unit tests to verify behavior / consistency of builder
tests/*
# python 2/3 compatibility modules
unihan_tabular/_compat.py
unihan_tabular/unicodecsv.py
# utility / helper functions
unihan_tabular/util.py
- ``data/unihan.csv`` - CSV export file.
- ``unihan_tabular/process.py`` - create a ``data/unihan.csv``.
.. _MIT: http://opensource.org/licenses/MIT
.. _API: http://cihai.readthedocs.org/en/latest/api.html
.. _UNIHAN: http://www.unicode.org/charts/unihan.html
.. |pypi| image:: https://img.shields.io/pypi/v/unihan-tabular.svg
:alt: Python Package
:target: http://badge.fury.io/py/unihan-tabular
.. |build-status| image:: https://img.shields.io/travis/cihai/unihan-tabular.svg
:alt: Build Status
:target: https://travis-ci.org/cihai/unihan-tabular
.. |coverage| image:: https://codecov.io/gh/cihai/unihan-tabular/branch/master/graph/badge.svg
:alt: Code Coverage
:target: https://codecov.io/gh/cihai/unihan-tabular
.. |license| image:: https://img.shields.io/github/license/cihai/unihan-tabular.svg
:alt: License
.. |docs| image:: https://readthedocs.org/projects/unihan-tabular/badge/?version=latest
:alt: Documentation Status
:scale: 100%
:target: https://readthedocs.org/projects/unihan-tabular/
like python, JSON, CSV and YAML. Part of the `cihai`_ project.
|pypi| |docs| |build-status| |coverage| |license|
Unihan's data is dispersed across multiple files in the format of::
U+3400 kCantonese jau1
U+3400 kDefinition (same as U+4E18 丘) hillock or mound
U+3400 kMandarin qiū
U+3401 kCantonese tim2
U+3401 kDefinition to lick; to taste, a mat, bamboo bark
U+3401 kHanyuPinyin 10019.020:tiàn
U+3401 kMandarin tiàn
``unihan_tabular/process.py`` will download Unihan.zip and build all files into a
single tabular friendly format.
CSV (default output: *./data/unihan.csv*)::
char,ucn,kCantonese,kDefinition,kHanyuPinyin,kMandarin
㐀,U+3400,jau1,(same as U+4E18 丘) hillock or mound,,qiū
㐁,U+3401,tim2,"to lick; to taste, a mat, bamboo bark",10019.020:tiàn,tiàn
JSON (default output: *./data/unihan.json*):
.. code-block:: json
[
{
"char": "㐀",
"ucn": "U+3400",
"kCantonese": "jau1",
"kDefinition": "(same as U+4E18 丘) hillock or mound",
"kHanyuPinyin": null,
"kMandarin": "qiū"
},
{
"char": "㐁",
"ucn": "U+3401",
"kCantonese": "tim2",
"kDefinition": "to lick; to taste, a mat, bamboo bark",
"kHanyuPinyin": "10019.020:tiàn",
"kMandarin": "tiàn"
}
]
YAML (default output: *./data/unihan.yaml*):
.. code-block:: yaml
- char: 㐀
kCantonese: jau1
kDefinition: (same as U+4E18 丘) hillock or mound
kHanyuPinyin: null
kMandarin: qiū
ucn: U+3400
- char: 㐁
kCantonese: tim2
kDefinition: to lick; to taste, a mat, bamboo bark
kHanyuPinyin: 10019.020:tiàn
kMandarin: tiàn
ucn: U+3401
``process.py`` supports command line arguments. See `unihan_tabular/process.py CLI
arguments`_ for information on how you can specify custom columns, files,
download URL's and output destinations.
.. _cihai: https://cihai.git-pull.com
.. _cihai-handbook: https://github.com/cihai/cihai-handbook
.. _cihai team: https://github.com/cihai?tab=members
.. _cihai-python: https://github.com/cihai/cihai-python
.. _unihan-tabular on github: https://github.com/cihai/unihan-tabular
Usage
-----
To download and build your own ``unihan.csv``:
.. code-block:: bash
$ pip install unihan-tabular
.. code-block:: bash
$ unihan-tabular
Creates ``data/unihan.json``.
To output CSV::
$ unihan-tabular -F csv
To output YAML::
$ pip install pyyaml
$ unihan-tabular -F yaml
To only output the kDefinition field in a csv::
$ unihan-tabular -F csv -f kDefinition
See `unihan_tabular/process.py CLI arguments`_ for advanced usage examples.
.. _unihan_tabular/process.py CLI arguments: http://unihan-tabular.readthedocs.org/en/latest/cli.html
Structure
---------
.. code-block:: bash
# output (JSON)
data/unihan.json
# output (CSV)
data/unihan.csv
# script to download + build a SDF csv of unihan.
unihan_tabular/process.py
# unit tests to verify behavior / consistency of builder
tests/*
# python 2/3 compatibility modules
unihan_tabular/_compat.py
unihan_tabular/unicodecsv.py
# utility / helper functions
unihan_tabular/util.py
- ``data/unihan.csv`` - CSV export file.
- ``unihan_tabular/process.py`` - create a ``data/unihan.csv``.
.. _MIT: http://opensource.org/licenses/MIT
.. _API: http://cihai.readthedocs.org/en/latest/api.html
.. _UNIHAN: http://www.unicode.org/charts/unihan.html
.. |pypi| image:: https://img.shields.io/pypi/v/unihan-tabular.svg
:alt: Python Package
:target: http://badge.fury.io/py/unihan-tabular
.. |build-status| image:: https://img.shields.io/travis/cihai/unihan-tabular.svg
:alt: Build Status
:target: https://travis-ci.org/cihai/unihan-tabular
.. |coverage| image:: https://codecov.io/gh/cihai/unihan-tabular/branch/master/graph/badge.svg
:alt: Code Coverage
:target: https://codecov.io/gh/cihai/unihan-tabular
.. |license| image:: https://img.shields.io/github/license/cihai/unihan-tabular.svg
:alt: License
.. |docs| image:: https://readthedocs.org/projects/unihan-tabular/badge/?version=latest
:alt: Documentation Status
:scale: 100%
:target: https://readthedocs.org/projects/unihan-tabular/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
unihan-tabular-0.6.2.tar.gz
(15.6 kB
view details)
File details
Details for the file unihan-tabular-0.6.2.tar.gz
.
File metadata
- Download URL: unihan-tabular-0.6.2.tar.gz
- Upload date:
- Size: 15.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 25c9cceb76d1deae9e9fa62988e9072992a2dfef1ade25938e604dedd1dee161 |
|
MD5 | 662b7a31cffcd7ffa96bbdeb50032d12 |
|
BLAKE2b-256 | aab72f7338ba29325ee9745fe356912ae81153c87a90c31939577c22bd375929 |