Skip to main content

Zhon provides constants used in Chinese text processing.

Project description

https://badge.fury.io/py/zhon.png https://travis-ci.org/tsroten/zhon.png?branch=develop

Zhon is a Python library that provides constants commonly used in Chinese text processing.

About

Zhon’s constants can be used in Chinese text processing, for example:

  • Find CJK characters in a string:

    >>> re.findall('[%s]' % zhon.hanzi.characters, 'I broke a plate: 我打破了一个盘子.')
    ['我', '打', '破', '了', '一', '个', '盘', '子']
  • Validate Pinyin syllables, words, or sentences:

    >>> re.findall(zhon.pinyin.syllable, 'Yuànzi lǐ tíngzhe yí liàng chē.', re.I)
    ['Yuàn', 'zi', 'lǐ', 'tíng', 'zhe', 'yí', 'liàng', 'chē']
    
    >>> re.findall(zhon.pinyin.word, 'Yuànzi lǐ tíngzhe yí liàng chē.', re.I)
    ['Yuànzi', 'lǐ', 'tíngzhe', 'yí', 'liàng', 'chē']
    
    >>> re.findall(zhon.pinyin.sentence, 'Yuànzi lǐ tíngzhe yí liàng chē.', re.I)
    ['Yuànzi lǐ tíngzhe yí liàng chē.']

Features

  • Includes commonly-used constants:
    • CJK characters and radicals

    • Chinese punctuation marks

    • Chinese sentence regular expression pattern

    • Pinyin vowels, consonants, lowercase, uppercase, and punctuation

    • Pinyin syllable, word, and sentence regular expression patterns

    • Zhuyin characters and marks

    • Zhuyin syllable regular expression pattern

    • CC-CEDICT characters

  • Runs on Python 2.7 and 3

Getting Started

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zhon-1.1.4.tar.gz (100.1 kB view details)

Uploaded Source

File details

Details for the file zhon-1.1.4.tar.gz.

File metadata

  • Download URL: zhon-1.1.4.tar.gz
  • Upload date:
  • Size: 100.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for zhon-1.1.4.tar.gz
Algorithm Hash digest
SHA256 40abb082e8573fc1fda04fbb64b75e140fb2bbb35a34ffa07feb23b69c56f7c7
MD5 1a9145e0d95f92289c9a0c34870cb9df
BLAKE2b-256 64dd3b6184c4a3e5d2972b4bed74757e49b59548eefd794251126b2342dd7161

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page