Conversion between Traditional and Simplified Chinese
Project description
Open Chinese Convert 開放中文轉換
Introduction 介紹
Open Chinese Convert (OpenCC, 開放中文轉換) is an opensource project for conversions between Traditional Chinese, Simplified Chinese and Japanese Kanji (Shinjitai). It supports character-level and phrase-level conversion, character variant conversion and regional idioms among Mainland China, Taiwan and Hong Kong. This is not translation tool between Mandarin and Cantonese, etc.
中文簡繁轉換開源項目,支持詞彙級別的轉換、異體字轉換和地區習慣用詞轉換(中國大陸、臺灣、香港、日本新字體)。不提供普通話與粵語的轉換。
Discussion (Telegram): https://t.me/open_chinese_convert
Features 特點
- 嚴格區分「一簡對多繁」和「一簡對多異」。
- 完全兼容異體字,可以實現動態替換。
- 嚴格審校一簡對多繁詞條,原則爲「能分則不合」。
- 支持中國大陸、臺灣、香港異體字和地區習慣用詞轉換,如「裏」「裡」、「鼠標」「滑鼠」。
- 詞庫和函數庫完全分離,可以自由修改、導入、擴展。
Installation 安裝
See Download.
Usage 使用
Online demo 線上轉換展示
Warning: This is NOT an API. You will be banned if you make calls programmatically.
Node.js
npm npm i install opencc
const OpenCC = require('opencc');
const opencc = new OpenCC('s2t.json');
opencc.convertPromise("汉字").then(converted => {
console.log(converted); // 漢字
});
See demo.js.
Python
PyPI pip3 install opencc-py
import opencc
converter = opencc.OpenCC('s2t.json')
converter.convert('汉字') //漢字
C++
#include "opencc.h"
int main() {
const SimpleConverter converter("s2t.json");
converter.Convert("汉字"); //漢字
return 0;
}
Document 文檔: https://byvoid.github.io/OpenCC/
Command Line
opencc --help
opencc_dict --help
opencc_phrase_extract --help
Others (Unofficial)
- Swift (iOS): SwiftyOpenCC
- Java: opencc4j
- Android: android-opencc
- PHP: opencc4php
- WebAssembly: wasm-opencc
Configurations 配置文件
預設配置文件
s2t.json
Simplified Chinese to Traditional Chinese 簡體到繁體t2s.json
Traditional Chinese to Simplified Chinese 繁體到簡體s2tw.json
Simplified Chinese to Traditional Chinese (Taiwan Standard) 簡體到臺灣正體tw2s.json
Traditional Chinese (Taiwan Standard) to Simplified Chinese 臺灣正體到簡體s2hk.json
Simplified Chinese to Traditional Chinese (Hong Kong Standard) 簡體到香港繁體(香港小學學習字詞表標準)hk2s.json
Traditional Chinese (Hong Kong Standard) to Simplified Chinese 香港繁體(香港小學學習字詞表標準)到簡體s2twp.json
Simplified Chinese to Traditional Chinese (Taiwan Standard) with Taiwanese idiom 簡體到繁體(臺灣正體標準)並轉換爲臺灣常用詞彙tw2sp.json
Traditional Chinese (Taiwan Standard) to Simplified Chinese with Mainland Chinese idiom 繁體(臺灣正體標準)到簡體並轉換爲中國大陸常用詞彙t2tw.json
Traditional Chinese (OpenCC Standard) to Taiwan Standard 繁體(OpenCC 標準)到臺灣正體t2hk.json
Traditional Chinese (OpenCC Standard) to Hong Kong Standard 繁體(OpenCC 標準)到香港繁體(香港小學學習字詞表標準)t2jp.json
Traditional Chinese Characters (Kyūjitai) to New Japanese Kanji (Shinjitai) 繁體(OpenCC 標準,舊字體)到日文新字體jp2t.json
New Japanese Kanji (Shinjitai) to Traditional Chinese Characters (Kyūjitai) 日文新字體到繁體(OpenCC 標準,舊字體)
Build 編譯
Build with CMake
Linux (g++ 4.6 is required) and Mac OS X (clang 3.2 is required):
make
Windows Visual Studio:
cmake -S. -Bbuild -DCMAKE_INSTALL_PREFIX:PATH=.
cmake --build build --config Release --target install
Test 測試
make test
Benchmark 基準測試
make benchmark
Example results (from Travis CI):
1: ------------------------------------------------------------------
1: Benchmark Time CPU Iterations
1: ------------------------------------------------------------------
1: BM_Initialization/s2t 27325410 ns 27337754 ns 26
1: BM_Initialization/t2s 1427929 ns 1428890 ns 492
1: BM_Initialization/s2tw 26888809 ns 26900500 ns 26
1: BM_Initialization/s2twp 27286513 ns 27297972 ns 25
1: BM_Initialization/tw2s 1442091 ns 1442939 ns 475
1: BM_Initialization/tw2sp 1737702 ns 1738815 ns 398
1: BM_Initialization/s2hk 27070874 ns 27081523 ns 26
1: BM_Initialization/hk2s 1515165 ns 1516135 ns 466
1: BM_Initialization/t2jp 147005 ns 146864 ns 4850
1: BM_Initialization/jp2t 246554 ns 246479 ns 2859
1: BM_Convert 531 ms 531 ms 1
1/1 Test #1: performance ...................... Passed 11.52 sec
Projects using OpenCC 使用 OpenCC 的項目
License 許可協議
Apache License 2.0
Third Party Library 第三方庫
- darts-clone BSD License
- marisa-trie BSD License
- tclap MIT License
- rapidjson MIT License
- Google Test BSD License
All these libraries are statically linked.
Change History 版本歷史
Links 相關鏈接
- Introduction 詳細介紹 https://github.com/BYVoid/OpenCC/wiki/%E7%B7%A3%E7%94%B1
- 現代漢語常用簡繁一對多字義辨析表 http://ytenx.org/byohlyuk/KienxPyan
Contributors 貢獻者
- BYVoid
- 佛振
- Peng Huang
- LI Daobing
- Kefu Chai
- Kan-Ru Chen
- Ma Xiaojun
- Jiang Jiang
- Ruey-Cheng Chen
- Paul Meng
- Lawrence Lau
- 瑾昀
- 內木一郎
- Marguerite Su
- Brian White
- Qijiang Fan
- LEOYoon-Tsaw
- Steven Yao
- Pellaeon Lin
- stony
- steelywing
- 吕旭东
- Weng Xuetian
- Ma Tao
- Heinz Wiesinger
- J.W
- Amo Wu
- Mark Tsai
- Zhe Wang
- sgqy
- Qichuan (Sean) ZHANG
- Flandre Scarlet
- 宋辰文
- iwater
- Xpol Wan
- Weihang Lo
- Cychih
- kyleskimo
- Ryuan Choi
- Tony Able
- Xiao Liang
Please update this list you have contributed OpenCC.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
File details
Details for the file OpenCC-1.1.0-py2.py3-none-manylinux1_x86_64.whl
.
File metadata
- Download URL: OpenCC-1.1.0-py2.py3-none-manylinux1_x86_64.whl
- Upload date:
- Size: 1.3 MB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3.post20200330 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1aa972d372df49240dbc1aa11986500723e7437d67898172fd2cd0fc53bc7da0 |
|
MD5 | 073abbedd63e0e63c8ad0ad9491f6f6d |
|
BLAKE2b-256 | 3cd6c859a7fbed0e79f6e6390ed1403f3767329e77ef6c7b2598fe483e88b41d |
Provenance
File details
Details for the file OpenCC-1.1.0-py2.py3-none-macosx_10_9_x86_64.whl
.
File metadata
- Download URL: OpenCC-1.1.0-py2.py3-none-macosx_10_9_x86_64.whl
- Upload date:
- Size: 1.1 MB
- Tags: Python 2, Python 3, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3.post20200330 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 504f749616c8c2c973747488ca68433b570ede694a5f61f4f82bc8b23913e295 |
|
MD5 | 4805e75cd698c7b4fe14eb035c6e5d2b |
|
BLAKE2b-256 | d9a2968bfa2e8319672c2f6dfc12220c37a8df6af5c017d330181bea23e5a8d2 |