Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

zhon

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

zhon

Zhon provides constants used in Chinese text processing.

  • 2.1.1
  • PyPI
  • Socket score

Maintainers
1

==== Zhon

.. image:: https://badge.fury.io/py/zhon.svg :target: https://pypi.org/project/zhon

.. image:: https://github.com/tsroten/zhon/actions/workflows/ci.yml/badge.svg :target: https://github.com/tsroten/zhon/actions/workflows/ci.yml

Zhon is a Python library that provides constants commonly used in Chinese text processing.

  • Documentation: https://tsroten.github.io/zhon/
  • GitHub: https://github.com/tsroten/zhon
  • Support: https://github.com/tsroten/zhon/issues
  • Free software: MIT license <http://opensource.org/licenses/MIT>_

About

Zhon's constants can be used in Chinese text processing, for example:

  • Find CJK characters in a string:

    .. code:: python

    re.findall('[{}]'.format(zhon.hanzi.characters), 'I broke a plate: 我打破了一个盘子.') ['我', '打', '破', '了', '一', '个', '盘', '子']

  • Validate Pinyin syllables, words, or sentences:

    .. code:: python

    re.findall(zhon.pinyin.syllable, 'Yuànzi lǐ tíngzhe yí liàng chē.', re.I) ['Yuàn', 'zi', 'lǐ', 'tíng', 'zhe', 'yí', 'liàng', 'chē']

    re.findall(zhon.pinyin.word, 'Yuànzi lǐ tíngzhe yí liàng chē.', re.I) ['Yuànzi', 'lǐ', 'tíngzhe', 'yí', 'liàng', 'chē']

    re.findall(zhon.pinyin.sentence, 'Yuànzi lǐ tíngzhe yí liàng chē.', re.I) ['Yuànzi lǐ tíngzhe yí liàng chē.']

Features

Zhon includes the following commonly-used constants:

  • CJK characters and radicals
  • Chinese punctuation marks
  • Chinese sentence regular expression pattern
  • Pinyin vowels, consonants, lowercase, uppercase, and punctuation
  • Pinyin syllable, word, and sentence regular expression patterns
  • Zhuyin characters and marks
  • Zhuyin syllable regular expression pattern
  • CC-CEDICT characters

Getting Started

  • Install Zhon <https://tsroten.github.io/zhon/installation.html>_
  • Learn how to use Zhon <https://tsroten.github.io/zhon/api.html>_
  • Contribute <https://github.com/tsroten/zhon/blob/develop/CONTRIBUTING.rst>_ documentation, code, or feedback

Keywords

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc