Socket
Socket
Sign inDemoInstall

nlpo3

Package Overview
Dependencies
0
Maintainers
2
Alerts
File Explorer

Install Socket

Detect and block malicious and high-risk dependencies

Install

    nlpo3

Python binding for nlpO3 Thai language processing library in Rust


Maintainers
2

Readme

pypi Python 3.6 License Downloads

nlpO3 Python binding

Python binding for nlpO3, a Thai natural language processing library in Rust.

Features

  • Thai word tokenizer
    • segment() - use maximal-matching dictionary-based tokenization algorithm and honor Thai Character Cluster boundaries
      • 2.5x faster than similar pure Python implementation (PyThaiNLP's newmm)
    • load_dict() - load a dictionary from plain text file (one word per line)

Dictionary file

  • For the interest of library size, nlpO3 does not assume what dictionary the developer would like to use. It does not come with a dictionary. A dictionary is needed for the dictionary-based word tokenizer.
  • For tokenization dictionary, try

Install

pip install nlpo3

Usage

Load file path/to/dict.file to memory and assign a name dict_name to it. Then tokenize a text with the dict_name dictionary:

from nlpo3 import load_dict, segment

load_dict("path/to/dict.file", "custom_dict")
segment("สวัสดีครับ", "dict_name")

it will return a list of strings:

['สวัสดี', 'ครับ']

(result depends on words included in the dictionary)

Use multithread mode, also use the dict_name dictionary:

segment("สวัสดีครับ", dict_name="dict_name", parallel=True)

Use safe mode to avoid long waiting time in some edge cases for text with lots of ambiguous word boundaries:

segment("สวัสดีครับ", dict_name="dict_name", safe=True)

Build

Requirements

Steps

python -m pip install --upgrade build
python -m build

This should generate a wheel file, in dist/ directory, which can be installed by pip.

Issues

Please report issues at https://github.com/PyThaiNLP/nlpo3/issues

Keywords

FAQs


Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc