Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

nlpo3

Package Overview
Dependencies
Maintainers
2
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

nlpo3

Python binding for nlpO3 Thai language processing library in Rust

  • 1.3.1
  • PyPI
  • Socket score

Maintainers
2

SPDX-FileCopyrightText: 2024 PyThaiNLP Project SPDX-License-Identifier: Apache-2.0

nlpO3 Python binding

PyPI Python 3.7 Apache-2.0

Python binding for nlpO3, a Thai natural language processing library in Rust.

To install:

pip install nlpo3

Table of Contents

Features

  • Thai word tokenizer
    • segment() - use maximal-matching dictionary-based tokenization algorithm and honor Thai Character Cluster boundaries
      • 2.5x faster than similar pure Python implementation (PyThaiNLP's newmm)
    • load_dict() - load a dictionary from a plain text file (one word per line)

Use

Load file path/to/dict.file to memory and assign a name dict_name to it.

Then tokenize a text with the dict_name dictionary:

from nlpo3 import load_dict, segment

load_dict("path/to/dict.file", "custom_dict")
segment("สวัสดีครับ", "dict_name")

it will return a list of strings:

['สวัสดี', 'ครับ']

(result depends on words included in the dictionary)

Use multithread mode, also use the dict_name dictionary:

segment("สวัสดีครับ", dict_name="dict_name", parallel=True)

Use safe mode to avoid long waiting time in some edge cases for text with lots of ambiguous word boundaries:

segment("สวัสดีครับ", dict_name="dict_name", safe=True)

Dictionary

  • For the interest of library size, nlpO3 does not assume what dictionary the user would like to use, and it does not come with a dictionary.
  • A dictionary is needed for the dictionary-based word tokenizer.
  • For tokenization dictionary, try

Build

Requirements

  • Rust 2018 Edition
  • Python 3.7 or newer (PyO3's minimum supported version)
  • Python Development Headers
    • Ubuntu: sudo apt-get install python3-dev
    • macOS: No action needed
  • PyO3 - already included in Cargo.toml
  • setuptools-rust

Steps

python -m pip install --upgrade build
python -m build

This should generate a wheel file, in dist/ directory, which can be installed by pip.

To install a wheel from a local directory:

pip install dist/nlpo3-1.3.1-cp311-cp311-macosx_12_0_x86_64.whl 

Test

To run a Python unit test:

cd tests
python -m unittest

Issues

Please report issues at https://github.com/PyThaiNLP/nlpo3/issues

License

nlpO3 Python binding is copyrighted by its authors and licensed under terms of the Apache Software License 2.0 (Apache-2.0). See file LICENSE for details.

Binary wheels

A pre-built binary package is available from PyPI for these platforms:

PythonOSArchitectureHas binary wheel?
3.13Windowsx86
WindowsAMD64
macOSx86_64
macOSarm64
manylinuxx86_64
manylinuxi686
musllinuxx86_64
3.12Windowsx86
WindowsAMD64
macOSx86_64
macOSarm64
manylinuxx86_64
manylinuxi686
musllinuxx86_64
3.11Windowsx86
WindowsAMD64
macOSx86_64
macOSarm64
manylinuxx86_64
manylinuxi686
musllinuxx86_64
3.10Windowsx86
WindowsAMD64
macOSx86_64
macOSarm64
manylinuxx86_64
manylinuxi686
musllinuxx86_64
3.9Windowsx86
WindowsAMD64
macOSx86_64
macOSarm64
manylinuxx86_64
manylinuxi686
musllinuxx86_64
3.8Windowsx86
WindowsAMD64
macOSx86_64
macOSarm64
manylinuxx86_64
manylinuxi686
musllinuxx86_64
3.7Windowsx86
WindowsAMD64
macOSx86_64
macOSarm64
manylinuxx86_64
manylinuxi686
musllinuxx86_64
PyPy 3.10Windowsx86
WindowsAMD64
macOSx86_64
macOSarm64
manylinuxx86_64
manylinuxi686
PyPy 3.9Windowsx86
WindowsAMD64
macOSx86_64
macOSarm64
manylinuxx86_64
manylinuxi686
PyPy 3.8Windowsx86
WindowsAMD64
macOSx86_64
macOSarm64
manylinuxx86_64
manylinuxi686
PyPy 3.7Windowsx86
WindowsAMD64
macOSx86_64
macOSarm64
manylinuxx86_64
manylinuxi686

Keywords

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc