Latest Threat Research:SANDWORM_MODE: Shai-Hulud-Style npm Worm Hijacks CI Workflows and Poisons AI Toolchains.Details
Socket
Book a DemoInstallSign in
Socket

pywordsegment

Package Overview
Dependencies
Maintainers
1
Versions
12
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

pywordsegment

Concatenated-word segmentation Python library written in Rust

pipPyPI
Version
0.4.1
Maintainers
1

Logo

Concatenated-word segmentation Python library written in Rust

license Python OS Build PyPi

Table of Contents

About The Project

A fast concatenated-word segmentation library written in Rust, inspired by wordninja and wordsegment. The binding uses pyo3 to interact with the rust package.

Built With

Installation

pip3 install pywordsegment

Usage

import pywordsegment

# The internal UNIGRAMS & BIGRAMS corpuses are lazy initialized
# once per the whole module. Multiple WordSegmenter instances would
# not create new dictionaries.

# Segments a word to its parts
pywordsegment.WordSegmenter.segment(
    text="theusashops",
)
# ["the", "usa", "shops"]


# This function checks whether the substring exists as a whole segment
# inside text.
pywordsegment.WordSegmenter.exist_as_segment(
    substring="inter",
    text="internationalairport",
)
# False

pywordsegment.WordSegmenter.exist_as_segment(
    substring="inter",
    text="intermilan",
)
# True

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Gal Ben David - gal@intsights.com

Project Link: https://github.com/intsights/pywordsegment

Keywords

word

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts