Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

lulalala_address_tokenizer

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

lulalala_address_tokenizer

  • 0.1.1
  • Rubygems
  • Socket score

Version published
Maintainers
1
Created
Source

LulalalaAddressTokenizer

Postal addresses tokenizer using Wapiti model.

Intended for addresses in CJK (Chinese, Japanese Korean) characters. After wapiti model labels each token(character), this gem combines adjacent word of the same label together. This is important for CJK languages because its phrases (combination of words) are not separated by spaces.

台灣地址分詞用

Installation

Add this line to your application's Gemfile:

gem 'lulalala_address_tokenizer'

And then execute:

$ bundle

Or install it yourself as:

$ gem install lulalala_address_tokenizer

Usage

tokenizer = LulalalaAddressTokenizer.new('address.mod')
tokenizer.parse("AA縣BB鎮CC路D號")
# {"city"=>"AA縣", "district"=>"BB鎮", "street"=>"CC路", "housenumber"=>"D號"}

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/lulalala_address_tokenizer.

FAQs

Package last updated on 22 Nov 2016

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc