Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

mecab-ko-dic

Package Overview
Dependencies
Maintainers
1
Versions
2
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

mecab-ko-dic

mecab-ko-dic in UTF-8 format - a Korean dictionary for use with Mecab.

  • 0.1.3
  • latest
  • Source
  • npm
  • Socket score

Version published
Weekly downloads
3
Maintainers
1
Weekly downloads
 
Created
Source

mecab-ko-dic-utf-8

mecab-ko-dic (a Korean dictionary for use with Mecab), in UTF-8 format, organised as a Cocoapod and npm package for usage with iOS/macOS.

Installation

Installing from Cocoapods

Specify this pod in your Podfile:

pod 'mecab-ko-dic-utf-8'
pod update

Installing as a Cocoapod from npm (for React Native iOS apps)

Add this npm package:

yarn add mecab-ko-dic

# or:

npm install --save mecab-ko-dic

Next, specify this pod in your Podfile:

pod 'mecab-ko-dic-utf-8', :podspec => '../node_modules/mecab-ko-dic/mecab-ko-dic-utf-8.podspec'

Don't forget to install the pods.

cd ios
pod update

Feature schema

Information about the dictionary format and part-of-speech tags used by mecab-ko-dic is documented in this Google Spreadsheet, linked to from mecab-ko-dic's repo readme.

Note how ko-dic has one less feature column than NAIST JDIC, and has an altogether different set of information (e.g. doesn't provide the "original form" of the word).

The tags are a slight modification of those specified by 세종 (Sejong), whatever that is. The mappings from Sejong to mecab-ko-dic's tag names are given in tab "태그 v2.0" on the above-linked spreadsheet.

The dictionary format is specified fully (in Korean) in tab "사전 형식 v2.0" of the spreadsheet. Any blank values default to *.

indexrole (Korean)role (English)notes
0품사 태그part-of-speech tagSee "태그 v2.0" tab on spreadsheet
1의미 부류meaning(too few examples for me to be sure)
2종성 유무presence or absenceT for true; F for false; else *
3읽기readingusually matches surface, but may differ for foreign words e.g. Chinese character words
4타입typeOne of: Inflect (활용); Compound (복합명사); or Preanalysis (기분석)
5첫번째 품사first part-of-speeche.g. given a part-of-speech tag of "VV+EM+VX+EP", would return VV
6마지막 품사last part-of-speeche.g. given a part-of-speech tag of "VV+EM+VX+EP", would return EP
7표현expression"활용, 복합명사, 기분석이 어떻게 구성되는지 알려주는 필드" – Fields that tell how usage, compound nouns, and key analysis are organized
Example feature strings

mecab-ko-dic은 MeCab을 사용하여, 한국어 형태소 분석을 하기 위한 프로젝트입니다.

mecab    SL,*,*,*,*,*,*,*
-    SY,*,*,*,*,*,*,*
ko    SL,*,*,*,*,*,*,*
-    SY,*,*,*,*,*,*,*
dic    SL,*,*,*,*,*,*,*
은    JX,*,T,은,*,*,*,*
MeCab    SL,*,*,*,*,*,*,*
을    JKO,*,T,을,*,*,*,*
사용    NNG,행위,T,사용,*,*,*,*
하    XSV,*,F,하,*,*,*,*
여    EC,*,F,여,*,*,*,*
,    SC,*,*,*,*,*,*,*
한국어    NNG,*,F,한국어,Compound,*,*,한국/NNG/*+어/NNG/*
형태소    NNG,*,F,형태소,Compound,*,*,형태/NNG/*+소/NNG/*
분석    NNG,행위,T,분석,*,*,*,*
을    JKO,*,T,을,*,*,*,*
하    VV,*,F,하,*,*,*,*
기    ETN,*,F,기,*,*,*,*
위한    VV+ETM,*,T,위한,Inflect,VV,ETM,위하/VV/*+ᆫ/ETM/*
프로젝트    NNG,*,F,프로젝트,*,*,*,*
입니다    VCP+EF,*,F,입니다,Inflect,VCP,EF,이/VCP/*+ᄇ니다/EF/*
.    SF,*,*,*,*,*,*,*
EOS

License

mecab-ko-dic-utf-8 is available only under the Apache 2.0 licence: mecab-ko-dic-utf-8/bundleContents/COPYING.

See also

Keywords

FAQs

Package last updated on 21 Mar 2020

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc