Research
Security News
Quasar RAT Disguised as an npm Package for Detecting Vulnerabilities in Ethereum Smart Contracts
Socket researchers uncover a malicious npm package posing as a tool for detecting vulnerabilities in Etherium smart contracts.
@lazy-cjk/japanese
Advanced tools
Util collection for Japanese text processing. Hiraganize, Katakanize, and Romanize.
Util collection for Japanese text processing. Hiraganize, Katakanize, and Romanize.
$ npm install --save japanese
var japanese = require('japanese');
japanese.hiraganize('ヱヴァンゲリヲン');
For crazy syntax sugar junkies:
var japanese = require('japanese/sugar');
'ヱヴァンゲリヲン'.hiraganize();
Command Line Interface is also available.
$ npm install japanese -g
$ japanese
Util collection for Japanese text processing. Hiraganize, Katakanize, and Romanize.
Usage:
japanese <input> [options]
Options:
-h, --hiraganize hiraganize input string
-k, --katakanize katakanize input string
-r, --romanize romanize input string
Example
japanese ヱヴァンゲリヲン --hiraganize
Convert input katakana into hiragana.
text
The text to hiraganizejapanese.hiraganize('ヱヴァンゲリヲン'); // ゑゔぁんげりをん
japanese.hiraganize('チヨコバナヽ'); // ちよこばなゝ
japanese.hiraganize('ヹルタースオリジナル'); // ゑ゙るたーすおりじなる
japanese.hiraganize('板垣死ス𪜈'); // 板垣死すとも
Convert input hiragana into katakana.
text
The text to katakanizejapanese.katakanize('抹茶あいす'); // 抹茶アイス
japanese.katakanize('ばゞへらあいす'); // バヾヘラアイス
japanese.katakanize('ゐ゙よろん'); // ヸヨロン
japanese.katakanize('本日ゟかき氷解禁'); // 本日ヨリカキ氷解禁
Convert input text into romaji.
important: Most definitions of Japanese text romanizations require total recognition of Japanese text, but robots cannot actually think or understand! Some conversions are hopelessly poor. For example, ISO 3602 defines that "こうし" which means "講師" must be romanized as "kôsi", while "こうし" which means "子牛" must be romanized as "kousi" (because 子牛 is mixed word of 子 and 牛), though these are apparently the same in Kana-form. While japanese.js is very... very very thoroughly tested, this module (and any other romanization machines) cannot distinguish between these semantics. So unfortunately, you cannot use this function for official writing or something. Ugh.
text
The text to romanizeconfig
The configuration object or string used to romanize. Described below.japanese.romanize('れんあいかんじょう'); // ren'aikanjō
japanese.romanize('ツァトゥグァ'); // tsatugwa
japanese.romanize('くうぼをきゅう', 'kunrei'); // kûbookyû
japanese.romanize('でんぢゃらす', 'nihon'); // dendyarasu
japanese.romanize('いいづか とおる', {
'いい': 'ii',
'おお': 'oh',
}); // iizuka tohru
Config is represented as plain object, where object keys stand for a collection of similar characters, and the value determines how these characters are converted. So the object is not just the same as a conversion table.
Available parameters are following.
Key | Available Values |
---|---|
し | si, shi |
ち | ti, chi |
つ | tu, tsu |
ふ | hu, fu |
じ | zi, ji |
ぢ | di, zi, ji, dzi, dji |
づ | du, zu, dsu, dzu |
ああ | aa, ah, â, ā, a |
いい | ii, ih, î, ī, i |
うう | uu, uh, û, ū, u |
ええ | ee, eh, ê, ē, e |
おお | oo, oh, ô, ō, o |
あー | a-, aa, ah, â, ā, a |
えい | ei, ee, eh, ê, ē, e |
おう | ou, oo, oh, ô, ō, o |
んあ | na, n'a, n-a |
んば | nba, mba |
っち | tti, tchi, cchi |
ゐ | i, wi |
を | o, wo |
You can also specify these predefined configs by supplying a string. Default is wikipedia.
'wikipedia' | 'traditional hepburn' | 'modified hepburn' | 'kunrei' | 'nihon' | |
---|---|---|---|---|---|
し | shi | shi | shi | si | si |
ち | chi | chi | chi | ti | ti |
つ | tsu | tsu | tsu | tu | tu |
ふ | fu | fu | fu | hu | hu |
じ | ji | ji | ji | zi | zi |
ぢ | ji | ji | ji | zi | di |
づ | zu | zu | zu | zu | du |
ああ | aa | aa | ā | â | ā |
いい | ii | ii | ii | î | ī |
うう | ū | ū | ū | û | ū |
ええ | ee | ee | ē | ê | ē |
おお | ō | ō | ō | ô | ō |
あー | ā | ā | ā | â | ā |
えい | ei | ei | ei | ei | ei |
おう | ō | ō | ō | ô | ō |
んあ | n'a | n-a | n'a | n'a | n'a |
んば | nba | mba | nba | nba | nba |
っち | tchi | tchi | tchi | tti | tti |
ゐ | i | i | i | i | wi |
を | o | wo | o | o | wo |
And here are short notes about these romanizations.
Source: http://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style/Japan-related_articles#Romanization
The most modern and widely used form of romanization. Wikipedia uses this guideline to name their article title and text. This is mixed version of traditional and modified Hepburn and easily recognizable for everyone.
Source: http://en.wikipedia.org/wiki/Hepburn_romanization
Actually this is not a specification. Hepburn romanization is very widely known but nobody other than Hepburn knows the REAL definition of these method.
Source: http://www.iso.org/iso/catalogue_detail.htm?csnumber=9029
Kunrei-shiki is defined as ISO 9029 and Nihon-shiki as ISO 9209 Strict. These romanizations are today kind of obsolete but still the only standardized romanization in the world.
--input <file>
and --output <file>
optionjapanese --hiraganize <string>
to work...and any proposal or idea for enhancing japanese.js is welcomed! Tell me, tell me, tell me!
MIT © hakatashi
FAQs
Util collection for Japanese text processing. Hiraganize, Katakanize, and Romanize.
We found that @lazy-cjk/japanese demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket researchers uncover a malicious npm package posing as a tool for detecting vulnerabilities in Etherium smart contracts.
Security News
Research
A supply chain attack on Rspack's npm packages injected cryptomining malware, potentially impacting thousands of developers.
Research
Security News
Socket researchers discovered a malware campaign on npm delivering the Skuld infostealer via typosquatted packages, exposing sensitive data.