japanese.js
Util collection for Japanese text processing. Hiraganize, Katakanize, and Romanize.
Install
$ npm install --save japanese
Usage
var japanese = require('japanese');
japanese.hiraganize('ヱヴァンゲリヲン');
For crazy syntax sugar junkies:
var japanese = require('japanese/sugar');
'ヱヴァンゲリヲン'.hiraganize();
Command
Command Line Interface is also available.
$ npm install japanese -g
$ japanese
Util collection for Japanese text processing. Hiraganize, Katakanize, and Romanize.
Usage:
japanese <input> [options]
Options:
-h, --hiraganize hiraganize input string
-k, --katakanize katakanize input string
-r, --romanize romanize input string
Example
japanese ヱヴァンゲリヲン --hiraganize
API
japanese.hiraganize(text)
Convert input katakana into hiragana.
Arguments
text
The text to hiraganize
Example
japanese.hiraganize('ヱヴァンゲリヲン');
japanese.hiraganize('チヨコバナヽ');
japanese.hiraganize('ヹルタースオリジナル');
japanese.hiraganize('板垣死ス𪜈');
japanese.katakanize(text)
Convert input hiragana into katakana.
Arguments
text
The text to katakanize
Example
japanese.katakanize('抹茶あいす');
japanese.katakanize('ばゞへらあいす');
japanese.katakanize('ゐ゙よろん');
japanese.katakanize('本日ゟかき氷解禁');
japanese.romanize(text[, config])
Convert input text into romaji.
important: Most definitions of Japanese text romanizations require total recognition of
Japanese text, but robots cannot actually think or understand!
Some conversions are hopelessly poor. For example, ISO 3602 defines that "こうし" which
means "講師" must be romanized as "kôsi", while "こうし" which means "子牛" must be romanized
as "kousi" (because 子牛 is mixed word of 子 and 牛), though these are apparently the same
in Kana-form. While japanese.js is very... very very thoroughly tested, this module (and any
other romanization machines) cannot distinguish between these semantics. So unfortunately,
you cannot use this function for official writing or something. Ugh.
Arguments
text
The text to romanizeconfig
The configuration object or string used to romanize. Described below.
Example
japanese.romanize('れんあいかんじょう');
japanese.romanize('ツァトゥグァ');
japanese.romanize('くうぼをきゅう', 'kunrei');
japanese.romanize('でんぢゃらす', 'nihon');
japanese.romanize('いいづか とおる', {
'いい': 'ii',
'おお': 'oh',
});
Configs
Config is represented as plain object, where object keys stand for a collection of
similar characters, and the value determines how these characters are converted.
So the object is not just the same as a conversion table.
Available parameters are following.
Key | Available Values |
---|
し | si, shi |
ち | ti, chi |
つ | tu, tsu |
ふ | hu, fu |
じ | zi, ji |
ぢ | di, zi, ji, dzi, dji |
づ | du, zu, dsu, dzu |
ああ | aa, ah, â, ā, a |
いい | ii, ih, î, ī, i |
うう | uu, uh, û, ū, u |
ええ | ee, eh, ê, ē, e |
おお | oo, oh, ô, ō, o |
あー | a-, aa, ah, â, ā, a |
えい | ei, ee, eh, ê, ē, e |
おう | ou, oo, oh, ô, ō, o |
んあ | na, n'a, n-a |
んば | nba, mba |
っち | tti, tchi, cchi |
ゐ | i, wi |
を | o, wo |
You can also specify these predefined configs by supplying a string. Default is wikipedia.
| 'wikipedia' | 'traditional hepburn' | 'modified hepburn' | 'kunrei' | 'nihon' |
---|
し | shi | shi | shi | si | si |
ち | chi | chi | chi | ti | ti |
つ | tsu | tsu | tsu | tu | tu |
ふ | fu | fu | fu | hu | hu |
じ | ji | ji | ji | zi | zi |
ぢ | ji | ji | ji | zi | di |
づ | zu | zu | zu | zu | du |
ああ | aa | aa | ā | â | ā |
いい | ii | ii | ii | î | ī |
うう | ū | ū | ū | û | ū |
ええ | ee | ee | ē | ê | ē |
おお | ō | ō | ō | ô | ō |
あー | ā | ā | ā | â | ā |
えい | ei | ei | ei | ei | ei |
おう | ō | ō | ō | ô | ō |
んあ | n'a | n-a | n'a | n'a | n'a |
んば | nba | mba | nba | nba | nba |
っち | tchi | tchi | tchi | tti | tti |
ゐ | i | i | i | i | wi |
を | o | wo | o | o | wo |
And here are short notes about these romanizations.
Wikipedia style
Source: http://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style/Japan-related_articles#Romanization
The most modern and widely used form of romanization. Wikipedia uses this guideline to name
their article title and text. This is mixed version of traditional and modified Hepburn
and easily recognizable for everyone.
Traditional and Modified Hepburn
Source: http://en.wikipedia.org/wiki/Hepburn_romanization
Actually this is not a specification. Hepburn romanization is very widely known but nobody
other than Hepburn knows the REAL definition of these method.
Kunrei-shiki and Nihon-shiki
Source: http://www.iso.org/iso/catalogue_detail.htm?csnumber=9029
Kunrei-shiki is defined as ISO 9029 and Nihon-shiki as ISO 9209 Strict. These romanizations
are today kind of obsolete but still the only standardized romanization in the world.
Roadmap
- japanese.deromanize()
- japanese.cyrillize()
- japanese.decyrillize()
- japanese.hangulize()
- japanese.dehangulize()
- japanese.arabize()
- japanese.dearabize()
- japanese.gyarumojize()
- japanese.isKatakana()
- japanese.isHiragana()
- japanese.isKanji()
- japanese.isJoyoKanji()
- japanese.isKinsoku() (JIS X 4051 compatibility is preferred)
- CLI
--input <file>
and --output <file>
optionjapanese --hiraganize <string>
to work
...and any proposal or idea for enhancing japanese.js is welcomed! Tell me, tell me, tell me!
License
MIT © hakatashi