You're Invited:Meet the Socket Team at BlackHat and DEF CON in Las Vegas, Aug 4-6.RSVP
Socket
Book a DemoInstallSign in
Socket

trigrams

Package Overview
Dependencies
Maintainers
1
Versions
16
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

trigrams

Trigram files for 500+ languages

6.0.0
latest
Source
npmnpm
Version published
Weekly downloads
12
-14.29%
Maintainers
1
Weekly downloads
 
Created
Source

trigrams

Build Coverage Downloads

Trigrams for 500+ languages.

Contents

What is this?

This package exposes all trigrams for natural languages. Based on the most translated copyright-free document on this planet: UDHR.

When should I use this?

When you are dealing with natural language detection.

Install

This package is ESM only. In Node.js (version 18+), install with npm:

npm install trigrams

In Deno with esm.sh:

import {min, top} from 'https://esm.sh/trigrams@6'

In browsers with esm.sh:

<script type="module">
  import {min, top} from 'https://esm.sh/trigrams@6?bundle'
</script>

Use

import {min, top} from 'trigrams'

console.log((await min()).nld)
console.log((await top()).pam)

Yields:

[ // 300 top trigrams.
  ' ar',
  'eer',
  'tij',
  // …
  'de ',
  'an ',
  'en ' // Most common trigram.
]
{ // 300 top trigrams.
  'isa': 6,
  'upa': 6,
  'i k': 6,
  // …
  'ang': 273,
  'ing': 282,
  'ng ': 572 // Most common trigram with how often it was found.
}

API

This package exports the identifiers min and top. It exports no TypeScript types. There is no default export.

min()

Get top trigrams.

Returns

Returns a promise resolving to arrays containing the top 300 trigrams sorted from least occurring to most occurring (Promise<Record<string, Array<string>>>).

top()

Get top trigrams to occurrence counts.

Returns

Returns a promise resolving to an object mapping UDHR in Unicode codes to objects mapping the top 300 trigrams to occurrence counts (Promise<Record<string, Record<string, number>>>).

Data

The trigrams are based on the unicode versions of the universal declaration of human rights.

The files are created from all paragraphs made available by wooorm/udhr and do not include headings and such.

Before creating trigrams,

  • the unicode characters from \u0021 to \u0040 (both including) are removed
  • one or more white space characters (\s+) are replaced with a single space
  • alphabetic characters are lower cased ([A-Z])

Additionally, the input is padded with two spaces on both sides.

CodeName
007Sãotomense
008Crioulo, Upper Guinea (008)
009Mbundu (009)
010Tetun Dili
011Umbundu (011)
013(Mijisa)
014(Maiunan)
016(Minjiang, spoken)
017(Minjiang, written)
020Drung
021(Muzzi)
022(Klau)
025(Bizisa)
026(Yeonbyeon)
027Gumuz
028Kafa
029Sidamo
030Kituba (2)
032South Azerbaijani
041Latvian (2)
042Spanish (resolution)
043Zarma
044Mirandese
045Maasai
046Malay, Papuan
047Malay, Ambonese
048Minangkabau (2)
049Banjar
050(Bataknese)
052Morisyen
053Hausa (2)
054Catalan (2)
055Jamaican Creole English
056Saint Lucian Creole French
057Maay
058Somali (Af Marka)
059North Saami (2)
060Inari Saami
061Skolt Saami
062Swahili (Chimwiini)
063Swahili (Kibajuni)
064Dabarre
065Garre
066Jiiddu
067Finnish (2)
068French (Welche)
069Maori (2)
071Kabyle
aarAfar
abkAbkhaz
aceAceh
acuAchuar-Shiwiar
acu_1Achuar-Shiwiar (1)
adaDangme
adyAdyghe
afrAfrikaans
agrAguaruna
aiiAssyrian Neo-Aramaic
ajgAja
aka_akuapemTwi (Akuapem)
aka_asanteTwi (Asante)
aka_fanteFante
alsAlbanian, Tosk
altAltai, Southern
amcAmahuaca
ameYaneshaʼ
amhAmharic
amiAmis
amrAmarakaeri
arbArabic, Standard
arlArabela
arnMapudungun
astAsturian
aucWaorani
auvOccitan (Auvergnat)
ayoAyoreo
ayrAymara, Central
azj_cyrlAzerbaijani, North (Cyrillic)
azj_latnAzerbaijani, North (Latin)
bamBamanankan
banBali
baxBamun
bbaBaatonum
bciBaoulé
bclBicolano, Central
belBelarusan
bemBemba
benBengali
bfaBari
bhoBhojpuri
binEdo
bisBislama
bltTai Dam
bluHmong Njua
boaBora
bodTibetan, Central
bos_cyrlBosnian (Cyrillic)
bos_latnBosnian (Latin)
breBreton
btbBulu
bucBushi
bugBugis
bulBulgarian
bviBelanda Viri
cabGarifuna
cakKaqchikel, Central
casTsimané
catCatalan
cbiChachi
cbrCashibo-Cacataibo
cbsCashinahua
cbtChayahuita
cbuCandoshi-Shapra
ccxZhuang, Yongbei
cebCebuano
cesCzech
chaChamorro
chjChinantec, Ojitlán
chkChuukese
chr_casedCherokee (cased)
chr_uppercaseCherokee (uppercase)
chvChuvash
cicChickasaw
cjkChokwe
cjk_AOChokwe (Angola)
cjsShor
ckbKurdish, Central
cnhChin, Haka
cniAsháninka
cnrMontenegrin
cofColorado
cosCorsican
cotCaquinte
cpuAshéninka, Pichis
crhCrimean Tatar
crsSeselwa Creole French
csaChinantec, Chiltepec
cswCree, Swampy
ctdChin, Tedim
cymWelsh
dagDagbani
danDanish
ddnDendi
deu_1901German, Standard (1901)
deu_1996German, Standard (1996)
dgaDagaare, Southern
dipDinka, Northeastern
divMaldivian
dyoJola-Fonyi
dyuJula
dzoDzongkha
ell_monotonicGreek (monotonic)
ell_polytonicGreek (polytonic)
emkManinkakan, Eastern
emlRomagnolo
engEnglish
epoEsperanto
eseEse Ejja
estEstonian
eusBasque
eveEven
evnEvenki
eweÉwé
faoFaroese
fijFijian
finFinnish
fkvFinnish, Kven
flmChin, Falam
fonFon
fraFrench
friFrisian, Western
fufPular
furFriulian
fuvFulfulde, Nigerian
fuv2Fulfulde, Nigerian (2)
fvrFur
gaaGa
gagGagauz
gaxOromo, Borana-Arsi-Guji
gjnGonja
gkpKpelle, Guinea
glaGaelic, Scottish
gldNanai
gleGaelic, Irish
glgGalician
glvManx
gnwGuarani, Western Bolivian
gsw1Alemannisch (Elsassisch)
gucWayuu
gugGuaraní, Paraguayan
gujGujarati
guuYanomamö
gyrGuarayu
hat_kreyolHaitian Creole French (Kreyol)
hat_popularHaitian Creole French (Popular)
hau_NEHausa (Niger)
hau_NGHausa (Nigeria)
hau_3Hausa
hawHawaiian
heaHmong, Northern Qiandong
hebHebrew
hilHiligaynon
hinHindi
hltChin, Matu
hmsHmong, Southern Qiandong
hnaGen
hniHani
hnsHindustani, Sarnami
hrvCroatian
hsbSorbian, Upper
hsfHuastec (Sierra de Otontepec)
hunHungarian
husHuastec (Veracruz)
huuHuitoto, Murui
hvaHuastec (San Luís Potosí)
hyeArmenian
ibbIbibio
iboIgbo
idoIdo
iduIdoma
ijsIjo, Southeast
ikeInuktitut, Eastern Canadian
iloIlocano
inaInterlingua
indIndonesian
islIcelandic
itaItalian
javJavanese (Latin)
jav_javaJavanese (Javanese)
jivShuar
jpnJapanese
jpn_osakaJapanese (Osaka)
jpn_tokyoJapanese (Tokyo)
kaaKarakalpak
kalInuktitut, Greenlandic
kanKannada
katGeorgian
kazKazakh
kbdKabardian
kbpKabiyé
kdeMakonde
kdhTem
keaKabuverdianu
kekQ'eqchi'
khaKhasi
khkMongolian, Halh (Cyrillic)
khmKhmer, Central
kinRwanda
kirKirghiz
kjhKhakas
kkh_lanaKhün
kmbMbundu
kmrKurdish, Northern
kncKanuri, Central
kngKoongo
kng_AOKoongo (Angola)
koiKomi-Permyak
kooKonjo
korKorean
kqnKaonde
kqsKissi, Northern
kriKrio
krlKarelian
ktuKituba
kwiAwa-Cuaiquer
ladLadino
laoLao
latLatin
lat_1Latin (1)
lavLatvian
liaLimba, West-Central
lijLigurian
linLingala
lin_tonesLingala (tones)
litLithuanian
lldLadin
lncOccitan (Languedocien)
lnsLamnso'
lobLobi
lotOtuho
lozLozi
ltzLuxembourgeois
luaLuba-Kasai
lueLuvale
lugGanda
lunLunda
lusMizo
madMadura
magMagahi
mahMarshallese
maiMaithili
malMalayalam
mal_chillusMalayalam
mamMam, Northern
marMarathi
mazMazahua Central
mcdSharanahua
mcfMatsés
menMende
mfqMoba
micMicmac
minMinangkabau
miqMískito
mkdMacedonian
mltMaltese
mly_arabMalay (Arabic)
mly_latnMalay (Latin)
mnwMon
morMoro
mosMòoré
mriMaori
mtoMixe, Totontepec
mtpWichí Lhamtés Nocten
mxiMozarabic
mxvMixtec, Metlatónoc
myaBurmese
mziMazatec, Ixcatlán
navNavajo
nbaNyemba
nblNdebele
ndoNdonga
ndsSaxon, Low
nepNepali
nhnNahuatl, Central
nioNganasan
niuNiue
nivGilyak
njoNaga, Ao
nkuKulango, Bouna
nldDutch
nnoNorwegian, Nynorsk
nobNorwegian, Bokmål
notNomatsiguenga
nsoSotho, Northern
nya_chechewaNyanja (Chechewa)
nya_chinyanjaNyanja (Chinyanja)
nymNyamwezi
nynNyankore
nziNzema
oaaOrok
oci_1Francoprovençal (Fribourg)
oci_2Francoprovençal (Savoie)
oci_3Francoprovençal (Vaud)
oci_4Francoprovençal (Valais)
ojbOjibwa, Northwestern
okiOkiek
orhOroqen
ossOsetin
oteOtomi, Mezquital
pamPampangan
panPanjabi, Eastern
papPapiamentu
pauPalauan
pbbPáez
pbuPashto, Northern
pcdPicard
pcmPidgin, Nigerian
pes_1Farsi, Western
pes_2Dari
pisPijin
piuPintupi-Luritja
pltMalagasy, Plateau
pnbPanjabi, Western
polPolish
ponPohnpeian
por_BRPortuguese (Brazil)
por_PTPortuguese (Portugal)
povCrioulo, Upper Guinea
pplPipil
prvOccitan
qucK'iche', Central
qudQuechua (Unified Quichua, old Hispanic orthography)
qugQuichua, Chimborazo Highland
qulQuechua, North Bolivian
quyQuechua, Ayacucho
quzQuechua, Cusco
qvaQuechua, Ambo-Pasco
qvcQuechua, Cajamarca
qvhQuechua, Huamalíes-Dos de Mayo Huánuco
qvmQuechua, Margos-Yarowilca-Lauricocha
qvnQuechua, North Junín
qwhQuechua, Huaylas Ancash
qxaQuechua, South Bolivian
qxnQuechua, Northern Conchucos Ancash
qxuQuechua, Arequipa-La Unión
rarRarotongan
rmnRomani, Balkan
rmn_1Romani, Balkan (1)
rmyAromanian
rohRomansch
roh_puterRomansch (Puter)
roh_rumgrRomansch (Grischun)
roh_surmiranRomansch (Surmiran)
roh_sursilvRomansch (Sursilvan)
roh_sutsilvRomansch (Sutsilvan)
roh_valladerRomansch (Vallader)
ron_1953Romanian (1953)
ron_1993Romanian (1993)
ron_2006Romanian (2006)
runRundi
rusRussian
sagSango
sahYakut
sanSanskrit
scoScots
seySecoya
shkShilluk
shnShan
shpShipibo-Conibo
sinSinhala
skrSeraiki
slkSlovak
slrSalar
slvSlovenian
smeNorth Saami
smoSamoan
snaShona
snkSoninke
snnSiona
somSomali
sotSotho, Southern
spaSpanish
srcSardinian, Logudorese
srp_cyrlSerbian (Cyrillic)
srp_latnSerbian (Latin)
srqSirionó
srrSerer-Sine
sswSwati
sukSukuma
sunSunda
susSusu
swbComorian, Maore
sweSwedish
swhSwahili
tahTahitian
tamTamil
tam_LKTamil (Sri Lanka)
tatTatar
tbzDitammari
tcaTicuna
telTelugu
temThemne
tetTetun
tgkTajiki
tglTagalog
thaThai
tha2Thai (2)
tirTigrigna
tivTiv
tjiTujia, Nothern
tlyTalysh
tnaTacana
tobToba
toiTonga
tojTojolabal
tonTongan
topTotonac, Papantla
tpiTok Pisin
trnTrinitario
tsnTswana
tso_MZTsonga (Mozambique)
tso_ZWTsonga (Zimbabwe)
tszPurepecha
tuk_cyrlTurkmen (Cyrillic)
tuk_latnTurkmen (Latin)
turTurkish
tyvTuva
tzcTzotzil (Chamula)
tzhTzeltal, Oxchuc
tzmTamazight, Central Atlas
uduUduk
uig_arabUyghur (Arabic)
uig_latnUyghur (Latin)
ukrUkrainian
umbUmbundu
uraUrarina
urdUrdu
urd_2Urdu (2)
uzn_cyrlUzbek, Northern (Cyrillic)
uzn_latnUzbek, Northern (Latin)
vaiVai
vecVenetian
venVenda
ven2Venda
vepVeps
vieVietnamese
vmwMakhuwa
warWaray-Waray
wlnWalloon
wolWolof
wwaWaama
xhoXhosa
xsmKasem
yadYagua
yaoYao
yapYapese
yddYiddish, Eastern
ykgYukaghir, Northern
yorYoruba
yrkNenets
yuaMaya, Yucatán
yuzYuracare
zamZapotec, Miahuatlán
zdjComorian, Ngazidja
zghTamazight, Standard Morocan
zroZáparo
ztuZapotec, Güilá
zulZulu

Compatibility

This package is at least compatible with all maintained versions of Node.js. As of now, that is Node.js 18+. It also works in Deno and modern browsers.

Contribute

Yes please! See How to Contribute to Open Source.

Security

This package is safe.

License

MIT © Titus Wormer

Keywords

declaration

FAQs

Package last updated on 21 Mar 2025

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts