What is language-subtag-registry?
The language-subtag-registry npm package provides an interface to access and manipulate data from the IANA Language Subtag Registry. This registry contains codes that are used to identify human languages, scripts, countries, and variants. The package allows for searching, validating, and parsing these codes, making it useful for applications that need to handle language tags according to the IETF BCP 47 standard.
What are language-subtag-registry's main functionalities?
Searching for language subtags
This feature allows users to search the registry for specific subtags. In the code sample, the search function is used to find entries related to the 'en' subtag, which typically corresponds to English.
const { search } = require('language-subtag-registry');
const results = search('en');
console.log(results);
Validating language subtags
This feature enables the validation of language subtags to ensure they conform to the IETF BCP 47 standard. The code sample demonstrates validating the 'en-US' language tag, which represents English as used in the United States.
const { validate } = require('language-subtag-registry');
const isValid = validate('en-US');
console.log(isValid);
Parsing language subtags
This feature provides the ability to parse complex language tags into their constituent parts, such as language, script, and region. The code sample parses 'zh-Hant-HK', which represents Chinese written in the Traditional script as used in Hong Kong.
const { parse } = require('language-subtag-registry');
const parsed = parse('zh-Hant-HK');
console.log(parsed);
Other packages similar to language-subtag-registry
bcp-47
Similar to language-subtag-registry, the bcp-47 package provides tools for parsing and validating BCP 47 language tags. However, it focuses more on the syntax of the tags rather than providing access to the registry data itself.
iso-639-1
The iso-639-1 package offers functionality for working with ISO 639-1 language codes, including getting names of languages and converting between codes and names. It is similar in that it deals with language codes, but it is limited to the ISO 639-1 standard and does not cover scripts or regions.
iso-3166-1
This package is designed to work with ISO 3166-1 country codes, providing functionality to search, validate, and get information about countries. While it deals with a different aspect of language tags (the region subtag), it offers similar utility in terms of validating and parsing codes related to geographic regions.
IANA Language Tags
IANA's official repository is in record-jar format and is hard to parse. This project provides neatly organized JSON files representing that data.
See data/
for all the JSON files available. The registry.json
file contains all records in a flat array and meta.json
contains its metadata. There's a separate JSON file for each 'scope' (e.g. macrolanguage.json
) and 'type' (e.g. language.json
). These files contain JSON objects keyed by tag or subtag and with the index integer for the corresponding entry in registry.json
as a value. This makes lookups fast.
Updates
This project will be updated as the registry changes. Non-breaking updates will result in the patch version number being bumped.
Run make update
to force an update from the latest official IANA-hosted version. The registry file format is converted to JSON automatically and the files in data/
are updated.
If there are changes, please make a pull request.
Usage
See language-tags for a Javascript API.
Credits and collaboration
The JSON database is licensed under the Creative Commons Zero v1.0 Universal (CC0-1.0) license.
Comments, feedback and suggestions are welcome. Please feel free to raise an issue or pull request. Enjoy.