Security News
Fluent Assertions Faces Backlash After Abandoning Open Source Licensing
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
iso639-lang
handles the ISO 639 code for individual languages and language groups.
>>> from iso639 import Lang
>>> Lang("French")
Lang(name='French', pt1='fr', pt2b='fre', pt2t='fra', pt3='fra', pt5='')
$ pip install iso639-lang
iso639-lang
supports Python 3.8+.
Begin by importing the Lang
class.
>>> from iso639 import Lang
Let's try with the identifier of an individual language.
>>> lg = Lang("deu")
>>> lg.name # 639-3 reference name
'German'
>>> lg.pt1 # 639-1 identifier
'de'
>>> lg.pt2b # 639-2/B bibliographic identifier
'ger'
>>> lg.pt2t # 639-2/T terminological identifier
'deu'
>>> lg.pt3 # 639-3 identifier
'deu'
And now with the identifier of a group of languages.
>>> lg = Lang("cel")
>>> lg.name # 639-5 English name
'Celtic languages'
>>> lg.pt2b # 639-2/B bibliographic identifier
'cel'
>>> lg.pt2t # 639-2/T terminological identifier
'cel'
>>> lg.pt5 # 639-5 identifier
'cel'
Lang
is instantiable with any ISO 639 identifier or reference name.
>>> Lang("German") == Lang("de") == Lang("deu") == Lang("ger")
True
Lang
also recognizes all non-reference English names associated with a language identifier in ISO 639.
>>> Lang("Chinese, Mandarin") # 639-3 inverted name
Lang(name='Mandarin Chinese', pt1='', pt2b='', pt2t='', pt3='cmn', pt5='')
>>> Lang("Uyghur") # other 639-3 printed name
Lang(name='Uighur', pt1='ug', pt2b='uig', pt2t='uig', pt3='uig', pt5='')
>>> Lang("Valencian") # other 639-2 English name
Lang(name='Catalan', pt1='ca', pt2b='cat', pt2t='cat', pt3='cat', pt5='')
Please note that Lang
is case-sensitive.
>>> Lang("ak")
Lang(name='Akan', pt1='ak', pt2b='aka', pt2t='aka', pt3='aka', pt5='')
>>> Lang("Ak")
Lang(name='Ak', pt1='', pt2b='', pt2t='', pt3='akq', pt5='')
You can use the asdict
method to return ISO 639 values as a Python dictionary.
>>> Lang("fra").asdict()
{'name': 'French', 'pt1': 'fr', 'pt2b': 'fre', 'pt2t': 'fra', 'pt3': 'fra', 'pt5': ''}
In addition to their reference name, some language identifiers may be associated with other names. You can list them using the other_names
method.
>>> lg = Lang("ast")
>>> lg.name
'Asturian'
>>> lg.other_names()
['Asturleonese', 'Bable', 'Leonese']
The type of a language is accessible thanks to the type
method.
>>> lg = Lang("Latin")
>>> lg.type()
'Historical'
You can easily determine whether a language is a macrolanguage or an individual language.
>>> lg = Lang("Arabic")
>>> lg.scope()
'Macrolanguage'
Use the macro
method to get the macrolanguage of an individual language.
>>> lg = Lang("Wu Chinese")
>>> lg.macro()
Lang(name='Chinese', pt1='zh', pt2b='chi', pt2t='zho', pt3='zho', pt5='')
Conversely, you can also list all the individual languages that share a common macrolanguage.
>>> lg = Lang("Persian")
>>> lg.individuals()
[Lang(name='Iranian Persian', pt1='', pt2b='', pt2t='', pt3='pes', pt5=''),
Lang(name='Dari', pt1='', pt2b='', pt2t='', pt3='prs', pt5='')]
As Lang
is hashable, Lang
instances can be added to a set or used as dictionary keys.
>>> {Lang("de"): "foo", Lang("fr"): "bar"}
{Lang(name='German', pt1='de', pt2b='ger', pt2t='deu', pt3='deu', pt5=''): 'foo', Lang(name='French', pt1='fr', pt2b='fre', pt2t='fra', pt3='fra', pt5=''): 'bar'}
Lists of Lang
instances are sortable by name.
>>> [lg.name for lg in sorted([Lang("deu"), Lang("rus"), Lang("eng")])]
['English', 'German', 'Russian']
iter_langs()
iterates through all possible Lang
instances, ordered alphabetically by name.
>>> from iso639 import iter_langs
>>> [lg.name for lg in iter_langs()]
["'Are'are", "'Auhelawa", "A'ou", ... , 'ǂHua', 'ǂUngkue', 'ǃXóõ']
When an invalid language value is passed to Lang
, an InvalidLanguageValue
exception is raised.
>>> from iso639.exceptions import InvalidLanguageValue
>>> try:
... Lang("foobar")
... except InvalidLanguageValue as e:
... e.msg
...
"'foobar' is not a valid Lang argument."
When a deprecated language value is passed to Lang
, a DeprecatedLanguageValue
exception is raised.
>>> from iso639.exceptions import DeprecatedLanguageValue
>>> try:
... Lang("gsc")
... except DeprecatedLanguageValue as e:
... lg = Lang(e.change_to)
... f"{e.name} replaced by {lg.name}."
...
'Gascon replaced by Occitan (post 1500).'
Note that you can use the is_language
language checker if you don't want to handle exceptions.
The is_language
function checks if a language value exists according to ISO 639.
>>> from iso639 import is_language
>>> is_language("fr")
True
>>> is_language("French")
True
You can restrict the check to certain identifiers or names by passing an additional argument.
>>> is_language("fr", "pt3") # only 639-3
False
>>> is_language("fre", ("pt2b", "pt2t")) # only 639-2/B or 639-2/T
True
iso639-lang
loads its mappings into memory to process calls much faster than Python libraries that rely on an embedded database.
As of November 11, 2024, iso639-lang
is based on the latest tables provided by the ISO 639 registration authorities. Please open a new issue if you find that this library uses out-of-date data files.
Set | Description | Registration Authority | Last Modified |
---|---|---|---|
Set 1 | two-letter language identifiers for major, mostly national individual languages | Infoterm | 2009-09-01 |
Set 2 | three-letter language identifiers for a larger number of widely known individual languages and a number of language groups | Library of Congress | 2017-12-21 |
Set 3 | three-letter language identifiers covering all individual languages, including living, extinct and ancient languages | SIL International | 2024-10-10 |
Set 5 | three-letter language identifiers covering a larger set of language groups, living and extinct | Library of Congress | 2013-02-11 |
To learn more about how the source tables are processed by the iso639-lang
library, read the generate.py
script.
We welcome contributions from the community to help improve this Python library. If you're interested in contributing, please follow the guidelines here.
FAQs
A fast, simple ISO 639 library.
We found that iso639-lang demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
Research
Security News
Socket researchers uncover the risks of a malicious Python package targeting Discord developers.
Security News
The UK is proposing a bold ban on ransomware payments by public entities to disrupt cybercrime, protect critical services, and lead global cybersecurity efforts.