
Security News
MCP Community Begins Work on Official MCP Metaregistry
The MCP community is launching an official registry to standardize AI tool discovery and let agents dynamically find and install MCP servers.
python-iso639
is a Python package for ISO 639 language codes, names, and
other associated information.
Current features:
pip install python-iso639
python-iso639
revolves around a Language
class.
Instances of Language
have attributes and methods that you will find useful.
Note that while the package name registered on PyPI is python-iso639
,
the actual import name during runtime is iso639
(which means you should do import iso639
in your Python code).
Language
InstancesCreate a Language
instance by one of the class methods.
from_part3
, with an ISO 639-3 code>>> import iso639
>>> lang1 = iso639.Language.from_part3('fra')
>>> type(lang1)
<class 'iso639.language.Language'>
>>> lang1
Language(part3='fra', part2b='fre', part2t='fra', part1='fr', scope='I', type='L', name='French', comment=None, other_names=None, macrolanguage=None, retire_reason=None, retire_change_to=None, retire_remedy=None, retire_date=None)
Fast object instantiation for retrieving language information (run on Python 3.13, macOS 15.3.1, Apple M1 Pro)
In [1]: import iso639
In [2]: %timeit iso639.Language.from_part3("fra")
217 ns ± 0.139 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
>>> lang2 = iso639.Language.from_part2b('fre') # ISO 639-2 (bibliographic)
>>> lang3 = iso639.Language.from_part2t('fra') # ISO 639-2 (terminological)
>>> lang4 = iso639.Language.from_part1('fr') # ISO 639-1
>>> lang5 = iso639.Language.from_name('French') # ISO 639-3 reference language name
LanguageNotFoundError
is Raised for Invalid Inputs>>> iso639.Language.from_part3('Fra') # The user input is case-sensitive!
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
LanguageNotFoundError: 'Fra' isn't an ISO language code or name
>>>
>>> iso639.Language.from_name("unknown language")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
LanguageNotFoundError: 'unknown language' isn't an ISO language code or name
>>> lang1
Language(part3='fra', part2b='fre', part2t='fra', part1='fr', scope='I', type='L', name='French', comment=None, other_names=None, macrolanguage=None, retire_reason=None, retire_change_to=None, retire_remedy=None, retire_date=None)
>>> lang1.part3
'fra'
>>> lang1.name
'French'
>>> lang1 == lang2 == lang3 == lang4 == lang5 # All are French
True
>>> lang6 = iso639.Language.from_part3('spa') # Spanish
>>> lang1 == lang6 # French vs. Spanish
False
>>> 'French' == lang1.name == lang2.name == lang3.name == lang4.name == lang5.name
True
>>> lang6.name
'Spanish'
match
You don't know which code set or name your input is from?
Use the match
classmethod:
>>> lang1 = iso639.Language.match('fra')
>>> lang2 = iso639.Language.match('fre')
>>> lang3 = iso639.Language.match('fr')
>>> lang4 = iso639.Language.match('French')
>>> lang1 == lang2 == lang3 == lang4
True
By default, the classmethod match
supports case-insensitive matching
and ignores leading/trailing whitespace.
To enforce exact matching instead, pass in exact=True
:
>>> lang5 = iso639.Language.match('FRA')
>>> lang6 = iso639.Language.match('fra ')
>>> lang7 = iso639.Language.match('french')
>>> lang4 == lang5 == lang6 == lang7
True
>>> iso639.Language.match("french", exact=True)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
LanguageNotFoundError: 'french' isn't an ISO language code or name
The classmethod match
is particularly useful for consistently
accessing a specific attribute from unknown inputs, e.g., the ISO 639-3 code.
>>> 'fra' == lang1.part3 == lang2.part3 == lang3.part3 == lang4.part3 == lang5.part3 == lang6.part3 == lang7.part3
True
If there's no match, a LanguageNotFoundError
is raised,
which you may want to catch:
>>> try:
... lang = iso639.Language.match('not gonna find a match')
... except iso639.LanguageNotFoundError:
... print("no match found!")
...
no match found!
>>> language = iso639.Language.match('yue')
>>> language.name
'Yue Chinese' # also commonly known as Cantonese
>>> language.macrolanguage
'zho' # Chinese
>>> language.other_names
[Name(print='Yue Chinese', inverted='Chinese, Yue')]
>>> for name in language.other_names:
... print(f'{name.print} | {name.inverted}')
...
Yue Chinese | Chinese, Yue
>>> language = iso639.Language.match('bvs')
>>> language.part3
'bvs'
>>> language.name
'Belgian Sign Language'
>>> language.status
'R' # (R)etired
>>> language.retire_reason
'S' # (S)plit
>>> language.retire_change_to is None
True
>>> language.retire_remedy
'Split into Langue des signes de Belgique Francophone [sfb], and Vlaamse Gebarentaal [vgt]'
>>> language.retire_date
datetime.date(2007, 7, 18)
Language
InstanceA Language
instance has the following attributes:
Attribute | Data type | Can it be None ? | Description |
---|---|---|---|
part3 | str | ✗ | ISO 639-3 code |
part2b | str | ✓ | ISO 639-2 code (bibliographic) |
part2t | str | ✓ | ISO 639-2 code (terminological) |
part1 | str | ✓ | ISO 639-1 code |
scope | str | ✗ | One of {(I)ndividual, (M)acrolanguage, (S)pecial} |
type | str | ✓ | One of {(A)ncient, (C)onstructed, (E)xtinct, (H)istorical, (L)iving, (S)pecial} [1] |
status | str | ✗ | One of {(A)ctive, (R)etired}, describing the ISO 639-3 code |
name | str | ✗ | Reference language name in ISO 639-3 |
comment | str | ✓ | Comment from ISO 639-3 |
other_names | List[Name] | ✓ | Other print and inverted names [2] |
macrolanguage | str | ✓ | Macrolanguage |
retire_reason | str | ✓ | Retirement reason, one of {(C)hange, (D)uplicate, (N)on-existent, (S)plit, (M)erge} |
retire_change_to | str | ✓ | ISO 639-3 code to which this language can be changed, if retirement reason is one of {(C)hange, (D)uplicate, (M)erge} |
retire_remedy | str | ✓ | Instructions for updating this retired language code |
retire_date | datetime.date | ✓ | The date the retirement became effective |
[1] If the ISO 639-3 code is retired, then the type
attribute is None
,
because its value is not clearly discernible from the SIL data source.
[2] A Name
instance has the attributes print
and inverted
,
for the print name and inverted name, respectively.
If reference name, print name, and inverted name are all the same, then
that particular (print name, inverted name) pair is excluded from
the other_names
attribute.
For example, for Spanish (ISO 639-3: spa), one (print name, inverted name)
pair is (Spanish, Spanish) from the SIL data source, but this pair is
excluded from its list of other_names
.
Language.match
Matches the LanguageAt a high level, Language.match
assumes the input is more likely to be
a language code rather than a language name.
Beyond that, the precise order in matching is as follows:
As soon as a match is found, Language.match
returns a Language
instance.
If there isn't a match, a LanguageNotFoundError
is raised.
Language
is a dataclassThe Language
class is a dataclass.
All functionality of
dataclasses
applies to Language
and its instances,
e.g., dataclasses.asdict
:
>>> import dataclasses, iso639
>>> language = iso639.Language.match('fra')
>>> dataclasses.asdict(language)
{'part3': 'fra', 'part2b': 'fre', 'part2t': 'fra', 'part1': 'fr', 'scope': 'I', 'type': 'L', 'status': 'A', 'name': 'French', 'comment': None, 'other_names': None, 'macrolanguage': None, 'retire_reason': None, 'retire_change_to': None, 'retire_remedy': None, 'retire_date': None}
DATA_LAST_UPDATED
: The release date of the included language code data from SIL
>>> import iso639
>>> iso639.DATA_LAST_UPDATED
datetime.date(2025, 1, 15)
ALL_LANGUAGES
: The list of all Language
objects based on the included language code data
>>> import iso639
>>> type(iso639.ALL_LANGUAGES)
<class 'set'>
>>> len(iso639.ALL_LANGUAGES)
8307
The python-iso639
code is released under an Apache 2.0 license.
Please see LICENSE.txt
for details.
The data source that backs this package is the
language code tables published by SIL.
The tables are included in this package under src/iso639/_data/
.
They are the UTF8-encoded *.tab
tab-separated files bundled as a ZIP archive file,
typically found at a URL that looks like
https://iso639-3.sil.org/sites/iso639-3/files/downloads/iso-639-3_Code_Tables_YYYYMMDD.zip
(replace YYYYMMDD
with the data release date).
Note that SIL resources have their terms of use.
Both packages iso639 and iso-639 exist on PyPI. However, as of this writing (May 2022), they were last updated in 2016 and don't seem to be maintained anymore for updating the language codes. pycountry is a great package, but what if you want a more lightweight package with just the language codes only and not the other stuff? :-)
If you ever notice that the upstream ISO 639-3 tables from SIL have been updated and yet this package isn't using the latest data, please ping me by opening a GitHub issue.
FAQs
ISO 639 language codes, names, and other associated information
We found that python-iso639 demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
The MCP community is launching an official registry to standardize AI tool discovery and let agents dynamically find and install MCP servers.
Research
Security News
Socket uncovers an npm Trojan stealing crypto wallets and BullX credentials via obfuscated code and Telegram exfiltration.
Research
Security News
Malicious npm packages posing as developer tools target macOS Cursor IDE users, stealing credentials and modifying files to gain persistent backdoor access.