
Security News
Deno 2.4 Brings Back deno bundle, Improves Dependency Management and Observability
Deno 2.4 brings back bundling, improves dependency updates and telemetry, and makes the runtime more practical for real-world JavaScript projects.
best-effort representations using smaller coded character sets (ASCII,
ISO 8859, etc.). The translation tables used by the codecs are from
the transtab
collection by Markus Kuhn.
Three types of transliterating codecs are provided:
"long", using as many characters as needed to make a natural
replacement. For example, \u00e4 LATIN SMALL LETTER A WITH
DIAERESIS ä
will be replaced with ae
.
"short", using the minimum number of characters to make a
replacement. For example, \u00e4 LATIN SMALL LETTER A WITH
DIAERESIS ä
will be replaced with a
.
"one", only performing single character replacements. Characters
that can not be transliterated with a single character are passed
through unchanged. For example, \u2639 WHITE FROWNING FACE ☹
will be passed through unchanged.
Using the codecs is simple::
import translitcodec import codecs codecs.encode('fácil € ☺', 'translit/long') 'facil EUR :-)' codecs.encode('fácil € ☺', 'translit/short') 'facil E :-)'
The codecs return Unicode by default. To receive a bytestring back, either chain the output of encode() to another codec, or append the name of the desired byte encoding to the codec name::
codecs.encode('fácil € ☺', 'translit/one').encode('ascii', 'replace') 'facil E ?' 'fácil € ☺'.encode('translit/one/ascii', 'replace') 'facil E ?'
The package also supplies a 'transliterate' codec, an alias for 'translit/long'.
Another way to use the library is to use an error handle. Error handles are available:
These error handles above, work similarly to Python's built-in ones. The difference is that transliteration is attempted first.
codecs.encode('Zażółć gęślą jaźń € ☺另!@#', 'ISO-8859-2', 'replace/translit/long').decode('ISO-8859-2') 'Zażółć gęślą jaźń EUR :-)?!@#' codecs.encode('Zażółć gęślą jaźń € ☺另!@#', 'ISO-8859-2', 'replace/translit/short').decode('ISO-8859-2') 'Zażółć gęślą jaźń E :-)?!@#' codecs.encode('Zażółć gęślą jaźń € ☺另!@#', 'ISO-8859-2', 'replace/translit/one').decode('ISO-8859-2') 'Zażółć gęślą jaźń E ??!@#' codecs.encode('Zażółć gęślą jaźń € ☺另!@#', 'ISO-8859-2', 'ignore/translit/long').decode('ISO-8859-2') 'Zażółć gęślą jaźń EUR :-)!@#' codecs.encode('Zażółć gęślą jaźń € ☺另!@#', 'ISO-8859-2', 'ignore/translit/short').decode('ISO-8859-2') 'Zażółć gęślą jaźń E :-)!@#' codecs.encode('Zażółć gęślą jaźń € ☺另!@#', 'ISO-8859-2', 'ignore/translit/one').decode('ISO-8859-2') 'Zażółć gęślą jaźń E !@#'
Released on May 8, 2021
Released on December 13, 2020
Released on January 19, 2020
Released on January 19, 2020
Released on January 18, 2020
Complete coverage of the Vietnamese alphabet
Removed Python 2 support
Released on May 11, 2015
Released on February 14, 2011
Fixes to the transtab table rebuilding tool.
Added translitcodec.version
Released on January 27, 2011
Resolves issue of "TypeError: character mapping must return integer, None or unicode" when a blank value (eg: \N{ZERO WIDTH SPACE} \u200B) was encoded. Unicode blanks are now returned.
Characters in the ASCII range are no longer included in the translation tables.
Released on December 28, 2008
FAQs
Unicode to 8-bit charset transliteration codec
We found that translitcodec demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 2 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Deno 2.4 brings back bundling, improves dependency updates and telemetry, and makes the runtime more practical for real-world JavaScript projects.
Security News
CVEForecast.org uses machine learning to project a record-breaking surge in vulnerability disclosures in 2025.
Security News
Browserslist-rs now uses static data to reduce binary size by over 1MB, improving memory use and performance for Rust-based frontend tools.