
Product
Introducing PHP and Composer Support in Socket
Socket now supports PHP with full Composer and Packagist integration, enabling developers to search packages, generate SBOMs, and protect their PHP dependencies from supply chain threats.
hyperglot
Advanced tools
Hyperglot is an open research project dedicated to documenting how the world’s languages are written. By mapping orthographies and their requirements, it supports inclusive, multilingual type design and equitable access to high-quality typography for underserved communities. Hyperglot currently covers 783 languages, representing approximately 7.3 billion speakers, and is developed as open source by Rosetta Type/Research in collaboration with a global community of contributors and licensed under the Apache 2.0 license.
Hyperglot is available as:
hyperglot,import hyperglot (see examples for basic usage).📖 Learn more about Hyperglot
🙋 Read the FAQ
💰 Sponsor via GitHub or directly via Hyperglot sponsorship. Any and all contributions are much appreciated! 🙏
Hyperglot is a work in progress and provided AS IS. The validity of language data varies and continues to improve. Each language includes a validity label (todo, draft, preliminary, verified) to help you assess the data.
Mapping all the world’s languages is a huge task—we need help from native speakers and language users! If you notice an error or see that a language is missing, please get in touch (via email or Issues). We welcome contributions and will credit your input.
The data structure is documented in a separate README file along with guidelines for contributing.
The following concepts are essential to understanding how Hyperglot works.
A language can be written in one or more scripts. Each such writing system is represented in Hyperglot as an orthography. Most languages have a single primary orthography; however, some use multiple orthographies either independently (for example, in different regions) or concurrently (such as Serbian or Japanese).
In the database, an orthography contains the following character sets:
base – the required, essential characters,aux – non-essential, recommended characters,marks – combining marks,punctuation,numerals, andcurrency.A script, however, is more than a collection of characters. It also defines how characters interact when combined. This behavior is known as shaping and, in digital fonts, is implemented using OpenType features.
Read the detailed description of the database structure
To detect language support in a font, Hyperglot performs the following checks:
Additional design-related notes are provided for the user’s discretion when assessing design quality. Hyperglot does not assess the font design in any way.
You will need to have Python 3 installed. Install via pip:
pip install hyperglot
Besides the main hyperglot command used for font inspection, the package also includes:
hyperglot-report – explore missing language support (see below).hyperglot-data – review language data stored in the database.hyperglot-validate, hyperglot-save, and hyperglot-export – manage and process data when contributing.Use:
hyperglot path/to/font.otf
to output a list of supported languages (and other data) for a font. Use:
hyperglot path/to/font.otf path/to/anotherfont.otf …
to check several fonts at once, or their combined coverage (with -m union).
-c, --check: Specify which character sets to check against. Options are 'base, auxiliary, punctuation, numerals, currency, all', or a comma-separated combination of these. (Default: 'base')--validity: Filter languages by data validity level. Options are 'todo, draft, preliminary, verified'. (Default: 'preliminary')-s, --status: Specify which languages to consider when checking support. Options are 'living, historical, constructed, all', or a comma-separated combination of these . (Default: 'living,constructed')-o, --orthography: Which orthographies to consider when checking support for a language. Options are 'primary, secondary, historical, transliteration, all', or a comma-separated combination of these. (Default: 'primary')-d, --decomposed: For precomposed character combinations, require only the individual component characters. By default, precomposed character combinations are also required when they have a unique code point in Unicode. (Default: False)-m, --marks: Require that a font include all combining marks used by a language’s orthography. By default, only marks that are not part of precomposed character combinations are required. (Default: False)--sort: Specify the sort order. Use "speakers" to sort by number of speakers. (Default: "alphabetic")--sort-dir: Specify the sort direction. Use "desc" for descending order. (Default: "asc" for ascending order)-y, --output: Specify a file path to write the output to, in YAML format. For a single input font, the output is a subset of the Hyperglot database containing the languages and orthographies supported by the font. When multiple fonts are provided, the YAML file contains a top-level key for each font. If the -m option is provided, the output includes the specific intersection or union result.-t, --shaping-threshold: Set the frequency threshold for complex-script shaping checks. A font passes when it renders correctly for combinations at or above this threshold. Frequencies range from 1.0 (most frequent combinations) to 0.0 (rares combinations). (Default: 0.01)--no-shaping Disable shaping checks (mark attachment, joining behavior, and conjunct shaping). (Default: shaping checks enabled)-v, --verbose: Enable verbose logging.-V, --version: Print the Hyperglot version number.The hyperglot-report reports missing characters and shaping support. A common use case is identifying languages that could be supported with minimal additional work in a given font. The command accepts the same options as hyperglot and the following options:
--report-missing: Report languages missing n or fewer characters. If n is 0, all languages with any number of missing characters are reported. (Default: 0)--report-marks: Report languages missing n or fewer mark-attachment sequences. If n is 0, all languages with any number of missing mark-attachment sequences are reported. (Default: 0)--report-joining: Report languages missing n or fewer joining sequences. If n is 0, all languages with any number of missing joining sequences are reported. (Default: 0)--report-all: Set or override all other --report-* options.The comparison of Hyperglot and the Unicode CLDR (this might be outdated atm.)
other directory is replicated from various public domain and open source origins for compasion and aggregation (mostly present in historic commits of this repository)FAQs
Detect language support for font binaries
We found that hyperglot demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 2 open source maintainers collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Product
Socket now supports PHP with full Composer and Packagist integration, enabling developers to search packages, generate SBOMs, and protect their PHP dependencies from supply chain threats.

Security News
An AI agent is merging PRs into major OSS projects and cold-emailing maintainers to drum up more work.

Research
/Security News
Chrome extension CL Suite by @CLMasters neutralizes 2FA for Facebook and Meta Business accounts while exfiltrating Business Manager contact and analytics data.