Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
fonemas is a Python library of methods and functions for phonologic and phonetic transcription of Spanish words.
This library is part of the research project Sound and Meaning in Spanish Golden Age Literature. This library was originally intended to analyse only pohonological features relevant to verse scansion. It has expanded its functionality ever since to become a fully featured phonological and phonetic analyser with IPA and SAMPA support.
pip3 install fonemas
The library provides the class transcription(sentence, mono, epenthesis, aspiration, rehash, sampastr). The class takes the obligatoy argument sentence, which is a string of characters with a Spanish word or words. It optionally takes two Boolean arguments mono, epenthesis and aspiration set to False as default.
mono sets whether the output shows graphic stresses for monosyllabic words
epenthesis set the behaviour S bfore consonant in onset (spiritu -> es pi ri tu|spi ri tu)
aspiration inserts an aspiration modifier 'ʰ' in onset. This may be useful when dealing with ambiguous verses in classic poetry to choose which synaloepha to break.
rehash moves last consonan on last-syllable coda to next's words first-syllable onset if it begins with a vowel.
sampastr allows an alternativestress symbol, as '"' to prevent issues e.g. when using in a CSV file.
The class transcription() has three dataclass attributes, each with two attributes {words, syllables} containing each a list of strings, which may be words or syllables, respectively.
phonology for the phonological transcription (requires UNICODE support).
phonetics for the phonetic transcription in IPA symbols (requires UNICODE support).
sampa for the phonetic transcription SAMPA transliteration.
>>> from fonemas import Transcription
>>> object = Transcription('Averigüéis')
>>> a.phonology.words
['abeɾiˈgwejs']
>>> a.phonology.syllables
['a', 'be', 'ɾi', 'ˈgwejs']
>>> a.phonetics.words
['aβeɾiˈɣwejs']
>>> a.phonetics.syllables
['a', 'be', 'ɾi', 'ˈɣwejs']
>>> a.sampa.words
['aBeri"Gwejs']
>>> a.sampa.syllables
['a', 'Be', 'ri', '"Gwejs']
The transcription is done according to the Spanish phonology and phonotactics described by Quilis (2019).
The phonetic transcription lacks allophones represented in IPA with diacritics. They require double characters, which need a workaround to be evaluated. It can be solved using hacks for 'special cases', which I will do until figure out a general solution.
Non-Spanish languages with different prosodic rules but same spelling will cause problems, e.g.(lat. 'amor', 'amabor', 'amabar', 'amer' vs sp. 'amor'. 'labor', 'acabar', 'temer').
Feel free to contribute using the GitHub Issue Tracker for feedback, suggestions, or bug reports.
Authors of scientific papers including results generated using fonemas are encouraged to cite the following paper.
@article{SanzLazaroF_RHD2023,
author = {Sanz-Lázaro, Fernando},
title = {Del fonema al verso: una caja de herramientas digitales de escansión teatral},
volume = {8},
journal = {Revista de Humanidades Digitales},
doi = {https://doi.org/10.5944/rhd.vol.8.2023.37830},
pages = {74--89},
langid = {Spanish},
}
2.1.0
2.0.20.1
2.0.20
2.0.19
2.0.18
2.0.17
2.0.16
Copyright (C) 2022 Fernando Sanz-Lázaro <fsanzl@gmail.com>
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library. If not, see <https://www.gnu.org/licenses/>.
Quilis, Antonio, Tratado de fonología y fonética españolas. Madrid, Gredos, 2019.
FAQs
Phonetic transcription of Spanish
We found that fonemas demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.