
Security News
PEP 810 Proposes Explicit Lazy Imports for Python 3.15
An opt-in lazy import keyword aims to speed up Python startups, especially CLIs, without the ecosystem-wide risks that sank PEP 690.
pip install bgnlp
Please note - only the first time you run one of these operations a model will be downloaded! Therefore, the first run might take more time.
from bgnlp import pos
print(pos("Това е библиотека за обработка на естествен език."))
[{
"word": "Това",
"tag": "PDOsn",
"bg_desc": "местоимение",
"en_desc": "pronoun"
}, {
"word": "е",
"tag": "VLINr3s",
"bg_desc": "глагол",
"en_desc": "verb"
}, {
"word": "библиотека",
"tag": "NCFsof",
"bg_desc": "съществително име",
"en_desc": "noun"
}, {
"word": "за",
"tag": "R",
"bg_desc": "предлог",
"en_desc": "preposition"
}, {
"word": "обработка",
"tag": "NCFsof",
"bg_desc": "съществително име",
"en_desc": "noun"
}, {
"word": "на",
"tag": "R",
"bg_desc": "предлог",
"en_desc": "preposition"
}, {
"word": "естествен",
"tag": "Asmo",
"bg_desc": "прилагателно име",
"en_desc": "adjective"
}, {
"word": "език",
"tag": "NCMsom",
"bg_desc": "съществително име",
"en_desc": "noun"
}, {
"word": ".",
"tag": "U",
"bg_desc": "препинателен знак",
"en_desc": "punctuation"
}]
from bgnlp import lemmatize
text = "Добре дошли!"
print(lemmatize(text))
[{'word': 'Добре', 'lemma': 'Добре'}, {'word': 'дошли', 'lemma': 'дойда'}, {'word': '!', 'lemma': '!'}]
# Generating a string of lemmas.
print(lemmatize(text, as_string=True))
Добре дойда!
Currently, the available NER tags are:
PER
- PersonORG
- OrganizationLOC
- Locationfrom bgnlp import ner
text = "Барух Спиноза е роден в Амстердам"
print(f"Input: {text}")
print("Result:", ner(text))
Input: Барух Спиноза е роден в Амстердам
Result: [{'word': 'Барух Спиноза', 'entity_group': 'PER'}, {'word': 'Амстердам', 'entity_group': 'LOC'}]
from bgnlp import extract_keywords
# Reading the text from a file, since it may be large, hence it wouldn't be
# pleasant to write it directly here.
# The current input is this Bulgarian news article (only the text, no HTML!):
# https://novini.bg/sviat/eu/781622
with open("input_text.txt", "r", encoding="utf-8") as f:
text = f.read()
# Extracting keywords with probability of at least 0.5.
keywords = extract_keywords(text, threshold=0.5)
print("Keywords:")
pprint(keywords)
Keywords:
[{'keyword': 'Еманюел Макрон', 'score': 0.8759163320064545},
{'keyword': 'Г-7', 'score': 0.5938143730163574},
{'keyword': 'Япония', 'score': 0.607077419757843}]
from pprint import pprint
from bgnlp import commatize
text = "Човекът искащ безгрижно писане ме помоли да създам този модел."
print("Without metadata:")
print(commatize(text))
print("\nWith metadata:")
pprint(commatize(text, return_metadata=True))
Without metadata:
Човекът, искащ безгрижно писане, ме помоли да създам този модел.
With metadata:
('Човекът, искащ безгрижно писане, ме помоли да създам този модел.',
[{'end': 12,
'score': 0.9301406145095825,
'start': 0,
'substring': 'Човекът, иск'},
{'end': 34,
'score': 0.93571537733078,
'start': 24,
'substring': ' писане, м'}])
FAQs
Package for Bulgarian Natural Language Processing (NLP)
We found that bgnlp demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
An opt-in lazy import keyword aims to speed up Python startups, especially CLIs, without the ecosystem-wide risks that sank PEP 690.
Security News
Socket CEO Feross Aboukhadijeh discusses the recent npm supply chain attacks on PodRocket, covering novel attack vectors and how developers can protect themselves.
Security News
Maintainers back GitHub’s npm security overhaul but raise concerns about CI/CD workflows, enterprise support, and token management.