
Security News
npm Introduces minimumReleaseAge and Bulk OIDC Configuration
npm rolls out a package release cooldown and scalable trusted publishing updates as ecosystem adoption of install safeguards grows.
compling
Advanced tools
compling is a Python module that provides some Natural Language Processing and Computational Linguistics functionalities to work with human language data. It incorporates various Data and Text Mining features from other famous libraries (e.g. spacy, nltk, sklearn, ...) in order to arrange a pipeline aimed at the analysis of corpora of JSON documents.
See documentation here.
You can install compling with:
$ pip install compling
compling requires:
You also need to download:
a ++spacy language model++
See here the available models. You can choose based on the language of your corpus documents.
By default, complig expects you to download sm models. You can still choose to download larger models, but remember to edit the confg.ini file, so it can work properly.
Example
Let's assume the language of your documents is English.
You could download the spacy small english model:
python -m spacy download en_core_web_sm
some ++nltk functionalities++:
$ python -m nltk.downloader stopwords
$ python -m nltk.downloader punkt
The functionalities offered by compling may require a large variety of parameters. To facilitate their use, default values are provided for some parameters:
You can see a preview below:
[Corpus]
;The language of documents in your corpus.
language = english
;Documents in your corpus store their text in this key.
text_key = text
;Documents in your corpus store their date values as string in this format.
;For a complete list of formatting directives, see: https://docs.python.org/3/library/datetime.html#strftime-strptime-behavior.
date_format = %d/%m/%Y
;The size of spacy model you want it to be used in the text processing
spacy_model_size = md
[Document_record]
;Document records metadata:
;If lower==1, A lowercase version will be stored for each document.
lower = 0
;If lemma==1, A version with tokens replace by their lemma will be stored for each document.
lemma = 0
;If stem==1, A version with tokens replace by their stem will be stored for each document.
stem = 0
;If negations==1, A version where negated token are preceded by 'NOT_' prefix will be stored for each document.
negations = 1
;If named_entities==1, the occurring named entities will be stored in a list for each document.
named_entities = 1
; ...
compling provides the ConfigManager class to make it easier for you to edit the config.ini file and to help you handling the corpora processing .
You can see a short example of usage at https://github.com/FrancescoPeriti/compling.
See the documentation for more details.
FAQs
Computational Linguistic
We found that compling demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
npm rolls out a package release cooldown and scalable trusted publishing updates as ecosystem adoption of install safeguards grows.

Security News
AI agents are writing more code than ever, and that's creating new supply chain risks. Feross joins the Risky Business Podcast to break down what that means for open source security.

Research
/Security News
Socket uncovered four malicious NuGet packages targeting ASP.NET apps, using a typosquatted dropper and localhost proxy to steal Identity data and backdoor apps.