compling
Computational Linguistic with Python

compling is a Python module that provides some Natural Language Processing and Computational Linguistics functionalities to work with human language data. It incorporates various Data and Text Mining features from other famous libraries (e.g. spacy, nltk, sklearn, ...) in order to arrange a pipeline aimed at the analysis of corpora of JSON documents.
Documentation
See documentation here.
Installation
You can install compling with:
$ pip install compling
compling requires:
- Python (>= 3.6)
- numpy
- spacy
- nltk
- gensim
- tqdm
- unicodedata2
- unidecode
- configparser_
- vaderSentiment
- wordcloud
You also need to download:
-
a ++spacy language model++
See here the available models. You can choose based on the language of your corpus documents.
By default, complig expects you to download sm models. You can still choose to download larger models, but remember to edit the confg.ini file, so it can work properly.
Example
Let's assume the language of your documents is English.
You could download the spacy small english model:
python -m spacy download en_core_web_sm
-
some ++nltk functionalities++:
config.ini
The functionalities offered by compling may require a large variety of parameters. To facilitate their use, default values are provided for some parameters:
- some can be changed in the function invocation. Many functions provide optional parameters;
- others are stored in the ++config.ini++ file.
This file configures the processing of your corpora. It contains the values of some special parameters.
(e.g. the language of documents in your corpus.)
You can see a preview below:
[Corpus]
language = english
text_key = text
date_format = %d/%m/%Y
spacy_model_size = md
[Document_record]
lower = 0
lemma = 0
stem = 0
negations = 1
named_entities = 1
ConfigManager
compling provides the ConfigManager class to make it easier for you to edit the config.ini file and to help you handling the corpora processing .
example of usage (compling)
You can see a short example of usage at https://github.com/FrancescoPeriti/compling.
See the documentation for more details.