
Security News
Open Source Maintainers Feeling the Weight of the EU’s Cyber Resilience Act
The EU Cyber Resilience Act is prompting compliance requests that open source maintainers may not be obligated or equipped to handle.
A simple NLP library that allows profiling datasets with one or more text columns.
NLP Profiler returns either high-level insights or low-level/granular statistical information about the text when given a dataset and a column name containing text data, in that column.
In short: Think of it as using the pandas.describe()
function or running Pandas Profiling on your data frame, but for datasets containing text columns rather than the usual columnar datasets.
pandas.describe()
on the dataframe.See screenshots under the Jupyter section and also under Screenshots for further illustrations.
Under the hood it does make use of a number of libraries that are popular in the AI and ML communities, but we can extend it's functionality by replacing or adding other libraries as well.
A simple notebook have been provided to illustrate the usage of the library.
Please join the Gitter.im community and say "hello" to us, share your feedback, have a fun time with us.
Note: this is a new endeavour and it may have rough edges i.e. NLP_Profiler in its current version is probably NOT capable of doing many things. Many of these gaps are opportunities we can work on and plug, as we go along using it. Please provide constructive feedback to help with the improvement of this library. We just recently achieved this with scaling with larger datasets.
requirements.txt
.Look at a short demo of the NLP Profiler library at one of these:
![]() | ![]() |
For Conda/Miniconda environments:
conda config --set pip_interop_enabled True
pip install "spacy >= 2.3.0,<3.0.0" # in case spacy is not present
python -m spacy download en_core_web_sm
### now perform any of the below pathways/options
From PyPi:
pip install -U nlp_profiler
From the GitHub repo:
pip install -U git+https://github.com/neomatrix369/nlp_profiler.git@master
From the source (only for development purposes), see Developer guide
import nlp_profiler.core as nlpprof
new_text_column_dataset = nlpprof.apply_text_profiling(dataset, 'text_column')
or
from nlp_profiler.core import apply_text_profiling
new_text_column_dataset = apply_text_profiling(dataset, 'text_column')
See Notebooks section for further illustrations.
See Developer guide to know how to build, test, and contribute to the library.
After successful installation of the library, RESTART Jupyter kernels or Google Colab runtimes for the changes to take effect.
See Notebooks for usage and further details.
See Screenshots
See CHANGELOG.md
Refer licensing (and warranty) policy.
Contributions are Welcome!
Please have a look at the CONTRIBUTING guidelines.
Please share it with the wider community (and get credited for it)!
Go to the NLP page
FAQs
A simple NLP library allows profiling datasets with one or more text columns.
We found that nlp-profiler demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
The EU Cyber Resilience Act is prompting compliance requests that open source maintainers may not be obligated or equipped to handle.
Security News
Crates.io adds Trusted Publishing support, enabling secure GitHub Actions-based crate releases without long-lived API tokens.
Research
/Security News
Undocumented protestware found in 28 npm packages disrupts UI for Russian-language users visiting Russian and Belarusian domains.