
Security News
MCP Community Begins Work on Official MCP Metaregistry
The MCP community is launching an official registry to standardize AI tool discovery and let agents dynamically find and install MCP servers.
Preprocessing and Extraction of Linguistic Information for Computational Analysis
.. |logo| image:: https://raw.githubusercontent.com/ypauli/pelican_nlp/main/docs/images/pelican_logo.png :alt: pelican_nlp Logo :width: 200px
+------------+-------------------------------------------------------------------+ | |logo| | pelican_nlp stands for "Preprocessing and Extraction of Linguistic| | | Information for Computational Analysis - Natural Language | | | Processing". This package enables the creation of standardized and| | | reproducible language processing pipelines, extracting linguistic | | | features from various tasks like discourse, fluency, and image | | | descriptions. | +------------+-------------------------------------------------------------------+
.. image:: https://img.shields.io/pypi/v/pelican_nlp.svg :target: https://pypi.org/project/pelican_nlp/ :alt: PyPI version
.. image:: https://img.shields.io/badge/License-CC%20BY--NC%204.0-lightgrey.svg :target: https://github.com/ypauli/pelican_nlp/blob/main/LICENSE :alt: License CC BY-NC 4.0
.. image:: https://img.shields.io/pypi/pyversions/pelican_nlp.svg :target: https://pypi.org/project/pelican_nlp/ :alt: Supported Python Versions
.. image:: https://img.shields.io/badge/Contributions-Welcome-brightgreen.svg :target: https://github.com/ypauli/pelican_nlp/blob/main/CONTRIBUTING.md :alt: Contributions Welcome
Create conda environment
.. code-block:: bash
conda create -n pelican-nlp -c defaults python=3.10
Activate environment
.. code-block:: bash
conda activate pelican-nlp
Install the package using pip:
.. code-block:: bash
pip install pelican_nlp
For the latest development version:
.. code-block:: bash
pip install https://github.com/ypauli/pelican_nlp/releases/tag/v0.1.2-alpha
To run pelican_nlp you need a configuration.yml file in your project directory, which specifies the configurations used for your project. Sample configuration files can be found on the pelican_nlp github repository: https://github.com/ypauli/pelican_nlp/tree/main/sample_configuration_files
Adapt your configuration file to your needs and save your personal configuration.yml file to your main project directory.
Running pelican_nlp with your configurations can be done directly from the command line interface or via Python script.
Run from command line:
Navigate to main project directory in command line and enter the following command (Note: Folder must contain your subjects folder and your configuration.yml file):
.. code-block:: bash
conda activate pelican-nlp
pelican-run
Run with python script:
Create python file with IDE of your choice (e.g. Visual Studio Code, Pycharm, etc.) and copy the following code into the file: Make sure to use the previously created conda environment 'pelican-nlp' for your project.
Run the following Python code: .. code-block:: python
from pelican_nlp.main import Pelican
configuration_file = "/path/to/your/config/file.yml"
pelican = Pelican(configuration_file)
pelican.run()
Replace "/path/to/your/config/file" with the path to your configuration file located in your main project folder.
For reliable operation, data must be stored in the Language Processing Data Structure (LPDS) format, inspired by brain imaging data structure conventions.
Text and audio files should follow this naming convention:
[subjectID][sessionID][task][task-supplement][corpus].[extension]
Example filenames:
To optimize performance, close other programs and limit GPU usage during language processing.
Feature 1: Cleaning text files
Feature 2: Linguistic Feature Extraction
You can find example setups on the github repository in the examples <https://github.com/ypauli/pelican_nlp/tree/main/examples>
_ folder:
Contributions are welcome! Please check out the contributing guide <https://github.com/ypauli/pelican_nlp/blob/main/CONTRIBUTING.md>
_.
This project is licensed under Attribution-NonCommercial 4.0 International. See the LICENSE <https://github.com/ypauli/pelican_nlp/blob/main/LICENSE>
_ file for details.
FAQs
Preprocessing and Extraction of Linguistic Information for Computational Analysis
We found that pelican-nlp demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
The MCP community is launching an official registry to standardize AI tool discovery and let agents dynamically find and install MCP servers.
Research
Security News
Socket uncovers an npm Trojan stealing crypto wallets and BullX credentials via obfuscated code and Telegram exfiltration.
Research
Security News
Malicious npm packages posing as developer tools target macOS Cursor IDE users, stealing credentials and modifying files to gain persistent backdoor access.