New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More →

edu-segmentation

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

edu-segmentation

To improve EDU segmentation performance using Segbot. As Segbot has an encoder-decoder model architecture, we can replace bidirectional GRU encoder with generative pretraining models such as BART and T5. Evaluate the new model using the RST dataset by using few-shot based settings (e.g. 100 examples) to train the model, instead of using the full dataset.

0.0.115
PyPI

Maintainers: 1

Final Year Project on EDU Segmentation:

Segbot:
http://138.197.118.157:8000/segbot/
https://www.ijcai.org/proceedings/2018/0579.pdf

Installation

To use the EDUSegmentation module, follow these steps:

Import the download module to download all models:

from edu_segmentation.download import download_models
download_models()

Import the edu_segmentation module and its related classes

from edu_segmentation.main import EDUSegmentation, ModelFactory, BERTUncasedModel, BERTCasedModel, BARTModel

Usage

The edu_segmentation module provides an easy-to-use interface to perform EDU segmentation using different strategies and models. Follow these steps to use it:

Create a segmentation strategy:

You can choose between the default segmentation strategy or a conjunction-based segmentation strategy.

Conjunction-based segmentation strategy: After the text has been EDU-segmented, if there are conjunctions at the start or end of each segment, the conjunctions will be isolated as its own segment.

Default segmentation strategy: No post-processing occurs after the text has been EDU-segmented

from edu_segmentation.main import DefaultSegmentation, ConjunctionSegmentation

Create a model using the ModelFactory.

Choose from BERT Uncased, BERT Cased, or BART models.

model_type = "bert_uncased"  # or "bert_cased", "bart"
model = ModelFactory.create_model(model_type)

create an instance of EDUSegmentation using the chosen model:

edu_segmenter = EDUSegmentation(model)

Segment the text using the chosen strategy:

text = "Your input text here."
granularity = "conjunction_words"  # or "default"
conjunctions = ["and", "but", "however"]  # Customize conjunctions if needed
device = 'cpu'  # Choose your device, e.g., 'cuda:0'

segmented_output = edu_segmenter.run(text, granularity, conjunctions, device)

Example

Here's a simple example demonstrating how to use the edu_segmentation module:

from edu_segmentation.download import download_models
from edu_segmentation.main import ModelFactory, EDUSegmentation

download_models()

# Create a BERT Uncased model
model = ModelFactory.create_model("bart") # or bert_cased or bert_uncased

# Create an instance of EDUSegmentation using the model
edu_segmenter = EDUSegmentation(model)

# Segment the text using the conjunction-based segmentation strategy
text = "The food is good, but the service is bad."
granularity = "conjunction_words" # or default
conjunctions = ["and", "but", "however"] # customise as needed
device = 'cpu' # or cuda

segmented_output = edu_segmenter.run(text, granularity, conjunctions, device)
print(segmented_output)

FAQs

What is edu-segmentation?

Is edu-segmentation well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

edu-segmentation

Installation

Usage

Example

Related posts

Malicious PyPI Package Exploits Deezer API for Coordinated Music Piracy

TON Wallet Security Threat: Malicious npm Package Steals Cryptocurrency Wallet Keys