Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

edu-segmentation

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

edu-segmentation

To improve EDU segmentation performance using Segbot. As Segbot has an encoder-decoder model architecture, we can replace bidirectional GRU encoder with generative pretraining models such as BART and T5. Evaluate the new model using the RST dataset by using few-shot based settings (e.g. 100 examples) to train the model, instead of using the full dataset.

  • 0.0.115
  • PyPI
  • Socket score

Maintainers
1

Final Year Project on EDU Segmentation:

To improve EDU segmentation performance using Segbot. As Segbot has an encoder-decoder model architecture, we can replace bidirectional GRU encoder with generative pretraining models such as BART and T5. Evaluate the new model using the RST dataset by using few-shot based settings (e.g. 100 examples) to train the model, instead of using the full dataset.

Segbot:
http://138.197.118.157:8000/segbot/
https://www.ijcai.org/proceedings/2018/0579.pdf


Installation

To use the EDUSegmentation module, follow these steps:

  1. Import the download module to download all models:
from edu_segmentation.download import download_models
download_models()
  1. Import the edu_segmentation module and its related classes
from edu_segmentation.main import EDUSegmentation, ModelFactory, BERTUncasedModel, BERTCasedModel, BARTModel

Usage

The edu_segmentation module provides an easy-to-use interface to perform EDU segmentation using different strategies and models. Follow these steps to use it:

  1. Create a segmentation strategy:

    You can choose between the default segmentation strategy or a conjunction-based segmentation strategy.

    Conjunction-based segmentation strategy: After the text has been EDU-segmented, if there are conjunctions at the start or end of each segment, the conjunctions will be isolated as its own segment.

    Default segmentation strategy: No post-processing occurs after the text has been EDU-segmented

from edu_segmentation.main import DefaultSegmentation, ConjunctionSegmentation
  1. Create a model using the ModelFactory.

    Choose from BERT Uncased, BERT Cased, or BART models.
model_type = "bert_uncased"  # or "bert_cased", "bart"
model = ModelFactory.create_model(model_type)
  1. create an instance of EDUSegmentation using the chosen model:
edu_segmenter = EDUSegmentation(model)
  1. Segment the text using the chosen strategy:
text = "Your input text here."
granularity = "conjunction_words"  # or "default"
conjunctions = ["and", "but", "however"]  # Customize conjunctions if needed
device = 'cpu'  # Choose your device, e.g., 'cuda:0'

segmented_output = edu_segmenter.run(text, granularity, conjunctions, device)

Example

Here's a simple example demonstrating how to use the edu_segmentation module:

from edu_segmentation.download import download_models
from edu_segmentation.main import ModelFactory, EDUSegmentation

download_models()

# Create a BERT Uncased model
model = ModelFactory.create_model("bart") # or bert_cased or bert_uncased

# Create an instance of EDUSegmentation using the model
edu_segmenter = EDUSegmentation(model)

# Segment the text using the conjunction-based segmentation strategy
text = "The food is good, but the service is bad."
granularity = "conjunction_words" # or default
conjunctions = ["and", "but", "however"] # customise as needed
device = 'cpu' # or cuda

segmented_output = edu_segmenter.run(text, granularity, conjunctions, device)
print(segmented_output)

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc