Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

ragbooster

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

ragbooster

  • 0.1.1
  • PyPI
  • Socket score

Maintainers
1

RAGBooster

RAGBooster improves the performance of retrieval-based large language models by learning which data sources are important to retrieve high quality data.

We provide an example notebook that shows how we boost RedPajama-INCITE-Instruct-3B-v1, a small LLM with 3 billion parameters to be on par with OpenAI's GPT3.5 (175 billion parameters) in a question answering task by using Bing websearch and ragbooster:

Furthermore, we have an additional example notebook, where we demonstrate how to boost a tiny qa model to get within 5% accuracy on GPT3.5 on a data imputation task:

Core classes

At the core of RAGBooster are RetrievalAugmentedModels, which fetch external data to improve prediction quality. Retrieval augmentation requires two components:

  • A retriever, which retrieves external data for a prediction sample. We currently only implement a BingRetriever, which queries Microsoft's Bing Websearch API.
  • A generator, which generates the final prediction from the prediction sample and the external data. This is typically a large language model. We provide the Generator interface, which makes it very easy to leverage LLMs available via an API, for example from OpenAI.

Once you defined your retrieval-augmented model, you can leverage RAGBooster to boost its performance by learning the data importance of retrieval sources (e.g., domains in the web). This often increases accuracy by a few percent.

Background

Have a look at our paper on Improving Retrieval-Augmented Large Language Models with Data-Centric Refinement for detailed algorithms, proofs and experimental results.

Installation

RAGBooster is available as pip package, and can be installed as follows:

pip install ragbooster

Installation for Development

  • Requires Python 3.9 and Rust to be available
  1. Clone the repository: git clone git@github.com:amsterdata/ragbooster.git
  2. Change to the project directory: cd ragbooster
  3. Create a virtualenv: python3.9 -m venv venv
  4. Activate the virtualenv source venv/bin/activate
  5. Install the dev dependencies with pip install ".[dev]"
  6. Build the project maturin develop --release
  • Optional steps:
    • Run the tests with cargo test --release
    • Run the benchmarks with RUSTFLAGS="-C target-cpu=native" cargo bench
    • Run linting for the Python code with flake8 python
    • Start jupyter with jupyter notebook and run the example notebooks

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc