You're Invited:Meet the Socket Team at BlackHat and DEF CON in Las Vegas, Aug 4-6.RSVP →

Book a Demo Install Sign in

outlines-core

Package Overview

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

outlines-core

Structured Text Generation in Rust

0.2.11

PyPI

Maintainers: 2

MSRV

Structured generation (in Rust).

Outlines-core

This package provides the core functionality for structured generation, formerly implemented in Outlines, with a focus on performance and portability, it offers a convenient way to:

build regular expressions from JSON schemas
construct an Index object by combining a Vocabulary and regular expression to efficiently map tokens from a given vocabulary to state transitions in a finite-state automation

Example

Basic example of how it all fits together.

use outlines_core::prelude::*;

// Define a JSON schema
let schema = r#"{
    "type": "object",
    "properties": {
        "name": { "type": "string" },
        "age": { "type": "integer" }
    },
    "required": ["name", "age"]
}"#;

// Generate a regular expression from it
let regex = json_schema::regex_from_str(&schema, None)?;

// Create `Vocabulary` from pretrained large language model (but manually is also possible)
let vocabulary = Vocabulary::from_pretrained("openai-community/gpt2", None)?;

// Create new `Index` from regex and a given `Vocabulary`
let index = Index::new(&regex, &vocabulary)?;

let initial_state = index.initial_state();
let allowed_tokens = index.allowed_tokens(&initial_state).expect("Some allowed token ids");
let token_id = allowed_tokens.first().expect("First token id");
let next_state = index.next_state(&initial_state, token_id);
let final_states = index.final_states();

Python Bindings

Additionally, project provides interfaces to integrate the crate's functionality with Python.

import json

from outlines_core.json_schema import build_regex_from_schema
from outlines_core.guide import Guide, Index, Vocabulary

schema =  {
  "title": "Foo",
  "type": "object",
  "properties": {"date": {"type": "string", "format": "date"}}
}
regex = build_regex_from_schema(json.dumps(schema))

vocabulary = Vocabulary.from_pretrained("openai-community/gpt2")
index = Index(regex, vocabulary)
guide = Guide(index)

# Get current state of the Guide:
current_state = guide.get_state()

# Get allowed tokens for the current state of the Guide:
allowed_tokens = guide.get_tokens()

# Advance Guide to the next state via some token_id and return allowed tokens for that new state:
next_allowed_tokens = guide.advance(allowed_tokens[-1])

# To check if Guide is finished:
guide.is_finished()

# If it's finished then this assertion holds:
assert guide.get_tokens() == [vocabulary.get_eos_token_id()]

How to contribute?

Setup

Fork the repository on GitHub and clone the fork locally:

git clone git@github.com/YourUserName/outlines-core.git
cd outlines-core

Create a new virtual environment and install the dependencies in editable mode:

python -m venv .venv
source .venv/bin/activate
pip install -e ".[test]"
pre-commit install

Before pushing your code

If working with Python bindings don't forget to build Rust extension before testing, for example, in debug mode:

make build-extension-debug

Run Python tests:

pytest

Run Rust tests:

cargo test

Or alternatively using Makefile for both:

make test

Finally, run the code style checks:

pre-commit run --all-files

Or using Makefile:

make pcc

If necessary you can run benchmarks locally:

make pybench

Join us

💡 Have an idea? Come chat with us on Discord
Found a bug? Open an issue

Keywords

machine learning

deep learning

language models

structured generation

FAQs

What is outlines-core?

Is outlines-core well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install