Security News
Fluent Assertions Faces Backlash After Abandoning Open Source Licensing
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
A simple python package that provides a unified interface to several LLM providers of chat fine-tuned models [OpenAI, AzureOpenAI, PaLM, Cohere and local HuggingFace Models].
Note llmx wraps multiple api providers and its interface may change as the providers as well as the general field of LLMs evolve.
There is nothing particularly special about this library, but some of the requirements I needed when I started building this (that other libraries did not have):
from llmx import llm
gen = llm(provider="openai") # support azureopenai models too.
gen = llm(provider="palm") # or google
gen = llm(provider="cohere") # or palm
gen = llm(provider="hf", model="HuggingFaceH4/zephyr-7b-beta", device_map="auto") # run huggingface model locally
system
, user
, or assistant
) and content (see below). A single request is list of only one message (e.g., write code to plot a cosine wave signal). A conversation is a list of messages e.g. write code for x, update the axis to y, etc. Same format for all models.messages = [
{"role": "user", "content": "You are a helpful assistant that can explain concepts clearly to a 6 year old child."},
{"role": "user", "content": "What is gravity?"}
]
use_cache=False
in the generate
call.response = gen.generate(messages=messages, config=TextGeneratorConfig(n=1, use_cache=True))
Output looks like
TextGenerationResponse(
text=[Message(role='assistant', content="Gravity is like a magical force that pulls things towards each other. It's what keeps us on the ground and stops us from floating away into space. ... ")],
config=TextGenerationConfig(n=1, temperature=0.1, max_tokens=8147, top_p=1.0, top_k=50, frequency_penalty=0.0, presence_penalty=0.0, provider='openai', model='gpt-4', stop=None),
logprobs=[], usage={'prompt_tokens': 34, 'completion_tokens': 69, 'total_tokens': 103})
Are there other libraries that do things like this really well? Yes! I'd recommend looking at guidance which does a lot more. Interested in optimized inference? Try somthing like vllm.
Install from pypi. Please use python3.10 or higher.
pip install llmx
Install in development mode
git clone
cd llmx
pip install -e .
Note that you may want to use the latest version of pip to install this package.
python3 -m pip install --upgrade pip
Set your api keys first for each service.
# for openai and cohere
export OPENAI_API_KEY=<your key>
export COHERE_API_KEY=<your key>
# for PALM via MakerSuite
export PALM_API_KEY=<your key>
# for PaLM (Vertex AI), setup a gcp project, and get a service account key file
export PALM_SERVICE_ACCOUNT_KEY_FILE= <path to your service account key file>
export PALM_PROJECT_ID=<your gcp project id>
export PALM_PROJECT_LOCATION=<your project location>
You can also set the default provider and list of supported providers via a config file. Use the yaml format in this sample config.default.yml
file and set the LLMX_CONFIG_PATH
to the path of the config file.
from llmx import llm
from llmx.datamodel import TextGenerationConfig
messages = [
{"role": "system", "content": "You are a helpful assistant that can explain concepts clearly to a 6 year old child."},
{"role": "user", "content": "What is gravity?"}
]
openai_gen = llm(provider="openai")
openai_config = TextGenerationConfig(model="gpt-4", max_tokens=50)
openai_response = openai_gen.generate(messages, config=openai_config, use_cache=True)
print(openai_response.text[0].content)
See the tutorial for more examples.
While llmx can use the huggingface transformers library to run inference with local models, you might get more mileage from using a well-optimized server endpoint like vllm, or FastChat. The general idea is that these tools let you provide an openai-compatible endpoint but also implement optimizations such as dynamic batching, quantization etc to improve throughput. The general steps are:
8000
from llmx import llm
hfgen_gen = llm(
provider="openai",
api_base="http://localhost:8000",
api_key="EMPTY,
)
...
If you use this library in your work, please cite:
@software{victordibiallmx,
author = {Victor Dibia},
license = {MIT},
month = {10},
title = {LLMX - An API for Chat Fine-Tuned Language Models},
url = {https://github.com/victordibia/llmx},
year = {2023}
}
FAQs
LLMX: A library for LLM Text Generation
We found that llmx demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
Research
Security News
Socket researchers uncover the risks of a malicious Python package targeting Discord developers.
Security News
The UK is proposing a bold ban on ransomware payments by public entities to disrupt cybercrime, protect critical services, and lead global cybersecurity efforts.