
Security News
Browserslist-rs Gets Major Refactor, Cutting Binary Size by Over 1MB
Browserslist-rs now uses static data to reduce binary size by over 1MB, improving memory use and performance for Rust-based frontend tools.
The llama-cpp-agent framework is a tool designed to simplify interactions with Large Language Models (LLMs). It provides an interface for chatting with LLMs, executing function calls, generating structured output, performing retrieval augmented generation, and processing text using agentic chains with tools.
The framework uses guided sampling to constrain the model output to the user defined structures. This way also models not fine-tuned to do function calling and JSON output will be able to do it.
The framework is compatible with the llama.cpp server, llama-cpp-python and its server, and with TGI and vllm servers.
Install the llama-cpp-agent framework using pip:
pip install llama-cpp-agent
You can find the latest documentation here!
You can find the get started guide here!
Join the Discord Community here
The llama-cpp-agent framework provides a wide range of examples demonstrating its capabilities. Here are some key examples:
This example demonstrates how to initiate a chat with an LLM model using the llama.cpp server backend.
This example showcases parallel function calling using the FunctionCallingAgent class. It demonstrates how to define and execute multiple functions concurrently.
This example illustrates how to generate structured output objects using the StructuredOutputAgent class. It shows how to create a dataset entry of a book from unstructured data.
This example demonstrates Retrieval Augmented Generation (RAG) with colbert reranking. It requires installing the optional rag dependencies (ragatouille).
This example shows how to use llama-index tools and query engines with the FunctionCallingAgent class.
This example demonstrates how to create a complete product launch campaign using a sequential chain.
This example illustrates how to create a mapping chain to summarize multiple articles into a single summary.
This example, based on an example from the Instructor library for OpenAI, shows how to create a knowledge graph using the llama-cpp-agent framework.
The llama-cpp-agent framework provides predefined message formatters to format messages for the LLM model. The MessagesFormatterType
enum defines the available formatters:
MessagesFormatterType.MISTRAL
: Formats messages using the MISTRAL format.MessagesFormatterType.CHATML
: Formats messages using the CHATML format.MessagesFormatterType.VICUNA
: Formats messages using the VICUNA format.MessagesFormatterType.LLAMA_2
: Formats messages using the LLAMA 2 format.MessagesFormatterType.SYNTHIA
: Formats messages using the SYNTHIA format.MessagesFormatterType.NEURAL_CHAT
: Formats messages using the NEURAL CHAT format.MessagesFormatterType.SOLAR
: Formats messages using the SOLAR format.MessagesFormatterType.OPEN_CHAT
: Formats messages using the OPEN CHAT format.MessagesFormatterType.ALPACA
: Formats messages using the ALPACA format.MessagesFormatterType.CODE_DS
: Formats messages using the CODE DS format.MessagesFormatterType.B22
: Formats messages using the B22 format.MessagesFormatterType.LLAMA_3
: Formats messages using the LLAMA 3 format.MessagesFormatterType.PHI_3
: Formats messages using the PHI 3 format.You can create your own custom messages formatter by instantiating the MessagesFormatter
class with the desired parameters:
from llama_cpp_agent.messages_formatter import MessagesFormatter, PromptMarkers, Roles
custom_prompt_markers = {
Roles.system: PromptMarkers("<|system|>", "<|endsystem|>"),
Roles.user: PromptMarkers("<|user|>", "<|enduser|>"),
Roles.assistant: PromptMarkers("<|assistant|>", "<|endassistant|>"),
Roles.tool: PromptMarkers("<|tool|>", "<|endtool|>"),
}
custom_formatter = MessagesFormatter(
pre_prompt="",
prompt_markers=custom_prompt_markers,
include_sys_prompt_in_first_user_message=False,
default_stop_sequences=["<|endsystem|>", "<|enduser|>", "<|endassistant|>", "<|endtool|>"]
)
We welcome contributions to the llama-cpp-agent framework! If you'd like to contribute, please follow these guidelines:
master
.master
branch.If you encounter any issues or have suggestions for improvements, please open an issue on the GitHub repository.
The llama-cpp-agent framework is released under the MIT License.
Q: How do I install the optional dependencies for RAG?
A: To use the RAGColbertReranker class and the RAG example, you need to install the optional rag dependencies (ragatouille). You can do this by running pip install llama-cpp-agent[rag]
.
Q: Can I contribute to the llama-cpp-agent project?
A: Absolutely! We welcome contributions from the community. Please refer to the Contributing section for guidelines on how to contribute.
Q: Is llama-cpp-agent compatible with the latest version of llama-cpp-python?
A: Yes, llama-cpp-agent is designed to work with the latest version of llama-cpp-python. However, if you encounter any compatibility issues, please open an issue on the GitHub repository.
FAQs
A framework for building LLM based AI agents with llama.cpp.
We found that llama-cpp-agent demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Browserslist-rs now uses static data to reduce binary size by over 1MB, improving memory use and performance for Rust-based frontend tools.
Research
Security News
Eight new malicious Firefox extensions impersonate games, steal OAuth tokens, hijack sessions, and exploit browser permissions to spy on users.
Security News
The official Go SDK for the Model Context Protocol is in development, with a stable, production-ready release expected by August 2025.