Security News
Introducing the Socket Python SDK
The initial version of the Socket Python SDK is now on PyPI, enabling developers to more easily interact with the Socket REST API in Python projects.
The simplest way of using control flow (like if statements and for loops) to build production-grade prompts for LLMs.
Defining and constructing production-grade LLM prompts via rich structured templates.
import datetime
from hermes_cai import build_structured_prefix
from hermes_cai.structured_prefix import StructuredPrefix, ChatContextMessage
# Retrieve Data.
# chat: dict
# character: dict
# user: dict
# persona_character: dict
# turns: List[dict]
# candidates: dict[str, dict]
# Parse persona data.
persona_definition = None
username = user.name
if persona_character:
persona_definition = (
persona_character.get("sanitized_definition")
if persona_character.get("sanitized_definition")
else persona_character.get("definition")
)
if persona_character.get("name") != "My Persona":
username = persona_cahracter.get("name")
chat_context_messages = []
for turn in turns:
candidate = candidates.get(turn["primary_candidate_id"])
ccm = ChatContextMessage(
author=turn.author_name,
text=candidate.raw_content,
is_pinned=turn.is_pinned,
type=0, # unused
)
chat_context_messages.append(ccm)
# Prepare the raw data.
raw_prompt_data = {
"character": character,
"chat_type": chat.chat_type,
"user_id": user.id,
"character_id": character.id,
"chat_id": chat.id,
"persona_definition": persona_definition,
"username": username,
}
# Prepare structured prefix for hermes-cai package.
structured_prefix = StructuredPrefix(
# METADATA
"reply_prompt": f"{character.get('name')}:",
"timestamp": datetime.utcnow(),
"space_added": True,
"use_hermes_generation": True,
"hermes_generation_template_name": "production_raw.yml.j2",
"token_limit": TOKEN_LIMIT, # IMPORTANT: this must not result in mismatch on model server otherwise it will fallback to legacy prompt.
# DATA
"raw_prompt_data_dict": raw_prompt_data,
"chat_context_messages": chat_context_messages,
)
hermes_structured_prefix = build_structured_prefix(
contextual_logger=logging.getLogger(__name__), # Provides contextual logging.
structured_prefix=structured_prefix,
close_last_message=True,
)
Fundamentally, Hermes is split into two layers -- the Templating Layer and the Logical Layer. We aim to keep a clear separation between these layers such that the Templating Layer exclusively handles the representation of the prompt and the Logical Layer handles the mechanics of constructing the prompt.
Prompt templates are expressive and human-readable files that define the prompt structure, data placements, and formatting of the final prompt. The templating engine aims to strike a balance between being readable and explicit with no magic under-the-hood. As such, we have chosen to use a combination of YAML and Jinja syntax to represent prompt templates -- a common templating language for DevOps tools like Ansible.
Fundamentally, prompt templates are YAML files. Once the jinja syntax is fully rendered they contain a repeating sequence of prompt parts. Each part contains the following fields:
name
: a unique name given to the prompt part used for readability and sometimes used functionally such as for truncation.content
: the string payload representing the content to be tokenized for this prompt part.truncation_priority
: the priority given to the prompt part during truncation.We construct the final prompt by concatenating the content
of these parts together to form the final prompt and do a best effort attempt at following the truncation policy implicit in the truncation_priority
field. In the future we may support additional fields. We use Jinja syntax to express arbitrary complexity in the templates such as control flow and function calls.
The Template Registry is a central repository where all prompt templates are stored. It serves as a single source of truth for prompt definitions, ensuring consistency and ease of access. The registry allows users to:
The Logical Layer contains the necessary logic for rendering templates and performing tokenization and truncation. It handles the dynamic aspects of prompt construction, ensuring that the final prompts adhere to specified length constraints and are correctly formatted.
The rendering process takes a template and fills in the necessary data to produce a final prompt. This involves:
Tokenization is the process of breaking down the final prompt into tokens that the LLM can understand. This involves:
Truncation ensures the prompt fits within the model's maximum token_limit
.
This involves:
truncation_priority
field to determine which parts of the prompt can be truncated if necessary.Validation ensures the final prompt adheres to all specified constraints and formatting rules. This involves:
By adhering to these principles and design choices, Hermes aims to provide a robust, flexible, and easy-to-use system for constructing high-quality LLM prompts.
FAQs
The simplest way of using control flow (like if statements and for loops) to build production-grade prompts for LLMs.
We found that hermes-cai demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
The initial version of the Socket Python SDK is now on PyPI, enabling developers to more easily interact with the Socket REST API in Python projects.
Security News
Floating dependency ranges in npm can introduce instability and security risks into your project by allowing unverified or incompatible versions to be installed automatically, leading to unpredictable behavior and potential conflicts.
Security News
A new Rust RFC proposes "Trusted Publishing" for Crates.io, introducing short-lived access tokens via OIDC to improve security and reduce risks associated with long-lived API tokens.