Socket
Book a DemoInstallSign in
Socket

flux-llm-kai

Package Overview
Dependencies
Maintainers
1
Versions
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

flux-llm-kai

A functional programming LLM client with tool calling

pipPyPI
Version
0.1.0
Maintainers
1

LLM Client - Enhanced Functional Programming Architecture

A powerful, functional programming approach to LLM orchestration with tool calling, provider routing, and automatic failover capabilities.

Architecture Overview

The LLM Client follows a functional programming architecture with clear separation of concerns:

┌─────────────────────────────────────────────────────────────┐
│                    LLM Client Architecture                  │
├─────────────────────────────────────────────────────────────┤
│  Core Layer (orchestrator_v2.py)                           │
│  ├── Data Structures (LLMRequest, LLMResponse, etc.)       │
│  ├── Tool Functions (execute_tool_call, get_available_tools)│
│  ├── Router Service (ProviderRouter, RoutingStrategy)      │
│  └── Core Orchestrator Functions (4 main functions)        │
├─────────────────────────────────────────────────────────────┤
│  Integration Layer (__init__.py)                           │
│  ├── Low-level Functions (requests.py, serialization.py)   │
│  ├── Tool System (tools/)                                  │
│  └── Public API (6 core functions)                         │
├─────────────────────────────────────────────────────────────┤
│  Provider Layer (requests.py, serialization.py)            │
│  ├── HTTP Request Building                                 │
│  ├── Response Parsing (OpenAI, Google formats)             │
│  └── Streaming Support                                     │
└─────────────────────────────────────────────────────────────┘

Key Design Principles

1. Functional Programming

  • Pure Functions: No side effects, predictable behavior
  • Composability: Functions can be easily combined and pipelined
  • Immutability: Data structures are immutable, no hidden state
  • Stateless: No instance variables to manage

2. Automatic Tool Detection

  • Tools are automatically detected from ToolMeta.registry
  • No manual tool management required
  • Smart inclusion based on tools_enabled flag

3. Multiple Routing Strategies

  • PRIORITY: Try providers in priority order (default)
  • RANDOM: Start with random provider, then priority
  • CYCLE: Cycle through providers continuously

4. Proper Tool Call Parsing

  • OpenAI Format: Parses tool_calls with function.name and function.arguments
  • Google Format: Parses function_call with name and args
  • Automatic Detection: Based on provider, uses correct parsing format

Core Components

Data Structures

@dataclass
class LLMRequest:
    messages: List[Dict[str, str]]
    model: Optional[str] = None
    temperature: float = 0.7
    max_tokens: Optional[int] = None
    tools_enabled: bool = True
    metadata: Dict[str, Any] = None

@dataclass
class ProviderConfig:
    name: str
    priority: int
    status: ProviderStatus = ProviderStatus.AVAILABLE
    retry_count: int = 0
    max_retries: int = 3
    backoff_seconds: int = 60

Core Functions (Reduced Set)

1. stream_llm_response()

Stream from a single provider with proper tool call parsing.

async for chunk_type, content in stream_llm_response(request, "openai"):
    if chunk_type == "t":  # text
        print(content)
    elif chunk_type == "f":  # function call
        print(f"Function: {content}")

2. stream_with_router()

Stream with automatic failover and routing strategies.

router = create_router(["openai", "google_gemini"], strategy=RoutingStrategy.PRIORITY)
async for chunk_type, content, provider in stream_with_router(request, router):
    print(f"[{provider}]: {content}")

3. chat_with_tools()

Single provider chat with automatic tool execution.

async for chunk_type, content in chat_with_tools(request, "openai", max_iterations=5):
    print(content)

4. chat_with_tools_and_router()

Multi-provider chat with tools and failover.

async for chunk_type, content, provider in chat_with_tools_and_router(request, router):
    print(f"[{provider}]: {content}")

Convenience Functions

1. quick_chat()

Simple single-provider chat.

response = await quick_chat("Hello!", "openai", tools_enabled=True)

2. quick_chat_with_router()

Simple chat with routing strategies.

response, provider = await quick_chat_with_router(
    "Hello!", 
    ["openai", "google_gemini"],
    strategy=RoutingStrategy.RANDOM
)

Routing Strategies

PRIORITY Strategy (Default)

# Try providers in priority order: openai -> google_gemini -> anthropic
router = create_router(
    ["openai", "google_gemini", "anthropic"],
    strategy=RoutingStrategy.PRIORITY
)

RANDOM Strategy

# Start with random provider, then follow priority order
router = create_router(
    ["openai", "google_gemini", "anthropic"],
    strategy=RoutingStrategy.RANDOM
)

CYCLE Strategy

# Cycle through providers continuously
router = create_router(
    ["openai", "google_gemini", "anthropic"],
    strategy=RoutingStrategy.CYCLE
)

# Use with max_cycles parameter
async for chunk_type, content, provider in stream_with_router(request, router, max_cycles=3):
    print(f"[{provider}]: {content}")

Tool System Integration

Tool Registration

from utils.llm_client import Tool

@Tool
def calculator(expression: str) -> str:
    """Calculate a mathematical expression safely."""
    try:
        result = eval(expression)
        return f"Result: {result}"
    except Exception as e:
        return f"Error: {str(e)}"

Automatic Tool Detection

# Tools are automatically detected and included
request = LLMRequest(
    messages=[{"role": "user", "content": "What's 5 * 7?"}],
    tools_enabled=True  # Automatically includes available tools
)

Tool Call Execution

# Tool calls are automatically executed and results fed back to LLM
async for chunk_type, content in chat_with_tools(request, "openai"):
    print(content)  # Includes both LLM response and tool results

Provider Support

Supported Providers

  • OpenAI (GPT models)
  • Google Gemini (Gemini models)
  • Anthropic (Claude models)
  • OpenRouter (Multiple models)

Provider Configuration

# Custom provider configurations
providers = [
    ProviderConfig(
        name="openai",
        priority=0,
        status=ProviderStatus.AVAILABLE,
        max_retries=2,
        backoff_seconds=30
    ),
    ProviderConfig(
        name="google_gemini",
        priority=1,
        status=ProviderStatus.AVAILABLE,
        max_retries=3,
        backoff_seconds=60
    )
]

router = ProviderRouter(providers, RoutingStrategy.PRIORITY)

Error Handling & Resilience

Automatic Failover

  • If a provider fails, automatically try the next available provider
  • Configurable retry limits and backoff periods
  • Status tracking for each provider

Tool Call Error Handling

  • Graceful handling of malformed tool calls
  • Error messages returned to LLM for context
  • Exception handling for tool execution failures

Retry Logic

# Custom retry configuration
provider = ProviderConfig(
    name="openai",
    max_retries=3,
    backoff_seconds=60  # Wait 60 seconds before retry
)

Usage Examples

Basic Usage

from utils.llm_client import quick_chat, quick_chat_with_router, RoutingStrategy

# Simple chat
response = await quick_chat("Hello!", "openai")

# Chat with failover
response, provider = await quick_chat_with_router(
    "Hello!", 
    ["openai", "google_gemini"]
)

Advanced Usage

from utils.llm_client import (
    LLMRequest, create_router, stream_with_router, 
    RoutingStrategy, Tool
)

@Tool
def get_weather(city: str) -> str:
    """Get weather for a city."""
    return f"Weather in {city}: Sunny, 72°F"

# Create request
request = LLMRequest(
    messages=[{"role": "user", "content": "What's the weather in NYC?"}],
    tools_enabled=True
)

# Create router
router = create_router(
    ["openai", "google_gemini"],
    strategy=RoutingStrategy.RANDOM
)

# Stream with failover
async for chunk_type, content, provider in stream_with_router(request, router):
    if chunk_type == "t":
        print(f"[{provider}]: {content}")

Custom Pipeline

# Create custom pipeline with middleware
async def my_pipeline(request):
    router = create_router(["openai", "google_gemini"])
    
    async for chunk_type, content, provider in stream_with_router(request, router):
        # Custom processing
        processed_content = process_chunk(content)
        yield chunk_type, processed_content, provider

# Use the pipeline
async for chunk_type, content, provider in my_pipeline(request):
    print(f"[{provider}]: {content}")

Benefits

  • Simplified API: Only 6 core functions instead of many
  • Automatic Detection: Tools are automatically detected and included
  • Flexible Routing: Multiple routing strategies for different use cases
  • Proper Parsing: Correct tool call parsing for different providers
  • Cycle Support: Can cycle through providers for load balancing
  • Error Resilience: Automatic failover and retry logic
  • Functional Composition: Easy to compose and extend
  • Provider Agnostic: Works with multiple LLM providers
  • Tool Integration: Seamless tool calling with automatic execution
  • Performance: Efficient streaming and concurrent tool execution

Migration from OOP Approach

Before (OOP)

client = LLMClient("openai")
client.enable_tools()
async for chunk in client.chat_with_tools(messages):
    print(chunk)

After (Functional)

request = LLMRequest(messages=messages, tools_enabled=True)
async for chunk_type, content in chat_with_tools(request, "openai"):
    if chunk_type == "t":
        print(content)

The functional approach is more flexible, composable, and provides better separation of concerns while maintaining simplicity.

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts