langfuse-haystack
langfuse-haystack integrates tracing capabilities into Haystack (2.x) pipelines using Langfuse. This package enhances the visibility of pipeline runs by capturing comprehensive details of the execution traces, including API calls, context data, prompts, and more. Whether you're monitoring model performance, pinpointing areas for improvement, or creating datasets for fine-tuning and testing from your pipeline executions, langfuse-haystack is the right tool for you.
Features
- Easy integration with Haystack pipelines
- Capture the full context of the execution
- Track model usage and cost
- Collect user feedback
- Identify low-quality outputs
- Build fine-tuning and testing datasets
Installation
To install langfuse-haystack, run the following command:
pip install langfuse-haystack
Usage
To enable tracing in your Haystack pipeline, add the LangfuseConnector
to your pipeline.
You also need to set the LANGFUSE_SECRET_KEY
and LANGFUSE_PUBLIC_KEY
environment variables in order to connect to Langfuse account.
You can get these keys by signing up for an account on the Langfuse website.
⚠️ Important: To ensure proper tracing, always set environment variables before importing any Haystack components. This is crucial because Haystack initializes its internal tracing components during import.
Here's the correct way to set up your script:
import os
os.environ["LANGFUSE_HOST"] = "https://cloud.langfuse.com"
os.environ["TOKENIZERS_PARALLELISM"] = "false"
os.environ["HAYSTACK_CONTENT_TRACING_ENABLED"] = "true"
from haystack.components.builders import ChatPromptBuilder
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack import Pipeline
from haystack_integrations.components.connectors.langfuse import LangfuseConnector
Alternatively, an even better practice is to set these environment variables in your shell before running the script.
Here's a full example:
import os
os.environ["LANGFUSE_HOST"] = "https://cloud.langfuse.com"
os.environ["TOKENIZERS_PARALLELISM"] = "false"
os.environ["HAYSTACK_CONTENT_TRACING_ENABLED"] = "true"
from haystack.components.builders import ChatPromptBuilder
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack import Pipeline
from haystack_integrations.components.connectors.langfuse import LangfuseConnector
if __name__ == "__main__":
pipe = Pipeline()
pipe.add_component("tracer", LangfuseConnector("Chat example"))
pipe.add_component("prompt_builder", ChatPromptBuilder())
pipe.add_component("llm", OpenAIChatGenerator(model="gpt-3.5-turbo"))
pipe.connect("prompt_builder.prompt", "llm.messages")
messages = [
ChatMessage.from_system("Always respond in German even if some input data is in other languages."),
ChatMessage.from_user("Tell me about {{location}}"),
]
response = pipe.run(
data={"prompt_builder": {"template_variables": {"location": "Berlin"}, "template": messages}}
)
print(response["llm"]["replies"][0])
print(response["tracer"]["trace_url"])
In this example, we add the LangfuseConnector
to the pipeline with the name "tracer". Each run of the pipeline produces one trace viewable on the Langfuse website with a specific URL. The trace captures the entire execution context, including the prompts, completions, and metadata.
Trace Visualization
Langfuse provides a user-friendly interface to visualize and analyze the traces generated by your Haystack pipeline. Login into your Langfuse account and navigate to the trace URL to view the trace details.
Contributing
hatch
is the best way to interact with this project. To install it, run:
pip install hatch
With hatch
installed, run all the tests:
hatch run test
Run the linters ruff
and mypy
:
hatch run lint:all
License
langfuse-haystack
is distributed under the terms of the Apache-2.0 license.