Are you sure you want to delete this access key?
sidebar_position |
---|
55 |
Promptfoo supports OpenTelemetry (OTLP) tracing to help you understand the internal operations of your LLM providers during evaluations.
This feature allows you to collect detailed performance metrics and debug complex provider implementations.
Promptfoo acts as an OpenTelemetry receiver, collecting traces from your providers and displaying them in the web UI. This eliminates the need for external observability infrastructure during development and testing.
Tracing provides visibility into:
Add tracing configuration to your promptfooconfig.yaml
:
tracing:
enabled: true # Required to send OTLP telemetry
otlp:
http:
enabled: true # Required to start the built-in OTLP receiver
Promptfoo passes a W3C trace context to providers via the traceparent
field. Use this to create child spans:
const { trace, context, SpanStatusCode } = require('@opentelemetry/api');
const { NodeTracerProvider } = require('@opentelemetry/sdk-trace-node');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-http');
const { SimpleSpanProcessor } = require('@opentelemetry/sdk-trace-base');
// Initialize tracer
const provider = new NodeTracerProvider();
const exporter = new OTLPTraceExporter({
url: 'http://localhost:4318/v1/traces',
});
provider.addSpanProcessor(new SimpleSpanProcessor(exporter));
provider.register();
const tracer = trace.getTracer('my-provider');
module.exports = {
async callApi(prompt, promptfooContext) {
// Parse trace context from Promptfoo
if (promptfooContext.traceparent) {
const activeContext = trace.propagation.extract(context.active(), {
traceparent: promptfooContext.traceparent,
});
return context.with(activeContext, async () => {
const span = tracer.startSpan('provider.call');
try {
// Your provider logic here
span.setAttribute('prompt.length', prompt.length);
const result = await yourLLMCall(prompt);
span.setStatus({ code: SpanStatusCode.OK });
return { output: result };
} catch (error) {
span.recordException(error);
span.setStatus({
code: SpanStatusCode.ERROR,
message: error.message,
});
throw error;
} finally {
span.end();
}
});
}
// Fallback for when tracing is disabled
return { output: await yourLLMCall(prompt) };
},
};
After running an evaluation, view traces in the web UI:
Run your evaluation:
promptfoo eval
Open the web UI:
promptfoo view
Click the magnifying glass (🔎) icon on any test result
Scroll to the "Trace Timeline" section
tracing:
enabled: true # Enable/disable tracing
otlp:
http:
enabled: true # Required to start the OTLP receiver
# port: 4318 # Optional - defaults to 4318 (standard OTLP HTTP port)
# host: '0.0.0.0' # Optional - defaults to '0.0.0.0'
You can also configure tracing via environment variables:
# Enable tracing
export PROMPTFOO_TRACING_ENABLED=true
# Configure OTLP endpoint (for providers)
export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4318"
# Set service name
export OTEL_SERVICE_NAME="my-rag-application"
# Authentication headers (if needed)
export OTEL_EXPORTER_OTLP_HEADERS="api-key=your-key"
Forward traces to external observability platforms:
tracing:
enabled: true
otlp:
http:
enabled: true
forwarding:
enabled: true
endpoint: 'http://jaeger:4318' # or Tempo, Honeycomb, etc.
headers:
'api-key': '${OBSERVABILITY_API_KEY}'
For complete provider implementation details, see the JavaScript Provider documentation. For tracing-specific examples, see the OpenTelemetry tracing example.
Key points:
SimpleSpanProcessor
for immediate trace exporttraceparent
For complete provider implementation details, see the Python Provider documentation.
from opentelemetry import trace
from opentelemetry.propagate import extract
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
# Setup
provider = TracerProvider()
exporter = OTLPSpanExporter(endpoint="http://localhost:4318/v1/traces")
provider.add_span_processor(SimpleSpanProcessor(exporter))
trace.set_tracer_provider(provider)
tracer = trace.get_tracer(__name__)
def call_api(prompt, context):
# Extract trace context
if 'traceparent' in context:
ctx = extract({"traceparent": context["traceparent"]})
with tracer.start_as_current_span("provider.call", context=ctx) as span:
span.set_attribute("prompt.length", len(prompt))
# Your provider logic here
result = your_llm_call(prompt)
return {"output": result}
# Fallback without tracing
return {"output": your_llm_call(prompt)}
Promptfoo includes a built-in trace viewer that displays all collected telemetry data. Since Promptfoo functions as an OTLP receiver, you can view traces directly without configuring external tools like Jaeger or Grafana Tempo.
The web UI displays traces as a hierarchical timeline showing:
[Root Span: provider.call (500ms)]
├─[Retrieve Documents (100ms)]
├─[Prepare Context (50ms)]
└─[LLM Generation (300ms)]
Each bar's width represents its duration relative to the total trace time. Hover over any span to see:
Use descriptive, hierarchical span names:
// Good
'rag.retrieve_documents';
'rag.rank_results';
'llm.generate_response';
// Less informative
'step1';
'process';
'call_api';
Include context that helps debugging:
span.setAttributes({
'prompt.tokens': tokenCount,
'documents.count': documents.length,
'model.name': 'gpt-4',
'cache.hit': false,
});
Always record exceptions and set error status:
try {
// Operation
} catch (error) {
span.recordException(error);
span.setStatus({
code: SpanStatusCode.ERROR,
message: error.message,
});
throw error;
}
Add metadata that appears in the UI:
span.setAttributes({
'user.id': userId,
'feature.flags': JSON.stringify(featureFlags),
version: packageVersion,
});
Reduce overhead in high-volume scenarios:
const { TraceIdRatioBasedSampler } = require('@opentelemetry/sdk-trace-base');
const provider = new NodeTracerProvider({
sampler: new TraceIdRatioBasedSampler(0.1), // Sample 10% of traces
});
Trace across multiple services:
// Service A: Forward trace context
const headers = {};
trace.propagation.inject(context.active(), headers);
await fetch(serviceB, { headers });
// Service B: Extract and continue trace
const extractedContext = trace.propagation.extract(context.active(), request.headers);
tracing.enabled: true
in confighttp://localhost:4318/v1/traces
traceparent
value to ensure it's being passedIf you see context.active is not a function
, rename the OpenTelemetry import:
// Avoid conflict with promptfoo context parameter
const { context: otelContext } = require('@opentelemetry/api');
async callApi(prompt, promptfooContext) {
// Use otelContext for OpenTelemetry
// Use promptfooContext for Promptfoo's context
}
BatchSpanProcessor
for production useEnable debug logs to troubleshoot:
# Promptfoo debug logs
DEBUG=promptfoo:* promptfoo eval
# OpenTelemetry debug logs
OTEL_LOG_LEVEL=debug promptfoo eval
async function ragPipeline(query, context) {
const span = tracer.startSpan('rag.pipeline');
try {
// Retrieval phase
const retrieveSpan = tracer.startSpan('rag.retrieve', { parent: span });
const documents = await vectorSearch(query);
retrieveSpan.setAttribute('documents.count', documents.length);
retrieveSpan.end();
// Reranking phase
const rerankSpan = tracer.startSpan('rag.rerank', { parent: span });
const ranked = await rerank(query, documents);
rerankSpan.setAttribute('documents.reranked', ranked.length);
rerankSpan.end();
// Generation phase
const generateSpan = tracer.startSpan('llm.generate', { parent: span });
const response = await llm.generate(query, ranked);
generateSpan.setAttribute('response.tokens', response.tokenCount);
generateSpan.end();
span.setStatus({ code: SpanStatusCode.OK });
return response;
} catch (error) {
span.recordException(error);
span.setStatus({ code: SpanStatusCode.ERROR });
throw error;
} finally {
span.end();
}
}
async function compareModels(prompt, context) {
const span = tracer.startSpan('compare.models');
const models = ['gpt-4', 'claude-3', 'llama-3'];
const promises = models.map(async (model) => {
const modelSpan = tracer.startSpan(`model.${model}`, { parent: span });
try {
const result = await callModel(model, prompt);
modelSpan.setAttribute('model.name', model);
modelSpan.setAttribute('response.latency', result.latency);
return result;
} finally {
modelSpan.end();
}
});
const results = await Promise.all(promises);
span.end();
return results;
}
Press p or to see the previous file or, n or to see the next file
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?