Are you sure you want to delete this access key?
This example demonstrates how to evaluate LLM function/tool calling capabilities using promptfoo.
You can run this example with:
npx promptfoo@latest init --example tool-use
This example shows how to configure and test function/tool calling capabilities across multiple LLM providers:
Each provider has slightly different syntax and requirements for implementing function/tool calling.
This example requires the following environment variables:
OPENAI_API_KEY
- Your OpenAI API keyANTHROPIC_API_KEY
- Your Anthropic API keyAWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
- For AWS Bedrock (if using the Bedrock example)GROQ_API_KEY
- If using Groq's LLaMA modelsYou can set these in a .env
file or directly in your environment.
Each provider implements tool use with different syntax:
The configuration for this example is in:
promptfooconfig.yaml
- Main example with OpenAI, Anthropic, and Groqpromptfooconfig.bedrock.yaml
- Example specifically for AWS Bedrock modelsTo run the main example:
promptfoo eval
To run the Bedrock example:
promptfoo eval -c promptfooconfig.bedrock.yaml
After running the evaluation, view the results with:
promptfoo view
This example uses a simple weather lookup function that takes a location and optionally a temperature unit. The example illustrates how different providers handle the same function definition with different syntaxes.
External tools can also be loaded from separate files, as demonstrated with external_tools.yaml
.
This example also demonstrates the use of finish-reason
assertions to validate why a model stopped generating:
tool_calls
: Verifies the model stopped to make a function/tool call (e.g., weather lookup for cities)The example shows that when models are asked about weather in real cities (Boston, New York, Paris), they correctly stop generation to make tool calls, resulting in a tool_calls
finish reason. This helps ensure your models are using tools appropriately when they should be.
Press p or to see the previous file or, n or to see the next file
Browsing data directories saved to S3 is possible with DAGsHub. Let's configure your repository to easily display your data in the context of any commit!
promptfoo is now integrated with AWS S3!
Are you sure you want to delete this access key?
Browsing data directories saved to Google Cloud Storage is possible with DAGsHub. Let's configure your repository to easily display your data in the context of any commit!
promptfoo is now integrated with Google Cloud Storage!
Are you sure you want to delete this access key?
Browsing data directories saved to Azure Cloud Storage is possible with DAGsHub. Let's configure your repository to easily display your data in the context of any commit!
promptfoo is now integrated with Azure Cloud Storage!
Are you sure you want to delete this access key?
Browsing data directories saved to S3 compatible storage is possible with DAGsHub. Let's configure your repository to easily display your data in the context of any commit!
promptfoo is now integrated with your S3 compatible storage!
Are you sure you want to delete this access key?