Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Michael abf1612226
docs(providers): remove deprecated claude-3-sonnet-20240229 model references (#5018)
1 month ago
..
abf1612226
docs(providers): remove deprecated claude-3-sonnet-20240229 model references (#5018)
1 month ago
0970efdd03
chore: add vision grading example (#4090)
3 months ago
0970efdd03
chore: add vision grading example (#4090)
3 months ago
abf1612226
docs(providers): remove deprecated claude-3-sonnet-20240229 model references (#5018)
1 month ago

README.md

You have to be logged in to leave a comment. Sign In

claude-vs-gpt-image (Image Analysis Example)

You can run this example with:

npx promptfoo@latest init --example claude-vs-gpt-image

This example compares an image analysis task using:

  • gpt-4.1 via OpenAI
  • claude-sonnet-4 via Amazon Bedrock
  • claude-3.5 via Anthropic

GPT-4.1 and Claude have different prompt formats. We use custom provider functions in Python and JavaScript to dynamically format the prompt based on context about the provider. The responses are scored using llm-rubric with a vision-capable OpenAI model.

To get started, set your environment variables:

  • OPENAI_API_KEY
  • ANTHROPIC_API_KEY
  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY

If you do not have access to all of these providers, simply comment out the providers you do not have access to in promptfooconfig.yaml.

Then run:

npx promptfoo@latest eval

Afterwards, you can view the results by running:

npx promptfoo@latest view
Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...