Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Integration:  git github
f984945770
fix(devcontainer): simplify and standardize development environment (#1547)
1 year ago
3d725e9c4d
fix(providers): correct golang behavior for prompts with quotes (#3026)
6 months ago
370fc43957
chore: Add email to remote inferencere requests (#2647)
7 months ago
9097c64521
refactor(providers): extract provider registry to dedicated module (#3127)
6 months ago
8afa0ffe8b
chore: separate errors from assert failures (#2214)
8 months ago
2e0d432674
chore: add support for claude on vertex (#3209)
5 months ago
a3cf92ebea
feat: helm chart for self hosted (#2003)
9 months ago
8494c43824
chore: replace node-fetch with native fetch API (#1968)
10 months ago
bbad178c6a
docs: Update claude-vs-gpt.md (#3216)
5 months ago
src
5 months ago
fa9822e2d3
test: add unit test for src/server/server.ts (#3198)
5 months ago
ee622119a1
feat: Migrate NextUI to a React App (#1637)
11 months ago
2d9dbe3067
chore: ensure extensions are serialized from config in call to getUnifiedConfig (#3050)
6 months ago
e1aa6ab106
docs: Merge docs into main repo (#317)
1 year ago
7c335ff340
chore: upgrade development versions of Node.js to v22 and Python to 3.13 (#2340)
6 months ago
95727630df
chore(providers): make adaline a peer dependency (#2833)
6 months ago
6e4fdcd886
chore: sort imports (#1006)
1 year ago
9f8ab34645
chore: bump version 0.105.0 (#3210)
5 months ago
242b7cbbaa
docs: add contributing guide (#1150)
1 year ago
7c335ff340
chore: upgrade development versions of Node.js to v22 and Python to 3.13 (#2340)
6 months ago
1b28ccc8c2
chore: update year
7 months ago
61ddfe4f12
docs: update Discord links to new invite (#2675)
7 months ago
dcddee95ee
chore: migrate drizzle (#1922)
10 months ago
ed7b0e710e
chore(deps): update patch and minor dependencies (#2064)
9 months ago
dc5f3ad25f
chore(ci): add shell format check to CI workflow (#2669)
7 months ago
6b486a3954
test: configure default globalConfig mock and logger mock (#2915)
6 months ago
d5b1130e26
ci(tests): separate unit and integration tests in CI pipeline (#1849)
10 months ago
6b486a3954
test: configure default globalConfig mock and logger mock (#2915)
6 months ago
6b4746ed6c
fix: nodemon
6 months ago
342d117283
chore: bump @aws-sdk/client-bedrock-runtime from 3.751.0 to 3.755.0 (#3213)
5 months ago
9f8ab34645
chore: bump version 0.105.0 (#3210)
5 months ago
e763cb62ec
chore: Enable SWC for ts-node for faster dev server (#3126)
6 months ago
Storage Buckets

README.md

You have to be logged in to leave a comment. Sign In

Promptfoo: LLM evals & red teaming

npm npm GitHub Workflow Status MIT license Discord

promptfoo is a developer-friendly local tool for testing LLM applications. Stop the trial-and-error approach - start shipping secure, reliable AI apps.

Quick Start

# Install and initialize project
npx promptfoo@latest init

# Run your first evaluation
npx promptfoo eval

See Getting Started (evals) or Red Teaming (vulnerability scanning) for more.

What can you do with Promptfoo?

  • Test your prompts and models with automated evaluations
  • Secure your LLM apps with red teaming and vulnerability scanning
  • Compare models side-by-side (OpenAI, Anthropic, Azure, Bedrock, Ollama, and more)
  • Automate checks in CI/CD
  • Share results with your team

Here's what it looks like in action:

prompt evaluation matrix - web viewer

It works on the command line too:

prompt evaluation matrix - command line

It also can generate security vulnerability reports:

gen ai red team

Why promptfoo?

  • 🚀 Developer-first: Fast, with features like live reload and caching
  • 🔒 Private: Runs 100% locally - your prompts never leave your machine
  • 🔧 Flexible: Works with any LLM API or programming language
  • 💪 Battle-tested: Powers LLM apps serving 10M+ users in production
  • 📊 Data-driven: Make decisions based on metrics, not gut feel
  • 🤝 Open source: MIT licensed, with an active community

Learn More

Contributing

We welcome contributions! Check out our contributing guide to get started.

Join our Discord community for help and discussion.

Tip!

Press p or to see the previous file or, n or to see the next file

About

Test your prompts. Evaluate and compare LLM outputs, catch regressions, and improve prompt quality.

Collaborators 1

Comments

Loading...