Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Integration:  git github
Michael D'Angelo 92a1872b91
fix: resolve table header DOM nesting and improve sticky behavior
8 months ago
f984945770
fix(devcontainer): simplify and standardize development environment (#1547)
1 year ago
93d0d57ea6
chore: disable ci telemetry
8 months ago
8449ea3341
Normalize eval results in db (#1776)
10 months ago
1627d418bb
revert: refactor(evaluator): enhance variable resolution and prompt rendering" (#2386)
8 months ago
8afa0ffe8b
chore: separate errors from assert failures (#2214)
8 months ago
adc9078e0e
chore(examples): revert redteam-ollama example to previous version (#2499)
8 months ago
a3cf92ebea
feat: helm chart for self hosted (#2003)
9 months ago
8494c43824
chore: replace node-fetch with native fetch API (#1968)
10 months ago
72d40196d0
docs: g-eval docs (#2501)
8 months ago
src
92a1872b91
fix: resolve table header DOM nesting and improve sticky behavior
8 months ago
7ede615dab
fix(moderation): handle empty output to avoid false positives (#2508)
8 months ago
ee622119a1
feat: Migrate NextUI to a React App (#1637)
11 months ago
3ca9076c0d
fix(providers): use prompt config for structured outputs in azure (#2331)
8 months ago
e1aa6ab106
docs: Merge docs into main repo (#317)
1 year ago
2041217c15
chore: update Node.js to v20.18.1 (#2342)
8 months ago
a3cf92ebea
feat: helm chart for self hosted (#2003)
9 months ago
6e4fdcd886
chore: sort imports (#1006)
1 year ago
fcdb4c0b2b
0.103.1
8 months ago
242b7cbbaa
docs: add contributing guide (#1150)
1 year ago
5afbef5f30
chore: dont run docker as root (#1884)
10 months ago
5a30dac647
chore: update license and gitignore
1 year ago
c905d3bb4f
docs: readme overhaul (#2502)
8 months ago
dcddee95ee
chore: migrate drizzle (#1922)
10 months ago
ed7b0e710e
chore(deps): update patch and minor dependencies (#2064)
9 months ago
4749c57232
feat: add install script for pre-built binary installation (#1755)
11 months ago
d5b1130e26
ci(tests): separate unit and integration tests in CI pipeline (#1849)
10 months ago
d5b1130e26
ci(tests): separate unit and integration tests in CI pipeline (#1849)
10 months ago
8449ea3341
Normalize eval results in db (#1776)
10 months ago
fcdb4c0b2b
0.103.1
8 months ago
fcdb4c0b2b
0.103.1
8 months ago
23b3665aa7
chore: invariant (#2363)
8 months ago
Storage Buckets

README.md

You have to be logged in to leave a comment. Sign In

Promptfoo: LLM evals & red teaming

npm npm GitHub Workflow Status MIT license Discord

promptfoo is a developer-friendly local tool for testing LLM applications. Stop the trial-and-error approach - start shipping secure, reliable AI apps.

Quick Start

# Install and initialize project
npx promptfoo@latest init

# Run your first evaluation
npx promptfoo eval

See Getting Started (evals) or Red Teaming (vulnerability scanning) for more.

What can you do with Promptfoo?

  • Test your prompts and models with automated evaluations
  • Secure your LLM apps with red teaming and vulnerability scanning
  • Compare models side-by-side (OpenAI, Anthropic, Azure, Bedrock, Ollama, and more)
  • Automate checks in CI/CD
  • Share results with your team

Here's what it looks like in action:

prompt evaluation matrix - web viewer

It works on the command line too:

prompt evaluation matrix - command line

It also can generate security vulnerability reports:

gen ai red team

Why promptfoo?

  • 🚀 Developer-first: Fast, with features like live reload and caching
  • 🔒 Private: Runs 100% locally - your prompts never leave your machine
  • 🔧 Flexible: Works with any LLM API or programming language
  • 💪 Battle-tested: Powers LLM apps serving 10M+ users in production
  • 📊 Data-driven: Make decisions based on metrics, not gut feel
  • 🤝 Open source: MIT licensed, with an active community

Learn More

Contributing

We welcome contributions! Check out our contributing guide to get started.

Join our Discord community for help and discussion.

Tip!

Press p or to see the previous file or, n or to see the next file

About

Test your prompts. Evaluate and compare LLM outputs, catch regressions, and improve prompt quality.

Collaborators 1

Comments

Loading...