Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Integration:  git github
f984945770
fix(devcontainer): simplify and standardize development environment (#1547)
1 year ago
dc5f3ad25f
chore(ci): add shell format check to CI workflow (#2669)
7 months ago
370fc43957
chore: Add email to remote inferencere requests (#2647)
7 months ago
2ac78ff2d2
feat(redteam): add Likert-based jailbreak strategy (#2614)
7 months ago
8afa0ffe8b
chore: separate errors from assert failures (#2214)
8 months ago
dc5f3ad25f
chore(ci): add shell format check to CI workflow (#2669)
7 months ago
a3cf92ebea
feat: helm chart for self hosted (#2003)
9 months ago
8494c43824
chore: replace node-fetch with native fetch API (#1968)
10 months ago
61ddfe4f12
docs: update Discord links to new invite (#2675)
7 months ago
src
309ba93179
fix: dont throw in http provider on non-2xx (#2689)
7 months ago
7 months ago
ee622119a1
feat: Migrate NextUI to a React App (#1637)
11 months ago
efc922a371
chore(examples): add image saving hook for DALL-E outputs in redteam-dalle (#2607)
7 months ago
e1aa6ab106
docs: Merge docs into main repo (#317)
1 year ago
2041217c15
chore: update Node.js to v20.18.1 (#2342)
8 months ago
a3cf92ebea
feat: helm chart for self hosted (#2003)
9 months ago
6e4fdcd886
chore: sort imports (#1006)
1 year ago
dc93d92fa2
0.103.9
7 months ago
242b7cbbaa
docs: add contributing guide (#1150)
1 year ago
5afbef5f30
chore: dont run docker as root (#1884)
10 months ago
e1d3b0f2e1
docs(license): update year and clarify licensing terms (#2596)
7 months ago
61ddfe4f12
docs: update Discord links to new invite (#2675)
7 months ago
dcddee95ee
chore: migrate drizzle (#1922)
10 months ago
ed7b0e710e
chore(deps): update patch and minor dependencies (#2064)
9 months ago
dc5f3ad25f
chore(ci): add shell format check to CI workflow (#2669)
7 months ago
d5b1130e26
ci(tests): separate unit and integration tests in CI pipeline (#1849)
10 months ago
d5b1130e26
ci(tests): separate unit and integration tests in CI pipeline (#1849)
10 months ago
370fc43957
chore: Add email to remote inferencere requests (#2647)
7 months ago
ce5d380a39
docs: add redirect for troubleshooting link (#2653)
7 months ago
bb148dc661
chore: bump groq-sdk from 0.11.0 to 0.12.0 (#2642)
7 months ago
23b3665aa7
chore: invariant (#2363)
8 months ago
Storage Buckets

README.md

You have to be logged in to leave a comment. Sign In

Promptfoo: LLM evals & red teaming

npm npm GitHub Workflow Status MIT license Discord

promptfoo is a developer-friendly local tool for testing LLM applications. Stop the trial-and-error approach - start shipping secure, reliable AI apps.

Quick Start

# Install and initialize project
npx promptfoo@latest init

# Run your first evaluation
npx promptfoo eval

See Getting Started (evals) or Red Teaming (vulnerability scanning) for more.

What can you do with Promptfoo?

  • Test your prompts and models with automated evaluations
  • Secure your LLM apps with red teaming and vulnerability scanning
  • Compare models side-by-side (OpenAI, Anthropic, Azure, Bedrock, Ollama, and more)
  • Automate checks in CI/CD
  • Share results with your team

Here's what it looks like in action:

prompt evaluation matrix - web viewer

It works on the command line too:

prompt evaluation matrix - command line

It also can generate security vulnerability reports:

gen ai red team

Why promptfoo?

  • 🚀 Developer-first: Fast, with features like live reload and caching
  • 🔒 Private: Runs 100% locally - your prompts never leave your machine
  • 🔧 Flexible: Works with any LLM API or programming language
  • 💪 Battle-tested: Powers LLM apps serving 10M+ users in production
  • 📊 Data-driven: Make decisions based on metrics, not gut feel
  • 🤝 Open source: MIT licensed, with an active community

Learn More

Contributing

We welcome contributions! Check out our contributing guide to get started.

Join our Discord community for help and discussion.

Tip!

Press p or to see the previous file or, n or to see the next file

About

Test your prompts. Evaluate and compare LLM outputs, catch regressions, and improve prompt quality.

Collaborators 1

Comments

Loading...