Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Integration:  git github
Michael D'Angelo c4b56971af
refactor
6 months ago
f984945770
fix(devcontainer): simplify and standardize development environment (#1547)
1 year ago
3d725e9c4d
fix(providers): correct golang behavior for prompts with quotes (#3026)
6 months ago
370fc43957
chore: Add email to remote inferencere requests (#2647)
7 months ago
c4b56971af
refactor
6 months ago
8afa0ffe8b
chore: separate errors from assert failures (#2214)
8 months ago
e385787b9b
chore(providers): add bedrock llama3.3 support (#3031)
6 months ago
a3cf92ebea
feat: helm chart for self hosted (#2003)
9 months ago
8494c43824
chore: replace node-fetch with native fetch API (#1968)
10 months ago
2175f2a5fa
chore(providers): allow Alibaba API base URL override (#3040)
6 months ago
src
c4b56971af
refactor
6 months ago
c42b84ee61
add test
6 months ago
ee622119a1
feat: Migrate NextUI to a React App (#1637)
11 months ago
634722c44e
chore(examples): Example of a very simple barebones eval with Harmbench (#2873)
6 months ago
e1aa6ab106
docs: Merge docs into main repo (#317)
1 year ago
7c335ff340
chore: upgrade development versions of Node.js to v22 and Python to 3.13 (#2340)
6 months ago
95727630df
chore(providers): make adaline a peer dependency (#2833)
6 months ago
6e4fdcd886
chore: sort imports (#1006)
1 year ago
f3a9de038f
chore: bump version 0.104.0 (#2994)
6 months ago
242b7cbbaa
docs: add contributing guide (#1150)
1 year ago
7c335ff340
chore: upgrade development versions of Node.js to v22 and Python to 3.13 (#2340)
6 months ago
1b28ccc8c2
chore: update year
7 months ago
61ddfe4f12
docs: update Discord links to new invite (#2675)
7 months ago
dcddee95ee
chore: migrate drizzle (#1922)
10 months ago
ed7b0e710e
chore(deps): update patch and minor dependencies (#2064)
9 months ago
dc5f3ad25f
chore(ci): add shell format check to CI workflow (#2669)
7 months ago
6b486a3954
test: configure default globalConfig mock and logger mock (#2915)
6 months ago
d5b1130e26
ci(tests): separate unit and integration tests in CI pipeline (#1849)
10 months ago
6b486a3954
test: configure default globalConfig mock and logger mock (#2915)
6 months ago
6b4746ed6c
fix: nodemon
6 months ago
3445613d54
chore: bump @aws-sdk/client-bedrock-runtime from 3.743.0 to 3.744.0 (#3038)
6 months ago
1de2a2acd7
chore(deps): update dependencies (#3032)
6 months ago
23711d6c5b
revert: "build: create `dist/` using TypeScript's `"module": "Node16"` setting (#2686) (#2747)
7 months ago
Storage Buckets

README.md

You have to be logged in to leave a comment. Sign In

Promptfoo: LLM evals & red teaming

npm npm GitHub Workflow Status MIT license Discord

promptfoo is a developer-friendly local tool for testing LLM applications. Stop the trial-and-error approach - start shipping secure, reliable AI apps.

Quick Start

# Install and initialize project
npx promptfoo@latest init

# Run your first evaluation
npx promptfoo eval

See Getting Started (evals) or Red Teaming (vulnerability scanning) for more.

What can you do with Promptfoo?

  • Test your prompts and models with automated evaluations
  • Secure your LLM apps with red teaming and vulnerability scanning
  • Compare models side-by-side (OpenAI, Anthropic, Azure, Bedrock, Ollama, and more)
  • Automate checks in CI/CD
  • Share results with your team

Here's what it looks like in action:

prompt evaluation matrix - web viewer

It works on the command line too:

prompt evaluation matrix - command line

It also can generate security vulnerability reports:

gen ai red team

Why promptfoo?

  • 🚀 Developer-first: Fast, with features like live reload and caching
  • 🔒 Private: Runs 100% locally - your prompts never leave your machine
  • 🔧 Flexible: Works with any LLM API or programming language
  • 💪 Battle-tested: Powers LLM apps serving 10M+ users in production
  • 📊 Data-driven: Make decisions based on metrics, not gut feel
  • 🤝 Open source: MIT licensed, with an active community

Learn More

Contributing

We welcome contributions! Check out our contributing guide to get started.

Join our Discord community for help and discussion.

Tip!

Press p or to see the previous file or, n or to see the next file

About

Test your prompts. Evaluate and compare LLM outputs, catch regressions, and improve prompt quality.

Collaborators 1

Comments

Loading...