Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Integration:  git github
Michael D'Angelo 4814bc65d9
feat(redteam): add other-encodings strategy alias
3 months ago
b818b728cd
docs(cursor): introduce a cursor rule for documentation and do some cleanup (#3404)
5 months ago
f984945770
fix(devcontainer): simplify and standardize development environment (#1547)
1 year ago
9cee0b8160
build: add Node.js 24 support (#3941)
3 months ago
370fc43957
chore: Add email to remote inferencere requests (#2647)
7 months ago
18a787010b
fix(redteam): replace other-encodings with individual morse and piglatin strategies (#4064)
3 months ago
8afa0ffe8b
chore: separate errors from assert failures (#2214)
8 months ago
360d7b519d
docs(examples): add uniform init commands to all example READMEs (#4068)
3 months ago
aa94c4ddcf
fix(Dockerfile): Create .promptfoo directory in Dockerfile and remove initContainer (#3435)
5 months ago
529c2150d0
chore: set telemetry key (#3838)
3 months ago
ed2f281c65
chore(docs): update model IDs in documentation to reflect latest naming convention (#4046)
3 months ago
src
4814bc65d9
feat(redteam): add other-encodings strategy alias
3 months ago
51d9f88b61
fix(hooks): add missing results to afterAll hook context (#4071)
3 months ago
ee622119a1
feat: Migrate NextUI to a React App (#1637)
11 months ago
903999375c
docs: medical agent example (#3993)
3 months ago
e1aa6ab106
docs: Merge docs into main repo (#317)
1 year ago
7c335ff340
chore: upgrade development versions of Node.js to v22 and Python to 3.13 (#2340)
6 months ago
41a1382c75
chore(deps): update dependencies to latest stable versions (#3385)
5 months ago
673e1431a8
chore(prompts): support j2 files (#3338)
5 months ago
280e0f9e30
chore: bump version 0.112.7 (#4023)
3 months ago
46e784ec00
docs: update CLAUDE.md with additional commands and project conventions (#3972)
3 months ago
242b7cbbaa
docs: add contributing guide (#1150)
1 year ago
33570d1c85
chore(docker): update base images to Node.js 22 (#3666)
4 months ago
1b28ccc8c2
chore: update year
7 months ago
bcfd405378
docs(readme): improve README formatting and add new sections (#3461)
5 months ago
5be7ca2dcf
docs(security): add security policy (#3470)
5 months ago
dcddee95ee
chore: migrate drizzle (#1922)
10 months ago
903999375c
docs: medical agent example (#3993)
3 months ago
fdb2524753
test: add unit test for src/app/src/pages/redteam/setup/components/strategies/utils.ts (#3495)
5 months ago
d5b1130e26
ci(tests): separate unit and integration tests in CI pipeline (#1849)
10 months ago
6b486a3954
test: configure default globalConfig mock and logger mock (#2915)
6 months ago
6b4746ed6c
fix: nodemon
6 months ago
9566304a4c
chore(deps): update dependencies (#4073)
3 months ago
9566304a4c
chore(deps): update dependencies (#4073)
3 months ago
8938dd3236
chore(build): add pnpm support (#3307)
5 months ago
e763cb62ec
chore: Enable SWC for ts-node for faster dev server (#3126)
6 months ago
Storage Buckets

README.md

You have to be logged in to leave a comment. Sign In

Promptfoo: LLM evals & red teaming

npm npm GitHub Workflow Status MIT license Discord

promptfoo is a developer-friendly local tool for testing LLM applications. Stop the trial-and-error approach - start shipping secure, reliable AI apps.

Website ยท Getting Started ยท Red Teaming ยท Documentation ยท Discord

Quick Start

# Install and initialize project
npx promptfoo@latest init

# Run your first evaluation
npx promptfoo eval

See Getting Started (evals) or Red Teaming (vulnerability scanning) for more.

What can you do with Promptfoo?

  • Test your prompts and models with automated evaluations
  • Secure your LLM apps with red teaming and vulnerability scanning
  • Compare models side-by-side (OpenAI, Anthropic, Azure, Bedrock, Ollama, and more)
  • Automate checks in CI/CD
  • Share results with your team

Here's what it looks like in action:

prompt evaluation matrix - web viewer

It works on the command line too:

prompt evaluation matrix - command line

It also can generate security vulnerability reports:

gen ai red team

Why promptfoo?

  • ๐Ÿš€ Developer-first: Fast, with features like live reload and caching
  • ๐Ÿ”’ Private: Runs 100% locally - your prompts never leave your machine
  • ๐Ÿ”ง Flexible: Works with any LLM API or programming language
  • ๐Ÿ’ช Battle-tested: Powers LLM apps serving 10M+ users in production
  • ๐Ÿ“Š Data-driven: Make decisions based on metrics, not gut feel
  • ๐Ÿค Open source: MIT licensed, with an active community

Star the Project โญ

If you find promptfoo useful, please star it on GitHub! Stars help the project grow and ensure you stay updated on new releases and features.

Star us on GitHub!

Learn More

Contributing

We welcome contributions! Check out our contributing guide to get started.

Join our Discord community for help and discussion.

Tip!

Press p or to see the previous file or, n or to see the next file

About

Test your prompts. Evaluate and compare LLM outputs, catch regressions, and improve prompt quality.

Collaborators 1

Comments

Loading...