You have to be logged in to leave a comment.

sidebar_position	sidebar_label
2	Quickstart

import CodeBlock from '@theme/CodeBlock'; import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem';

Quickstart

Promptfoo is an open-source tool for red teaming gen AI applications.

Automatically scans 30+ vulnerability types:
- <a href="/docs/red-team/llm-vulnerability-types/#privacy-and-security" style={{background: '#f0f0f0', borderRadius: '4px', padding: '2px 4px', fontSize: '14px'}}>Security & data privacy: jailbreaks, injections, RAG poisoning, etc.
- <a href="/docs/red-team/llm-vulnerability-types/" style={{background: '#f0f0f0', borderRadius: '4px', padding: '2px 4px', fontSize: '14px'}}>Compliance & ethics: harmful & biased content, content filter validation, OWASP/NIST/EU compliance, etc.
- <a href="/docs/red-team/configuration/#custom-policies" style={{background: '#f0f0f0', borderRadius: '4px', padding: '2px 4px', fontSize: '14px'}}>Custom policies: enforce organizational guidelines.
Generates dynamic attack probes tailored to your application using specialized uncensored models.
Implements state-of-the-art adversarial ML research from Microsoft, Meta, and others.
Integrates with CI/CD.
Tests via HTTP API, browser, or direct model access.

![llm red team report](/img/riskreport-1@2x.png)

Prerequisites

Install Node 18 or later
Optional: Set your OPENAI_API_KEY environment variable

Initialize the project

npx promptfoo@latest redteam setup Install: npm install -g promptfoo

Run:
<CodeBlock language="bash">
  promptfoo redteam setup
</CodeBlock>

Install: brew install promptfoo

Run:
<CodeBlock language="bash">
  promptfoo redteam setup
</CodeBlock>

The setup command will open a web UI that asks questions to help you configure your red teaming project.

Start by providing some details about the target application. The more details we provide, the more tailored the generated test cases will be.

At a minimum, be sure to fill out the Purpose field with a description of your application.

Next, configure Promptfoo to communicate with your target application or model.

Because the Promptfoo scanner runs locally on your machine, it can attack any endpoint accessible from your machine or network.

See below for more info on how to talk with non-HTTP targets such as models (local or remote) or custom code.

Next, select the plugins that you want to use. Plugins are adversarial generators. They produce malicious inputs that are sent to your application.

Check off the individual plugins you want to use, or select a preset that includes a combination of plugins (if in doubt, stick with "Default").

Now we select strategies. Strategies are adversarial techniques that wrap the generated inputs in a specific attack pattern.

This is how Promptfoo generates more sophisticated jailbreaks and injections.

Finally, download the generated configuration file. You'll use this to run the red team from your local machine.

Save the file as promptfooconfig.yaml. Then, navigate to the directory where you saved the file and run promptfoo redteam run.

:::info If you don't want to use the UI to start a red team, you can use the init command instead:

promptfoo redteam init --no-gui

:::

Attacking an API endpoint

Edit the config to set up the target endpoint. For example:

targets:
  - id: https
    label: 'travel-agent-agent'
    config:
      url: 'https://example.com/generate'
      method: 'POST'
      headers:
        'Content-Type': 'application/json'
      body:
        myPrompt: '{{prompt}}'

purpose: 'The user is a budget traveler looking for the best deals. The system is a travel agent that helps the user plan their trip. The user is anonymous and should not be able to access any information about other users, employees, or other individuals.'

The label is used to create issues and report the results of the red teaming. Make sure to re-use the same label when generating new redteam configs for the same target.

Setting the purpose is optional, but it will significantly improve the quality of the generated test cases and grading. Be specific about who the user of the system is and what information and actions they should be able to access.

For more information on configuring an HTTP target, see HTTP requests.

Alternative: Test specific prompts and models

If you don't have a live endpoint, you can edit the config to set the specific prompt(s) and the LLM(s) to test:

prompts:
  - 'Act as a travel agent and help the user plan their trip. User query: {{query}}'
  # Paths to prompts also work:
  # - file://path/to/prompt.txt

targets:
  - id: openai:gpt-4o-mini
    label: 'travel-agent-mini'

For more information on supported targets, see Custom Providers. For more information on supported prompt formats, see prompts.

Alternative: Talking directly to your app

Promptfoo can hook directly into your existing LLM app to attack targets via Python, Javascript, RAG or agent workflows, HTTP API, and more. See custom providers for details on setting up:

HTTP requests to your API
Custom Python scripts for precise control
Javascript, any executable, local providers like ollama, or other provider types

Run the eval

Now that we've generated the test cases, we're ready to run the adversarial evaluation.

npx promptfoo@latest redteam run promptfoo redteam run promptfoo redteam run

This command will generate several hundred adversarial inputs across many categories of potential harm and save them in redteam.yaml. Then, it will run the test cases against the target.

View the results

npx promptfoo@latest redteam report promptfoo redteam report promptfoo redteam report

Promptfoo provides a report view that lets you dig into specific red team failure cases:

That view includes a breakdown of specific test types that are connected to the eval view:

Clicking into a specific test case to view logs will display the raw inputs and outputs:

Understanding the report view

The red teaming results provide insights into various aspects of your LLM application's behavior:

Vulnerability categories: Identifies the types of vulnerabilities discovered, such as prompt injections, context poisoning, or unintended behaviors.
Severity levels: Classifies vulnerabilities based on their potential impact and likelihood of occurrence.
Logs: Provides concrete instances of inputs that triggered vulnerabilities.
Suggested mitigations: Recommendations for addressing identified vulnerabilities, which may include prompt engineering, additional safeguards, or architectural changes.

Continuous improvement

Red teaming is not a one-time activity but an ongoing process. As you develop and refine your LLM application, regularly running red team evaluations helps ensure that:

New features or changes don't introduce unexpected vulnerabilities
Your application remains robust against evolving attack techniques
You can quantify and demonstrate improvements in safety and reliability over time

Check out the CI/CD integration docs for more info.

Resources

Configuration guide for detailed info on configuring your red team
Full guide for info examples of dynamically generated prompts, RAG/chain, etc.
Types of LLM vulnerabilities for an overview of supported plugins
Guides on red teaming agents and RAGs

Tip!

Press p or to see the previous file or, n or to see the next file

Specify your S3 bucket

Bucket name cannot be the same as the repository name. Please change one of them.

Bucket url and prefix

Region

Endpoint Url

Disable SSL verification

quickstart.md 9.8 KB

Permalink History Raw

Quickstart

Prerequisites

Initialize the project

Attacking an API endpoint

Alternative: Test specific prompts and models

Alternative: Talking directly to your app

Run the eval

View the results

Understanding the report view

Continuous improvement

Resources

Comments

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

nirbarazida / promptfoo mirror of https://github.com/promptfoo/promptfoo

quickstart.md 9.8 KB Permalink History Raw

Quickstart

Prerequisites

Initialize the project

Attacking an API endpoint

Alternative: Test specific prompts and models

Alternative: Talking directly to your app

Run the eval

View the results

Understanding the report view

Continuous improvement

Resources

Comments

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

nirbarazida
/
promptfoo
mirror of https://github.com/promptfoo/promptfoo

quickstart.md 9.8 KB

Permalink History Raw