...

README.md

efc922a371

chore(examples): add image saving hook for DALL-E outputs in redteam-dalle (#2607)

7 months ago

promptfooconfig.yaml

efc922a371

chore(examples): add image saving hook for DALL-E outputs in redteam-dalle (#2607)

7 months ago

save.js

efc922a371

chore(examples): add image saving hook for DALL-E outputs in redteam-dalle (#2607)

7 months ago

You have to be logged in to leave a comment.

DALL-E Red Team Example

This example demonstrates how to use promptfoo to automatically discover jailbreaks in OpenAI's DALL-E image generation model. It includes pre-configured test cases that attempt to generate various types of harmful content.

⚠️ Warning: Running this example may get your OpenAI account flagged for moderation or banned.

Setup

Set your OpenAI API key:
```
export OPENAI_API_KEY=your_key_here
```

Initialize the example:

npx promptfoo@latest init --example redteam-dalle

Usage

Review and optionally modify the test cases in promptfooconfig.yaml. The example includes the same test cases shown in our blog post.
Run the evaluation:
```
npx promptfoo@latest eval
```
View the results in the web UI:
```
npx promptfoo@latest view
```
Important table settings:
- Under TABLE SETTINGS, enable "Render model outputs as Markdown" to display generated images
- Set "Max text length" to unlimited to view complete responses
- Click the magnifying glass icon (🔍) in the "View output and test details" column to see the final modified prompt that was used, complete conversation history, and scoring breakdown for each iteration.

Expected Behavior

During evaluation, you may see error messages like:

Error from target provider: 400 Your request was rejected as a result of our safety system.
Error from target provider: 400 This request has been blocked by our content filters.

This is normal expected behavior. Promptfoo automatically retries with modified prompts in a loop until it succeeds or reaches the maximum number of iterations.

Configuration

The default configuration uses 4 iterations per test case. To increase this (and potentially find more jailbreaks), set:
```
export PROMPTFOO_NUM_JAILBREAK_ITERATIONS=6
```
For debugging or to see the internal workings, enable debug logging:
```
LOG_LEVEL=debug npx promptfoo@latest eval -j 1
```

Troubleshooting

Note: DALL-E image URLs expire after 2 hours. The example includes an extension hook that downloads images to a local images directory as soon as they are generated. Each image is saved with a filename based on the test description and timestamp.

If you get rate limit errors, try reducing concurrency with -j 1
If you get timeout errors, the evaluation is still running in the background. Wait a few minutes and check the results
For other issues, please check our documentation or file an issue

Learn More

For more details about LLM red teaming with promptfoo, check out:

Tip!

Press p or to see the previous file or, n or to see the next file

Specify your S3 bucket

Bucket name cannot be the same as the repository name. Please change one of them.

Bucket url and prefix

Region

Endpoint Url

Disable SSL verification

README.md

DALL-E Red Team Example

Setup

Usage

Expected Behavior

Configuration

Troubleshooting

Learn More

Comments

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

nirbarazida / promptfoo mirror of https://github.com/promptfoo/promptfoo

README.md

DALL-E Red Team Example

Setup

Usage

Expected Behavior

Configuration

Troubleshooting

Learn More

Comments

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

nirbarazida
/
promptfoo
mirror of https://github.com/promptfoo/promptfoo