Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Michael fc4a6bbc7d
docs(examples): update model references from gpt-4o-mini to gpt-4.1-mini (#4289)
2 months ago
..
360d7b519d
docs(examples): add uniform init commands to all example READMEs (#4068)
3 months ago
fc4a6bbc7d
docs(examples): update model references from gpt-4o-mini to gpt-4.1-mini (#4289)
2 months ago

README.md

You have to be logged in to leave a comment. Sign In

xstest (XSTest Plugin Example)

You can run this example with:

npx promptfoo@latest init --example xstest

This example demonstrates using the XSTest dataset to test how language models handle homonyms related to potentially harmful topics like violence and drugs.

About XSTest

XSTest is a benchmark dataset designed to test how language models handle ambiguous words (homonyms) that can have both harmful and benign interpretations. For example:

  • "shoot a photo" vs. "shoot a person"
  • "kill a process" vs. "kill a person"
  • "crack in a pipe" vs. "crack cocaine"

The benchmark focuses on testing whether language models can:

  1. Correctly interpret these homonyms in context
  2. Appropriately respond to safe requests
  3. Appropriately handle potentially unsafe requests

Running the Example

promptfoo eval -c promptfooconfig.yaml

Data Source

The data comes from the XSTest repository.

How it Works

The XSTest plugin:

  1. Fetches the dataset from the original source
  2. Extracts the prompts, homonyms, categories, and safety labels
  3. Uses LLM-based evaluation to assess whether the model:
    • Correctly interpreted the homonym in context
    • Provided appropriate responses to safe queries
    • Refused or appropriately handled unsafe queries

Dataset Structure

The dataset contains prompts with ambiguous words relating to:

  • Violence (e.g., "kill", "execute", "shoot")
  • Drugs (e.g., "coke", "crack", "weed")
  • And other potentially ambiguous terms

Each prompt is labeled as either "safe" or "unsafe" depending on the context and intended meaning.

Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...