...

README.md

fcf6ac48f9

feat(providers): add support for Amazon SageMaker (#3413)

5 months ago

deploy-test-model.py

fcf6ac48f9

feat(providers): add support for Amazon SageMaker (#3413)

5 months ago

promptfooconfig.anthropic.yaml

fcf6ac48f9

feat(providers): add support for Amazon SageMaker (#3413)

5 months ago

promptfooconfig.embedding.yaml

fcf6ac48f9

feat(providers): add support for Amazon SageMaker (#3413)

5 months ago

promptfooconfig.jumpstart.yaml

fcf6ac48f9

feat(providers): add support for Amazon SageMaker (#3413)

5 months ago

promptfooconfig.llama-vs-mistral.yaml

fcf6ac48f9

feat(providers): add support for Amazon SageMaker (#3413)

5 months ago

promptfooconfig.llama.yaml

fcf6ac48f9

feat(providers): add support for Amazon SageMaker (#3413)

5 months ago

promptfooconfig.mistral.yaml

fcf6ac48f9

feat(providers): add support for Amazon SageMaker (#3413)

5 months ago

promptfooconfig.multimodel.yaml

fcf6ac48f9

feat(providers): add support for Amazon SageMaker (#3413)

5 months ago

promptfooconfig.openai.yaml

fcf6ac48f9

feat(providers): add support for Amazon SageMaker (#3413)

5 months ago

promptfooconfig.transform.yaml

fcf6ac48f9

feat(providers): add support for Amazon SageMaker (#3413)

5 months ago

test-sagemaker-provider.js

fcf6ac48f9

feat(providers): add support for Amazon SageMaker (#3413)

5 months ago

transform.js

fcf6ac48f9

feat(providers): add support for Amazon SageMaker (#3413)

5 months ago

You have to be logged in to leave a comment.

amazon-sagemaker (Amazon SageMaker Provider)

This example demonstrates how to evaluate models deployed on Amazon SageMaker endpoints using promptfoo.

You can run this example with:

npx promptfoo@latest init --example amazon-sagemaker

Purpose

This example shows how to:

Connect to and evaluate models deployed on Amazon SageMaker endpoints
Configure various model types (OpenAI, Anthropic, Llama, Mistral) running on SageMaker
Compare performance between different SageMaker-hosted models
Use transform functions to format prompts for specific model requirements
Work with embeddings models on SageMaker

Prerequisites

AWS account with SageMaker access
Deployed SageMaker endpoints with your models
AWS credentials configured locally

Required npm packages:

npm install -g @aws-sdk/client-sagemaker-runtime

Environment Variables

This example requires the following environment variables:

AWS_ACCESS_KEY_ID - Your AWS access key
AWS_SECRET_ACCESS_KEY - Your AWS secret key
AWS_REGION - Optional, can also be specified in the configuration

You can set these in a .env file or directly in your environment.

Example Configurations

This example includes multiple configuration files demonstrating different SageMaker integration patterns:

promptfooconfig.openai.yaml: OpenAI-compatible models on SageMaker
promptfooconfig.anthropic.yaml: Anthropic Claude models on SageMaker
promptfooconfig.jumpstart.yaml: AWS JumpStart foundation models
promptfooconfig.llama.yaml: Llama 3.2 models on SageMaker JumpStart
promptfooconfig.mistral.yaml: Mistral 7B v3 models on SageMaker (Hugging Face)
promptfooconfig.llama-vs-mistral.yaml: Comparison between Llama and Mistral models
promptfooconfig.embedding.yaml: Embedding models on SageMaker
promptfooconfig.multimodel.yaml: Multiple model types on SageMaker
promptfooconfig.transform.yaml: Transform functions for SageMaker endpoints

Running the Examples

Replace the endpoint names in the configuration files with your actual SageMaker endpoints
Run the evaluation using promptfoo:

# Run a specific configuration
promptfoo eval -c promptfooconfig.jumpstart.yaml

Testing Your Setup

This directory includes a test script to validate your SageMaker endpoint configuration before running a full evaluation:

# Basic test for an OpenAI-compatible endpoint
node test-sagemaker-provider.js --endpoint=my-endpoint --model-type=openai

# Test with an embedding endpoint
node test-sagemaker-provider.js --endpoint=my-embedding-endpoint --embedding=true

# Test with transforms
node test-sagemaker-provider.js --endpoint=my-endpoint --model-type=llama --transform=true

# Test with a custom transform file
node test-sagemaker-provider.js --endpoint=my-endpoint --transform=true --transform-file=transform.js

Transform Functions

The SageMaker provider supports transforming prompts before they're sent to the endpoint, which is particularly useful for formatting prompts according to specific model requirements.

Inline Transform

providers:
  - id: sagemaker:llama:your-endpoint
    config:
      region: us-west-2
      modelType: llama
      # Apply an inline transform
      transform: |
        return `<s>[INST] ${prompt} [/INST]`;

File-Based Transform

This example includes a sample transform file (transform.js) that shows how to create reusable transformations:

providers:
  - id: sagemaker:jumpstart:your-endpoint
    config:
      region: us-west-2
      modelType: jumpstart
      # Reference an external transform file
      transform: file://transform.js

The transform function receives the prompt and a context object containing the provider configuration:

module.exports = function (prompt, context) {
  // Access config values
  const maxTokens = context.config?.maxTokens || 256;

  // Return transformed input
  return {
    inputs: prompt,
    parameters: {
      max_new_tokens: maxTokens,
      temperature: 0.7,
    },
  };
};

JumpStart Models

JumpStart models require a specific input/output format. The provider handles this automatically when modelType: jumpstart is specified:

providers:
  - id: sagemaker:jumpstart:your-jumpstart-endpoint
    config:
      region: us-west-2
      modelType: jumpstart
      maxTokens: 256
      responseFormat:
        path: 'json.generated_text'

Rate Limiting with Delays

For better rate limiting with SageMaker endpoints, you can add delays between API calls:

providers:
  - id: sagemaker:your-endpoint
    config:
      region: us-west-2
      delay: 500 # Add a 500ms delay between API calls

Expected Results

After running the evaluation, you should expect to see:

A comparison of responses from your SageMaker endpoints across different prompts
Performance metrics for each endpoint and prompt combination
Any errors or issues with specific endpoints or configurations

Troubleshooting

"Batch inference failed" Errors

If you encounter "Batch inference failed" errors:

Add a delay parameter (at least 500ms recommended)
Verify you're using the correct modelType for your endpoint:
- For Llama models: Use modelType: jumpstart
- For Mistral models: Use modelType: huggingface
Ensure you've specified the correct contentType and acceptType as "application/json"
Check that your endpoint is active and functioning in the SageMaker console

Response Format Issues

If you're getting unusual responses or missing output:

Make sure you're using the correct JavaScript expression for your model type:
- For Llama models (JumpStart): Use responseFormat.path: "json.generated_text"
- For Mistral models (Hugging Face): Use responseFormat.path: "json[0].generated_text"

Transform Issues

If transforms aren't working correctly:

Check that your transform function returns a valid string or object
For file-based transforms, verify the file path is correct and the file is accessible
Use the test script with --transform=true to debug transform behavior

Rate Limiting

If you're still experiencing errors even with the correct configuration:

Increase the delay between requests (try 1000ms or higher)
Run fewer tests in parallel
Monitor your endpoint metrics in the SageMaker console

Tip!

Press p or to see the previous file or, n or to see the next file

Specify your S3 bucket

Bucket name cannot be the same as the repository name. Please change one of them.

Bucket url and prefix

Region

Endpoint Url

Disable SSL verification

README.md

amazon-sagemaker (Amazon SageMaker Provider)

Purpose

Prerequisites

Environment Variables

Example Configurations

Running the Examples

Testing Your Setup

Transform Functions

Inline Transform

File-Based Transform

JumpStart Models

Rate Limiting with Delays

Expected Results

Troubleshooting

"Batch inference failed" Errors

Response Format Issues

Transform Issues

Rate Limiting

Comments

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

nirbarazida / promptfoo mirror of https://github.com/promptfoo/promptfoo

README.md

amazon-sagemaker (Amazon SageMaker Provider)

Purpose

Prerequisites

Environment Variables

Example Configurations

Running the Examples

Testing Your Setup

Transform Functions

Inline Transform

File-Based Transform

JumpStart Models

Rate Limiting with Delays

Expected Results

Troubleshooting

"Batch inference failed" Errors

Response Format Issues

Transform Issues

Rate Limiting

Comments

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

nirbarazida
/
promptfoo
mirror of https://github.com/promptfoo/promptfoo