You have to be logged in to leave a comment.

sidebar_label
Databricks

Databricks Foundation Model APIs

The Databricks provider integrates with Databricks' Foundation Model APIs, offering access to state-of-the-art models through a unified OpenAI-compatible interface. It supports multiple deployment modes to match your specific use case and performance requirements.

Overview

Databricks Foundation Model APIs provide three main deployment options:

Pay-per-token endpoints: Pre-configured endpoints for popular models with usage-based pricing
Provisioned throughput: Dedicated endpoints with guaranteed performance for production workloads
External models: Unified access to models from providers like OpenAI, Anthropic, and Google through Databricks

Prerequisites

A Databricks workspace with Foundation Model APIs enabled
A Databricks access token for authentication
Your workspace URL (e.g., https://your-workspace.cloud.databricks.com)

Set up your environment:

export DATABRICKS_WORKSPACE_URL=https://your-workspace.cloud.databricks.com
export DATABRICKS_TOKEN=your-token-here

Basic Usage

Pay-per-token Endpoints

Access pre-configured Foundation Model endpoints with simple configuration:

providers:
  - id: databricks:databricks-meta-llama-3-3-70b-instruct
    config:
      isPayPerToken: true
      workspaceUrl: https://your-workspace.cloud.databricks.com

Available pay-per-token models include:

databricks-meta-llama-3-3-70b-instruct - Meta's latest Llama model
databricks-claude-3-7-sonnet - Anthropic Claude with reasoning capabilities
databricks-gte-large-en - Text embeddings model
databricks-dbrx-instruct - Databricks' own foundation model

Provisioned Throughput Endpoints

For production workloads requiring guaranteed performance:

providers:
  - id: databricks:my-custom-endpoint
    config:
      workspaceUrl: https://your-workspace.cloud.databricks.com
      temperature: 0.7
      max_tokens: 500

External Models

Access external models through Databricks' unified API:

providers:
  - id: databricks:my-openai-endpoint
    config:
      workspaceUrl: https://your-workspace.cloud.databricks.com
      # External model endpoints proxy to providers like OpenAI, Anthropic, etc.

Configuration Options

The Databricks provider extends the OpenAI configuration options with these Databricks-specific features:

Parameter	Description	Default
`workspaceUrl`	Databricks workspace URL. Can also be set via `DATABRICKS_WORKSPACE_URL` environment variable	-
`isPayPerToken`	Whether this is a pay-per-token endpoint (true) or custom deployed endpoint (false)	false
`usageContext`	Optional metadata for usage tracking and cost attribution	-
`aiGatewayConfig`	AI Gateway features configuration (safety filters, PII handling)	-

Advanced Configuration

providers:
  - id: databricks:databricks-claude-3-7-sonnet
    config:
      isPayPerToken: true
      workspaceUrl: https://your-workspace.cloud.databricks.com

      # Standard OpenAI parameters
      temperature: 0.7
      max_tokens: 2000
      top_p: 0.9

      # Usage tracking for cost attribution
      usageContext:
        project: 'customer-support'
        team: 'engineering'
        environment: 'production'

      # AI Gateway features (if enabled on endpoint)
      aiGatewayConfig:
        enableSafety: true
        piiHandling: 'mask' # Options: none, block, mask

Environment Variables

Variable	Description
`DATABRICKS_WORKSPACE_URL`	Your Databricks workspace URL
`DATABRICKS_TOKEN`	Authentication token for Databricks API access

Features

Vision Models

Vision models on Databricks require structured JSON prompts similar to OpenAI's format. Here's how to use them:

prompts:
  - file://vision-prompt.json

providers:
  - id: databricks:databricks-claude-3-7-sonnet
    config:
      isPayPerToken: true

tests:
  - vars:
      question: "What's in this image?"
      image_url: 'https://example.com/image.jpg'

Create a vision-prompt.json file with the proper format:

[
  {
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": "{{question}}"
      },
      {
        "type": "image_url",
        "image_url": {
          "url": "{{image_url}}"
        }
      }
    ]
  }
]

Structured Outputs

Get responses in a specific JSON schema:

providers:
  - id: databricks:databricks-meta-llama-3-3-70b-instruct
    config:
      isPayPerToken: true
      response_format:
        type: 'json_schema'
        json_schema:
          name: 'product_info'
          schema:
            type: 'object'
            properties:
              name:
                type: 'string'
              price:
                type: 'number'
            required: ['name', 'price']

Monitoring and Usage Tracking

Track usage and costs with detailed context:

providers:
  - id: databricks:databricks-meta-llama-3-3-70b-instruct
    config:
      isPayPerToken: true
      usageContext:
        application: 'chatbot'
        customer_id: '12345'
        request_type: 'support_query'
        priority: 'high'

Usage data is available through Databricks system tables:

system.serving.endpoint_usage - Token usage and request metrics
system.serving.served_entities - Endpoint metadata

Best Practices

Choose the right deployment mode:
- Use pay-per-token for experimentation and low-volume use cases
- Use provisioned throughput for production workloads requiring SLAs
- Use external models when you need specific providers' capabilities
Enable AI Gateway features for production endpoints:
- Safety guardrails prevent harmful content
- PII detection protects sensitive data
- Rate limiting controls costs and prevents abuse
Implement proper error handling:
- Pay-per-token endpoints may have rate limits
- Provisioned endpoints may have token-per-second limits
- External model endpoints inherit provider-specific limitations

Example: Multi-Model Comparison

prompts:
  - 'Explain quantum computing to a 10-year-old'

providers:
  # Databricks native model
  - id: databricks:databricks-meta-llama-3-3-70b-instruct
    config:
      isPayPerToken: true
      temperature: 0.7

  # External model via Databricks
  - id: databricks:my-gpt4-endpoint
    config:
      temperature: 0.7

  # Custom deployed model
  - id: databricks:my-finetuned-llama
    config:
      temperature: 0.7

tests:
  - assert:
      - type: llm-rubric
        value: 'Response should be simple, clear, and use age-appropriate analogies'

Troubleshooting

Common issues and solutions:

Authentication errors: Verify your DATABRICKS_TOKEN has the necessary permissions
Endpoint not found:
- For pay-per-token: Ensure you're using the exact endpoint name (e.g., databricks-meta-llama-3-3-70b-instruct)
- For custom endpoints: Verify the endpoint exists and is running
Rate limiting: Pay-per-token endpoints have usage limits; consider provisioned throughput for high-volume use
Token count errors: Some models have specific token limits; adjust max_tokens accordingly

Additional Resources

Tip!

Press p or to see the previous file or, n or to see the next file

Specify your S3 bucket

Bucket name cannot be the same as the repository name. Please change one of them.

Bucket url and prefix

Region

Endpoint Url

Disable SSL verification

databricks.md 8.3 KB

Permalink History Raw

Databricks Foundation Model APIs

Overview

Prerequisites

Basic Usage

Pay-per-token Endpoints

Provisioned Throughput Endpoints

External Models

Configuration Options

Advanced Configuration

Environment Variables

Features

Vision Models

Structured Outputs

Monitoring and Usage Tracking

Best Practices

Example: Multi-Model Comparison

Troubleshooting

Additional Resources

Comments

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

nirbarazida / promptfoo mirror of https://github.com/promptfoo/promptfoo

databricks.md 8.3 KB Permalink History Raw

Databricks Foundation Model APIs

Overview

Prerequisites

Basic Usage

Pay-per-token Endpoints

Provisioned Throughput Endpoints

External Models

Configuration Options

Advanced Configuration

Environment Variables

Features

Vision Models

Structured Outputs

Monitoring and Usage Tracking

Best Practices

Example: Multi-Model Comparison

Troubleshooting

Additional Resources

Comments

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

nirbarazida
/
promptfoo
mirror of https://github.com/promptfoo/promptfoo

databricks.md 8.3 KB

Permalink History Raw