Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

databricks.md 8.3 KB

You have to be logged in to leave a comment. Sign In
sidebar_label
Databricks

Databricks Foundation Model APIs

The Databricks provider integrates with Databricks' Foundation Model APIs, offering access to state-of-the-art models through a unified OpenAI-compatible interface. It supports multiple deployment modes to match your specific use case and performance requirements.

Overview

Databricks Foundation Model APIs provide three main deployment options:

  1. Pay-per-token endpoints: Pre-configured endpoints for popular models with usage-based pricing
  2. Provisioned throughput: Dedicated endpoints with guaranteed performance for production workloads
  3. External models: Unified access to models from providers like OpenAI, Anthropic, and Google through Databricks

Prerequisites

  1. A Databricks workspace with Foundation Model APIs enabled
  2. A Databricks access token for authentication
  3. Your workspace URL (e.g., https://your-workspace.cloud.databricks.com)

Set up your environment:

export DATABRICKS_WORKSPACE_URL=https://your-workspace.cloud.databricks.com
export DATABRICKS_TOKEN=your-token-here

Basic Usage

Pay-per-token Endpoints

Access pre-configured Foundation Model endpoints with simple configuration:

providers:
  - id: databricks:databricks-meta-llama-3-3-70b-instruct
    config:
      isPayPerToken: true
      workspaceUrl: https://your-workspace.cloud.databricks.com

Available pay-per-token models include:

  • databricks-meta-llama-3-3-70b-instruct - Meta's latest Llama model
  • databricks-claude-3-7-sonnet - Anthropic Claude with reasoning capabilities
  • databricks-gte-large-en - Text embeddings model
  • databricks-dbrx-instruct - Databricks' own foundation model

Provisioned Throughput Endpoints

For production workloads requiring guaranteed performance:

providers:
  - id: databricks:my-custom-endpoint
    config:
      workspaceUrl: https://your-workspace.cloud.databricks.com
      temperature: 0.7
      max_tokens: 500

External Models

Access external models through Databricks' unified API:

providers:
  - id: databricks:my-openai-endpoint
    config:
      workspaceUrl: https://your-workspace.cloud.databricks.com
      # External model endpoints proxy to providers like OpenAI, Anthropic, etc.

Configuration Options

The Databricks provider extends the OpenAI configuration options with these Databricks-specific features:

Parameter Description Default
workspaceUrl Databricks workspace URL. Can also be set via DATABRICKS_WORKSPACE_URL environment variable -
isPayPerToken Whether this is a pay-per-token endpoint (true) or custom deployed endpoint (false) false
usageContext Optional metadata for usage tracking and cost attribution -
aiGatewayConfig AI Gateway features configuration (safety filters, PII handling) -

Advanced Configuration

providers:
  - id: databricks:databricks-claude-3-7-sonnet
    config:
      isPayPerToken: true
      workspaceUrl: https://your-workspace.cloud.databricks.com

      # Standard OpenAI parameters
      temperature: 0.7
      max_tokens: 2000
      top_p: 0.9

      # Usage tracking for cost attribution
      usageContext:
        project: 'customer-support'
        team: 'engineering'
        environment: 'production'

      # AI Gateway features (if enabled on endpoint)
      aiGatewayConfig:
        enableSafety: true
        piiHandling: 'mask' # Options: none, block, mask

Environment Variables

Variable Description
DATABRICKS_WORKSPACE_URL Your Databricks workspace URL
DATABRICKS_TOKEN Authentication token for Databricks API access

Features

Vision Models

Vision models on Databricks require structured JSON prompts similar to OpenAI's format. Here's how to use them:

prompts:
  - file://vision-prompt.json

providers:
  - id: databricks:databricks-claude-3-7-sonnet
    config:
      isPayPerToken: true

tests:
  - vars:
      question: "What's in this image?"
      image_url: 'https://example.com/image.jpg'

Create a vision-prompt.json file with the proper format:

[
  {
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": "{{question}}"
      },
      {
        "type": "image_url",
        "image_url": {
          "url": "{{image_url}}"
        }
      }
    ]
  }
]

Structured Outputs

Get responses in a specific JSON schema:

providers:
  - id: databricks:databricks-meta-llama-3-3-70b-instruct
    config:
      isPayPerToken: true
      response_format:
        type: 'json_schema'
        json_schema:
          name: 'product_info'
          schema:
            type: 'object'
            properties:
              name:
                type: 'string'
              price:
                type: 'number'
            required: ['name', 'price']

Monitoring and Usage Tracking

Track usage and costs with detailed context:

providers:
  - id: databricks:databricks-meta-llama-3-3-70b-instruct
    config:
      isPayPerToken: true
      usageContext:
        application: 'chatbot'
        customer_id: '12345'
        request_type: 'support_query'
        priority: 'high'

Usage data is available through Databricks system tables:

  • system.serving.endpoint_usage - Token usage and request metrics
  • system.serving.served_entities - Endpoint metadata

Best Practices

  1. Choose the right deployment mode:

    • Use pay-per-token for experimentation and low-volume use cases
    • Use provisioned throughput for production workloads requiring SLAs
    • Use external models when you need specific providers' capabilities
  2. Enable AI Gateway features for production endpoints:

    • Safety guardrails prevent harmful content
    • PII detection protects sensitive data
    • Rate limiting controls costs and prevents abuse
  3. Implement proper error handling:

    • Pay-per-token endpoints may have rate limits
    • Provisioned endpoints may have token-per-second limits
    • External model endpoints inherit provider-specific limitations

Example: Multi-Model Comparison

prompts:
  - 'Explain quantum computing to a 10-year-old'

providers:
  # Databricks native model
  - id: databricks:databricks-meta-llama-3-3-70b-instruct
    config:
      isPayPerToken: true
      temperature: 0.7

  # External model via Databricks
  - id: databricks:my-gpt4-endpoint
    config:
      temperature: 0.7

  # Custom deployed model
  - id: databricks:my-finetuned-llama
    config:
      temperature: 0.7

tests:
  - assert:
      - type: llm-rubric
        value: 'Response should be simple, clear, and use age-appropriate analogies'

Troubleshooting

Common issues and solutions:

  1. Authentication errors: Verify your DATABRICKS_TOKEN has the necessary permissions
  2. Endpoint not found:
    • For pay-per-token: Ensure you're using the exact endpoint name (e.g., databricks-meta-llama-3-3-70b-instruct)
    • For custom endpoints: Verify the endpoint exists and is running
  3. Rate limiting: Pay-per-token endpoints have usage limits; consider provisioned throughput for high-volume use
  4. Token count errors: Some models have specific token limits; adjust max_tokens accordingly

Additional Resources

Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...