Are you sure you want to delete this access key?
sidebar_position |
---|
42 |
The vertex
provider is compatible with Google's Vertex AI offering, which offers access to models such as Gemini and Bison.
You can use it by specifying any of the available stable or latest model versions offered by Vertex AI. These include:
vertex:chat-bison
vertex:chat-bison@001
vertex:chat-bison@002
vertex:chat-bison-32k
vertex:chat-bison-32k@001
vertex:chat-bison-32k@002
vertex:codechat-bison
vertex:codechat-bison@001
vertex:codechat-bison@002
vertex:codechat-bison-32k
vertex:codechat-bison-32k@001
vertex:codechat-bison-32k@002
vertex:gemini-pro
vertex:gemini-ultra
vertex:gemini-1.0-pro-vision
vertex:gemini-1.0-pro-vision-001
vertex:gemini-1.0-pro
vertex:gemini-1.0-pro-001
vertex:gemini-1.0-pro-002
vertex:gemini-pro-vision
vertex:gemini-1.5-pro-latest
vertex:gemini-1.5-pro-preview-0409
vertex:gemini-1.5-pro-preview-0514
vertex:gemini-1.5-pro
vertex:gemini-1.5-pro-001
vertex:gemini-1.0-pro-vision-001
vertex:gemini-1.5-flash-preview-0514
vertex:gemini-1.5-flash-001
vertex:aqa
Embeddings models such as vertex:embedding:text-embedding-004
are also supported.
:::tip
If you are using Google AI Studio, see the google
provider.
:::
To call Vertex AI models in Node, you'll need to install Google's official auth client as a peer dependency:
npm i google-auth-library
Make sure the Vertex AI API is enabled for the relevant project in Google Cloud. Then, ensure that you've selected that project in the gcloud
cli:
gcloud config set project PROJECT_ID
Next, make sure that you've authenticated to Google Cloud using one of these methods:
gcloud auth application-default login
GOOGLE_APPLICATION_CREDENTIALS
environment variable to the path of the credentials file.VERTEX_API_KEY
- gcloud API token. The easiest way to get an API key is to run gcloud auth print-access-token
.VERTEX_PROJECT_ID
- gcloud project IDVERTEX_REGION
- gcloud region, defaults to us-central1
VERTEX_PUBLISHER
- model publisher, defaults to google
VERTEX_API_HOST
- used to override the full Google API host, e.g. for an LLM proxy, defaults to {region}-aiplatform.googleapis.com
VERTEX_API_VERSION
- API version to use, defaults to v1
The Vertex provider also supports various configuration options such as context, examples, temperature, maxOutputTokens, and more, which can be used to customize the the behavior of the model like so:
providers:
- id: vertex:chat-bison-32k
config:
generationConfig:
temperature: 0
maxOutputTokens: 1024
AI safety settings can be configured using the safetySettings
key. For example:
- id: vertex:gemini-pro
config:
safetySettings:
- category: HARM_CATEGORY_HARASSMENT
threshold: BLOCK_ONLY_HIGH
- category: HARM_CATEGORY_VIOLENCE
threshold: BLOCK_MEDIUM_AND_ABOVE
See more details on Google's SafetySetting API here.
To use Vertex models for model grading (e.g. llm-rubric
and factuality
assertions), or to override the embeddings provider for the similar
assertion, use defaultTest
to override providers for all tests:
defaultTest:
options:
provider:
# Use gemini-pro for model-graded evals (e.g. assertions such as llm-rubric)
text: vertex:chat:gemini-pro
# Use vertex embeddings for similarity
embedding: vertex:embedding:text-embedding-004
Option | Description | Default Value |
---|---|---|
apiKey |
gcloud API token. | None |
apiHost |
Full Google API host, e.g., for an LLM proxy. | {region}-aiplatform.googleapis.com |
apiVersion |
API version to use. | v1 |
projectId |
gcloud project ID. | None |
region |
gcloud region. | us-central1 |
publisher |
Model publisher. | google |
context |
Context for the model to consider when generating responses. | None |
examples |
Examples to prime the model. | None |
safetySettings |
Safety settings to filter generated content. | None |
generationConfig.temperature |
Controls randomness. Lower values make the model more deterministic. | None |
generationConfig.maxOutputTokens |
Maximum number of tokens to generate. | None |
generationConfig.topP |
Nucleus sampling: higher values cause the model to consider more candidates. | None |
generationConfig.topK |
Controls diversity via random sampling: lower values make sampling more deterministic. | None |
generationConfig.stopSequences |
Set of string outputs that will stop output generation. | [] |
toolConfig |
Configuration for tool usage | None |
systemInstruction |
System prompt. Nunjucks template variables {{var}} are supported |
None |
Note that not all models support all parameters. Please consult the Google documentation on how to use and format parameters.
gcloud
reauth related errorsIf you get an error that looks like this:
API call error: Error: {"error":"invalid_grant","error_description":"reauth related error (invalid_rapt)","error_uri":"https://support.google.com/a/answer/9368756","error_subtype":"invalid_rapt"}
This is due to an authentication issue. You can fix this by running:
gcloud auth application-default login
Press p or to see the previous file or, n or to see the next file
Browsing data directories saved to S3 is possible with DAGsHub. Let's configure your repository to easily display your data in the context of any commit!
promptfoo is now integrated with AWS S3!
Are you sure you want to delete this access key?
Browsing data directories saved to Google Cloud Storage is possible with DAGsHub. Let's configure your repository to easily display your data in the context of any commit!
promptfoo is now integrated with Google Cloud Storage!
Are you sure you want to delete this access key?
Browsing data directories saved to Azure Cloud Storage is possible with DAGsHub. Let's configure your repository to easily display your data in the context of any commit!
promptfoo is now integrated with Azure Cloud Storage!
Are you sure you want to delete this access key?
Browsing data directories saved to S3 compatible storage is possible with DAGsHub. Let's configure your repository to easily display your data in the context of any commit!
promptfoo is now integrated with your S3 compatible storage!
Are you sure you want to delete this access key?