vertex-ai
The vertex-ai
provider is used to interact with the Google Vertex AI services, specifically the following endpoints:
Example:
Authorization
The vertex-ai
provider uses the Google Cloud SDK to authenticate with a temporary access token. We generate these Google Cloud Authentication Tokens using Google Cloud service account credentials. We do not store this token, and it is only used for the duration of the request.
Instructions for downloading Google Cloud credentials
- Go to the Google Cloud Console.
- Click on the project you want to use.
- Select the
IAM & Admin
section, and click onService Accounts
. - Select an existing service account or create a new one.
- Click on the service account and select
Add Key
. - Choose the JSON key type and click
Create
. - Set the
GOOGLE_APPLICATION_CREDENTIALS
environment variable to the path of the file.
See the Google Cloud Application Default Credentials Docs for more information.
The project_id
of your client object must match the project_id
of your credentials file.
The options are passed through directly to the API, barring a few. Here’s a shorthand of the options:
Non-forwarded options
The base URL for the API.
Default: https://{LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/{LOCATION}/publishers/google/models/
Can be used in lieu of the project_id
and location
fields, to manually set the request URL.
Vertex requires a Google Cloud project ID for each request. See the Google Cloud Project ID Docs for more information.
Vertex requires a location for each request. Some locations may have different models avaiable.
Common locations include:
us-central1
us-west1
us-east1
us-south1
See the Vertex Location Docs for all locations and supported models.
Path to a JSON credentials file or a JSON object containing the credentials.
Default: env.GOOGLE_APPLICATION_CREDENTIALS
This field cannot be used in the BAML Playground. For the playground, use the credentials_content
instead.
Overrides contents of the Google Cloud Application Credentials. Default: env.GOOGLE_APPLICATION_CREDENTIALS_CONTENT
Only use this for the BAML Playground only. Use credentials
for your runtime code.
Directly set Google Cloud Authentication Token in lieu of token generation via env.GOOGLE_APPLICATION_CREDENTIALS
or env.GOOGLE_APPLICATION_CREDENTIALS_CONTENT
fields.
Additional headers to send with the request.
Example:
The role to use if the role is not in the allowed_roles. Default: "user"
usually, but some models like OpenAI’s gpt-4o
will use "system"
Picked the first role in allowed_roles
if not “user”, otherwise “user”.
Which roles should we forward to the API? Default: ["system", "user", "assistant"]
usually, but some models like OpenAI’s o1-mini
will use ["user", "assistant"]
When building prompts, any role not in this list will be set to the default_role
.
Which role metadata should we forward to the API? Default: []
For example you can set this to ["foo", "bar"]
to forward the cache policy to the API.
If you do not set allowed_role_metadata
, we will not forward any role metadata to the API even if it is set in the prompt.
Then in your prompt you can use something like:
You can use the playground to see the raw curl request to see what is being sent to the API.
Whether the internal LLM client should use the streaming API. Default: true
Then in your prompt you can use something like:
Forwarded options
Safety settings to apply to the request. You can stack different safety settings with a new safetySettings
header for each one. See the Google Vertex API Request Docs for more information on what safety settings can be set.
Generation configurations to apply to the request. See the Google Vertex API Request Docs for more information on what properties can be set.
For all other options, see the official Vertex AI documentation.
Publishers Other Than Google
If you are using models from publishers other than Google, such as Llama from
Meta, use your project endpoint as the base_url
in BAML: