vertex-ai
The vertex-ai
provider is used to interact with the Google Vertex AI services, specifically the following endpoints:
Example:
Authorization
The vertex-ai
provider uses the Google Cloud SDK to authenticate with a temporary access token. We generate these Google Cloud Authentication Tokens using Google Cloud service account credentials. We do not store this token, and it is only used for the duration of the request.
Instructions for downloading Google Cloud credentials
- Go to the Google Cloud Console.
- Click on the project you want to use.
- Select the
IAM & Admin
section, and click onService Accounts
. - Select an existing service account or create a new one.
- Click on the service account and select
Add Key
. - Choose the JSON key type and click
Create
. - Set the
GOOGLE_APPLICATION_CREDENTIALS
environment variable to the path of the file.
See the Google Cloud Application Default Credentials Docs for more information.
The project_id
of your client object must match the project_id
of your credentials file.
The options are passed through directly to the API, barring a few. Here’s a shorthand of the options:
Non-forwarded options
The base URL for the API.
Default: https://{LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/{LOCATION}/publishers/google/models/
Can be used in lieu of the project_id
and location
fields, to manually set the request URL.
Vertex requires a Google Cloud project ID for each request. See the Google Cloud Project ID Docs for more information.
Vertex requires a location for each request. Some locations may have different models avaiable.
Common locations include:
us-central1
us-west1
us-east1
us-south1
See the Vertex Location Docs for all locations and supported models.
Path to a JSON credentials file or a JSON object containing the credentials.
Default: env.GOOGLE_APPLICATION_CREDENTIALS
This field cannot be used in the BAML Playground. For the playground, use the credentials_content
instead.
Overrides contents of the Google Cloud Application Credentials. Default: env.GOOGLE_APPLICATION_CREDENTIALS_CONTENT
Only use this for the BAML Playground only. Use credentials
for your runtime code.
Directly set Google Cloud Authentication Token in lieu of token generation via env.GOOGLE_APPLICATION_CREDENTIALS
or env.GOOGLE_APPLICATION_CREDENTIALS_CONTENT
fields.
The default role for any prompts that don’t specify a role. Default: user
Additional headers to send with the request.
Example:
Which role metadata should we forward to the API? Default: []
For example you can set this to ["foo", "bar"]
to forward the cache policy to the API.
If you do not set allowed_role_metadata
, we will not forward any role metadata to the API even if it is set in the prompt.
Then in your prompt you can use something like:
You can use the playground to see the raw curl request to see what is being sent to the API.
Whether the internal LLM client should use the streaming API. Default: true
Then in your prompt you can use something like:
Forwarded options
Safety settings to apply to the request. You can stack different safety settings with a new safetySettings
header for each one. See the Google Vertex API Request Docs for more information on what safety settings can be set.
Generation configurations to apply to the request. See the Google Vertex API Request Docs for more information on what properties can be set.
For all other options, see the official Vertex AI documentation.