google-ai | Boundary Documentation

The google-ai provider supports the https://generativelanguage.googleapis.com/v1beta/models/{model_id}/generateContent and https://generativelanguage.googleapis.com/v1beta/models/{model_id}/streamGenerateContent endpoints.

The use of v1beta rather than v1 aligns with the endpoint conventions established in Google’s SDKs and offers access to both the existing v1 models and additional models exclusive to v1beta.

BAML will automatically pick streamGenerateContent if you call the streaming interface.

Example:

BAML

1 client<llm> MyClient {
2   provider google-ai
3   options {
4     model "gemini-2.5-flash"
5   }
6 }

BAML-specific request `options`

These unique parameters (aka options) modify the API request sent to the provider.

You can use this to modify the headers and base_url for example.

api_key

string

Will be passed as the x-goog-api-key header. Default: env.GOOGLE_API_KEY

x-goog-api-key: $api_key

base_url

string

The base URL for the API. Default: https://generativelanguage.googleapis.com/v1beta

model

string

The model to use. Default: gemini-2.5-flash

We don’t have any checks for this field, you can pass any string you wish.

Model	Use Case	Context	Key Features
gemini-2.5-pro	Complex tasks, coding, STEM	1M	Adaptive thinking, multimodal
gemini-2.5-flash	Production apps, balanced performance	1M	Best price/performance
gemini-2.5-flash-lite	High-volume, cost-sensitive	1M	Lowest cost, fastest

See the Google Model Docs for the latest models.

Some parameters, like temperature, for Gemini Models are specified in the generationConfig object. See Docs

headers

object

Additional headers to send with the request.

Example:

BAML

1 client<llm> MyClient {
2   provider google-ai
3   options {
4     model "gemini-2.5-flash"
5     headers {
6       "X-My-Header" "my-value"
7     }
8     generationConfig {
9       temperature 0.5
10     }
11   }
12 }

default_role

string

The role to use if the role is not in the allowed_roles. Default: "user" usually, but some models like OpenAI’s gpt-5 will use "system"

Picked the first role in allowed_roles if not “user”, otherwise “user”.

allowed_roles

string[]

Which roles should we forward to the API? Default: ["system", "user", "assistant"] usually, but some models like OpenAI’s o1-mini will use ["user", "assistant"]

When building prompts, any role not in this list will be set to the default_role.

remap_roles

map<string, string>

A mapping to transform role names before sending to the API. Default: {} (no remapping)

For google-ai provider, the default is: { "assistant": "model" }

This allows you to use standard role names in your prompts (like “user”, “assistant”, “system”) but send different role names to the API. The remapping happens after role validation and default role assignment.

Example:

1 {
2   "user": "human",
3   "assistant": "ai",
4 }

With this configuration, {{ _.role("user") }} in your prompt will result in a message with role “human” being sent to the API.

allowed_role_metadata

string[]

Which role metadata should we forward to the API? Default: []

For example you can set this to ["foo", "bar"] to forward the cache policy to the API.

If you do not set allowed_role_metadata, we will not forward any role metadata to the API even if it is set in the prompt.

Then in your prompt you can use something like:

1 client<llm> Foo {
2   provider openai
3   options {
4     allowed_role_metadata: ["foo", "bar"]
5   }
6 }
7 
8 client<llm> FooWithout {
9   provider openai
10   options {
11   }
12 }
13 template_string Foo() #"
14   {{ _.role('user', foo={"type": "ephemeral"}, bar="1", cat=True) }}
15   This will be have foo and bar, but not cat metadata. But only for Foo, not FooWithout.
16   {{ _.role('user') }}
17   This will have none of the role metadata for Foo or FooWithout.
18 "#

You can use the playground to see the raw curl request to see what is being sent to the API.

supports_streaming

boolean

Whether the internal LLM client should use the streaming API. Default: true

Then in your prompt you can use something like:

1 client<llm> MyClientWithoutStreaming {
2   provider anthropic
3   options {
4     model claude-3-5-haiku-20241022
5     api_key env.ANTHROPIC_API_KEY
6     max_tokens 1000
7     supports_streaming false
8   }
9 }
10 
11 function MyFunction() -> string {
12   client MyClientWithoutStreaming
13   prompt #"Write a short story"#
14 }

1 # This will be streamed from your python code perspective, 
2 # but under the hood it will call the non-streaming HTTP API
3 # and then return a streamable response with a single event
4 b.stream.MyFunction()
5 
6 # This will work exactly the same as before
7 b.MyFunction()

finish_reason_allow_list

string[]

Which finish reasons are allowed? Default: null

version 0.73.0 onwards: This is case insensitive.

Will raise a BamlClientFinishReasonError if the finish reason is not in the allow list. See Exceptions for more details.

Note, only one of finish_reason_allow_list or finish_reason_deny_list can be set.

For example you can set this to ["stop"] to only allow the stop finish reason, all other finish reasons (e.g. length) will treated as failures that PREVENT fallbacks and retries (similar to parsing errors).

Then in your code you can use something like:

1 client<llm> MyClient {
2   provider "openai"
3   options {
4     model "gpt-5-mini"
5     api_key env.OPENAI_API_KEY
6     // Finish reason allow list will only allow the stop finish reason
7     finish_reason_allow_list ["stop"]
8   }
9 }

finish_reason_deny_list

string[]

Which finish reasons are denied? Default: null

version 0.73.0 onwards: This is case insensitive.

Will raise a BamlClientFinishReasonError if the finish reason is in the deny list. See Exceptions for more details.

Note, only one of finish_reason_allow_list or finish_reason_deny_list can be set.

For example you can set this to ["length"] to stop the function from continuing if the finish reason is length. (e.g. LLM was cut off because it was too long).

Then in your code you can use something like:

1 client<llm> MyClient {
2   provider "openai"
3   options {
4     model "gpt-5-mini"
5     api_key env.OPENAI_API_KEY
6     // Finish reason deny list will allow all finish reasons except length
7     finish_reason_deny_list ["length"]
8   }
9 }

`media_url_handler`

Controls how media URLs are processed before sending to the provider. This allows you to override the default behavior for handling images, audio, PDFs, and videos.

1 client<llm> MyClient {
2   provider openai
3   options {
4     media_url_handler {
5       image "send_base64"                    // Options: send_base64 | send_url | send_url_add_mime_type | send_base64_unless_google_url
6       audio "send_url"
7       pdf "send_url_add_mime_type"
8       video "send_url"
9     }
10   }
11 }

Options

Each media type can be configured with one of these modes:

send_base64 - Always download URLs and convert to base64 data URIs
send_url - Pass URLs through unchanged to the provider
send_url_add_mime_type - Ensure MIME type is present (may require downloading to detect)
send_base64_unless_google_url - Only process non-gs:// URLs (keep Google Cloud Storage URLs as-is)

Provider Defaults

If not specified, each provider uses these defaults:

Provider	Image	Audio	PDF	Video
OpenAI	`send_url`	`send_base64`	`send_url`	`send_url`
Anthropic	`send_url`	`send_url`	`send_base64`	`send_url`
Google AI	`send_base64_unless_google_url`	`send_url`	`send_url`	`send_url`
Vertex AI	`send_url_add_mime_type`	`send_url_add_mime_type`	`send_url`	`send_url`
AWS Bedrock	`send_base64`	`send_base64`	`send_base64`	`send_url`
Azure OpenAI	`send_url`	`send_base64`	`send_url`	`send_url`

When to Use

Use send_base64 when your provider doesn’t support external URLs and you need to embed media content
Use send_url when your provider handles URL fetching and you want to avoid the overhead of base64 conversion
Use send_url_add_mime_type when your provider requires MIME type information (e.g., Vertex AI)
Use send_base64_unless_google_url when working with Google Cloud Storage and want to preserve gs:// URLs

URL fetching happens at request time and may add latency. Consider caching or pre-converting frequently used media when using send_base64 mode.

Google AI uses send_base64_unless_google_url by default for images, which preserves Google Cloud Storage URLs (gs://) while converting other URLs to base64.

Provider request parameters

These are other options that are passed through to the provider, without modification by BAML. For example if the request has a temperature field, you can define it in the client here so every call has that set.

Consult the specific provider’s documentation for more information.

contents

DO NOT USE

BAML will auto construct this field for you from the prompt

For all other options, see the official Google Gemini API documentation.

BAML-specific request options

media_url_handler

Options

Provider Defaults

When to Use

Provider request parameters

BAML-specific request `options`

`media_url_handler`