azure-openai — Boundary Documentation

For azure-openai, we provide a client that can be used to interact with the OpenAI API hosted on Azure using the /chat/completions endpoint.

Example:

BAML

1 client<llm> MyClient {
2   provider azure-openai
3   options {
4     resource_name "my-resource-name"
5     deployment_id "my-deployment-id"
6     // Alternatively, you can use the base_url field
7     // base_url "https://my-resource-name.openai.azure.com/openai/deployments/my-deployment-id"
8     api_version "2024-02-01"
9     api_key env.AZURE_OPENAI_API_KEY
10   }
11 }

api_version is required. Azure will return not found if the version is not specified.

The options are passed through directly to the API, barring a few. Here’s a shorthand of the options:

BAML-specific request `options`

These unique parameters (aka options) modify the API request sent to the provider.

You can use this to modify the azure api key, base url, and api version for example.

api_key

string

Will be injected via the header API-KEY. Default: env.AZURE_OPENAI_API_KEY

API-KEY: $api_key

base_url

string

The base URL for the API. Default: https://${resource_name}.openai.azure.com/openai/deployments/${deployment_id}

May be used instead of resource_name and deployment_id.

deployment_id

stringRequired

See the base_url field.

resource_name

stringRequired

See the base_url field.

api_version

stringRequired

Will be passed via a query parameter api-version.

headers

object

Additional headers to send with the request.

Example:

BAML

1 client<llm> MyClient {
2   provider azure-openai
3   options {
4     resource_name "my-resource-name"
5     deployment_id "my-deployment-id"
6     api_version "2024-02-01"
7     api_key env.AZURE_OPENAI_API_KEY
8     headers {
9       "X-My-Header" "my-value"
10     }
11   }
12 }

default_role

string

The role to use if the role is not in the allowed_roles. Default: "user" usually, but some models like OpenAI’s gpt-4o will use "system"

Picked the first role in allowed_roles if not “user”, otherwise “user”.

allowed_roles

string[]

Which roles should we forward to the API? Default: ["system", "user", "assistant"] usually, but some models like OpenAI’s o1-mini will use ["user", "assistant"]

When building prompts, any role not in this list will be set to the default_role.

allowed_role_metadata

string[]

Which role metadata should we forward to the API? Default: []

For example you can set this to ["foo", "bar"] to forward the cache policy to the API.

If you do not set allowed_role_metadata, we will not forward any role metadata to the API even if it is set in the prompt.

Then in your prompt you can use something like:

1 client<llm> Foo {
2   provider openai
3   options {
4     allowed_role_metadata: ["foo", "bar"]
5   }
6 }
7 
8 client<llm> FooWithout {
9   provider openai
10   options {
11   }
12 }
13 template_string Foo() #"
14   {{ _.role('user', foo={"type": "ephemeral"}, bar="1", cat=True) }}
15   This will be have foo and bar, but not cat metadata. But only for Foo, not FooWithout.
16   {{ _.role('user') }}
17   This will have none of the role metadata for Foo or FooWithout.
18 "#

You can use the playground to see the raw curl request to see what is being sent to the API.

supports_streaming

boolean

Whether the internal LLM client should use the streaming API. Default: true

Then in your prompt you can use something like:

1 client<llm> MyClientWithoutStreaming {
2   provider anthropic
3   options {
4     model claude-3-haiku-20240307
5     api_key env.ANTHROPIC_API_KEY
6     max_tokens 1000
7     supports_streaming false
8   }
9 }
10 
11 function MyFunction() -> string {
12   client MyClientWithoutStreaming
13   prompt #"Write a short story"#
14 }

1 # This will be streamed from your python code perspective, 
2 # but under the hood it will call the non-streaming HTTP API
3 # and then return a streamable response with a single event
4 b.stream.MyFunction()
5 
6 # This will work exactly the same as before
7 b.MyFunction()

finish_reason_allow_list

string[]

Which finish reasons are allowed? Default: null

version 0.73.0 onwards: This is case insensitive.

Will raise a BamlClientFinishReasonError if the finish reason is not in the allow list. See Exceptions for more details.

Note, only one of finish_reason_allow_list or finish_reason_deny_list can be set.

For example you can set this to ["stop"] to only allow the stop finish reason, all other finish reasons (e.g. length) will treated as failures that PREVENT fallbacks and retries (similar to parsing errors).

Then in your code you can use something like:

1 client<llm> MyClient {
2   provider "openai"
3   options {
4     model "gpt-4o-mini"
5     api_key env.OPENAI_API_KEY
6     // Finish reason allow list will only allow the stop finish reason
7     finish_reason_allow_list ["stop"]
8   }
9 }

finish_reason_deny_list

string[]

Which finish reasons are denied? Default: null

version 0.73.0 onwards: This is case insensitive.

Will raise a BamlClientFinishReasonError if the finish reason is in the deny list. See Exceptions for more details.

Note, only one of finish_reason_allow_list or finish_reason_deny_list can be set.

For example you can set this to ["length"] to stop the function from continuing if the finish reason is length. (e.g. LLM was cut off because it was too long).

Then in your code you can use something like:

1 client<llm> MyClient {
2   provider "openai"
3   options {
4     model "gpt-4o-mini"
5     api_key env.OPENAI_API_KEY
6     // Finish reason deny list will allow all finish reasons except length
7     finish_reason_deny_list ["length"]
8   }
9 }

client_response_type

openai | anthropic | google | vertexDefaults to openai

Please let us know on Discord if you have this use case! This is in alpha and we’d like to make sure we continue to cover your use cases.

The type of response to return from the client.

Sometimes you may expect a different response format than the provider default. For example, using Azure you may be proxying to an endpoint that returns a different format than the OpenAI default.

Default: openai

Provider request parameters

These are other options that are passed through to the provider, without modification by BAML. For example if the request has a temperature field, you can define it in the client here so every call has that set.

Consult the specific provider’s documentation for more information.

For reasoning models (like o1 or o1-mini), you must use max_completion_tokens instead of max_tokens. Please set max_tokens to null in order to get this to work.

See the OpenAI API documentation and OpenAI Reasoning Docs for more details about token handling.

Example:

BAML

1 client<llm> AzureO1 {
2   provider azure-openai
3   options {
4     deployment_id "o1-mini"
5     max_tokens null
6   }
7 }

messages

DO NOT USE

BAML will auto construct this field for you from the prompt

stream

DO NOT USE

BAML will auto construct this field for you based on how you call the client in your code

For all other options, see the official Azure API documentation.

1	client<llm> MyClient {
2	provider azure-openai
3	options {
4	resource_name "my-resource-name"
5	deployment_id "my-deployment-id"
6	// Alternatively, you can use the base_url field
7	// base_url "https://my-resource-name.openai.azure.com/openai/deployments/my-deployment-id"
8	api_version "2024-02-01"
9	api_key env.AZURE_OPENAI_API_KEY
10	}
11	}

1	client<llm> Foo {
2	provider openai
3	options {
4	allowed_role_metadata: ["foo", "bar"]
5	}
6	}
7
8	client<llm> FooWithout {
9	provider openai
10	options {
11	}
12	}
13	template_string Foo() #"
14	{{ _.role('user', foo={"type": "ephemeral"}, bar="1", cat=True) }}
15	This will be have foo and bar, but not cat metadata. But only for Foo, not FooWithout.
16	{{ _.role('user') }}
17	This will have none of the role metadata for Foo or FooWithout.
18	"#

1	client<llm> MyClientWithoutStreaming {
2	provider anthropic
3	options {
4	model claude-3-haiku-20240307
5	api_key env.ANTHROPIC_API_KEY
6	max_tokens 1000
7	supports_streaming false
8	}
9	}
10
11	function MyFunction() -> string {
12	client MyClientWithoutStreaming
13	prompt #"Write a short story"#
14	}

1	# This will be streamed from your python code perspective,
2	# but under the hood it will call the non-streaming HTTP API
3	# and then return a streamable response with a single event
4	b.stream.MyFunction()
5
6	# This will work exactly the same as before
7	b.MyFunction()

1	client<llm> MyClient {
2	provider "openai"
3	options {
4	model "gpt-4o-mini"
5	api_key env.OPENAI_API_KEY
6	// Finish reason allow list will only allow the stop finish reason
7	finish_reason_allow_list ["stop"]
8	}
9	}

1	client<llm> AzureO1 {
2	provider azure-openai
3	options {
4	deployment_id "o1-mini"
5	max_tokens null
6	}
7	}