For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Help on Discord
HomeGuideExamplesBAML ReferencePlaygroundAgents.mdChangelog
HomeGuideExamplesBAML ReferencePlaygroundAgents.mdChangelog
    • Overview
  • baml-cli
    • init
    • generate
    • test
    • serve
    • dev
    • fmt
  • Language Reference
    • Types
    • function
    • test
    • template_string
    • client<llm>
    • class
    • enum
    • generator
  • Generated baml_client
    • with_options(..)
    • AbortSignal / Cancellation
    • Collector
    • logging / env vars
    • AsyncClient / SyncClient
    • TypeBuilder
    • ClientRegistry
    • client Option
    • OnTick
    • Multimodal
    • Image
    • Audio
    • Pdf
    • Video
  • Attributes
    • What are attributes?
    • @alias / @@alias
    • @description / @@description
    • @skip
    • @assert
    • @check
    • Jinja in Attributes
    • @@dynamic
  • LLM Client Providers
    • Overview
    • AWS Bedrock
    • Anthropic
    • Google AI: Gemini
    • Google: Vertex
    • OpenAI
    • OpenAI Responses API
    • OpenAI from Azure
    • OpenRouter
    • openai-generic
    • Microsoft Foundry (openai-generic)
    • Cerebras (openai-generic)
    • Groq (openai-generic)
    • Hugging Face (openai-generic)
    • Keywords AI (openai-generic)
    • Llama API (openai-generic)
    • Litellm (openai-generic)
    • LM Studio (openai-generic)
    • Ollama (openai-generic)
    • Vercel AI Gateway (openai-generic)
    • Tinfoil (openai-generic)
    • TogetherAI (openai-generic)
    • Unify AI (openai-generic)
    • vLLM (openai-generic)
  • LLM Client Strategies
    • Timeout Configuration
    • Retry Policy
    • Fallback
    • Round Robin
  • Prompt Syntax
    • What is jinja?
    • Jinja Filters
    • ctx.output_format
    • ctx.client
    • _.role
    • Variables
    • Conditionals
    • Loops
  • Editor Extension Settings
    • baml.cliPath
    • baml.generateCodeOnSave
    • baml.enablePlaygroundProxy
    • baml.syncExtensionToGeneratorVersion
Help on Discord
LogoLogo
On this page
  • BAML Functions
  • Call Patterns
  • .stream
  • .request
  • .stream_request
  • .parse
  • .parse_stream
Generated baml_client

AsyncClient / SyncClient

Was this page helpful?
Edit this page
Previous

TypeBuilder

Next
Built with

BAML generates both a sync client and an async client. They offer the exact same public API but methods are either synchronous or asynchronous.

BAML Functions

The generated client exposes all the functions that you’ve defined your BAML files as methods. Suppose we have this file named baml_src/literature.baml:

baml_src/literature.baml
1function TellMeAStory() -> string {
2 client "openai/gpt-4o"
3 prompt #"
4 Tell me a story
5 "#
6}
7
8function WriteAPoemAbout(input: string) -> string {
9 client "openai/gpt-4o"
10 prompt #"
11 Write a poem about {{ input }}
12 "#
13}

After running baml-cli generate you can directly call these functions from your code. Here’s an example using the async client:

Python
TypeScript
Go
Ruby
Rust
1from baml_client.async_client import b
2
3async def example():
4 # Call your BAML functions.
5 story = await b.TellMeAStory()
6 poem = await b.WriteAPoemAbout("Roses")

The sync client is exactly the same but it doesn’t need an async runtime, instead it just blocks.

Python
TypeScript
Go
Ruby
Rust
1from baml_client.sync_client import b
2
3def example():
4 # Call your BAML functions.
5 story = b.TellMeAStory()
6 poem = b.WriteAPoemAbout("Roses")

Call Patterns

The client object exposes some references to other objects that call your functions in a different manner.

.stream

The .stream object is used to stream the response from a function.

Python
TypeScript
Go
Ruby
Rust
1from baml_client.async_client import b
2
3async def example():
4 stream = b.stream.TellMeAStory()
5
6 async for partial in stream:
7 print(partial)
8
9 print(await stream.get_final_response())

.request

This feature was added in: v0.79.0

The .request object returns the raw HTTP request but it does not send it. However, the async client still returns an awaitable object because we might need to resolve media types like images and convert them to base64 or the required format in order to send them to the LLM.

Python
TypeScript
Ruby
Rust
1from baml_client.async_client import b
2
3async def example():
4 request = await b.request.TellMeAStory()
5 print(request.url)
6 print(request.headers)
7 print(request.body.json())

.stream_request

This feature was added in: v0.79.0

Same as .request but sets the streaming options to true.

Python
TypeScript
Ruby
Rust
1from baml_client.async_client import b
2
3async def example():
4 request = await b.stream_request.TellMeAStory()
5 print(request.url)
6 print(request.headers)
7 print(request.body.json())

.parse

This feature was added in: v0.79.0

The .parse object is used to parse the response returned by the LLM after the function call. Can be used in combination with .request.

Python
TypeScript
Ruby
Rust
1import requests
2# requests is not async so for simplicity we'll use the sync client.
3from baml_client.sync_client import b
4
5def example():
6 # Get the HTTP request.
7 request = b.request.TellMeAStory()
8
9 # Send the HTTP request.
10 response = requests.post(request.url, headers=request.headers, json=request.body.json())
11
12 # Parse the LLM response.
13 parsed = b.parse.TellMeAStory(response.json()["choices"][0]["message"]["content"])
14
15 # Fully parsed response.
16 print(parsed)

.parse_stream

This feature was added in: v0.79.0

Same as .parse but for streaming responses. Can be used in combination with .stream_request.

Python
TypeScript
1from openai import AsyncOpenAI
2from baml_client.async_client import b
3
4async def example():
5 client = AsyncOpenAI()
6
7 request = await b.stream_request.TellMeAStory()
8 stream = await client.chat.completions.create(**request.body.json())
9
10 llm_response: list[str] = []
11 async for chunk in stream:
12 if len(chunk.choices) > 0 and chunk.choices[0].delta.content is not None:
13 llm_response.append(chunk.choices[0].delta.content)
14 print(b.parse_stream.TellMeAStory("".join(llm_response)))