Requires BAML version >=0.79.0
First and foremost, BAML provides a high level API where functions are a first class citizen and their execution is fully transparent to the developer. This means that you can simply call a BAML function and everything from prompt rendering, HTTP request building, LLM API network call and response parsing is handled for you. Basic example:
Now we can use this function in our server code after running baml-cli generate:
However, sometimes we may want to execute a function without so much abstraction
or have access to the HTTP request before sending it. For this, BAML provides a
lower level API that exposes the HTTP request and LLM response parser to the
caller. Here’s an example that uses the requests library in Python, the
fetch API in Node.js and the Net::HTTP library in Ruby to manually send an
HTTP request to OpenAI’s API and parse the LLM response.
Note that request.body.json() returns an object (dict in Python, hash in Ruby)
which we are then serializing to JSON, but request.body also exposes the raw
binary buffer so we can skip the serialization:
We can use the same modular API with the official SDKs. Here are some examples:
The OpenAI Responses API uses the /v1/responses endpoint and is designed for enhanced reasoning capabilities. BAML supports this through the openai-responses provider:
Remember that the client is defined in the BAML function (or you can use the client registry):
Remember that the client is defined in the BAML function (or you can use the client registry):
The modular API now returns requests for Bedrock’s Converse API. You can modify it, sign it and forward the request with any HTTP client. A signature with the SignatureV4 SDK is required, we provide examples of how to do this below.
ℹ️ Streaming modular requests are not yet supported for Bedrock. Call
b.request(non-streaming) when targeting AWS, and re-sign after any modifications to the body or headers.
The return type of request.body.json() is Any so you won’t get full type
checking in Python when using the SDKs. Here are some workarounds:
1. Using typing.cast
2. Manually setting the arguments
This will preserve the type hints for the OpenAI SDK but it doesn’t work for Anthropic. On the other hand, Gemini SDK / REST API is built in such a way that it basically forces us to use this pattern as seen in the example above.
TypeScript doesn’t have optional parameters like Python, it uses objects instead so you can just cast to the expected type:
Stream requests and parsing is also supported. Here’s an example using OpenAI SDK:
Currently, BAML doesn’t support OpenAI’s Batch API out of the box, but you can use the modular API to build the prompts and parse the responses of batch jobs. Here’s an example: