Multi-Modal (Images / Audio)
Multi-modal input
You can use audio
or image
input types in BAML prompts. Just create an input argument of that type and render it in the prompt.
Check the “raw curl” checkbox in the playground to see how BAML translates multi-modal input into the LLM Request body.
See how to test images in the playground.
Calling Multimodal BAML Functions
Images
Calling a BAML function with an image
input argument type (see image types)
The from_url
and from_base64
methods create an Image
object based on input type.
Audio
Calling functions that have audio
types. See audio types