Testing functions | Boundary Documentation

You can test your BAML functions in the VSCode Playground by adding a test snippet into a BAML file:

1 enum Category {
2     Refund
3     CancelOrder
4     TechnicalSupport
5     AccountIssue
6     Question
7 }
8 
9 function ClassifyMessage(input: string) -> Category {
10   client GPT4Turbo
11   prompt #"
12     ... truncated ...
13   "#
14 }
15 
16 test Test1 {
17   functions [ClassifyMessage]
18   args {
19     // input is the first argument of ClassifyMessage
20     input "Can't access my account using my usual login credentials, and each attempt results in an error message stating 'Invalid username or password.' I have tried resetting my password using the 'Forgot Password' link, but I haven't received the promised password reset email."
21   }
22   // 'this' is the output of the function
23   @@assert( {{ this == "AccountIssue" }})
24 }

Try it! Press ‘Run Test’ below!

See more interactive examples

The BAML playground will give you a starting snippet to copy that will match your function signature.

BAML doesn’t use colons : between key-value pairs except in function parameters.

Complex object inputs

Objects are injected as dictionaries

1 class Message {
2   user string
3   content string
4 }
5 
6 function ClassifyMessage(messages: Messages[]) -> Category {
7 ...
8 }
9 
10 test Test1 {
11   functions [ClassifyMessage]
12   args {
13     messages [
14       {
15         user "hey there"
16         // multi-line string using the #"..."# syntax
17         content #"
18           You can also add a multi-line
19           string with the hashtags
20           Instead of ugly json with \n
21         "#
22       }
23     ]
24   }
25 }

Test Image Inputs in the Playground

For a function that takes an image as input, like so:

1 function MyFunction(myImage: image) -> string {
2   client GPT4o
3   prompt #"
4     Describe this image: {{myImage}}
5   "#
6 }

You can define test cases using image files, URLs, or base64 strings.

File

URL

Base64

Committing a lot of images into your repository can make it slow to clone and pull your repository. If you expect to commit >500MiB of images, please read GitHub’s size limit documentation and consider setting up large file storage.

1 test Test1 {
2   functions [MyFunction]
3   args {
4     myImage {
5       file "../path/to/image.png"
6     }
7   }
8 }

file

stringRequired

The path to the image file, relative to the directory containing the current BAML file.

Image files must be somewhere in baml_src/.

media_type

string

The mime-type of the image. If not set, and the provider expects a mime-type to be provided, BAML will try to infer it based on first, the file extension, and second, the contents of the file.

Test Audio Inputs in the Playground

For a function that takes audio as input, like so:

1 function MyFunction(myAudio: audio) -> string {
2   client GPT4o
3   prompt #"
4     Describe this audio: {{myAudio}}
5   "#
6 }

You can define test cases using audio files, URLs, or base64 strings.

File

URL

Base64

Committing a lot of audio files into your repository can make it slow to clone and pull your repository. If you expect to commit >500MiB of audio, please read GitHub’s size limit documentation and consider setting up large file storage.

1 test Test1 {
2   functions [MyFunction]
3   args {
4     myAudio {
5       file "../path/to/audio.mp3"
6     }
7   }
8 }

file

stringRequired

The path to the audio file, relative to the directory containing the current BAML file.

audio files must be somewhere in baml_src/.

media_type

string

The mime-type of the audio. If not set, and the provider expects a mime-type to be provided, BAML will try to infer it based on first, the file extension, and second, the contents of the file.

Assertions

Test blocks in BAML code may contain checks and asserts. These attributes behave similarly to value-level Checks and Asserts, with several additional variables available in the context of the jinja expressions you can write in a test:

The _ variable contains fields result, checks and latency_ms.
The this variable refers to the value computed by the test, and is shorthand for _.result.
In a given check or assert, _.checks.$NAME can refer to the NAME of any earlier check that was run in the same test block. By referring to prior checks, you can build compound checks and asserts, for example asserting that all checks of a certain type passed.

The following example illustrates how each of these features can be used to validate a test result.

1 test MyTest {
2   functions [EchoString]
3   args {
4     input "example input"
5   }
6   @@check( nonempty, {{ this|length > 0 }} )
7   @@check( small_enough, {{ _.result|length < 1000 }} )
8   @@assert( {{ _.checks.nonempty and _.checks.small_enough }})
9   @@assert( {{ _.latency_ms < 1000 }})
10 }

@@check and @@assert behave differently:

A @@check represents a property of the test result that should either be manually checked or checked by a subsequent stage in the test. Multiple @@check predicates can fail without causing a hard failure of the test.
An @@assert represents a hard guarantee. The first failing assert will halt the remainder of the checks and asserts in this particular test.

For more information about the syntax used inside @@check and @@assert attributes, see Checks and Asserts

Dynamic Types Tests

Classes and enums marked with the @@dynamic attribute can be modified in tests using the type_builder and dynamic blocks.

1 class DynamicClass {
2     static_prop string
3     @@dynamic
4 }
5 
6 function ReturnDynamicClass(input: string) -> DynamicClass {
7     // ...
8 }
9 
10 test DynamicClassTest {
11     functions [ReturnDynamicClass]
12     type_builder {
13         dynamic class DynamicClass {
14             new_prop_here string
15         }
16     }
17     args {
18         input "test data"
19     }
20 }

The type_builder block can contain new types scoped to the parent test block and also dynamic blocks that act as modifiers for dynamic classes or enums.