Testing functions

You can test your BAML functions in the VSCode Playground by adding a test snippet into a BAML file:

1enum Category {
2 Refund
3 CancelOrder
4 TechnicalSupport
5 AccountIssue
6 Question
7}
8
9function ClassifyMessage(input: string) -> Category {
10 client GPT4Turbo
11 prompt #"
12 ... truncated ...
13 "#
14}
15
16test Test1 {
17 functions [ClassifyMessage]
18 args {
19 // input is the first argument of ClassifyMessage
20 input "Can't access my account using my usual login credentials, and each attempt results in an error message stating 'Invalid username or password.' I have tried resetting my password using the 'Forgot Password' link, but I haven't received the promised password reset email."
21 }
22 @@assert( {{ this == AccountIssue }})
23}

See the interactive examples

The BAML playground will give you a starting snippet to copy that will match your function signature.

BAML doesn’t use colons : between key-value pairs except in function parameters.


Complex object inputs

Objects are injected as dictionaries

1class Message {
2 user string
3 content string
4}
5
6function ClassifyMessage(messages: Messages[]) -> Category {
7...
8}
9
10test Test1 {
11 functions [ClassifyMessage]
12 args {
13 messages [
14 {
15 user "hey there"
16 // multi-line string using the #"..."# syntax
17 content #"
18 You can also add a multi-line
19 string with the hashtags
20 Instead of ugly json with \n
21 "#
22 }
23 ]
24 }
25}

Test Image Inputs in the Playground

For a function that takes an image as input, like so:

1function MyFunction(myImage: image) -> string {
2 client GPT4o
3 prompt #"
4 Describe this image: {{myImage}}
5 "#
6}

You can define test cases using image files, URLs, or base64 strings.

Committing a lot of images into your repository can make it slow to clone and pull your repository. If you expect to commit >500MiB of images, please read GitHub’s size limit documentation and consider setting up large file storage.

1test Test1 {
2 functions [MyFunction]
3 args {
4 myImage {
5 file "../path/to/image.png"
6 }
7 }
8}
file
stringRequired

The path to the image file, relative to the directory containing the current BAML file.

Image files must be somewhere in baml_src/.

media_type
string

The mime-type of the image. If not set, and the provider expects a mime-type to be provided, BAML will try to infer it based on first, the file extension, and second, the contents of the file.


Test Audio Inputs in the Playground

For a function that takes audio as input, like so:

1function MyFunction(myAudio: audio) -> string {
2 client GPT4o
3 prompt #"
4 Describe this audio: {{myAudio}}
5 "#
6}

You can define test cases using audio files, URLs, or base64 strings.

Committing a lot of audio files into your repository can make it slow to clone and pull your repository. If you expect to commit >500MiB of audio, please read GitHub’s size limit documentation and consider setting up large file storage.

1test Test1 {
2 functions [MyFunction]
3 args {
4 myAudio {
5 file "../path/to/audio.mp3"
6 }
7 }
8}
file
stringRequired

The path to the audio file, relative to the directory containing the current BAML file.

audio files must be somewhere in baml_src/.

media_type
string

The mime-type of the audio. If not set, and the provider expects a mime-type to be provided, BAML will try to infer it based on first, the file extension, and second, the contents of the file.

Assertions

Test blocks in BAML code may contain checks and asserts. These attributes behave similarly to value-level Checks and Asserts, with several additional variables available in the context of the jinja expressions you can write in a test:

  • The _ variable contains fields result, checks and latency_ms.
  • The this variable refers to the value computed by the test, and is shorthand for _.result.
  • In a given check or assert, _.checks.$NAME can refer to the NAME of any earlier check that was run in the same test block. By referring to prior checks, you can build compound checks and asserts, for example asserting that all checks of a certain type passed.

The following example illustrates how each of these features can be used to validate a test result.

1test MyTest {
2 functions [EchoString]
3 args {
4 input "example input"
5 }
6 @@check( nonempty, {{ this|length > 0 }} )
7 @@check( small_enough, {{ _.result|length < 1000 }} )
8 @@assert( {{ _.checks.nonempty and _.checks.small_enough }})
9 @@assert( {{ _.latency_ms < 1000 }})
10}

@@check and @@assert behave differently:

  • A @@check represents a property of the test result that should either be manually checked or checked by a subsequent stage in the test. Multiple @@check predicates can fail without causing a hard failure of the test.
  • An @@assert represents a hard guarantee. The first failing assert will halt the remainder of the checks and asserts in this particular test.

For more information about the syntax used inside @@check and @@assert attributes, see Checks and Asserts

Built with