Comparing Marvin | Boundary Documentation

Marvin lets developers do extraction or classification tasks in Python as shown below (TypeScript is not supported):

1 import pydantic
2 
3 class Location(pydantic.BaseModel):
4     city: str
5     state: str
6 
7 marvin.extract("I moved from NY to CHI", target=Location)

You can also provide instructions:

1 marvin.extract(
2     "I paid $10 for 3 tacos and got a dollar and 25 cents back.",
3     target=float,
4     instructions="Only extract money"
5 )
6 
7 #  [10.0, 1.25]

or using enums to classify

1 from enum import Enum
2 import marvin
3 
4 class RequestType(Enum):
5     SUPPORT = "support request"
6     ACCOUNT = "account issue"
7     INQUIRY = "general inquiry"
8 
9 request = marvin.classify("Reset my password", RequestType)
10 assert request == RequestType.ACCOUNT

For enum classification, you can add more instructions to each enum, but then you don’t get fully typed outputs, nor can reuse the enum in your own code. You’re back to working with raw strings.

1 # Classifying a task based on project specifications
2 project_specs = {
3     "Frontend": "Tasks involving UI design, CSS, and JavaScript.",
4     "Backend": "Tasks related to server, database, and application logic.",
5     "DevOps": "Tasks involving deployment, CI/CD, and server maintenance."
6 }
7 
8 task_description = "Set up the server for the new application."
9 
10 task_category = marvin.classify(
11     task_description,
12     labels=list(project_specs.keys()),
13     instructions="Match the task to the project category based on the provided specifications."
14 )
15 assert task_category == "Backend"

Marvin has some inherent limitations for example:

How to use a different model?
What is the full prompt? Where does it live? What if I want to change it because it doesn’t work well for my use-case? How many tokens is it?
How do I test this function?
How do I visualize results over time in production?

Using BAML

Here is the BAML equivalent of this classification task based off the prompt Marvin uses under-the-hood. Note how the prompt becomes transparent to you using BAML. You can easily make it more complex or simpler depending on the model.

1 enum RequestType {
2   SUPPORT @alias("support request")
3   ACCOUNT @alias("account issue") @description("A detailed description")
4   INQUIRY @alias("general inquiry")
5 }
6 
7 function ClassifyRequest(input: string) -> RequestType {
8   client GPT4 // choose even open source models
9   prompt #"
10     You are an expert classifier that always maintains as much semantic meaning
11     as possible when labeling text. Classify the provided data,
12     text, or information as one of the provided labels:
13 
14     TEXT:
15     ---
16     {{ input }}
17     ---
18 
19     {{ ctx.output_format }}
20 
21     The best label for the text is:
22   "#
23 }

And you can call this function in your code

1 from baml_client import baml as b
2 
3 ...
4 requestType = await b.ClassifyRequest("Reset my password")
5 # fully typed output
6 assert requestType == RequestType.ACCOUNT

The prompt string may be more wordy, but with BAML you now have

Fully typed responses, guaranteed
Full transparency and flexibility of the prompt string
Full freedom for what model to use
Helper functions to manipulate types in prompts (print_enum)
Testing capabilities using the VSCode playground
Analytics in the Boundary Dashboard
Support for TypeScript
A better understanding of how prompt engineering works

Marvin was a big source of inspiration for us — their approach is simple and elegant. We recommend checking out Marvin if you’re just starting out with prompt engineering or want to do a one-off simple task in Python. But if you’d like a whole added set of features, we’d love for you to give BAML a try and let us know what you think.

Limitations of BAML

BAML does have some limitations we are continuously working on. Here are a few of them:

It is a new language. However, it is fully open source and getting started takes less than 10 minutes. We are on-call 24/7 to help with any issues (and even provide prompt engineering tips)
Developing requires VSCode. You could use vim and we have workarounds but we don’t recommend it.