Let’s say you want to extract structured data from resumes. It starts simple enough…
But first, let’s see where we’re going with this story:
BAML: What it is and how it helps - see the full developer experience
You begin with a basic LLM call to extract a name and skills:
This works… sometimes. But you need structured data, not free text.
So you try JSON mode and add Pydantic for validation:
Better! But now you need more fields. You add education, experience, and location:
The prompt gets longer and more complex. But wait - how do you test this without burning tokens?
Every test costs money and takes time:
You try mocking, but then you’re not testing your actual extraction logic. Your prompt could be completely broken and tests would still pass.
Real resumes break your extraction. The LLM returns malformed JSON:
You add retry logic, JSON fixing, error handling:
Your simple extraction function is now 50+ lines of infrastructure code.
Your company wants to use Claude for some tasks (better reasoning) and GPT-4-mini for others (cost savings):
Each provider has different APIs, different response formats, different capabilities. Your code becomes a mess of if/else statements.
Your extraction fails on certain resumes. You need to debug, but what was actually sent to the LLM?
You start adding logging, token counting, prompt inspection tools…
Now you need to classify seniority levels:
But the LLM doesn’t know what these levels mean! You update the prompt:
Your prompt is getting huge and your business logic is scattered between code and strings.
In production, you need:
Your simple extraction function becomes a complex service:
What if you could go back to something simple, but keep all the power?
Look what you get immediately:
BAML playground showing successful resume extraction with clear prompts and structured output
Test in VSCode playground without API calls or token costs:
Build up a library of test cases that run instantly
BAML’s breakthrough innovation follows Postel’s Law: “Be conservative in what you do, be liberal in what you accept from others.”
Instead of rejecting imperfect outputs, SAP actively transforms them to match your schema using custom edit distance algorithms.
SAP vs Other Approaches:
Key insight: SAP + GPT-3.5 turbo beats GPT-4o + structured outputs, saving you money while improving accuracy.
BAML generates fully typed clients for all languages automatically
See how changes instantly update the prompt:
Change your types → Prompt automatically updates → See the difference immediately
BAML’s semantic streaming lets you build real UIs with loading bars and type-safe implementations:
What this enables:
See semantic streaming in action - structured data streaming with loading states
You started with: A simple LLM call You ended up with: Hundreds of lines of infrastructure code
With BAML, you get:
BAML is what LLM development should have been from the start. Ready to see the difference? Get started with BAML.