Comparing AI SDK | Boundary Documentation

AI SDK by Vercel is a powerful toolkit for building AI-powered applications in TypeScript. It’s particularly popular for Next.js and React developers.

Let’s explore how AI SDK handles structured extraction and where the complexity creeps in.

Why working with LLMs requires more than just AI SDK

AI SDK makes structured data generation look elegant at first:

1 import { generateObject } from 'ai';
2 import { openai } from '@ai-sdk/openai';
3 import { z } from 'zod';
4 
5 const Resume = z.object({
6   name: z.string(),
7   skills: z.array(z.string())
8 });
9 
10 const { object } = await generateObject({
11   model: openai('gpt-4o'),
12   schema: Resume,
13   prompt: 'John Doe, Python, Rust'
14 });

Clean and simple! But let’s make it more realistic by adding education:

1 +const Education = z.object({
2 +  school: z.string(),
3 +  degree: z.string(),
4 +  year: z.number()
5 +});
6 
7 const Resume = z.object({
8   name: z.string(),
9   skills: z.array(z.string()),
10 +  education: z.array(Education)
11 });
12 
13 const { object } = await generateObject({
14   model: openai('gpt-4o'),
15   schema: Resume,
16   prompt: `John Doe
17 Python, Rust
18 University of California, Berkeley, B.S. in Computer Science, 2020`
19 });

Still works! But… what’s the actual prompt being sent? How many tokens is this costing?

The visibility problem

Your manager asks: “Why did the extraction fail for this particular resume?”

1 // How do you debug what went wrong?
2 const { object } = await generateObject({
3   model: openai('gpt-4o'),
4   schema: Resume,
5   prompt: complexResumeText
6 });
7 
8 // You can't see:
9 // - The actual prompt sent to the model
10 // - The schema format used
11 // - Why certain fields were missed

You start digging through the AI SDK source code to understand the prompt construction…

Classification challenges

Now your PM wants to classify resumes by seniority level:

1 const SeniorityLevel = z.enum(['junior', 'mid', 'senior', 'staff']);
2 
3 const Resume = z.object({
4   name: z.string(),
5   skills: z.array(z.string()),
6   education: z.array(Education),
7   seniority: SeniorityLevel
8 });

But wait… how do you tell the model what “junior” vs “senior” means? Zod enums are just string literals:

1 // You can't add descriptions to enum values!
2 // How does the model know junior = 0-2 years experience?
3 
4 // You try adding a comment...
5 const SeniorityLevel = z.enum([
6   'junior',  // 0-2 years
7   'mid',     // 2-5 years  
8   'senior',  // 5-10 years
9   'staff'    // 10+ years
10 ]);
11 // But comments aren't sent to the model!
12 
13 // So you end up doing this hack:
14 const { object } = await generateObject({
15   model: openai('gpt-4o'),
16   schema: Resume,
17   prompt: `Extract resume information.
18   
19 Seniority levels:
20 - junior: 0-2 years experience
21 - mid: 2-5 years experience
22 - senior: 5-10 years experience  
23 - staff: 10+ years experience
24 
25 Resume:
26 ${resumeText}`
27 });

Your clean abstraction is leaking…

Multi-provider pain

Your company wants to use different models for different use cases:

1 // First, install a bunch of packages
2 npm install @ai-sdk/openai @ai-sdk/anthropic @ai-sdk/google @ai-sdk/mistral
3 
4 // Import from different packages
5 import { openai } from '@ai-sdk/openai';
6 import { anthropic } from '@ai-sdk/anthropic';
7 import { google } from '@ai-sdk/google';
8 
9 // Now you need provider detection logic
10 function getModel(provider: string) {
11   switch(provider) {
12     case 'openai': return openai('gpt-4o');
13     case 'anthropic': return anthropic('claude-3-opus-20240229');
14     case 'google': return google('gemini-pro');
15     // Don't forget to handle errors...
16   }
17 }
18 
19 // And manage different API keys
20 const providers = {
21   openai: process.env.OPENAI_API_KEY,
22   anthropic: process.env.ANTHROPIC_API_KEY,
23   google: process.env.GOOGLE_API_KEY,
24   // More environment variables to manage...
25 };

Testing without burning money

You want to test your extraction logic:

1 // How do you test this without API calls?
2 const { object } = await generateObject({
3   model: openai('gpt-4o'),
4   schema: Resume,
5   prompt: testResumeText
6 });
7 
8 // Mock the entire AI SDK?
9 jest.mock('ai', () => ({
10   generateObject: jest.fn().mockResolvedValue({
11     object: { name: 'Test', skills: ['JS'] }
12   })
13 }));
14 
15 // But you're not testing your schema or prompt...
16 // Just that your mocks return the right shape

The real-world spiral

As your app grows, you need:

Custom extraction strategies for different document types
Retry logic for flaky models
Token usage tracking for cost control
Prompt versioning for A/B testing

Your code evolves into:

1 class ResumeExtractor {
2   private tokenCounter: TokenCounter;
3   private promptTemplates: Map<string, string>;
4   private retryConfig: RetryConfig;
5   
6   async extract(text: string, options?: ExtractOptions) {
7     const model = this.selectModel(options);
8     const prompt = this.buildPrompt(text, options);
9     
10     return this.withRetry(async () => {
11       const start = Date.now();
12       const tokens = this.tokenCounter.estimate(prompt);
13       
14       try {
15         const result = await generateObject({
16           model,
17           schema: Resume,
18           prompt
19         });
20         
21         this.logUsage({ tokens, duration: Date.now() - start });
22         return result;
23       } catch (error) {
24         this.handleError(error);
25       }
26     });
27   }
28   
29   // ... dozens more methods
30 }

The simple AI SDK call is now buried in layers of infrastructure code.

Enter BAML

BAML was designed for the reality of production LLM applications. Here’s the same resume extraction:

1 class Education {
2   school string
3   degree string
4   year int
5 }
6 
7 enum SeniorityLevel {
8   JUNIOR @description("0-2 years of experience")
9   MID @description("2-5 years of experience")
10   SENIOR @description("5-10 years of experience")
11   STAFF @description("10+ years of experience, technical leadership")
12 }
13 
14 class Resume {
15   name string
16   skills string[]
17   education Education[]
18   seniority SeniorityLevel
19 }
20 
21 function ExtractResume(resume_text: string) -> Resume {
22   client GPT4
23   prompt #"
24     Extract the following information from the resume.
25     
26     Resume:
27     ---
28     {{ resume_text }}
29     ---
30     
31     {{ ctx.output_format }}
32   "#
33 }

Notice what you get immediately:

The prompt is right there - No digging through source code
Enums with descriptions - The model knows what each value means
Type definitions that become prompts - Less tokens, clearer instructions

Multi-model made simple

1 // All providers in one place
2 client<llm> GPT4 {
3   provider openai
4   options {
5     model "gpt-4o"
6     temperature 0.1
7   }
8 }
9 
10 client<llm> Claude {
11   provider anthropic  
12   options {
13     model "claude-3-opus-20240229"
14     temperature 0.1
15   }
16 }
17 
18 client<llm> Gemini {
19   provider google
20   options {
21     model "gemini-pro"
22   }
23 }
24 
25 client<llm> Llama {
26   provider ollama
27   options {
28     model "llama3"
29   }
30 }
31 
32 // Same function, any model
33 function ExtractResume(resume_text: string) -> Resume {
34   client GPT4  // Just change this
35   prompt #"..."#
36 }

Use it in TypeScript:

1 import { b } from '@/baml_client';
2 
3 // Use default model
4 const resume = await b.ExtractResume(resumeText);
5 
6 // Switch models based on your needs
7 const complexResume = await b.ExtractResume(complexText, { client: "Claude" });
8 const simpleResume = await b.ExtractResume(simpleText, { client: "Llama" });
9 
10 // Everything is fully typed!
11 console.log(resume.seniority); // TypeScript knows this is SeniorityLevel

Testing that actually tests

With BAML’s VSCode extension, you can:

Test prompts without API calls - Instant feedback
See exactly what will be sent - Full transparency
Iterate on prompts instantly - No deploy cycles
Save test cases for regression testing

No mocking required - you’re testing the actual prompt and parsing logic.

The bottom line

AI SDK is fantastic for building streaming AI applications in Next.js. But for structured extraction, you end up fighting the abstractions.

BAML’s advantages over AI SDK:

Prompt transparency - See and control exactly what’s sent to the LLM
Purpose-built types - Enums with descriptions, aliases, better schema format
Unified model interface - All providers work the same way, switch with one line
Real testing - Test in VSCode without API calls or burning tokens
Schema-Aligned Parsing - Get structured outputs from any model
Better token efficiency - Optimized schema format uses fewer tokens
Production features - Built-in retries, fallbacks, and error handling

What this means for your TypeScript apps:

Faster development - Test prompts instantly without running Next.js
Better debugging - Know exactly why extraction failed
Cost optimization - See token usage and optimize prompts
Model flexibility - Never get locked into one provider
Cleaner code - No wrapper classes or infrastructure code needed

AI SDK is great for: Rapid prototyping, simple use cases BAML is great for: Production structured extraction, multi-model apps, cost optimization, streaming UIs with semantic streaming

We built BAML because we were tired of elegant APIs that fall apart when you need production reliability and control.

Limitations of BAML

BAML does have some limitations:

It’s a new language (but learning takes < 10 minutes)
Best experience requires VSCode

Ready for bulletproof structured extraction with full control? Try BAML.