Comparing AI SDK

AI SDK by Vercel is a powerful toolkit for building AI-powered applications in TypeScript. It’s particularly popular for Next.js and React developers.

Let’s explore how AI SDK handles structured extraction and where the complexity creeps in.

Why working with LLMs requires more than just AI SDK

AI SDK makes structured data generation look elegant at first:

1import { generateObject } from 'ai';
2import { openai } from '@ai-sdk/openai';
3import { z } from 'zod';
4
5const Resume = z.object({
6 name: z.string(),
7 skills: z.array(z.string())
8});
9
10const { object } = await generateObject({
11 model: openai('gpt-4o'),
12 schema: Resume,
13 prompt: 'John Doe, Python, Rust'
14});

Clean and simple! But let’s make it more realistic by adding education:

1+const Education = z.object({
2+ school: z.string(),
3+ degree: z.string(),
4+ year: z.number()
5+});
6
7const Resume = z.object({
8 name: z.string(),
9 skills: z.array(z.string()),
10+ education: z.array(Education)
11});
12
13const { object } = await generateObject({
14 model: openai('gpt-4o'),
15 schema: Resume,
16 prompt: `John Doe
17Python, Rust
18University of California, Berkeley, B.S. in Computer Science, 2020`
19});

Still works! But… what’s the actual prompt being sent? How many tokens is this costing?

The visibility problem

Your manager asks: “Why did the extraction fail for this particular resume?”

1// How do you debug what went wrong?
2const { object } = await generateObject({
3 model: openai('gpt-4o'),
4 schema: Resume,
5 prompt: complexResumeText
6});
7
8// You can't see:
9// - The actual prompt sent to the model
10// - The schema format used
11// - Why certain fields were missed

You start digging through the AI SDK source code to understand the prompt construction…

Classification challenges

Now your PM wants to classify resumes by seniority level:

1const SeniorityLevel = z.enum(['junior', 'mid', 'senior', 'staff']);
2
3const Resume = z.object({
4 name: z.string(),
5 skills: z.array(z.string()),
6 education: z.array(Education),
7 seniority: SeniorityLevel
8});

But wait… how do you tell the model what “junior” vs “senior” means? Zod enums are just string literals:

1// You can't add descriptions to enum values!
2// How does the model know junior = 0-2 years experience?
3
4// You try adding a comment...
5const SeniorityLevel = z.enum([
6 'junior', // 0-2 years
7 'mid', // 2-5 years
8 'senior', // 5-10 years
9 'staff' // 10+ years
10]);
11// But comments aren't sent to the model!
12
13// So you end up doing this hack:
14const { object } = await generateObject({
15 model: openai('gpt-4o'),
16 schema: Resume,
17 prompt: `Extract resume information.
18
19Seniority levels:
20- junior: 0-2 years experience
21- mid: 2-5 years experience
22- senior: 5-10 years experience
23- staff: 10+ years experience
24
25Resume:
26${resumeText}`
27});

Your clean abstraction is leaking…

Multi-provider pain

Your company wants to use different models for different use cases:

1// First, install a bunch of packages
2npm install @ai-sdk/openai @ai-sdk/anthropic @ai-sdk/google @ai-sdk/mistral
3
4// Import from different packages
5import { openai } from '@ai-sdk/openai';
6import { anthropic } from '@ai-sdk/anthropic';
7import { google } from '@ai-sdk/google';
8
9// Now you need provider detection logic
10function getModel(provider: string) {
11 switch(provider) {
12 case 'openai': return openai('gpt-4o');
13 case 'anthropic': return anthropic('claude-3-opus-20240229');
14 case 'google': return google('gemini-pro');
15 // Don't forget to handle errors...
16 }
17}
18
19// And manage different API keys
20const providers = {
21 openai: process.env.OPENAI_API_KEY,
22 anthropic: process.env.ANTHROPIC_API_KEY,
23 google: process.env.GOOGLE_API_KEY,
24 // More environment variables to manage...
25};

Testing without burning money

You want to test your extraction logic:

1// How do you test this without API calls?
2const { object } = await generateObject({
3 model: openai('gpt-4o'),
4 schema: Resume,
5 prompt: testResumeText
6});
7
8// Mock the entire AI SDK?
9jest.mock('ai', () => ({
10 generateObject: jest.fn().mockResolvedValue({
11 object: { name: 'Test', skills: ['JS'] }
12 })
13}));
14
15// But you're not testing your schema or prompt...
16// Just that your mocks return the right shape

The real-world spiral

As your app grows, you need:

  • Custom extraction strategies for different document types
  • Retry logic for flaky models
  • Token usage tracking for cost control
  • Prompt versioning for A/B testing

Your code evolves into:

1class ResumeExtractor {
2 private tokenCounter: TokenCounter;
3 private promptTemplates: Map<string, string>;
4 private retryConfig: RetryConfig;
5
6 async extract(text: string, options?: ExtractOptions) {
7 const model = this.selectModel(options);
8 const prompt = this.buildPrompt(text, options);
9
10 return this.withRetry(async () => {
11 const start = Date.now();
12 const tokens = this.tokenCounter.estimate(prompt);
13
14 try {
15 const result = await generateObject({
16 model,
17 schema: Resume,
18 prompt
19 });
20
21 this.logUsage({ tokens, duration: Date.now() - start });
22 return result;
23 } catch (error) {
24 this.handleError(error);
25 }
26 });
27 }
28
29 // ... dozens more methods
30}

The simple AI SDK call is now buried in layers of infrastructure code.

Enter BAML

BAML was designed for the reality of production LLM applications. Here’s the same resume extraction:

1class Education {
2 school string
3 degree string
4 year int
5}
6
7enum SeniorityLevel {
8 JUNIOR @description("0-2 years of experience")
9 MID @description("2-5 years of experience")
10 SENIOR @description("5-10 years of experience")
11 STAFF @description("10+ years of experience, technical leadership")
12}
13
14class Resume {
15 name string
16 skills string[]
17 education Education[]
18 seniority SeniorityLevel
19}
20
21function ExtractResume(resume_text: string) -> Resume {
22 client GPT4
23 prompt #"
24 Extract the following information from the resume.
25
26 Pay attention to the seniority descriptions:
27 {{ ctx.output_format.seniority }}
28
29 Resume:
30 ---
31 {{ resume_text }}
32 ---
33
34 {{ ctx.output_format }}
35 "#
36}

Notice what you get immediately:

  1. The prompt is right there - No digging through source code
  2. Enums with descriptions - The model knows what each value means
  3. Type definitions that become prompts - Less tokens, clearer instructions

Multi-model made simple

1// All providers in one place
2client<llm> GPT4 {
3 provider openai
4 options {
5 model "gpt-4o"
6 temperature 0.1
7 }
8}
9
10client<llm> Claude {
11 provider anthropic
12 options {
13 model "claude-3-opus-20240229"
14 temperature 0.1
15 }
16}
17
18client<llm> Gemini {
19 provider google
20 options {
21 model "gemini-pro"
22 }
23}
24
25client<llm> Llama {
26 provider ollama
27 options {
28 model "llama3"
29 }
30}
31
32// Same function, any model
33function ExtractResume(resume_text: string) -> Resume {
34 client GPT4 // Just change this
35 prompt #"..."#
36}

Use it in TypeScript:

1import { baml } from '@/baml_client';
2
3// Use default model
4const resume = await baml.ExtractResume(resumeText);
5
6// Switch models based on your needs
7const complexResume = await baml.ExtractResume(complexText, { client: "Claude" });
8const simpleResume = await baml.ExtractResume(simpleText, { client: "Llama" });
9
10// Everything is fully typed!
11console.log(resume.seniority); // TypeScript knows this is SeniorityLevel

Testing that actually tests

With BAML’s VSCode extension, you can:

BAML development tools in VSCode
  1. Test prompts without API calls - Instant feedback
  2. See exactly what will be sent - Full transparency
  3. Iterate on prompts instantly - No deploy cycles
  4. Save test cases for regression testing
BAML code lens showing test options

No mocking required - you’re testing the actual prompt and parsing logic.

The bottom line

AI SDK is fantastic for building streaming AI applications in Next.js. But for structured extraction, you end up fighting the abstractions.

BAML’s advantages over AI SDK:

  • Prompt transparency - See and control exactly what’s sent to the LLM
  • Purpose-built types - Enums with descriptions, aliases, better schema format
  • Unified model interface - All providers work the same way, switch with one line
  • Real testing - Test in VSCode without API calls or burning tokens
  • Schema-Aligned Parsing - Get structured outputs from any model
  • Better token efficiency - Optimized schema format uses fewer tokens
  • Production features - Built-in retries, fallbacks, and error handling

What this means for your TypeScript apps:

  • Faster development - Test prompts instantly without running Next.js
  • Better debugging - Know exactly why extraction failed
  • Cost optimization - See token usage and optimize prompts
  • Model flexibility - Never get locked into one provider
  • Cleaner code - No wrapper classes or infrastructure code needed

AI SDK is great for: Streaming UI, Next.js integration, rapid prototyping BAML is great for: Production structured extraction, multi-model apps, cost optimization

We built BAML because we were tired of elegant APIs that fall apart when you need production reliability and control.

Limitations of BAML

BAML does have some limitations:

  1. It’s a new language (but learning takes < 10 minutes)
  2. Best experience requires VSCode
  3. Focused on structured extraction, not general AI features

If you’re building a Next.js app with streaming UI, use AI SDK. If you want bulletproof structured extraction with full control, try BAML.