Pdf values to BAML functions can be created in client libraries. This document explains how to use these functions both at compile time and runtime to handle Pdf data. For more details, refer to pdf types.
Pdf instances can be created from URLs, Base64 data, or local files. URL processing is controlled by your client’s media_url_handler configuration.
Please note that many websites will block requests to directly fetch PDFs.
Some models like Vertex AI require the media type to be explicitly specified. Always provide the mediaType parameter when possible for better compatibility.
Usage Examples
Test Pdf in the Playground
To test a function that accepts a pdf in the VSCode playground using a local file, add a test block to your .baml file:
The path to the PDF file. Supports relative paths (resolved from the current BAML file) or absolute paths. The file does not need to be inside baml_src/.
Static Methods
Creates a Pdf object from a URL. The mediaType parameter is optional but recommended for better model compatibility. If not provided, the media type will be inferred when the content is fetched.
Creates a Pdf object using Base64 encoded data along with the given MIME type. The mediaType parameter is required.
Instance Methods
Check if the Pdf is stored as a URL.
Get the URL if the Pdf is stored as a URL. Throws an Error if the Pdf is not stored as a URL.
Get the base64 data and media type if the Pdf is stored as base64. Returns [base64Data, mediaType]. Throws an Error if the Pdf is not stored as base64.
Convert the Pdf to a JSON representation. Returns either a URL object or a base64 object with media type, depending on how the Pdf was created.
URL Handling
PDF URLs are processed according to your client’s media_url_handler configuration:
- Anthropic: By default converts to base64 (
send_base64) as required by their API. - AWS Bedrock: By default converts to base64 (
send_base64). - OpenAI: By default keeps URLs as-is (
send_url). - Google AI: By default keeps URLs as-is (
send_url). - Vertex AI: By default keeps URLs as-is (
send_url).
Many websites block direct PDF fetching. If you encounter issues with URL-based PDFs, try:
- Using
media_url_handler.pdf = "send_base64"to fetch and embed the content - Downloading the PDF locally and using
from_file - Using a proxy or authenticated request
Model Compatibility
Different AI models have varying levels of support for PDF input methods (As of July 2025):
For most models, direct https URLs are not accepted (except Anthropic). Prefer using base64, file uploads, or the appropriate cloud storage/file upload mechanism for your provider. Always specify the correct MIME type (e.g., application/pdf) when required.