Configuring Timeouts
Timeouts help you build resilient applications by preventing requests from hanging indefinitely. BAML provides granular timeout controls at multiple stages of the request lifecycle.
Why Use Timeouts?
Without timeouts, your application can stall when:
- LLM provider endpoints are unreachable
- Providers accept requests but take too long to respond
- Network connections stall mid-stream
- Long-running requests exceed your application’s latency requirements
Timeouts let you fail fast and either retry or fallback to alternative clients.
Quick Start
Add timeouts to any client by specifying timeout values in the http
block within options
:
Available Timeout Types
BAML supports four types of timeouts for individual requests, plus a fifth timeout type for composite clients (fallback, round-robin):
connect_timeout_ms
Maximum time to establish a connection to the LLM provider.
When to use: Detect unreachable endpoints quickly.
time_to_first_token_timeout_ms
Maximum time to receive the first token after sending the request.
When to use: Detect when the provider accepts your request but takes too long to start generating.
This timeout is especially useful for streaming responses where you want to ensure the LLM starts responding quickly, even if the full response takes longer.
idle_timeout_ms
Maximum time between receiving data chunks during streaming.
When to use: Detect stalled connections where the provider stops sending data mid-response.
request_timeout_ms
Maximum total time for the entire request-response cycle.
When to use: Ensure requests complete within your application’s latency requirements.
Timeouts with Retry Policies
Each retry attempt gets the full timeout duration:
If the first attempt times out at 30 seconds, the retry mechanism kicks in and the next attempt gets a fresh 30-second timeout.
Total time: Up to 4 attempts × 30s + retry delays = ~2+ minutes
Runtime Timeout Overrides
Override timeouts at runtime using the Client Registry:
Handling Timeout Errors
Timeout errors are a subclass of BamlClientError
called BamlTimeoutError
. You can catch them specifically:
For more on error handling, see Error Handling.
Recommended Production Timeouts
For most production applications, we recommend starting with:
For fallback clients with stricter requirements:
Tips and Best Practices
Start Conservative, Then Optimize
Begin with generous timeouts and monitor your application’s performance. Tighten timeouts gradually based on real-world data.
Different Timeouts for Different Models
Faster models can use stricter timeouts:
Monitor Timeout Rates
Track how often timeouts occur using BAML Studio or your own observability tools. High timeout rates indicate you should either:
- Increase timeout values
- Use faster models
- Optimize your prompts
- Add more fallback clients
Timeouts vs Abort Controllers
Timeouts and abort controllers serve different purposes:
- Timeouts: Automatic, configuration-based time limits
- Abort controllers: Manual, user-initiated cancellation
Use timeouts for resilience and SLAs. Use abort controllers when users explicitly cancel operations.
You can use both together: