Timeout Configuration
Configure timeouts on any BAML client to prevent requests from hanging indefinitely.
Overview
Timeouts can be configured on leaf clients (OpenAI, Anthropic, etc.).
Timeout Options
All timeout values are specified in milliseconds as positive integers.
Maximum time to establish a network connection to the provider.
Default: No timeout (infinite)
Maximum time to receive the first token after sending the request.
Default: No timeout (infinite)
Particularly useful for detecting when a provider accepts the request but takes too long to start generating.
Maximum time between receiving consecutive data chunks.
Default: No timeout (infinite)
Important for detecting stalled streaming connections.
Maximum total time for the entire request-response cycle.
Default: No timeout (infinite)
For streaming responses, this applies to the entire stream duration (first token to last token).
Timeout Composition
When composite clients reference subclients with their own timeouts, the minimum (most restrictive) timeout wins.
Example
Effective timeouts:
When calling FastClient
:
connect_timeout_ms
:min(5000, 3000)
= 3000ms (FastClient is stricter)request_timeout_ms
:min(∞, 20000)
= 20000ms (only FastClient defines it)idle_timeout_ms
:min(15000, ∞)
= 15000ms (only parent defines it)
When calling SlowClient
:
connect_timeout_ms
:min(5000, ∞)
= 5000ms (only parent defines it)request_timeout_ms
:min(∞, 60000)
= 60000ms (only SlowClient defines it)idle_timeout_ms
:min(15000, ∞)
= 15000ms (only parent defines it)
Timeout Evaluation
All timeouts are evaluated concurrently. A request fails when any timeout is exceeded:
- Connection phase:
connect_timeout_ms
applies - After connection:
time_to_first_token_timeout_ms
starts when request is sentrequest_timeout_ms
starts when request is sentidle_timeout_ms
starts after each chunk is received
Interaction with Retry Policies
When a client has both timeouts and a retry policy:
- Each retry attempt gets the full timeout duration
- A timeout triggers the retry mechanism (if configured)
- Total elapsed time = (number of attempts) × (timeout per attempt) + (retry delays)
Example:
Maximum possible time: ~30s × 4 attempts + exponential backoff delays
Runtime Overrides
Override timeout values at runtime using the client registry:
Runtime overrides follow the same composition rules: the minimum timeout wins when composing runtime values with config file values.
Error Handling
Timeout errors are represented by BamlTimeoutError
, a subclass of BamlClientError
:
Timeout errors include structured fields:
client
: The client name that timed outtimeout_type
: The specific timeout that was exceededconfigured_value_ms
: The configured timeout value in millisecondselapsed_ms
: The actual elapsed time in millisecondsmessage
: A human-readable error message
Validation Rules
BAML validates timeout configurations at compile time:
- Positive values: All timeout values must be positive integers
- Logical constraints:
request_timeout_ms
must be ≥time_to_first_token_timeout_ms
(if both are specified)
Invalid configurations will cause BAML to raise validation errors with helpful messages.
See Also
- Configuring Timeouts Guide - User guide with examples
- Fallback Strategy - Using timeouts with fallback clients
- Retry Policies - Using timeouts with retries
- Error Handling - Handling timeout errors