vLLM supports the OpenAI client, allowing you to use the openai-generic provider with an overridden base_url.

See https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html for more information.

BAML
1client<llm> MyClient {
2 provider "openai-generic"
3 options {
4 base_url "http://localhost:8000/v1"
5 api_key "token-abc123"
6 model "NousResearch/Meta-Llama-3-8B-Instruct"
7 default_role "user" // Required for using VLLM
8 }
9}