Reduce Hallucinations

We recommend these simple ways to reduce hallucinations:

1. Set temperature to 0.0 (especially if extracting data verbatim)

This will make the model less creative and more likely to just extract the data that you want verbatim.

clients.baml
1client<llm> MyClient {
2 provider openai
3 options {
4 temperature 0.0
5 }
6}

2. Reduce the number of input tokens

Reduce the amount of data you’re giving the model to process to reduce confusion.

Prune as much data as possible, or split your prompt into multiple prompts analyzing subsets of the data.

If you’re processing images, try cropping the parts of the image that you don’t need. LLMs can only handle images of certain sizes, so every pixel counts. Make sure you resize images to the model’s input size (even if the provider does the resizing for you), so you can gauge how clear the image is at the model’s resolution. You’ll notice the blurrier the image is, the higher the hallucination rate.

Let us know if you want more tips for processing images, we have some helper prompts we can share with you, or help debug your prompt.

2. Use reasoning or reflection prompting

Read our chain-of-thought guide for more.

3. Watch out for contradictions and word associations

Each word you add into the prompt will cause it to associate it with something it saw before in its training data. This is why we have techniques like symbol tuning to help control this bias.

Let’s say you have a prompt that says:

Answer in this JSON schema:
But when you answer, add some comments in the JSON indicating your reasoning for the field like this:
Example:
---
{
// I used the name "John" because it's the name of the person who wrote the prompt
"name": "John"
}
JSON:

The LLM may not write the // comment inline, because it’s been trained to associate JSON with actual “valid” JSON.

You can get around this with some more coaxing like:

Answer in this JSON schema:
But when you answer, add some comments in the JSON indicating your reasoning for the field like this:
---
{
// I used the name "John" because it's the name of the person who wrote the prompt
"name": "John"
}
It's ok if this isn't fully valid JSON,
we will fix it afterwards and remove the comments.
JSON:

The LLM made an assumption that you want “JSON” — which doesn’t use comments — and our instructions were not explicit enough to override that bias originally.

Keep on reading for more tips and tricks! Or reach out in our Discord