We have released the ability for developers to include confidence scores via the structured endpoint when using the Box Extract APIs. Confidence scores approximate the actual probability of the extraction being correct and are displayed as numeric percentage scores using a decimal format. They can be configured on a per field basis (I.E. 0.875 means the agent is 87.5% confident of the extract value per field), and developers can include confidence labels (I.E. Low, Medium, or High) to describe confidence levels in the extracted results. The scores are calibrated to approximate real-world correctness probabilities and are produced by aggregating multiple LLM responses and measuring consistency, enabling automated decisioning and human-in-the-loop workflows based on estimated extraction accuracy.
Confidence scores, confidence levels, and recommended actions should be interpreted based on your risk tolerance, the criticality of the use case, and the degree to which you have tested and validated the extraction results. We recommend validating confidence score thresholds against your specific document types and accuracy requirements.Developers can use confidence scores to flag certain extracted values for human review if they ball below a specific numeric threshold.
Confidence scores are calculated by estimating confidence from multiple responses to the same request sent to an LLM. Consistent responses indicate high confidence. There are several methods of introducing variety into responses:
- Setting the model’s temperature
- Requesting multiple candidates in response
- Creating multiple independent requests with different prompts, for example:
- Paraphrase system prompt or template.
- Shuffle requested fields, possibly send only subset of fields in request.
Multiple responses are then aggregated into one result based on the frequency of distinct values returned for each field. Confidence scores are estimated based on that frequency.
Confidence scores will support the following Google Gemini LLMs:
gemini-2.5-flashgemini-2.5-pro