AI Receipt Scanning. Photo in, structured data out.
A production-grade pattern for receipt OCR. Photo upload, LLM extraction, deterministic validation, human-in-the-loop approval, and a full audit trail. The LLM does the unstructured-to-structured part; everything else is deterministic and predictable.
Components
When to use this
- →You receive lots of receipts (or invoices, or any structured-from-unstructured paperwork)
- →There is a human in the loop somewhere — a manager, an admin, the user themselves
- →You want to remove typing, not remove approval
- →Errors are recoverable (a wrong field can be fixed before approval)
When not to use this
- ×No human in the loop and the AI output triggers real-money actions automatically
- ×Volume is so low that a human typing it themselves is cheaper
- ×Receipts are in a tightly controlled, highly structured format (use a deterministic parser)
- ×Compliance requires deterministic, auditable extraction (LLM outputs vary, even with low temperature)
The flow
- →User opens the expense form, takes a photo of the receipt.
- →Photo uploaded directly to Supabase Storage from the client (signed-URL or RLS-gated bucket).
- →Server action invoked with the storage reference.
- →Server action fetches the image, sends it to Claude with a strict JSON schema in the system prompt.
- →LLM returns JSON. Zod validates. If invalid, retry once with a stricter prompt. If still invalid, return the raw image to the user for manual entry.
- →Validated fields written to the form. User confirms or edits. Submits.
- →Manager receives an email with the image inline and approve/reject links.
- →On approval, the claim transitions state and is included in the next payroll export.
The schema
The Zod schema is the contract between the LLM and the rest of the system. Keep it tight — every field that is not strictly required is a place for hallucination to creep in.
import { z } from 'zod'
export const ReceiptExtraction = z.object({
merchant: z.string().min(1).max(120),
date: z.string().regex(/^\d{4}-\d{2}-\d{2}$/),
total: z.number().positive(),
currency: z.enum(['GBP', 'EUR', 'USD']),
category_guess: z.enum([
'meals', 'travel', 'office', 'software', 'other'
]),
confidence: z.number().min(0).max(1),
})
export type ReceiptExtraction = z.infer<typeof ReceiptExtraction>The orchestrator
'use server'
import Anthropic from '@anthropic-ai/sdk'
import { ReceiptExtraction } from './schema'
const anthropic = new Anthropic()
export async function extractReceipt(
imageBase64: string
): Promise<ReceiptExtraction | { error: string }> {
const res = await anthropic.messages.create({
model: 'claude-sonnet-4-7-20260301',
max_tokens: 512,
system:
'You are a receipt extraction system. Output strict JSON ' +
'conforming to the supplied schema. No prose.',
messages: [{
role: 'user',
content: [
{ type: 'image', source: { type: 'base64', media_type: 'image/jpeg', data: imageBase64 } },
{ type: 'text', text: 'Extract the receipt as JSON.' },
],
}],
})
const text = res.content[0].type === 'text' ? res.content[0].text : ''
const parsed = ReceiptExtraction.safeParse(JSON.parse(text))
if (!parsed.success) return { error: 'extraction_failed' }
return parsed.data
}Validation rules beyond the schema
Schema validation gets you well-typed JSON. Business validation gets you sane data. Both run before the row is written.
- →Date must be within the last 12 months (older claims need manual review)
- →Total must be plausible (configurable per category — meals capped at £500, travel uncapped, etc.)
- →Currency must match the user’s region unless explicitly flagged as international
- →Confidence below 0.7 routes to a manual-review queue rather than auto-populating the form
Failure modes and how to handle them
The LLM will fail. Plan for it. The two common failure modes are extraction failures (the model returns malformed JSON or unrelated content) and silent extraction errors (the JSON is valid but a field is wrong).
For malformed output, retry once with a stricter prompt. For silent errors, the human approval step is the safety net — managers see the receipt image alongside the extracted total, and incorrect totals get caught visually.
If the model is unreachable (rate limit, outage), fall back to letting the user enter the data manually. Never block the form on the AI being available. The AI is an enhancement, not a dependency.
Alternatives I considered
Cheap, fast, fully deterministic. Falls over on real-world messy receipts. Works for clean invoices.
Strong at structured documents (invoices, forms). Less flexible than an LLM for free-form receipts. Vendor-specific schemas.
Better accuracy on a specific receipt format if you have thousands of labelled examples. Almost never worth it for an internal platform.
Zero infra, zero error rate from the system. Costs the user thirty seconds per receipt, which compounds into real time at volume.
Want me to build this for you?
Blueprints are how I think. If your problem fits one of these, we are already most of the way to a quote.