Technical

Designing an Exception Routing System That Ops Teams Actually Use

April 14, 2025 · 10 min read

Everyone talks about extraction accuracy. 99%, 98.5%, "industry-leading" — the number gets featured in demos and marketing copy. What gets far less attention is what happens to the documents that don't reach that threshold. Those are the documents that your ops team has to touch. If the exception routing system is poorly designed, those touches become a bottleneck that degrades the whole operation.

We've seen two distinct failure modes in exception routing. The first: the system routes everything to a single shared queue, and that queue becomes a pile nobody owns. The second: the system is so aggressive about flagging exceptions that nearly every document requires human review, eliminating most of the value of automation.

What follows is how we designed around both failure modes — the logic that governs what gets flagged, why it gets flagged, and where it goes.

What Triggers an Exception

Fieldiq's exception engine uses two distinct trigger categories: confidence-based and rule-based. They're evaluated independently and can both flag the same document.

Confidence-Based Triggers

Every field in our extraction output has a confidence score between 0 and 1. A field with confidence 0.92 means the model is highly certain. A field with confidence 0.47 means the model is essentially guessing — it found something that looks like the right field type, but the evidence is weak.

We let customers configure per-field confidence thresholds rather than using a single global threshold. The reason: not all fields have equal consequence when wrong. A low-confidence shipping_amount is a nuisance. A low-confidence total_due is a payment error waiting to happen.

A reasonable default threshold configuration looks like:

{
  "confidence_thresholds": {
    "total_due": 0.90,
    "vendor_name": 0.85,
    "invoice_number": 0.85,
    "invoice_date": 0.80,
    "line_items": 0.75,
    "tax_amount": 0.85,
    "payment_terms": 0.70,
    "vendor_address": 0.65
  }
}

The lower threshold for vendor_address isn't because the field matters less — it's because vendor addresses are often partially obscured, abbreviated, or formatted inconsistently, and the cost of a slightly imprecise address extraction is lower than the cost of routing 15% of documents to exception just for address cleanup.

Rule-Based Triggers

Rule-based triggers operate on the extracted values themselves, regardless of confidence score. Even a highly-confident extraction can flag as an exception if it violates a business rule.

Common rule triggers we configure for customers:

Amount tolerance check: subtotal + tax_amount + shipping_amount ≠ total_due within a configured tolerance (typically ±$0.02 for rounding)
Duplicate invoice detection: Same vendor_name + invoice_number pair already processed within the last N days
Vendor whitelist mismatch: vendor_name not found in your approved vendor master — common when a known vendor invoices from a legal entity you haven't seen before
PO three-way match failure: Invoice total_due exceeds the linked PO amount by more than a configured variance threshold
Tax anomaly: Tax rate falls outside expected range for the vendor's jurisdiction (catches invoices with obvious tax calculation errors)

Routing Logic: Not Just a Queue

The worst exception routing design puts everything in a single shared inbox. In practice, this means the person who picks up the queue might be looking at an amount tolerance check failure (a 30-second fix) followed by a document-quality issue requiring a rescan request to the vendor (a 2-day process). The queue becomes unprioritized and unowned.

Our routing engine assigns exceptions to queues based on exception type and configurable rules about who owns each type:

Exception Type	Default Queue	Typical Resolution
Low confidence — critical field	AP Review	Reviewer corrects field inline
Amount tolerance failure	AP Review	Manual verify against original document
Duplicate detection	AP Review (high priority)	Confirm or reject as duplicate payment
Vendor not in master	Vendor Management	Add vendor entity or map to existing
PO match failure	Procurement	PO amendment or invoice dispute
Document quality (unreadable scan)	Vendor Outreach	Request re-send from vendor

The key design decision: PO match failures go to Procurement, not AP. AP can't resolve a PO variance — they don't own the PO. Sending it to an AP queue creates a hand-off delay that adds hours or days to resolution. The routing configuration maps exception types directly to the team that can actually resolve them.

The Review Interface Problem

A routing system is only as good as the interface the reviewer actually uses. We've seen exception review workflows that require the reviewer to: open the exception ticket, download the original document, open it in a separate PDF viewer, compare values manually, go back to the exception system, make corrections field by field. That's a 4–7 minute process per exception.

Our review interface shows the original document and the extracted fields side by side. The specific field that triggered the exception is highlighted in the document image alongside the extracted value. The reviewer sees exactly what the model saw, where it's uncertain, and can correct inline. Resolution for a low-confidence field exception typically takes 45–90 seconds.

This matters for adoption. If exception review is painful, your team starts looking for workarounds — approving questionable extractions to clear the queue, or marking documents as "processed" before the correction is actually in the ERP. Both create downstream accuracy problems that negate the value of the extraction system.

Calibrating the Exception Rate Over Time

At launch, exception rates are typically higher than steady state. The model hasn't seen your specific vendor mix, and some of your rule thresholds may be set conservatively. Over the first 4–8 weeks, a few things happen:

The model improves on your document distribution as it processes more of your specific vendor formats
Your team accumulates vendor master data, reducing the "vendor not found" exception rate
You tune the rule thresholds based on actual exception patterns — if the tax anomaly rule triggers on invoices where the tax calculation is genuinely correct (edge cases in your vendor base), you widen the allowed range

We track exception rate over time as a primary operational metric — not as a "how good is the model" measure but as a "how healthy is the overall pipeline" measure. A rising exception rate after steady state is a signal something has changed: new vendor formats entering the mix, a rule threshold set incorrectly, or a systematic extraction issue worth investigating.

What We'd Tell You to Avoid

We're not saying every exception routing system needs to be as configurable as ours. For smaller operations with relatively uniform document formats, a simpler system with a lower exception rate and a single reviewed queue may be entirely sufficient.

What we'd tell you to avoid regardless of scale: any system where exceptions are silently approved rather than explicitly reviewed, and any system where the exception rate isn't measured and tracked. Silent approvals erode accuracy without any visible signal. An unmeasured exception rate means you don't know whether your automation is actually working or just moving errors downstream.

The extraction accuracy number gets the headline. The exception routing system is what determines whether that accuracy number translates into actual operational reliability. Those are two different things, and both matter.

Published by the Fieldiq team

See Fieldiq process your documents