← all writing

What Zero-Error Tolerance Actually Means for AI

When your AI handles 25% of U.S. mortgage closings, the error rate in the critical path has to be zero. Why that constraint is architectural, not a tuning problem.

When your AI is handling 25% of U.S. mortgage closings, you're not working on a transaction. You're working on a transaction that, if it leads to an error, could cause delays and losses on someone's home purchase. We're talking about million-dollar transactions where chances of fraud are high, and you're introducing a probabilistic AI solution into the mix.

Error rates in the critical path need to be zero. Not "low." Not "acceptable." Zero.

That's a strange constraint to put on a technology that is, by its nature, probabilistic. Machine learning models don't guarantee outcomes. They produce predictions with confidence scores. So when the business requirement is zero errors in the critical path, you have to design the entire system around that tension rather than pretend it doesn't exist.

We had three AI models: TASHA, Doug, and Ann. They started as a single SageMaker model and evolved into specialized systems that reached 98% accuracy. Which sounds impressive until you remember what the other 2% means at scale. When you're processing tens of thousands of closings, 2% is hundreds of transactions where the AI got it wrong. Transactions tied to real people buying real homes, with real money on the line.

So that 2% didn't get a pass. It got re-verified by humans. Every time. Not as an afterthought, not as a "nice to have" quality check. It was built into the pipeline as a core architectural decision. The system knew what it didn't know, flagged it, and routed it to a human. That boundary between what the AI handles and what gets escalated wasn't a tuning parameter we optimized later. It was the first thing we designed.

That's what people miss about deploying AI in high-stakes environments. The hard part isn't getting to 98% accuracy. The hard part is building the system around the other 2%. The traceability, the fallbacks, the confidence thresholds, the human-in-the-loop routing, all designed so that the 2% never reaches the customer as an unverified output.

Most enterprise AI today doesn't work this way. The system produces an output, the output goes to the user, and the user has no way to know whether this particular answer is in the 98% or the 2%. There's no confidence signal. No trace. No fallback. Just a well-written response that may or may not be right.

I think about this every time I see AI deployed in sales, customer success, operations. Environments where the outputs feed directly into customer-facing conversations and business decisions. The tolerance for error might be higher than mortgage closings, but it's not infinite. A rep who walks into a meeting with wrong account intelligence loses credibility. A manager who commits a forecast based on AI-surfaced pipeline data that missed a key risk loses trust with the board. The errors compound, and because there's no traceability, nobody can diagnose where things went wrong.

The fix is architectural. Not better models, better systems around the models. Traceability built in from the start, not bolted on. Confidence-aware routing so the system knows when to surface an answer and when to say "I don't have enough to be sure." Data provenance so every output connects back to the specific inputs that produced it.

It's harder to build this way. It's slower. But it's the difference between AI that works in a demo and AI that works when the stakes are real.

That's the problem I spend most of my time on.

Anupreet Walia is CTO & Co-Founder of Brevian. Originally published on LinkedIn.