The failure mode that stalls “AI for data” efforts or "AI on my APIs" efforts isn’t psychedelic hallucination—it’s confident inaccuracy: plausible answers that are wrong in subtle and costly ways.
Until we design systems that can signal uncertainty and improve from missing context, the data democratization dream remains a demo.
At the end, I’ll share the two principles that have worked for us to make this tractable.
We've spent the last two years building a platform to help people solve this, and yes, I run that company. But the problem is real whether you buy from us or not - so let's get into it.
The problem in one minute
Consider the following typical examples of questions and the simple reasons why confident inaccuracy becomes an issue:
"Where do we keep shipment status?"
Misses multiple sources; presents a partial list as complete.
Result looks plausible so you accept it.
"Link vendor returns to support tickets for the last two promos."
Uses the wrong join key because data needs to be transformed before joining, mismatches and double‑counts.
The final result looksright but is actually 5% off.
"From support emails, extract product information and assign severity"
Maps “chargeback” to “refund request”; misroutes high‑priority tickets.
Sampling a few items you'd miss that this happened.
In each case, a general‑purpose model can produce something that looks correct. The trap is that it’s confidently wrong in ways that slip past casual review.
Why this blocks the data democratization dream
AI that democratizes data only works when non‑experts caneither trust, or safely decline an answer without reverse‑engineering the entire pipeline.
Aside from the hilarious "Oh I spend $10M on this campaign because our AI assistant told me to" first order problem, confident inaccuracy causes second and third order problems that are far more insidious:
Imposes a universal verification tax. Every answer has to be replicated or forensically checked. Minutes turn into hours; the ROI disappears.
Erodes trust asymmetrically. One high‑confidence miss costs more credibility than ten successes earn. Users revert to old workflows.
Hidden failure modes prevent improvement. Without explicit plans and uncertainty signaling, you don’t know whether a result is wrong because of ambiguity, missing context, stale data, or a model mistake—so you can’t fix what you don't know.
If you lead a Data or AI initiative, you’ve seen the pattern: impressive pilot → hidden verification tax → dropping AI adoption → “we paused rollout.”
Net result: The ROI on the AI initiative is unclear, adoption is not trending well, the AI system feels obsolete on arrival.
The accuracy flywheel (how trust compounds)
The irony here is that perfect accuracy is not actually required to have a usable AI system.
More valuable than a 90% accurate system is a 40% accurate system that can signal uncertainty - and get more accurate over time.
We don’t need perfection; we need a loop that tightens:
Explicit plan → the system proposes a domain-specific plan for retrieval, transformation and semantic AI tasks.
Native uncertainty → it attaches confidence and the top uncertainty causes; abstains below threshold.
Human nudge → the user fills a planning gap that was causing uncertainty
Model improvement → that nudge updates the domain knowledge and the planning space (not just the answer).
Higher future accuracy & coverage → next time, the plan is safer, the confidence is calibrated, and fewer cases need review.
This is how you get genuine democratization: self-bounded autonomy today, shrinking review tomorrow.
A quick diagnostic for leaders
Before you fund another “AI for X” pilot, ask:
Will it tell me when it’s unsure—and why? Ambiguity, missing context, data staleness, validation failure etc etc?
Does it learn from the correction I just gave it? Will the next user avoid the same trap without re‑prompting?
Our solution approach
I’m deliberately avoiding product details, but two principles have proven decisive across customers:
Generate plans in a DSL unique to your domain, not end answers. Plans in the DSL compile to deterministic actions with runtime validations and policy checks.
Specialize the AI to your domain to drive planning accuracy & confidence. Bind to your ontology/metrics, entity catalogs, data systems; learn your naming collisions and edge‑case layouts; understand your meanings and how your domain works. Build a system that carries a calibrated confidence on the generated plan.
The mechanics are open to debate; but these are the invariants we believe we need to optimize for to solve this fundamental problem with real world adoption of AI.
Our approach is simple: specialize to the domain, plan in a domain DSL, and treat uncertainty as a first‑class output so that the accuracy flywheel spins.
I can pretend this is pure thought leadership, but we're a venture-backed company taking this into production with Fortune 100s and seven-figure deals. The difference is: we built this because the problem is real, not the other way around.
If you want to see what that looks like on your data—without a long‑winded demo—bring three tasks you don’t trust your AI with. We’ll walk you through the DSL plan, the ability to signal uncertainty, and how the system learns from your corrections.
If you lead your data / AI initiatives and want to chat, reach out with your 3 tasks and a preferred time at [email protected].