Fast When Certainty Is High.
Smart When It's Not.
Every entity comparison is routed through the fastest, cheapest, most accurate method first — escalating only when needed. 90%+ cost reduction vs. LLM-everywhere approaches.
Every Entity Gets the Right Treatment
70–80% of entity resolution is straightforward. The cascade solves easy problems cheaply and handles hard cases with full sophistication.
90% Cost Reduction. Same Accuracy. Better Auditability.
| Dimension | LLM-Everywhere | ioNova Cascade |
|---|---|---|
| Cost / 10M comparisons | $200K–$600K / month | $15K–$40K / month |
| Average Latency | 200–500ms | <50ms weighted average |
| Determinism | Varies by run | Stages 1–3 deterministic (90%) |
| Auditability | Opaque model reasoning | Explicit rules/thresholds per stage |
| Provider Dependency | Fully dependent on LLM provider | LLM only for 10% — provider-agnostic |
| Throughput Ceiling | API rate limited | Effectively unlimited (Stages 1–2) |
Cascade Intelligence — Your Questions Answered
What is cascade intelligence and how does it optimize entity resolution?
Cascade intelligence is a multi-stage entity resolution architecture that routes every comparison through the fastest, cheapest, most accurate method first — and only escalates to more expensive processing when needed. The four stages are: Stage 1 (Exact Match) resolves ~40% of comparisons in under 5ms using deterministic rules and identifier lookup. Stage 2 (Fuzzy Match) resolves ~30% in under 50ms using edit-distance, phonetic, and token similarity. Stage 3 (Semantic Match) resolves ~20% in under 200ms using vector embeddings. Stage 4 (LLM Escalation) handles the remaining ~10% using full-context reasoning. This achieves 90%+ cost reduction versus LLM-everywhere approaches while maintaining equivalent accuracy.
How does fuzzy matching work for entity resolution?
Fuzzy matching in ioNova's cascade uses a combination of edit-distance algorithms (Levenshtein, Jaro-Winkler), phonetic matching (Soundex, Metaphone), and token-level similarity to handle the real-world messiness of enterprise entity data: typos, abbreviations, formatting differences, and transliterations. For example, "Acme Corp" vs. "ACME Corporation" or "John Smyth" vs. "Jon Smith" are resolved at this stage in under 50ms. This stage handles approximately 30% of all entity comparisons and is fully deterministic — every match decision produces an explicit similarity score with defined thresholds, making it completely auditable.
Why is LLM-only entity resolution too expensive for enterprise scale?
At enterprise scale, entity resolution involves millions of comparisons monthly. Processing 10 million comparisons through an LLM costs $200K–$600K per month in inference fees alone, with average latencies of 200–500ms per comparison, non-deterministic results (the same inputs can produce different outputs across runs), and API rate limits that cap throughput. ioNova's cascade architecture reduces this to $15K–$40K per month by reserving LLMs for only the ~10% of genuinely ambiguous cases that require world knowledge. The first three stages (handling 90% of volume) are deterministic, sub-200ms, and run on your infrastructure with effectively unlimited throughput.
What is semantic matching and when is it needed for entity resolution?
Semantic matching uses vector embeddings to identify entities that are conceptually the same but textually different — cases where exact and fuzzy matching fail. This includes parent-subsidiary relationships (recognizing "YouTube" as part of "Alphabet Inc."), entity aliases and trade names, cross-language entity references, and entities known by multiple forms across different jurisdictions. Semantic matching processes comparisons in under 200ms and resolves approximately 20% of entity comparisons that would otherwise require expensive LLM escalation. The stage is still deterministic for a given embedding model version, preserving auditability.
How does the cascade improve over time with the data flywheel?
The cascade implements a continuous data flywheel: every resolution decision feeds back into the optimization loop. When an analyst overrides a score, that feedback tightens matching thresholds. When an LLM escalation resolves to a pattern the fuzzy matcher could have caught, the system promotes that pattern to an earlier (faster, cheaper) stage. Over time, more comparisons resolve at faster, cheaper stages — increasing the percentage handled by Stages 1 and 2 while reducing costly LLM escalations. This creates compounding cost and speed improvements: organizations typically see cascade efficiency increase by 10–15% in the first year as the system learns their specific entity landscape.