Research
What we've learned building the determinism layer. Specific findings, real data, conservative claims.
~70% of agent tool-call work is non-creative
Across multiple real agent traces, the majority of tool calls are tasks a deterministic catalog could answer in microseconds: format validation, ID parsing, content-type detection, secret-shape recognition. The model adds no value to these and frequently hallucinates them. We treat this as the foundational case for the determinism layer.
Schism Hunter: differential attestation finds CVE-class parser drift in milliseconds
Implementation differences in standard format parsers (IPv4, URL, JSON) routinely create CVE-class security bugs (e.g. CVE-2021-29921). By fuzzing two implementations against each other and recording divergences, we can bound the equivalence rate cryptographically and pinpoint the exact divergence inputs in real time.
A live LLM can be distilled to a perfect deterministic FSM in milliseconds
For sufficiently constrained capability surfaces, an LLM's observed behavior is exactly representable as a small finite-state machine. Live Claude Haiku 4.5 was distilled to a perfect 3-state minimal FSM in 8 ms, then signed and added to the catalog. The FSM matches the LLM bit-for-bit on the in-distribution surface AND runs at deterministic-catalog speed.
Cross-validation evidence is binary, not statistical
755 hand-authored regex artifacts achieved 100.0000% match-rate against a Python reference oracle on adversarial corpora. This is not "pretty good"; it is provably correct on the corpus by construction. Any divergence is a bug, not a tradeoff.
A note for non-technical readers: each finding above is an empirical result from running our catalog against real agents and real models. The mechanisms behind them are described at a high level in the whitepaper; the implementation details are deliberately kept private.