Benchmarks
Eight headline numbers, twelve catalog tiers, six advanced capabilities shipped today. Every claim has a measurement behind it.
Catalog -- 12 tiers, 1,300+ signed artifacts
Each row below is a production tier with hand-authored cores plus auto-emitted Python / C / JavaScript / WebAssembly siblings. The full catalog list is discoverable via vaa_list_catalog.
| Tier | Cores | Examples (truncated) |
|---|---|---|
| CVE defense | 34 | IPv4 canonical, IPv6 hex, port range, URL scheme allowlist, dns label, SSRF-safe IPv4, ... |
| Identifiers | 46 | UUID v4, ULID, MongoDB ObjectId, GitHub PAT, AWS access key shape, Anthropic key shape, ... |
| Encoding | 22 | base32 strict + loose, base64 strict + URL no-pad, hex lower / mixed, percent-encoded, ... |
| Finance | 52 | 50 ISO 4217 currency codes, IBAN per country, ACH routing, IBAN shape, Visa Luhn shape, ... |
| Auth | 24 | JWT structure + alg allowlist, OAuth grants, bearer token, basic auth b64, ... |
| HTTP | 38 | method strict, status code strict, header name, cookie name, etag quoted hex, ... |
| Agent safety | 47 | 47 OWASP CRS rules: header injection, response splitting, RFI, RCE bypass, ... |
| Source code lints | 60 | Python / Rust / Go / Java / JS keyword + identifier shapes, snake_case, ... |
| Structured (JSON/CSV) | 14 | JSON structure PDA, JSON pointer, JSON key, CSV field unquoted, scientific notation, ... |
| DNS / mail | 20 | DNS label, A / AAAA / MX / SOA / SRV / TXT / CNAME / PTR / NS / CAA, email full strict, ... |
| DevOps / cloud | 38 | k8s namespace + pod state, kubectl ops, docker ops, git ops, AWS regions, prometheus label, ... |
| PDA / TM (formal) | 17 | balanced parens, nested quotes, scope depth, palindrome, a^n b^n c^n, majority binary, ... |
Advanced capabilities shipping today
Six capabilities layered on top of the base catalog. Each one is in production and proven on real-model traffic.
Cross-implementation agreement proof
DifferentialA sample-bounded statistical guarantee that two implementations of the same spec agree to within an epsilon, with the testing seed committed by the verifier so the prover cannot cherry-pick inputs.
Streaming output safety gate with byte-level audit
StreamingBanned-substring filtering at sub-microsecond per character on a model's streamed output, plus a per-byte chain hash that turns the entire stream into a single auditable receipt.
Live catalog auto-grow
Auto-growA nominated tool is shadowed; the system distills a minimal deterministic recognizer from the captured examples and proposes a new signed catalog entry for the next deploy.
Composed-tool proof receipts
CompositionWhen tools are chained together, each boundary is dispatched and verified by content hash. The composition produces a single proof receipt covering the whole pipeline.
Verifiable rejection guarantees
SafetyThe catalog can prove not just what an artifact accepts, but what it provably rejects — useful for safety claims such as "this validator rejects ALL leading-zero IPv4 octets".
LLM-to-FSM distillation
DistillationOn a sufficiently constrained capability surface, a live LLM's behavior was reduced to a perfect 3-state finite-state machine in 8 ms, then signed and added to the catalog as a deterministic replacement.
All measurements run end-to-end on commodity hardware. Methodology summaries available in the whitepaper. For reproducibility help on a specific number, email contact@veriops.io.