Designing Predictable LLM-Verifier Systems for Formal Method Guarantee
[Submitted on 30 Nov 2025 (v1), last revised 16 Dec 2025 (this version, v2)]
Abstract:The integration of Formal Verification tools with Large Language Models (LLMs) offers a path to scale software verification beyond manual workflows. However, current methods remain unreliable: without a solid theoretical footing, the refinement process acts as a black box that may oscillate, loop, or diverge. This work bridges this critical gap by developing an LLM-Verifier Convergence Theorem, providing the first formal framework with provable guarantees for termination in multi-stage verification pipelines. We model the interaction not as a generic loop, but as a sequential absorbing Markov Chain comprising four essential engineering stages: texttt{CodeGen}, texttt{Compilation}, texttt{InvariantSynth}, and texttt{SMTSolving}. We prove that for any non-zero stage success probability ($delta > 0$), the system reaches the texttt{Verified} state almost surely. Furthermore, because of the sequential nature of the pipeline, we derive a precise latency bound of $mathbb{E}[n] leq 4/delta$. We stress-tested this prediction in an extensive empirical campaign comprising over 90,000 trials. The results match the theory with striking consistency: every run reached verification, and the empirical convergence factor clustered tightly around $C_fapprox 1.0$, confirming that the $4/delta$ bound accurately mirrors system behavior rather than serving as a loose buffer. Based on this data, we identify three distinct operating zones — marginal, practical, and high-performance — and propose a dynamic calibration strategy to handle parameter drift in real-world environments. Together, these contributions replace heuristic guesswork with a rigorous architectural foundation, enabling predictable resource planning and performance budgeting for safety-critical software.
Submission history
From: Pierre Dantas [view email]
[v1]
Sun, 30 Nov 2025 22:19:09 UTC (459 KB)
[v2]
Tue, 16 Dec 2025 22:28:30 UTC (491 KB)
