Why Neural Network Verification?

Neural networks achieve impressive accuracy on test sets. They classify images, translate languages, and make predictions with human-level performance. Yet a disturbing pattern emerges: small, often imperceptible changes to inputs can cause catastrophic failures. A stop sign with carefully placed stickers becomes a speed limit sign [Eykholt et al., 2018]. A sentence with one word changed flips sentiment classification. These aren’t random errors—they’re vulnerabilities that can be systematically exploited.

This is why verification matters. Standard testing measures average-case performance. Verification asks: what’s the worst that can happen? Can we prove the network behaves correctly even under adversarial conditions? This fundamental shift—from empirical evaluation to formal guarantees—is what neural network verification provides.

The Adversarial Example Problem

The story begins with adversarial examples [Goodfellow et al., 2015, Szegedy et al., 2013]. Researchers discovered that adding carefully crafted noise to images—noise imperceptible to humans—causes neural networks to confidently misclassify inputs. A panda becomes a gibbon. A school bus becomes an ostrich. The network’s confidence remains high even as it produces nonsensical outputs.

Definition: Adversarial Example

An adversarial example is an input \(x'\) crafted to fool a neural network \(f\), where \(x'\) is perceptually similar to a natural input \(x\) but causes \(f(x') \neq f(x)\). Formally, for a perturbation budget \(\epsilon\):

\[x' = x + \delta \text{ where } \|\delta\|_p \leq \epsilon \text{ and } f(x') \neq f(x)\]

The perturbation \(\delta\) is chosen to maximize the probability of misclassification while remaining within the allowed budget.

Why adversarial examples exist: Neural networks learn decision boundaries in high-dimensional space. In these spaces, small perturbations can move inputs across decision boundaries even when the perturbations are imperceptible in the original input space. The network hasn’t learned the “right” features—it’s learned features that work well on average but fail catastrophically on adversarial inputs.

Attack methods reveal vulnerabilities: Fast Gradient Sign Method (FGSM) [Goodfellow et al., 2015] computes gradients of the loss with respect to the input, then steps in the direction that increases loss. Projected Gradient Descent (PGD) [Madry et al., 2018] iterates this process for stronger attacks. Carlini-Wagner (C&W) [Carlini and Wagner, 2017] formulates attack generation as an optimization problem, finding minimal perturbations that change predictions. These attacks aren’t exotic—they’re straightforward applications of gradient descent.

The existence of adversarial examples demonstrates a fundamental gap: networks that perform well on test sets can fail completely on inputs just slightly different from the test distribution.

Real-World Consequences

Adversarial vulnerabilities aren’t just academic curiosities. They have serious real-world implications.

🚗 Autonomous Vehicles

Physical adversarial examples can cause misclassification of stop signs and traffic signals. Adversarial patches on roads could cause lane detection failures, with potentially fatal consequences.

🏥 Medical Diagnosis

Diagnostic networks might correctly identify tumors in test images but fail when images are slightly degraded or from different equipment. Without verification, robustness to realistic imaging variations is unknown.

🔒 Security Systems

Malware detectors using neural networks can be evaded through adversarial perturbations. Attackers who understand the detector can craft malware that evades detection while maintaining malicious functionality.

💰 Financial Systems

Credit scoring and fraud detection networks make high-stakes decisions. Small perturbations to input features could change decisions unfairly or through adversarial manipulation.

The Stakes Are High

When neural networks control safety-critical systems—vehicles, medical devices, security infrastructure—failures aren’t just embarrassing. They can be dangerous. Testing on average cases isn’t sufficient. We need guarantees about worst-case behavior.

Why Testing Isn’t Enough

Standard machine learning evaluation uses test sets: hold out data, measure accuracy, report results. This empirical approach has served the field well for measuring average-case performance. But it’s fundamentally inadequate for assessing robustness.

Coverage problem: A test set with 10,000 images samples an infinitesimal fraction of the input space. For a 224×224×3 image, the input space has roughly \(256^{150528}\) possible inputs. Testing samples only a tiny subset. Adversarial examples can exist in the vast unsampled regions.

Adversarial search is cheap: Finding adversarial examples requires only gradient computation—a few backward passes. An attacker can search far more thoroughly than any test set covers. Testing against known attacks provides some confidence, but an adaptive attacker will find new attacks.

No guarantees from empirical testing: If testing finds no adversarial examples, what have we learned? Only that the specific attacks tried didn’t succeed. This doesn’t mean adversarial examples don’t exist—it means we didn’t find them yet. Testing can falsify (finding counterexamples proves vulnerability) but cannot verify (failing to find counterexamples doesn’t prove robustness).

Table 12 Testing vs Verification

Approach

What It Provides

What It Doesn’t Provide

Computational Cost

Testing/Attacks

Evidence of vulnerabilities (falsification)

Proof of robustness

Low (gradient-based)

Verification

Proof of robustness (or counterexamples)

Scalability to all properties

High (depends on method)

Verification fills the gap: Formal verification provides mathematical proofs that properties hold for all inputs in a region, not just the inputs we happened to test. This is the fundamental difference: verification provides guarantees, testing provides evidence.

What Verification Provides

Neural network verification offers several types of assurance:

Certified robustness: Prove that for all inputs within a perturbation ball around a given input, the network maintains the same classification. This is a sound guarantee—if verification succeeds, the property holds. If it finds a counterexample, you’ve identified a real vulnerability.

Worst-case analysis: Instead of measuring average accuracy, verification analyzes worst-case behavior. For safety-critical applications, worst-case guarantees matter more than average-case performance.

Formal specifications: Verification requires precisely specifying what “correct behavior” means. This specification process often reveals ambiguities in requirements, forcing clearer thinking about what the network should actually do.

Regulatory compliance: Many safety-critical domains (medical devices, aviation, autonomous vehicles) increasingly require formal evidence of robustness. Verification provides the mathematical proof that regulators demand.

Confidence for deployment: Combining verified components with tested components provides layered assurance. Critical decisions rely on verified networks; less critical decisions use faster but unverified networks. This risk-based approach balances assurance with computational constraints.

Verification Guarantees

When verification succeeds, you have a mathematical proof that the specified property holds for all inputs in the verified region. This is stronger than empirical testing, which only covers sampled inputs. The guarantee is formal and rigorous—assuming the verification method is sound and correctly implemented.

The Verification Challenge

Verification is powerful but not easy. Neural networks pose unique challenges:

🔢 Complexity

Networks have millions of parameters and billions of operations. Analyzing all possible behaviors requires reasoning about high-dimensional spaces with complex nonlinearities.

📈 Non-linearity

ReLU activations, softmax layers, and other nonlinear operations create complex decision boundaries that are difficult to analyze formally.

📊 Scalability

Verification methods that work for small networks (hundreds of neurons) often fail for large networks (millions of parameters). Computational cost grows rapidly with network size.

⚖️ Incompleteness Tradeoff

Complete verification methods are slow and often intractable for large networks. Incomplete methods are faster but don’t always provide answers.

These challenges are real and fundamental—some stem from the inherent complexity of the verification problem (NP-completeness [Katz et al., 2017, Weng et al., 2018]), others from engineering limitations. But progress continues: new methods, better tools, and specialized techniques make verification increasingly practical.

When to Use Verification

Verification isn’t always necessary or practical. Understanding when it matters helps allocate resources effectively:

Use verification when:

  • Deploying safety-critical systems (autonomous vehicles, medical devices, aviation)

  • Facing regulatory requirements for formal assurance

  • Protecting against adaptive adversaries (security applications)

  • Needing worst-case guarantees, not just average-case performance

  • Deploying in high-stakes domains where failures have serious consequences

Testing may suffice when:

  • Prototyping and rapid iteration (verification would slow development)

  • Working with very large networks where verification is intractable

  • Operating in low-stakes domains where failures are acceptable

  • Using verification tools not yet mature for your architecture

Hybrid approaches when:

  • You can verify critical components and test the rest

  • Computational budget allows both verification and comprehensive testing

  • Different parts of the system have different assurance requirements

Hint

Start with testing to find vulnerabilities quickly. Use verification for critical components where formal guarantees matter. Combine both for comprehensive assurance: testing finds bugs fast, verification proves their absence in critical regions.

The Path Forward

Verification is maturing from research technique to practical tool. Modern verifiers handle networks with thousands of neurons. GPU acceleration [Xu et al., 2020] makes verification faster. Certified training [Gowal et al., 2019, Zhang et al., 2019] builds verifiability into networks from the start. Hybrid methods [Wang et al., 2021, Zhang et al., 2022] combine incomplete verification with selective refinement for better scalability.

The field is expanding beyond simple feedforward networks to transformers [Vaswani et al., 2017], graph neural networks, and other modern architectures. Specialized verifiers for specific domains (medical imaging, autonomous vehicles) incorporate domain knowledge for tighter bounds.

The future of trustworthy AI depends on verification. As neural networks control increasingly critical systems, empirical testing alone won’t provide sufficient assurance. Verification offers the rigorous guarantees that safety-critical applications demand.

This guide series teaches you how to apply verification practically: choosing methods, writing specifications, training robust networks, and interpreting results. Verification isn’t magic—it’s a tool. Understanding when and how to use it makes the difference between networks you hope are robust and networks you can prove are robust.

Further Reading

This guide provides motivation for neural network verification by examining adversarial vulnerabilities and the limitations of empirical testing. For readers interested in diving deeper, we recommend the following resources organized by topic:

Adversarial Examples:

The discovery of adversarial examples [Szegedy et al., 2013] revealed fundamental vulnerabilities in neural networks. FGSM [Goodfellow et al., 2015] demonstrated that simple gradient-based attacks can generate adversarial examples efficiently. Stronger iterative attacks like PGD [Madry et al., 2018] and optimization-based attacks like C&W [Carlini and Wagner, 2017] showed that adversarial examples are not artifacts of weak attacks but fundamental properties of neural networks. Physical adversarial examples [Eykholt et al., 2018] demonstrated that these vulnerabilities extend to real-world physical systems.

Why Verification Is Needed:

The NP-completeness of neural network verification [Katz et al., 2017, Weng et al., 2018] establishes fundamental computational barriers, explaining why verification is challenging. Despite these barriers, verification provides mathematical guarantees [Katz et al., 2017, Tjeng et al., 2019, Katz et al., 2019] that empirical testing cannot. Understanding these theoretical foundations clarifies what verification can and cannot achieve.

Verification Methods:

Modern verification approaches span complete methods [Katz et al., 2017, Tjeng et al., 2019, Katz et al., 2019] that provide definitive answers but face scalability limits, and incomplete methods [Gowal et al., 2019, Singh et al., 2019, Weng et al., 2018, Zhang et al., 2018] that scale to larger networks by accepting “unknown” results. Hybrid approaches [Wang et al., 2021, Zhang et al., 2022] combine both strategies. Probabilistic certification [Cohen et al., 2019] offers an alternative through randomized smoothing.

Building Robust Networks:

Certified training methods [Gowal et al., 2019, Xu et al., 2020, Zhang et al., 2019] integrate verification into the training process, building verifiable robustness from the start. Adversarial training [Madry et al., 2018] provides empirical robustness. The combination of robust training and formal verification creates networks that are both performant and provably robust.

Next Guide

Continue to Learn NNV in 3 Minutes for a quick 3-minute introduction to neural network verification concepts.

[1] (1,2)

Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy (SP), 39–57. IEEE, 2017.

[2]

Jeremy Cohen, Elan Rosenfeld, and Zico Kolter. Certified adversarial robustness via randomized smoothing. In International Conference on Machine Learning. 2019.

[3] (1,2)

Kevin Eykholt, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, and Dawn Song. Robust physical-world attacks on deep learning visual classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1625–1634. 2018.

[4] (1,2,3)

Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. In International Conference on Learning Representations. 2015.

[5] (1,2,3)

Sven Gowal, Krishnamurthy Dj Dvijotham, Robert Stanforth, Rudy Bunel, Chongli Qin, Jonathan Uesato, Relja Arandjelovic, Timothy Mann, and Pushmeet Kohli. Scalable verified training for provably robust image classification. In Proceedings of the IEEE International Conference on Computer Vision, 4842–4851. 2019.

[6] (1,2,3,4)

Guy Katz, Clark Barrett, David L Dill, Kyle Julian, and Mykel J Kochenderfer. Reluplex: an efficient smt solver for verifying deep neural networks. In International Conference on Computer Aided Verification, 97–117. Springer, 2017.

[7] (1,2)

Guy Katz, Derek A Huang, Duligur Ibeling, Kyle Julian, Christopher Lazarus, Rachel Lim, Parth Shah, Shantanu Thakoor, Haoze Wu, Aleksandar Zeljić, and others. The marabou framework for verification and analysis of deep neural networks. In International Conference on Computer Aided Verification, 443–452. Springer, 2019.

[8] (1,2,3)

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations. 2018.

[9]

Gagandeep Singh, Rupanshu Ganvir, Markus Püschel, and Martin Vechev. Beyond the single neuron convex barrier for neural network certification. In Advances in Neural Information Processing Systems, 15072–15083. 2019.

[10] (1,2)

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013.

[11] (1,2)

Vincent Tjeng, Kai Y. Xiao, and Russ Tedrake. Evaluating robustness of neural networks with mixed integer programming. In International Conference on Learning Representations. 2019.

[12]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Advances in neural information processing systems, 5998–6008. 2017.

[13] (1,2)

Shiqi Wang, Huan Zhang, Kaidi Xu, Xue Lin, Suman Jana, Cho-Jui Hsieh, and J Zico Kolter. Beta-crown: efficient bound propagation with per-neuron split constraints for neural network robustness verification. Advances in Neural Information Processing Systems, 2021.

[14] (1,2,3)

Lily Weng, Huan Zhang, Hongge Chen, Zhao Song, Cho-Jui Hsieh, Luca Daniel, Duane Boning, and Inderjit Dhillon. Towards fast computation of certified robustness for relu networks. In International Conference on Machine Learning, 5276–5285. 2018.

[15] (1,2)

Kaidi Xu, Zhouxing Shi, Huan Zhang, Yihan Wang, Kai-Wei Chang, Minlie Huang, Bhavya Kailkhura, Xue Lin, and Cho-Jui Hsieh. Automatic perturbation analysis for scalable certified robustness and beyond. Advances in Neural Information Processing Systems, 2020.

[16] (1,2)

Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric Xing, Laurent El Ghaoui, and Michael Jordan. Theoretically principled trade-off between robustness and accuracy. In International conference on machine learning, 7472–7482. PMLR, 2019.

[17] (1,2)

Huan Zhang, Shiqi Wang, Kaidi Xu, Linyi Li, Bo Li, Suman Jana, Cho-Jui Hsieh, and J. Zico Kolter. General cutting planes for bound-propagation-based neural network verification. arXiv preprint arXiv:2208.05740, 2022.

[18]

Huan Zhang, Tsui-Wei Weng, Pin-Yu Chen, Cho-Jui Hsieh, and Luca Daniel. Efficient neural network robustness certification with general activation functions. In Advances in neural information processing systems, 4939–4948. 2018.