Real-World Applications

Verification isn’t just academic—it’s being used in real systems today. From autonomous vehicles to medical devices, practitioners are applying verification to build confidence in critical neural networks. But verification looks different in each domain. What matters in autonomous vehicles differs from what matters in medical diagnosis. Understanding domain-specific verification needs helps you apply tools effectively.

Autonomous Vehicles and Perception Systems

Autonomous vehicles rely on neural networks for perception: object detection, lane detection, traffic sign recognition. These systems must handle diverse weather, lighting, and road conditions. Verification helps ensure robustness to sensor noise and unexpected conditions, including physical adversarial examples [Eykholt et al., 2018] like adversarial road signs.

What gets verified: Object detection networks’ robustness to small sensor perturbations. Lane detection networks’ behavior under rain or fog. Traffic sign classification robustness to occlusions or distortions. The specification is typically: “under realistic sensor noise, output classification remains correct.”

Challenges specific to AVs: Sensor characteristics vary (camera, lidar, radar each have different noise models). Networks must work across diverse weather and lighting conditions. Verification must account for this diversity without becoming overly conservative. Safety implications are extreme—a misclassified object could be fatal.

Notable work: The ACAS Xu benchmark [Katz et al., 2017] addresses collision avoidance—a critical autonomous system. Verification here is mature, with multiple solvers able to verify properties. ACAS Xu shows what’s possible when the domain is narrow and specifications are precise.

AV Component

Verification Goal

Key Challenge

Status

Object detector

Robustness to sensor noise

Maintaining detection quality

Emerging

Lane detection

Behavior under weather variation

Diverse condition handling

Research stage

Sign classification

Occlusion robustness

Physical realism

Lab demonstrations

Collision avoidance (ACAS Xu)

Loss of separation prevention

Formal specification rigor

Production-ready verification

Regulatory landscape: Aviation already mandates verification (FAA requirements). Ground vehicle regulations are developing but not yet strict. Medical devices are ahead of automotive in regulatory requirements.

Medical Diagnosis and Clinical Decision Support

Medical neural networks make high-stakes decisions: interpreting radiological images, suggesting diagnoses, predicting patient outcomes. Errors have serious consequences. Verification is part of building FDA-approved systems.

What gets verified: Image classification networks (detecting tumors, identifying lesions) under realistic imaging variations. Clinical prediction models’ robustness to measurement noise or data entry errors. The critical property: “under realistic perturbations to imaging data or inputs, the diagnostic output changes appropriately and doesn’t flip to dangerous alternatives.”

Domain-specific challenges: Medical datasets are small compared to ImageNet. Training data often comes from specific imaging equipment or institutions. Verification must account for domain shift when deploying to new facilities. Clinical decisions often require both the network’s output and supporting evidence—pure classification isn’t enough.

Regulatory requirements: The FDA requires evidence of robustness for approved medical devices. This isn’t necessarily formal verification, but can include adversarial testing, tolerance analysis, and verification on critical properties. Formal verification complements traditional medical device validation.

Ethical Considerations in Medical Verification

Medical systems serve human patients. Over-optimism about verification can be dangerous. Use verification as one component of assurance alongside clinical validation, diverse testing, and human oversight. Don’t claim verified robustness beyond what you’ve actually verified.

Robotics and Autonomous Systems

Robotic control systems increasingly use neural networks for vision and decision-making. Verification helps ensure safe behavior.

What gets verified: Vision networks for grasping (correctly identifying graspable surfaces). Navigation networks (avoiding obstacles under perception uncertainty). Control networks (maintaining stability under unexpected disturbances).

Unique challenges: Robotics operates in continuous physical systems where small errors accumulate. A small perception mistake early in a manipulation task can cause cascade failures. Verification must account for this temporal accumulation.

Example domain: Robotic surgery combines perception and control. A verification framework might verify that perception networks maintain accuracy under the surgical lighting conditions, and that control networks maintain stability constraints even under realistic sensor noise.

Current state: Some robots use verified components (verified perception in simple grasping). Full verification of complex manipulation tasks remains research. Real-world deployment often combines verified components with redundancy and human oversight.

Cybersecurity and Malware Detection

Security applications use neural networks for anomaly detection, malware classification, and intrusion detection. These systems face an adversary who actively tries to evade detection.

What gets verified: Robustness to evasion attacks [Athalye et al., 2018] designed to bypass detection. Consistency in classification (same malware samples consistently classified as malicious). The challenge: an adversary might discover robustness violations that verification misses.

Unique aspect: In security, verification is an arms race. An adversary with knowledge of the verified properties can still search for other weaknesses. Verification proves robustness to specific perturbations, not robustness to all adversarial attacks.

Practical approach: Verification helps, but is combined with diversity (multiple detection models), sandboxing, and behavioral monitoring. No single verification or classifier is trusted absolutely in security-critical contexts.

Financial Systems and Algorithmic Decision-Making

Financial institutions use neural networks for credit scoring, fraud detection, and trading decisions. Regulatory and ethical requirements demand fairness and explainability.

What gets verified: Fairness properties (credit scoring isn’t biased by protected attributes). Robustness to input variations (fraud detection doesn’t flip on minor data entry errors). Output bounds (estimated risk scores stay in realistic ranges).

Regulatory context: Fair lending laws and recent AI regulation increasingly require evidence of fairness. This is less about technical verification than about demonstrating robustness across demographic groups. But formal verification can support fairness evidence.

Challenge: Financial data distributions are complex. Fairness properties interact with accuracy. A perfectly fair model might have lower overall accuracy. Verification must navigate this tradeoff.

Natural Language Processing and Text Models

NLP networks process text, which has semantic perturbations very different from pixel perturbations. What does robustness mean for text?

What gets verified: Text classification robustness to character substitutions, word substitutions, or paraphrasing. Sentiment analysis consistency. Named entity recognition stability.

Current limitations: Semantic perturbations in text are hard to formalize. “The movie was great” and “The movie was excellent” are semantically similar but different character strings. Defining robustness rigorously is harder than for images. Current verification tools have limited applicability to text.

Emerging approaches: Using word embeddings to define semantic distance. Verifying simpler text properties where perturbations are well-defined. Research is advancing, but NLP verification is less mature than computer vision verification.

Deployment Considerations

Getting verification into production requires thinking beyond the verification itself.

Runtime monitoring: Deploy verified networks alongside monitors that check if inputs remain in verified regions. If an input falls outside verified assumptions, trigger fallback behavior.

Fallback mechanisms: Verified networks handle the common case within verified boundaries. Out-of-distribution inputs trigger fallback to slower but more robust methods, or human review.

Continuous verification: Networks may be updated frequently. Verification must be part of the update pipeline—any model change requires re-verification.

Certification documentation: Regulatory compliance requires documenting what was verified, under what assumptions, with what limitations. “Verified network” is meaningless; “network verified to maintain classification under \(\ell_\infty\) norm perturbations with epsilon=1/255” is precise.

Deployment Strategy

Use Case

Computational Cost

Confidence Level

Verified core with runtime monitoring

Production systems with bounded inputs

Low (single forward pass)

High within verified bounds

Fallback for out-of-distribution

Safety-critical systems

Higher (conditional computation)

High (fallback guarantees)

Continuous verification in CI/CD

Rapidly evolving models

High (offline verification)

Continuous assurance

Verified components + redundancy

High-stakes decisions

Medium (multiple networks)

High (independent verification)

Common Patterns Across Domains

Several patterns emerge across different application domains:

Start with critical components: Don’t verify everything. Identify the most critical decisions and verify those. Fallback to testing for less critical components.

Domain expertise is essential: Generic verification tools are useful, but domain experts must define what properties matter and what perturbations are realistic.

Combine verification with testing: Verification proves some properties, testing confirms others. Together they provide comprehensive confidence.

Iterate on specifications: Requirements are often clarified through verification attempts. A specification that times out might be refined to something more tractable.

Monitor after deployment: Even verified networks can encounter unexpected inputs. Runtime monitoring catches situations where verification assumptions don’t hold.

Research Frontiers

The field is developing domain-specific verification frameworks. Rather than generic tools, specialized verifiers for medical imaging, autonomous vehicles, and other critical domains are emerging. These incorporate domain knowledge about realistic perturbations and critical properties.

Hybrid human-machine verification is also emerging. Combining formal verification with testing, human expertise, and simulation to build comprehensive assurance frameworks that are more practical than pure formal verification alone.

Final Thoughts

Verification is practical and being used today. But it looks different across domains. Medical devices, autonomous vehicles, and safety-critical systems all apply verification, but with different specifications, tools, and workflows adapted to domain-specific needs.

Domain expertise matters as much as verification technique. Understanding what your application requires, what perturbations are realistic, and what properties matter is half the battle. The other half is choosing appropriate verification tools and accepting the limitations of current techniques.

Real-world verification is not “perfect assurance.” It’s evidence of robustness under specific assumptions. Combined with testing, monitoring, and human oversight, it helps build systems you can reasonably trust.

Further Reading

This guide provides comprehensive coverage of neural network verification across real-world application domains. For readers interested in diving deeper, we recommend the following resources organized by topic:

Safety-Critical Systems:

The ACAS Xu collision avoidance benchmark [Katz et al., 2017] established rigorous formal specifications for aviation safety, demonstrating that production-ready verification is achievable when domains are narrow and specifications are precise. This work uses complete verification methods [Katz et al., 2017, Katz et al., 2019] to provide absolute guarantees for collision avoidance—a critical requirement in aviation where probabilistic guarantees are insufficient.

Adversarial Robustness in Real-World Contexts:

Physical adversarial examples [Eykholt et al., 2018] demonstrate that adversarial attacks extend beyond digital perturbations to physical-world manipulations like adversarial road signs that fool autonomous vehicle perception systems. Evasion attacks [Athalye et al., 2018] in cybersecurity contexts show that adversaries actively probe defenses, requiring verification to account for adaptive attackers. Decision-based attacks [Brendel et al., 2018] operate with minimal model access, representing realistic threat models for deployed systems.

Verification Methods for Applications:

Different application domains require different verification approaches. Complete verification methods [Katz et al., 2017, Tjeng et al., 2019] provide absolute guarantees suitable for safety-critical aviation and medical systems. Incomplete but scalable methods [Gowal et al., 2019, Singh et al., 2019, Weng et al., 2018, Zhang et al., 2018] enable verification of larger perception networks in autonomous vehicles. Randomized smoothing [Cohen et al., 2019] offers probabilistic certificates that scale to production-sized models where deterministic verification becomes intractable.

Training for Real-World Robustness:

Deploying verified networks requires training strategies that build in robustness. Certified training methods [Gowal et al., 2019, Zhang et al., 2019] integrate verification bounds into the training objective. Adversarial training [Madry et al., 2018] provides empirical robustness through training on strong attacks. The choice between certified and empirical training depends on regulatory requirements and acceptable assurance levels in each domain.

Related Topics:

For writing domain-specific formal specifications, see Formal Specifications. For training networks with robustness objectives suitable for deployment, see Training Robust Networks. For practical workflows combining verification with testing, see Robustness Testing Guide.

Congratulations!

You’ve completed all phases of the NNV Guides! To see these techniques in action, explore WraLU: Fast and Precise ReLU Hull Approximation and WraAct: Convex Hull Approximation for General Activation Functions for practical verification tools, or visit My Research Journey to learn about ongoing research directions.

[1] (1,2)

Anish Athalye, Nicholas Carlini, and David Wagner. Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. In International Conference on Machine Learning, 274–283. 2018.

[2]

Wieland Brendel, Jonas Rauber, and Matthias Bethge. Decision-based adversarial attacks: reliable attacks against black-box machine learning models. In International Conference on Learning Representations. 2018.

[3]

Jeremy Cohen, Elan Rosenfeld, and Zico Kolter. Certified adversarial robustness via randomized smoothing. In International Conference on Machine Learning. 2019.

[4] (1,2)

Kevin Eykholt, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, and Dawn Song. Robust physical-world attacks on deep learning visual classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1625–1634. 2018.

[5] (1,2)

Sven Gowal, Krishnamurthy Dj Dvijotham, Robert Stanforth, Rudy Bunel, Chongli Qin, Jonathan Uesato, Relja Arandjelovic, Timothy Mann, and Pushmeet Kohli. Scalable verified training for provably robust image classification. In Proceedings of the IEEE International Conference on Computer Vision, 4842–4851. 2019.

[6] (1,2,3,4)

Guy Katz, Clark Barrett, David L Dill, Kyle Julian, and Mykel J Kochenderfer. Reluplex: an efficient smt solver for verifying deep neural networks. In International Conference on Computer Aided Verification, 97–117. Springer, 2017.

[7]

Guy Katz, Derek A Huang, Duligur Ibeling, Kyle Julian, Christopher Lazarus, Rachel Lim, Parth Shah, Shantanu Thakoor, Haoze Wu, Aleksandar Zeljić, and others. The marabou framework for verification and analysis of deep neural networks. In International Conference on Computer Aided Verification, 443–452. Springer, 2019.

[8]

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations. 2018.

[9]

Gagandeep Singh, Rupanshu Ganvir, Markus Püschel, and Martin Vechev. Beyond the single neuron convex barrier for neural network certification. In Advances in Neural Information Processing Systems, 15072–15083. 2019.

[10]

Vincent Tjeng, Kai Y. Xiao, and Russ Tedrake. Evaluating robustness of neural networks with mixed integer programming. In International Conference on Learning Representations. 2019.

[11]

Lily Weng, Huan Zhang, Hongge Chen, Zhao Song, Cho-Jui Hsieh, Luca Daniel, Duane Boning, and Inderjit Dhillon. Towards fast computation of certified robustness for relu networks. In International Conference on Machine Learning, 5276–5285. 2018.

[12]

Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric Xing, Laurent El Ghaoui, and Michael Jordan. Theoretically principled trade-off between robustness and accuracy. In International conference on machine learning, 7472–7482. PMLR, 2019.

[13]

Huan Zhang, Tsui-Wei Weng, Pin-Yu Chen, Cho-Jui Hsieh, and Luca Daniel. Efficient neural network robustness certification with general activation functions. In Advances in neural information processing systems, 4939–4948. 2018.