Beyond \(\ell_p\): Alternative Threat Models

Most neural network verification [Gowal et al., 2019, Weng et al., 2018, Zhang et al., 2018, Katz et al., 2019] focuses on \(\ell_p\) norm-bounded perturbations—small pixel changes invisible to humans. But real-world adversaries aren’t constrained to \(\ell_p\) balls. They can modify semantics (change object pose, lighting), add physical patches, or exploit distribution shift.

These alternative threat models require different verification approaches. Semantic perturbations need invariance to meaningful transformations. Physical adversarial examples demand robustness under real-world constraints. Distributional robustness requires guarantees across data distributions. Each threat model brings unique challenges and opportunities for verification.

This guide explores threat models beyond \(\ell_p\), how they differ from traditional adversarial perturbations, and what verification methods exist for each.

Limitations of \(\ell_p\) Threat Models

Why \(\ell_p\) Norms Dominate

Standard formulation: Adversarial robustness typically considers \(\ell_p\) balls:

\[\mathcal{B}_p^\epsilon(x_0) = \{ x : \|x - x_0\|_p \leq \epsilon \}\]

Advantages:

Widespread adoption: FGSM [Goodfellow et al., 2015], PGD [Madry et al., 2018], C&W attacks [Carlini and Wagner, 2017] all use \(\ell_p\) norms. Most verification benchmarks (VNN-COMP, certified accuracy metrics) use \(\ell_p\).

What \(\ell_p\) Misses

Semantic meaningfulness: \(\ell_p\) perturbations often aren’t semantically meaningful. Small \(\ell_2\) changes might move “airplane” to “nonsense image” rather than to another valid airplane.

Perceptual realism: Humans don’t perceive \(\ell_p\) distance. Two images with small \(\ell_2\) distance might look very different; two images with large \(\ell_2\) distance might look similar.

Physical realizability: \(\ell_p\) perturbations assume arbitrary pixel control. Physical attackers face constraints (lighting, viewpoint, printability).

Distribution shift: Real-world deployment faces natural distribution shift (different demographics, weather, hardware) not captured by \(\ell_p\) perturbations.

The \(\ell_p\) Paradox

\(\ell_p\) balls are:

  • Mathematically convenient: Easy to formulate, verify, optimize

  • Practically limited: Don’t capture many real-world threats

Result: Standard verification provides guarantees against threats that may not occur, while missing threats that do occur.

Semantic Perturbations

Semantic perturbations modify meaningful attributes (pose, lighting, color) rather than arbitrary pixels.

What Are Semantic Perturbations?

Definition: Perturbations that change semantic properties while maintaining object identity.

Examples:

  • Rotation: Rotate object by angle \(\theta\)

  • Translation: Shift object position

  • Scaling: Resize object

  • Illumination: Change brightness, contrast, color temperature

  • 3D pose: Change viewpoint in 3D space

  • Weather: Add rain, snow, fog effects

Key property: Semantic perturbations produce valid, realistic images. Unlike \(\ell_p\) noise, they correspond to real-world variations.

\[x' = T_{\text{semantic}}(x, \theta)\]

where \(T\) is a semantic transformation parameterized by \(\theta\) (e.g., rotation angle, brightness factor).

Verification Approaches

Data augmentation bounds [Cohen et al., 2019]:

  • Define distribution over transformations \(T \sim \mathcal{D}\)

  • Use randomized smoothing with transformation distribution

  • Certified radius in transformation space (e.g., rotation up to \(\pm 30°\))

Geometric verification:

  • For specific transformations (rotation, translation), use geometric reasoning

  • Example: Verify rotation invariance by checking all rotations in a range

Abstraction refinement:

  • Abstract semantic transformations into over-approximations

  • Use standard verification [Singh et al., 2019] on abstracted space

  • Refine if bounds too loose

Challenge: Semantic transformations are non-linear (rotation involves trigonometry, scaling changes image structure). Standard linear verification [Weng et al., 2018, Zhang et al., 2018] doesn’t directly apply.

Current state: Verification for semantic perturbations is less mature than \(\ell_p\). Randomized smoothing [Cohen et al., 2019] with transformation distributions is most practical approach.

Physical Adversarial Examples

Physical adversarial examples exist in the physical world—patches, stickers, objects that fool cameras/sensors.

Physical Realizability Constraints

Printability: Perturbations must be printable with available printers (discrete color gamut, limited resolution).

Viewpoint robustness: Must fool the network from multiple viewing angles, distances.

Lighting robustness: Must work under various lighting conditions.

Environmental factors: Shadows, reflections, weather affect appearance.

Physical vs Digital Attacks

Digital (\(\ell_p\)):

  • Arbitrary pixel control

  • Perfect precision

  • Works in digital space only

Physical:

  • Limited to printable/fabricable perturbations

  • Subject to environmental variations

  • Works in real world (cameras, sensors)

Physical attacks are constrained but more dangerous: They work against deployed systems in the real world.

Adversarial Patches

Adversarial patch [Brown et al., 2017]: A localized region (sticker, poster) designed to fool a network when placed in the scene.

Formulation: Optimize a patch \(p\) to cause misclassification when applied to image:

\[\max_p \mathbb{E}_{x, t, l} \left[ \mathcal{L}(f_\theta(A(x, p, t, l)), y_{\text{target}}) \right]\]

where:

  • \(p\): Patch content

  • \(x\): Base image

  • \(t\): Patch location/transformation

  • \(l\): Lighting/environmental conditions

  • \(A\): Applies patch to image

Verification: Proving robustness to patches is hard—must show no patch of size \(k \times k\) anywhere in the image can fool the network.

Approaches:

  • Certified patch defense: For small patches (single pixel), certify using masking and aggregation

  • Randomized ablation: Randomly mask portions of input, aggregate predictions

  • Segmentation-based: Use object segmentation to identify and ignore suspicious regions

Current state: Verification for general patch attacks is largely open. Most work focuses on specific patch sizes or defensive strategies rather than complete verification.

Physical World Attacks

Stop sign attacks: Stickers on stop signs causing misclassification [Eykholt et al., 2018].

3D-printed objects: Adversarial 3D objects (turtles, cars) designed to fool classifiers from multiple angles.

Wearable adversarial examples: Clothing, glasses designed to fool facial recognition.

Verification challenge: Must account for:

  • Camera optics (distortion, blur)

  • Lighting variations

  • Viewing angle changes

  • Environmental noise

No standard verification framework yet. Research explores:

  • Simulation-based testing (render scenes under various conditions)

  • Differentiable rendering for optimization

  • Statistical guarantees via sampling (probabilistic certification)

Sparse Perturbations (\(\ell_0\) Attacks)

\(\ell_0\) attacks** modify a limited number of pixels, potentially by large amounts.

\(\ell_0\) Threat Model

Definition: Adversarial example with at most \(k\) pixels changed:

\[\mathcal{B}_0^k(x_0) = \{ x : \|x - x_0\|_0 \leq k \}\]

where \(\|x - x_0\|_0\) counts non-zero entries (number of changed pixels).

Difference from \(\ell_p\):

  • \(\ell_p\): Small changes to many pixels

  • \(\ell_0\): Large changes to few pixels

Physical intuition: Corresponds to localized damage (dead pixels, sensor noise, occlusions).

Verification for \(\ell_0\)

Combinatorial challenge: With \(d\)-dimensional input and budget \(k\), there are \(\binom{d}{k}\) possible pixel sets to perturb. For images (\(d \sim 3000\)), even \(k=10\) gives billions of combinations.

Complete verification: Enumerate all \(\binom{d}{k}\) combinations, verify each. Intractable for realistic \(d, k\).

Incomplete approaches:

  • Greedy search: Iteratively select most influential pixels to perturb

  • Heuristic bounds: Use pixel importance scores to focus verification

  • Randomized certification: Sample pixel subsets, aggregate results probabilistically

Specialized defenses:

  • Median filtering: Replacing pixels with neighborhood medians mitigates sparse attacks

  • Randomized masking: Randomly mask pixels during inference, reducing attack success

Current state: Limited verification methods exist for \(\ell_0\). Most work focuses on attack generation and specific defenses rather than general certification.

Distributional Robustness

Distributional robustness ensures performance across related but distinct data distributions.

Distribution Shift Problem

Training distribution \(\mathcal{D}_{\text{train}}\) vs test distribution \(\mathcal{D}_{\text{test}}\):

  • Different demographics, ages, skin tones (facial recognition)

  • Different weather, lighting (autonomous driving)

  • Different hardware, cameras (medical imaging)

Standard ML assumption: \(\mathcal{D}_{\text{train}} = \mathcal{D}_{\text{test}}\). In practice, this often fails.

Robustness goal: Ensure good performance on \(\mathcal{D}_{\text{test}}\) even when different from \(\mathcal{D}_{\text{train}}\).

Formulations

Worst-case distribution in neighborhood of training distribution:

\[\max_{\mathcal{D} : d(\mathcal{D}, \mathcal{D}_{\text{train}}) \leq \epsilon} \mathbb{E}_{x \sim \mathcal{D}} [\mathcal{L}(f_\theta(x), y)]\]

where \(d\) is a distribution distance (e.g., Wasserstein, KL divergence).

Distributionally robust optimization (DRO):

\[\min_\theta \max_{\mathcal{D} \in \mathcal{U}(\mathcal{D}_{\text{train}})} \mathbb{E}_{x \sim \mathcal{D}} [\mathcal{L}(f_\theta(x), y)]\]

where \(\mathcal{U}\) is an uncertainty set around training distribution.

Verification and Certification

Challenge: Verifying distributional robustness requires reasoning about sets of distributions, not individual inputs.

Approaches:

  • DRO training: Train networks to be distributionally robust. Verification comes from training objective guarantees.

  • Statistical testing: Sample from candidate test distributions, check performance empirically.

  • Causal reasoning: Use causal models to bound performance under interventions/shifts.

Connection to \(\ell_p\): Individual \(\ell_p\) robustness can be viewed as distributional robustness under uniform perturbations. Distributional robustness generalizes this to structured shifts.

Current state: Distributional robustness is active research area. Practical verification methods are limited; most work focuses on training for distributional robustness rather than verifying it post-hoc.

Domain-Specific Threat Models

Different application domains have unique threat models:

Autonomous Driving

Threats:

  • Weather conditions (rain, fog, snow)

  • Lighting changes (day/night, shadows)

  • Sensor noise (camera, LiDAR)

  • Adversarial objects (misleading signs, fake lane markings)

Verification needs:

  • Safety properties (don’t hit pedestrians, stay in lane)

  • Continuous safety (properties hold over time, not just single frames)

  • Multi-modal robustness (camera + LiDAR consistency)

Approaches: Simulation-based testing, scenario coverage, formal methods for control systems.

Medical Imaging

Threats:

  • Scanner variations (different manufacturers, models)

  • Imaging artifacts (noise, contrast variations)

  • Patient variations (age, anatomy)

  • Adversarial manipulations (malicious image modification)

Verification needs:

  • Diagnostic accuracy across populations

  • Robustness to imaging protocols

  • Detection of manipulated images

Approaches: Cross-scanner validation, adversarial training for artifacts, uncertainty quantification.

Facial Recognition

Threats:

  • Demographic variations (age, race, gender)

  • Expression changes (smiling, frowning)

  • Accessories (glasses, hats, makeup)

  • Adversarial glasses/makeup

Verification needs:

  • Fairness across demographics

  • Robustness to accessories

  • Detection of adversarial examples

Approaches: Fairness-constrained training, adversarial robustness for accessories, liveness detection.

Composing Threat Models

Real-world robustness requires defending against multiple threat models simultaneously.

Multi-Perturbation Robustness

Union of threats: Robust to \(\ell_p\) AND semantic AND physical:

\[\forall x' \in \mathcal{T}_1 \cup \mathcal{T}_2 \cup \cdots \cup \mathcal{T}_k, \quad \phi(f_\theta(x')) = \text{true}\]

Challenge: Verifying union is harder than verifying each individually.

Approaches:

  • Modular verification: Verify each threat separately, compose results

  • Joint training: Train for multiple threats simultaneously

  • Hierarchical robustness: Ensure robustness to most general threat (implies robustness to specific ones)

Tradeoffs: Robustness to multiple threats may come with larger accuracy drop than single-threat robustness.

Verification Challenges for Alternative Threats

Why Alternative Threats Are Harder

Non-convexity: Semantic transformations (rotation, scaling) are non-linear. Standard linear verification [Weng et al., 2018, Zhang et al., 2018] doesn’t apply.

High dimensionality: Physical attacks involve many parameters (lighting, viewpoint, patch content). Verification must reason about this high-dimensional space.

Lack of structure: \(\ell_p\) balls have nice mathematical structure (convex, symmetric). Alternative threats often lack this, making verification harder.

Limited tooling: Most verification tools [Katz et al., 2019, Xu et al., 2020] assume \(\ell_p\). Adapting them to alternative threats requires significant modification.

Current State and Future Directions

Semantic perturbations: Randomized smoothing [Cohen et al., 2019] with transformation distributions is most mature. Geometric verification for specific transformations (rotation) emerging.

Physical adversarial examples: Certified patch defenses for small patches exist. General physical verification largely open problem.

Sparse perturbations: Limited verification methods. Mostly heuristic defenses and empirical evaluation.

Distributional robustness: DRO training provides some guarantees. Post-hoc verification largely unsolved.

Multi-threat robustness: Very early stage. Composing verification across threat models is open challenge.

Practical Recommendations

For Researchers

Define threat model carefully: What perturbations can adversaries actually perform? \(\ell_p\) may not capture real threats.

Develop domain-specific verification: Generic \(\ell_p\) verification isn’t enough. Build verification methods for your domain’s threats.

Combine approaches: Use \(\ell_p\) verification as baseline, augment with domain-specific testing/verification.

For Practitioners

Understand deployment threats: What attacks will your system face in practice? Physical patches? Distribution shift?

Multi-layered defense:

  1. \(\ell_p\) adversarial training (baseline robustness)

  2. Data augmentation for semantic/physical robustness

  3. Runtime monitoring for anomalies

  4. Regular re-training for distribution shift

Don’t rely solely on :math:`ell_p` certification: Use it as one component, not complete solution.

For Tool Builders

Extend beyond \(\ell_p\): Add support for semantic perturbations, sparse attacks, custom threat models.

Modular design: Allow users to define custom threat models, plug in verification methods.

Simulation support: Integrate with renderers, physics engines for physical adversarial example verification.

Final Thoughts

\(\ell_p\) threat models dominate neural network verification [Gowal et al., 2019, Weng et al., 2018, Zhang et al., 2018, Katz et al., 2019] for good reason: mathematical convenience, well-studied theory, and practical tools. But real-world threats extend far beyond \(\ell_p\).

Semantic perturbations, physical adversarial examples, sparse attacks, and distributional robustness each require different verification approaches. Some, like randomized smoothing for semantic perturbations [Cohen et al., 2019], are maturing. Others, like general physical verification, remain largely open.

The future of neural network verification lies in domain-aware methods: understanding the specific threats each application faces and building verification tailored to those threats. \(\ell_p\) verification provides a foundation, but comprehensive robustness demands going beyond it.

Understanding alternative threat models clarifies verification’s scope and limitations. \(\ell_p\) certification is valuable but incomplete. Building truly robust systems requires addressing the full spectrum of threats, each with its own verification challenges and opportunities.

Further Reading

This guide provides comprehensive coverage of threat models beyond \(\ell_p\) norms. For readers interested in diving deeper, we recommend the following resources organized by topic:

Semantic Perturbations:

Randomized smoothing [Cohen et al., 2019] extends naturally to semantic transformations by using transformation distributions instead of additive noise. This provides certified robustness to rotations, translations, and other meaningful perturbations with theoretical guarantees.

Physical Adversarial Examples:

Adversarial patch attacks [Brown et al., 2017] demonstrated that localized, printable perturbations can fool networks in the physical world. Real-world attacks on stop signs [Eykholt et al., 2018] showed practical deployment of these threats. Certified defenses remain limited to specific scenarios like small patch sizes.

Standard \(\ell_p\) Attacks for Context:

Understanding classical \(\ell_p\) attacks—FGSM [Goodfellow et al., 2015], PGD [Madry et al., 2018], C&W [Carlini and Wagner, 2017]—provides essential context for alternative threats. These represent the baseline against which alternative threat models are compared.

Verification Methods for \(\ell_p\):

Standard verification methods—CROWN [Weng et al., 2018, Zhang et al., 2018], DeepPoly [Singh et al., 2019], IBP [Gowal et al., 2019], Marabou [Katz et al., 2019]—primarily target \(\ell_p\) threats. Understanding their design helps clarify what modifications are needed for alternative threats.

Certified Training:

Training for robustness—whether adversarial [Madry et al., 2018], certified [Gowal et al., 2019, Zhang et al., 2020], or randomized smoothing-based [Cohen et al., 2019]—typically focuses on \(\ell_p\). Extending certified training to alternative threats is active research.

GPU Acceleration:

GPU-accelerated verification [Xu et al., 2020] has primarily benefited \(\ell_p\) verification. Adapting these techniques to alternative threat models could enable practical verification at scale for non-\(\ell_p\) threats.

Theoretical Barriers:

The NP-completeness of verification [Katz et al., 2017, Weng et al., 2018] applies regardless of threat model. Alternative threats don’t escape computational barriers but may present different practical challenges.

Related Topics:

For understanding \(\ell_p\) threat models that alternative threats extend, see Threat Models in Neural Network Verification. For verification methods primarily designed for \(\ell_p\), see Bound Propagation Approaches, Marabou and Reluplex: Extended Simplex for Verification, and Branch-and-Bound Verification. For certified defenses including randomized smoothing, see Certified Defenses and Randomized Smoothing. For practical robustness testing across threat models, see Robustness Testing Guide.

Next Guide

Continue to Verifying Diverse Architectures to explore unique verification challenges for Transformers, RNNs, Graph Neural Networks, and other modern architectures.

[1] (1,2)

TB Brown, D Mané, A Roy, M Abadi, and J Gilmer. Adversarial patch. arxiv e-prints (dec. 2017). arXiv preprint cs.CV/1712.09665, 1(2):4, 2017.

[2] (1,2)

Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy (SP), 39–57. IEEE, 2017.

[3] (1,2,3,4,5,6)

Jeremy Cohen, Elan Rosenfeld, and Zico Kolter. Certified adversarial robustness via randomized smoothing. In International Conference on Machine Learning. 2019.

[4] (1,2)

Kevin Eykholt, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, and Dawn Song. Robust physical-world attacks on deep learning visual classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1625–1634. 2018.

[5] (1,2)

Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. In International Conference on Learning Representations. 2015.

[6] (1,2,3,4,5)

Sven Gowal, Krishnamurthy Dj Dvijotham, Robert Stanforth, Rudy Bunel, Chongli Qin, Jonathan Uesato, Relja Arandjelovic, Timothy Mann, and Pushmeet Kohli. Scalable verified training for provably robust image classification. In Proceedings of the IEEE International Conference on Computer Vision, 4842–4851. 2019.

[7]

Guy Katz, Clark Barrett, David L Dill, Kyle Julian, and Mykel J Kochenderfer. Reluplex: an efficient smt solver for verifying deep neural networks. In International Conference on Computer Aided Verification, 97–117. Springer, 2017.

[8] (1,2,3,4)

Guy Katz, Derek A Huang, Duligur Ibeling, Kyle Julian, Christopher Lazarus, Rachel Lim, Parth Shah, Shantanu Thakoor, Haoze Wu, Aleksandar Zeljić, and others. The marabou framework for verification and analysis of deep neural networks. In International Conference on Computer Aided Verification, 443–452. Springer, 2019.

[9] (1,2,3)

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations. 2018.

[10] (1,2)

Gagandeep Singh, Rupanshu Ganvir, Markus Püschel, and Martin Vechev. Beyond the single neuron convex barrier for neural network certification. In Advances in Neural Information Processing Systems, 15072–15083. 2019.

[11] (1,2,3,4,5,6,7)

Lily Weng, Huan Zhang, Hongge Chen, Zhao Song, Cho-Jui Hsieh, Luca Daniel, Duane Boning, and Inderjit Dhillon. Towards fast computation of certified robustness for relu networks. In International Conference on Machine Learning, 5276–5285. 2018.

[12] (1,2)

Kaidi Xu, Zhouxing Shi, Huan Zhang, Yihan Wang, Kai-Wei Chang, Minlie Huang, Bhavya Kailkhura, Xue Lin, and Cho-Jui Hsieh. Automatic perturbation analysis for scalable certified robustness and beyond. Advances in Neural Information Processing Systems, 2020.

[13]

Huan Zhang, Hongge Chen, Chaowei Xiao, Sven Gowal, Robert Stanforth, Bo Li, Duane Boning, and Cho-Jui Hsieh. Towards stable and efficient training of verifiably robust neural networks. In International Conference on Learning Representations. 2020.

[14] (1,2,3,4,5,6)

Huan Zhang, Tsui-Wei Weng, Pin-Yu Chen, Cho-Jui Hsieh, and Luca Daniel. Efficient neural network robustness certification with general activation functions. In Advances in neural information processing systems, 4939–4948. 2018.