Phase 4: Advanced Topics Guide 5

Beyond Lp: Alternative Threat Models

Extensions to threat models beyond Lp norms, including semantic adversaries, patch adversaries, sparse perturbations, and distributional attacks

Beyond p\ell_p: Alternative Threat Models

Most neural network verification focuses on p\ell_p norm-bounded perturbations—small pixel changes invisible to humans. But real-world adversaries aren’t constrained to p\ell_p balls. They can modify semantics (change object pose, lighting), add physical patches, or exploit distribution shift.

These alternative threat models require different verification approaches. Semantic perturbations need invariance to meaningful transformations. Physical adversarial examples demand robustness under real-world constraints. Distributional robustness requires guarantees across data distributions. Each threat model brings unique challenges and opportunities for verification.

This guide explores threat models beyond p\ell_p, how they differ from traditional adversarial perturbations, and what verification methods exist for each.

Limitations of p\ell_p Threat Models

Why p\ell_p Norms Dominate

Standard formulation: Adversarial robustness typically considers p\ell_p balls:

Bpϵ(x0)={x:xx0pϵ}\mathcal{B}_p^\epsilon(x_0) = \{ x : \|x - x_0\|_p \leq \epsilon \}

Advantages:

  • Simple mathematical formulation
  • Well-studied optimization theory
  • Efficient verification methods
  • Natural interpretability (maximum per-pixel change for \ell_\infty)

Widespread adoption: FGSM, PGD, C&W attacks all use p\ell_p norms. Most verification benchmarks (VNN-COMP, certified accuracy metrics) use p\ell_p.

What p\ell_p Misses

Semantic meaningfulness: p\ell_p perturbations often aren’t semantically meaningful. Small 2\ell_2 changes might move “airplane” to “nonsense image” rather than to another valid airplane.

Perceptual realism: Humans don’t perceive p\ell_p distance. Two images with small 2\ell_2 distance might look very different; two images with large 2\ell_2 distance might look similar.

Physical realizability: p\ell_p perturbations assume arbitrary pixel control. Physical attackers face constraints (lighting, viewpoint, printability).

Distribution shift: Real-world deployment faces natural distribution shift (different demographics, weather, hardware) not captured by p\ell_p perturbations.

The p\ell_p Paradox

p\ell_p balls are:

  • Mathematically convenient: Easy to formulate, verify, optimize
  • Practically limited: Don’t capture many real-world threats

Result: Standard verification provides guarantees against threats that may not occur, while missing threats that do occur.

Semantic Perturbations

Semantic perturbations modify meaningful attributes (pose, lighting, color) rather than arbitrary pixels.

What Are Semantic Perturbations?

Definition: Perturbations that change semantic properties while maintaining object identity.

Examples:

  • Rotation: Rotate object by angle θ\theta
  • Translation: Shift object position
  • Scaling: Resize object
  • Illumination: Change brightness, contrast, color temperature
  • 3D pose: Change viewpoint in 3D space
  • Weather: Add rain, snow, fog effects

Key property: Semantic perturbations produce valid, realistic images. Unlike p\ell_p noise, they correspond to real-world variations.

x=Tsemantic(x,θ)x' = T_{\text{semantic}}(x, \theta)

where TT is a semantic transformation parameterized by θ\theta (e.g., rotation angle, brightness factor).

Verification Approaches

Data augmentation bounds:

  • Define distribution over transformations TDT \sim \mathcal{D}
  • Use randomized smoothing with transformation distribution
  • Certified radius in transformation space (e.g., rotation up to ±30°\pm 30°)

Geometric verification:

  • For specific transformations (rotation, translation), use geometric reasoning
  • Example: Verify rotation invariance by checking all rotations in a range

Abstraction refinement:

  • Abstract semantic transformations into over-approximations
  • Use standard verification on abstracted space
  • Refine if bounds too loose

Challenge: Semantic transformations are non-linear (rotation involves trigonometry, scaling changes image structure). Standard linear verification doesn’t directly apply.

Current state: Verification for semantic perturbations is less mature than p\ell_p. Randomized smoothing with transformation distributions is most practical approach.

Physical Adversarial Examples

Physical adversarial examples exist in the physical world—patches, stickers, objects that fool cameras/sensors.

Physical Realizability Constraints

Printability: Perturbations must be printable with available printers (discrete color gamut, limited resolution).

Viewpoint robustness: Must fool the network from multiple viewing angles, distances.

Lighting robustness: Must work under various lighting conditions.

Environmental factors: Shadows, reflections, weather affect appearance.

Physical vs Digital Attacks

Digital (p\ell_p):

  • Arbitrary pixel control
  • Perfect precision
  • Works in digital space only

Physical:

  • Limited to printable/fabricable perturbations
  • Subject to environmental variations
  • Works in real world (cameras, sensors)

Physical attacks are constrained but more dangerous: They work against deployed systems in the real world.

Adversarial Patches

Adversarial patch: A localized region (sticker, poster) designed to fool a network when placed in the scene.

Formulation: Optimize a patch pp to cause misclassification when applied to image:

maxpEx,t,l[L(fθ(A(x,p,t,l)),ytarget)]\max_p \mathbb{E}_{x, t, l} \left[ \mathcal{L}(f_\theta(A(x, p, t, l)), y_{\text{target}}) \right]

where:

  • pp: Patch content
  • xx: Base image
  • tt: Patch location/transformation
  • ll: Lighting/environmental conditions
  • AA: Applies patch to image

Verification: Proving robustness to patches is hard—must show no patch of size k×kk \times k anywhere in the image can fool the network.

Approaches:

  • Certified patch defense: For small patches (single pixel), certify using masking and aggregation
  • Randomized ablation: Randomly mask portions of input, aggregate predictions
  • Segmentation-based: Use object segmentation to identify and ignore suspicious regions

Current state: Verification for general patch attacks is largely open. Most work focuses on specific patch sizes or defensive strategies rather than complete verification.

Physical World Attacks

Stop sign attacks: Stickers on stop signs causing misclassification.

3D-printed objects: Adversarial 3D objects (turtles, cars) designed to fool classifiers from multiple angles.

Wearable adversarial examples: Clothing, glasses designed to fool facial recognition.

Verification challenge: Must account for:

  • Camera optics (distortion, blur)
  • Lighting variations
  • Viewing angle changes
  • Environmental noise

No standard verification framework yet. Research explores:

  • Simulation-based testing (render scenes under various conditions)
  • Differentiable rendering for optimization
  • Statistical guarantees via sampling (probabilistic certification)

Sparse Perturbations (0\ell_0 Attacks)

0\ell_0 attacks modify a limited number of pixels, potentially by large amounts.

0\ell_0 Threat Model

Definition: Adversarial example with at most kk pixels changed:

B0k(x0)={x:xx00k}\mathcal{B}_0^k(x_0) = \{ x : \|x - x_0\|_0 \leq k \}

where xx00\|x - x_0\|_0 counts non-zero entries (number of changed pixels).

Difference from p\ell_p:

  • p\ell_p: Small changes to many pixels
  • 0\ell_0: Large changes to few pixels

Physical intuition: Corresponds to localized damage (dead pixels, sensor noise, occlusions).

Verification for 0\ell_0

Combinatorial challenge: With dd-dimensional input and budget kk, there are (dk)\binom{d}{k} possible pixel sets to perturb. For images (d3000d \sim 3000), even k=10k=10 gives billions of combinations.

Complete verification: Enumerate all (dk)\binom{d}{k} combinations, verify each. Intractable for realistic d,kd, k.

Incomplete approaches:

  • Greedy search: Iteratively select most influential pixels to perturb
  • Heuristic bounds: Use pixel importance scores to focus verification
  • Randomized certification: Sample pixel subsets, aggregate results probabilistically

Specialized defenses:

  • Median filtering: Replacing pixels with neighborhood medians mitigates sparse attacks
  • Randomized masking: Randomly mask pixels during inference, reducing attack success

Current state: Limited verification methods exist for 0\ell_0. Most work focuses on attack generation and specific defenses rather than general certification.

Distributional Robustness

Distributional robustness ensures performance across related but distinct data distributions.

Distribution Shift Problem

Training distribution Dtrain\mathcal{D}_{\text{train}} vs test distribution Dtest\mathcal{D}_{\text{test}}:

  • Different demographics, ages, skin tones (facial recognition)
  • Different weather, lighting (autonomous driving)
  • Different hardware, cameras (medical imaging)

Standard ML assumption: Dtrain=Dtest\mathcal{D}_{\text{train}} = \mathcal{D}_{\text{test}}. In practice, this often fails.

Robustness goal: Ensure good performance on Dtest\mathcal{D}_{\text{test}} even when different from Dtrain\mathcal{D}_{\text{train}}.

Formulations

Worst-case distribution in neighborhood of training distribution:

maxD:d(D,Dtrain)ϵExD[L(fθ(x),y)]\max_{\mathcal{D} : d(\mathcal{D}, \mathcal{D}_{\text{train}}) \leq \epsilon} \mathbb{E}_{x \sim \mathcal{D}} [\mathcal{L}(f_\theta(x), y)]

where dd is a distribution distance (e.g., Wasserstein, KL divergence).

Distributionally robust optimization (DRO):

minθmaxDU(Dtrain)ExD[L(fθ(x),y)]\min_\theta \max_{\mathcal{D} \in \mathcal{U}(\mathcal{D}_{\text{train}})} \mathbb{E}_{x \sim \mathcal{D}} [\mathcal{L}(f_\theta(x), y)]

where U\mathcal{U} is an uncertainty set around training distribution.

Verification and Certification

Challenge: Verifying distributional robustness requires reasoning about sets of distributions, not individual inputs.

Approaches:

  • DRO training: Train networks to be distributionally robust. Verification comes from training objective guarantees.
  • Statistical testing: Sample from candidate test distributions, check performance empirically.
  • Causal reasoning: Use causal models to bound performance under interventions/shifts.

Connection to p\ell_p: Individual p\ell_p robustness can be viewed as distributional robustness under uniform perturbations. Distributional robustness generalizes this to structured shifts.

Current state: Distributional robustness is active research area. Practical verification methods are limited; most work focuses on training for distributional robustness rather than verifying it post-hoc.

Domain-Specific Threat Models

Different application domains have unique threat models:

Autonomous Driving

Threats:

  • Weather conditions (rain, fog, snow)
  • Lighting changes (day/night, shadows)
  • Sensor noise (camera, LiDAR)
  • Adversarial objects (misleading signs, fake lane markings)

Verification needs:

  • Safety properties (don’t hit pedestrians, stay in lane)
  • Continuous safety (properties hold over time, not just single frames)
  • Multi-modal robustness (camera + LiDAR consistency)

Approaches: Simulation-based testing, scenario coverage, formal methods for control systems.

Medical Imaging

Threats:

  • Scanner variations (different manufacturers, models)
  • Imaging artifacts (noise, contrast variations)
  • Patient variations (age, anatomy)
  • Adversarial manipulations (malicious image modification)

Verification needs:

  • Diagnostic accuracy across populations
  • Robustness to imaging protocols
  • Detection of manipulated images

Approaches: Cross-scanner validation, adversarial training for artifacts, uncertainty quantification.

Facial Recognition

Threats:

  • Demographic variations (age, race, gender)
  • Expression changes (smiling, frowning)
  • Accessories (glasses, hats, makeup)
  • Adversarial glasses/makeup

Verification needs:

  • Fairness across demographics
  • Robustness to accessories
  • Detection of adversarial examples

Approaches: Fairness-constrained training, adversarial robustness for accessories, liveness detection.

Composing Threat Models

Real-world robustness requires defending against multiple threat models simultaneously.

Multi-Perturbation Robustness

Union of threats: Robust to p\ell_p AND semantic AND physical:

xT1T2Tk,ϕ(fθ(x))=true\forall x' \in \mathcal{T}_1 \cup \mathcal{T}_2 \cup \cdots \cup \mathcal{T}_k, \quad \phi(f_\theta(x')) = \text{true}

Challenge: Verifying union is harder than verifying each individually.

Approaches:

  • Modular verification: Verify each threat separately, compose results
  • Joint training: Train for multiple threats simultaneously
  • Hierarchical robustness: Ensure robustness to most general threat (implies robustness to specific ones)

Tradeoffs: Robustness to multiple threats may come with larger accuracy drop than single-threat robustness.

Verification Challenges for Alternative Threats

Why Alternative Threats Are Harder

Non-convexity: Semantic transformations (rotation, scaling) are non-linear. Standard linear verification doesn’t apply.

High dimensionality: Physical attacks involve many parameters (lighting, viewpoint, patch content). Verification must reason about this high-dimensional space.

Lack of structure: p\ell_p balls have nice mathematical structure (convex, symmetric). Alternative threats often lack this, making verification harder.

Limited tooling: Most verification tools assume p\ell_p. Adapting them to alternative threats requires significant modification.

Current State and Future Directions

Semantic perturbations: Randomized smoothing with transformation distributions is most mature. Geometric verification for specific transformations (rotation) emerging.

Physical adversarial examples: Certified patch defenses for small patches exist. General physical verification largely open problem.

Sparse perturbations: Limited verification methods. Mostly heuristic defenses and empirical evaluation.

Distributional robustness: DRO training provides some guarantees. Post-hoc verification largely unsolved.

Multi-threat robustness: Very early stage. Composing verification across threat models is open challenge.

Practical Recommendations

For Researchers

Define threat model carefully: What perturbations can adversaries actually perform? p\ell_p may not capture real threats.

Develop domain-specific verification: Generic p\ell_p verification isn’t enough. Build verification methods for your domain’s threats.

Combine approaches: Use p\ell_p verification as baseline, augment with domain-specific testing/verification.

For Practitioners

Understand deployment threats: What attacks will your system face in practice? Physical patches? Distribution shift?

Multi-layered defense:

  1. p\ell_p adversarial training (baseline robustness)
  2. Data augmentation for semantic/physical robustness
  3. Runtime monitoring for anomalies
  4. Regular re-training for distribution shift

Don’t rely solely on p\ell_p certification: Use it as one component, not complete solution.

For Tool Builders

Extend beyond p\ell_p: Add support for semantic perturbations, sparse attacks, custom threat models.

Modular design: Allow users to define custom threat models, plug in verification methods.

Simulation support: Integrate with renderers, physics engines for physical adversarial example verification.

Final Thoughts

p\ell_p threat models dominate neural network verification for good reason: mathematical convenience, well-studied theory, and practical tools. But real-world threats extend far beyond p\ell_p.

Semantic perturbations, physical adversarial examples, sparse attacks, and distributional robustness each require different verification approaches. Some, like randomized smoothing for semantic perturbations, are maturing. Others, like general physical verification, remain largely open.

The future of neural network verification lies in domain-aware methods: understanding the specific threats each application faces and building verification tailored to those threats. p\ell_p verification provides a foundation, but comprehensive robustness demands going beyond it.

Understanding alternative threat models clarifies verification’s scope and limitations. p\ell_p certification is valuable but incomplete. Building truly robust systems requires addressing the full spectrum of threats, each with its own verification challenges and opportunities.

Further Reading

This guide provides comprehensive coverage of threat models beyond p\ell_p norms. For readers interested in diving deeper, we recommend the following resources organized by topic:

Semantic Perturbations:

Randomized smoothing extends naturally to semantic transformations by using transformation distributions instead of additive noise. This provides certified robustness to rotations, translations, and other meaningful perturbations with theoretical guarantees.

Physical Adversarial Examples:

Adversarial patch attacks demonstrated that localized, printable perturbations can fool networks in the physical world. Real-world attacks on stop signs showed practical deployment of these threats. Certified defenses remain limited to specific scenarios like small patch sizes.

Standard p\ell_p Attacks for Context:

Understanding classical p\ell_p attacks—FGSM, PGD, C&W—provides essential context for alternative threats. These represent the baseline against which alternative threat models are compared.

Verification Methods for p\ell_p:

Standard verification methods—CROWN, DeepPoly, IBP, Marabou—primarily target p\ell_p threats. Understanding their design helps clarify what modifications are needed for alternative threats.

Certified Training:

Training for robustness—whether adversarial, certified, or randomized smoothing-based—typically focuses on p\ell_p. Extending certified training to alternative threats is active research.

GPU Acceleration:

GPU-accelerated verification has primarily benefited p\ell_p verification. Adapting these techniques to alternative threat models could enable practical verification at scale for non-p\ell_p threats.

Theoretical Barriers:

The NP-completeness of verification applies regardless of threat model. Alternative threats don’t escape computational barriers but may present different practical challenges.

Related Topics:

For certified defenses including randomized smoothing, see Certified Defenses. For practical robustness testing across threat models, see Robustness Testing Guide.

Next Guide

Continue to Verifying Diverse Architectures to explore unique verification challenges for Transformers, RNNs, Graph Neural Networks, and other modern architectures.