Beyond Lp: Alternative Threat Models

Beyond $\ell_p$ : Alternative Threat Models

Most neural network verification focuses on $\ell_p$ norm-bounded perturbations—small pixel changes invisible to humans. But real-world adversaries aren’t constrained to $\ell_p$ balls. They can modify semantics (change object pose, lighting), add physical patches, or exploit distribution shift.

These alternative threat models require different verification approaches. Semantic perturbations need invariance to meaningful transformations. Physical adversarial examples demand robustness under real-world constraints. Distributional robustness requires guarantees across data distributions. Each threat model brings unique challenges and opportunities for verification.

This guide explores threat models beyond $\ell_p$ , how they differ from traditional adversarial perturbations, and what verification methods exist for each.

Limitations of $\ell_p$ Threat Models

Why $\ell_p$ Norms Dominate

Standard formulation: Adversarial robustness typically considers $\ell_p$ balls:

\mathcal{B}_p^\epsilon(x_0) = \{ x : \|x - x_0\|_p \leq \epsilon \}

Advantages:

Simple mathematical formulation
Well-studied optimization theory
Efficient verification methods
Natural interpretability (maximum per-pixel change for $\ell_\infty$ )

Widespread adoption: FGSM, PGD, C&W attacks all use $\ell_p$ norms. Most verification benchmarks (VNN-COMP, certified accuracy metrics) use $\ell_p$ .

What $\ell_p$ Misses

Semantic meaningfulness: $\ell_p$ perturbations often aren’t semantically meaningful. Small $\ell_2$ changes might move “airplane” to “nonsense image” rather than to another valid airplane.

Perceptual realism: Humans don’t perceive $\ell_p$ distance. Two images with small $\ell_2$ distance might look very different; two images with large $\ell_2$ distance might look similar.

Physical realizability: $\ell_p$ perturbations assume arbitrary pixel control. Physical attackers face constraints (lighting, viewpoint, printability).

Distribution shift: Real-world deployment faces natural distribution shift (different demographics, weather, hardware) not captured by $\ell_p$ perturbations.

The $\ell_p$ Paradox

$\ell_p$ balls are:

Mathematically convenient: Easy to formulate, verify, optimize

Practically limited: Don’t capture many real-world threats

Result: Standard verification provides guarantees against threats that may not occur, while missing threats that do occur.

Semantic Perturbations

Semantic perturbations modify meaningful attributes (pose, lighting, color) rather than arbitrary pixels.

What Are Semantic Perturbations?

Definition: Perturbations that change semantic properties while maintaining object identity.

Examples:

Rotation: Rotate object by angle $\theta$
Translation: Shift object position
Scaling: Resize object
Illumination: Change brightness, contrast, color temperature
3D pose: Change viewpoint in 3D space
Weather: Add rain, snow, fog effects

Key property: Semantic perturbations produce valid, realistic images. Unlike $\ell_p$ noise, they correspond to real-world variations.

x' = T_{\text{semantic}}(x, \theta)

where $T$ is a semantic transformation parameterized by $\theta$ (e.g., rotation angle, brightness factor).

Verification Approaches

Data augmentation bounds:

Define distribution over transformations $T \sim \mathcal{D}$
Use randomized smoothing with transformation distribution
Certified radius in transformation space (e.g., rotation up to $\pm 30°$ )

Geometric verification:

For specific transformations (rotation, translation), use geometric reasoning
Example: Verify rotation invariance by checking all rotations in a range

Abstraction refinement:

Abstract semantic transformations into over-approximations
Use standard verification on abstracted space
Refine if bounds too loose

Challenge: Semantic transformations are non-linear (rotation involves trigonometry, scaling changes image structure). Standard linear verification doesn’t directly apply.

Current state: Verification for semantic perturbations is less mature than $\ell_p$ . Randomized smoothing with transformation distributions is most practical approach.

Physical Adversarial Examples

Physical adversarial examples exist in the physical world—patches, stickers, objects that fool cameras/sensors.

Physical Realizability Constraints

Printability: Perturbations must be printable with available printers (discrete color gamut, limited resolution).

Viewpoint robustness: Must fool the network from multiple viewing angles, distances.

Lighting robustness: Must work under various lighting conditions.

Environmental factors: Shadows, reflections, weather affect appearance.

Physical vs Digital Attacks

Digital ( $\ell_p$ ):

Arbitrary pixel control

Perfect precision

Works in digital space only

Physical:

Limited to printable/fabricable perturbations

Subject to environmental variations

Works in real world (cameras, sensors)

Physical attacks are constrained but more dangerous: They work against deployed systems in the real world.

Adversarial Patches

Adversarial patch: A localized region (sticker, poster) designed to fool a network when placed in the scene.

Formulation: Optimize a patch $p$ to cause misclassification when applied to image:

\max_p \mathbb{E}_{x, t, l} \left[ \mathcal{L}(f_\theta(A(x, p, t, l)), y_{\text{target}}) \right]

where:

$p$ : Patch content
$x$ : Base image
$t$ : Patch location/transformation
$l$ : Lighting/environmental conditions
$A$ : Applies patch to image

Verification: Proving robustness to patches is hard—must show no patch of size $k \times k$ anywhere in the image can fool the network.

Approaches:

Certified patch defense: For small patches (single pixel), certify using masking and aggregation
Randomized ablation: Randomly mask portions of input, aggregate predictions
Segmentation-based: Use object segmentation to identify and ignore suspicious regions

Current state: Verification for general patch attacks is largely open. Most work focuses on specific patch sizes or defensive strategies rather than complete verification.

Physical World Attacks

Stop sign attacks: Stickers on stop signs causing misclassification.

3D-printed objects: Adversarial 3D objects (turtles, cars) designed to fool classifiers from multiple angles.

Wearable adversarial examples: Clothing, glasses designed to fool facial recognition.

Verification challenge: Must account for:

Camera optics (distortion, blur)
Lighting variations
Viewing angle changes
Environmental noise

No standard verification framework yet. Research explores:

Simulation-based testing (render scenes under various conditions)
Differentiable rendering for optimization
Statistical guarantees via sampling (probabilistic certification)

Sparse Perturbations ( $\ell_0$ Attacks)

$\ell_0$ attacks modify a limited number of pixels, potentially by large amounts.

$\ell_0$ Threat Model

Definition: Adversarial example with at most $k$ pixels changed:

\mathcal{B}_0^k(x_0) = \{ x : \|x - x_0\|_0 \leq k \}

where $\|x - x_0\|_0$ counts non-zero entries (number of changed pixels).

Difference from $\ell_p$ :

$\ell_p$ : Small changes to many pixels
$\ell_0$ : Large changes to few pixels

Physical intuition: Corresponds to localized damage (dead pixels, sensor noise, occlusions).

Verification for $\ell_0$

Combinatorial challenge: With $d$ -dimensional input and budget $k$ , there are $\binom{d}{k}$ possible pixel sets to perturb. For images ( $d \sim 3000$ ), even $k=10$ gives billions of combinations.

Complete verification: Enumerate all $\binom{d}{k}$ combinations, verify each. Intractable for realistic $d, k$ .

Incomplete approaches:

Greedy search: Iteratively select most influential pixels to perturb
Heuristic bounds: Use pixel importance scores to focus verification
Randomized certification: Sample pixel subsets, aggregate results probabilistically

Specialized defenses:

Median filtering: Replacing pixels with neighborhood medians mitigates sparse attacks
Randomized masking: Randomly mask pixels during inference, reducing attack success

Current state: Limited verification methods exist for $\ell_0$ . Most work focuses on attack generation and specific defenses rather than general certification.

Distributional Robustness

Distributional robustness ensures performance across related but distinct data distributions.

Distribution Shift Problem

Training distribution $\mathcal{D}_{\text{train}}$ vs test distribution $\mathcal{D}_{\text{test}}$ :

Different demographics, ages, skin tones (facial recognition)
Different weather, lighting (autonomous driving)
Different hardware, cameras (medical imaging)

Standard ML assumption: $\mathcal{D}_{\text{train}} = \mathcal{D}_{\text{test}}$ . In practice, this often fails.

Robustness goal: Ensure good performance on $\mathcal{D}_{\text{test}}$ even when different from $\mathcal{D}_{\text{train}}$ .

Formulations

Worst-case distribution in neighborhood of training distribution:

\max_{\mathcal{D} : d(\mathcal{D}, \mathcal{D}_{\text{train}}) \leq \epsilon} \mathbb{E}_{x \sim \mathcal{D}} [\mathcal{L}(f_\theta(x), y)]

where $d$ is a distribution distance (e.g., Wasserstein, KL divergence).

Distributionally robust optimization (DRO):

\min_\theta \max_{\mathcal{D} \in \mathcal{U}(\mathcal{D}_{\text{train}})} \mathbb{E}_{x \sim \mathcal{D}} [\mathcal{L}(f_\theta(x), y)]

where $\mathcal{U}$ is an uncertainty set around training distribution.

Verification and Certification

Challenge: Verifying distributional robustness requires reasoning about sets of distributions, not individual inputs.

Approaches:

DRO training: Train networks to be distributionally robust. Verification comes from training objective guarantees.
Statistical testing: Sample from candidate test distributions, check performance empirically.
Causal reasoning: Use causal models to bound performance under interventions/shifts.

Connection to $\ell_p$ : Individual $\ell_p$ robustness can be viewed as distributional robustness under uniform perturbations. Distributional robustness generalizes this to structured shifts.

Current state: Distributional robustness is active research area. Practical verification methods are limited; most work focuses on training for distributional robustness rather than verifying it post-hoc.

Domain-Specific Threat Models

Different application domains have unique threat models:

Autonomous Driving

Threats:

Weather conditions (rain, fog, snow)
Lighting changes (day/night, shadows)
Sensor noise (camera, LiDAR)
Adversarial objects (misleading signs, fake lane markings)

Verification needs:

Safety properties (don’t hit pedestrians, stay in lane)
Continuous safety (properties hold over time, not just single frames)
Multi-modal robustness (camera + LiDAR consistency)

Approaches: Simulation-based testing, scenario coverage, formal methods for control systems.

Medical Imaging

Threats:

Scanner variations (different manufacturers, models)
Imaging artifacts (noise, contrast variations)
Patient variations (age, anatomy)
Adversarial manipulations (malicious image modification)

Verification needs:

Diagnostic accuracy across populations
Robustness to imaging protocols
Detection of manipulated images

Approaches: Cross-scanner validation, adversarial training for artifacts, uncertainty quantification.

Facial Recognition

Threats:

Demographic variations (age, race, gender)
Expression changes (smiling, frowning)
Accessories (glasses, hats, makeup)
Adversarial glasses/makeup

Verification needs:

Fairness across demographics
Robustness to accessories
Detection of adversarial examples

Approaches: Fairness-constrained training, adversarial robustness for accessories, liveness detection.

Composing Threat Models

Real-world robustness requires defending against multiple threat models simultaneously.

Multi-Perturbation Robustness

Union of threats: Robust to $\ell_p$ AND semantic AND physical:

\forall x' \in \mathcal{T}_1 \cup \mathcal{T}_2 \cup \cdots \cup \mathcal{T}_k, \quad \phi(f_\theta(x')) = \text{true}

Challenge: Verifying union is harder than verifying each individually.

Approaches:

Modular verification: Verify each threat separately, compose results
Joint training: Train for multiple threats simultaneously
Hierarchical robustness: Ensure robustness to most general threat (implies robustness to specific ones)

Tradeoffs: Robustness to multiple threats may come with larger accuracy drop than single-threat robustness.

Verification Challenges for Alternative Threats

Why Alternative Threats Are Harder

Non-convexity: Semantic transformations (rotation, scaling) are non-linear. Standard linear verification doesn’t apply.

High dimensionality: Physical attacks involve many parameters (lighting, viewpoint, patch content). Verification must reason about this high-dimensional space.

Lack of structure: $\ell_p$ balls have nice mathematical structure (convex, symmetric). Alternative threats often lack this, making verification harder.

Limited tooling: Most verification tools assume $\ell_p$ . Adapting them to alternative threats requires significant modification.

Current State and Future Directions

Semantic perturbations: Randomized smoothing with transformation distributions is most mature. Geometric verification for specific transformations (rotation) emerging.

Physical adversarial examples: Certified patch defenses for small patches exist. General physical verification largely open problem.

Sparse perturbations: Limited verification methods. Mostly heuristic defenses and empirical evaluation.

Distributional robustness: DRO training provides some guarantees. Post-hoc verification largely unsolved.

Multi-threat robustness: Very early stage. Composing verification across threat models is open challenge.

Practical Recommendations

For Researchers

Define threat model carefully: What perturbations can adversaries actually perform? $\ell_p$ may not capture real threats.

Develop domain-specific verification: Generic $\ell_p$ verification isn’t enough. Build verification methods for your domain’s threats.

Combine approaches: Use $\ell_p$ verification as baseline, augment with domain-specific testing/verification.

For Practitioners

Understand deployment threats: What attacks will your system face in practice? Physical patches? Distribution shift?

Multi-layered defense:

$\ell_p$ adversarial training (baseline robustness)
Data augmentation for semantic/physical robustness
Runtime monitoring for anomalies
Regular re-training for distribution shift

Don’t rely solely on $\ell_p$ certification: Use it as one component, not complete solution.

For Tool Builders

Extend beyond $\ell_p$ : Add support for semantic perturbations, sparse attacks, custom threat models.

Modular design: Allow users to define custom threat models, plug in verification methods.

Simulation support: Integrate with renderers, physics engines for physical adversarial example verification.

Final Thoughts

$\ell_p$ threat models dominate neural network verification for good reason: mathematical convenience, well-studied theory, and practical tools. But real-world threats extend far beyond $\ell_p$ .

Semantic perturbations, physical adversarial examples, sparse attacks, and distributional robustness each require different verification approaches. Some, like randomized smoothing for semantic perturbations, are maturing. Others, like general physical verification, remain largely open.

The future of neural network verification lies in domain-aware methods: understanding the specific threats each application faces and building verification tailored to those threats. $\ell_p$ verification provides a foundation, but comprehensive robustness demands going beyond it.

Understanding alternative threat models clarifies verification’s scope and limitations. $\ell_p$ certification is valuable but incomplete. Building truly robust systems requires addressing the full spectrum of threats, each with its own verification challenges and opportunities.

Beyond ℓp\ell_pℓp​: Alternative Threat Models

Limitations of ℓp\ell_pℓp​ Threat Models

Why ℓp\ell_pℓp​ Norms Dominate

What ℓp\ell_pℓp​ Misses

Semantic Perturbations

What Are Semantic Perturbations?

Verification Approaches

Physical Adversarial Examples

Physical Realizability Constraints

Adversarial Patches

Physical World Attacks

Sparse Perturbations (ℓ0\ell_0ℓ0​ Attacks)

ℓ0\ell_0ℓ0​ Threat Model

Verification for ℓ0\ell_0ℓ0​

Distributional Robustness

Distribution Shift Problem

Formulations

Verification and Certification

Domain-Specific Threat Models

Autonomous Driving

Medical Imaging

Facial Recognition

Composing Threat Models

Multi-Perturbation Robustness

Verification Challenges for Alternative Threats

Why Alternative Threats Are Harder

Current State and Future Directions

Practical Recommendations

For Researchers

For Practitioners

For Tool Builders

Final Thoughts

Beyond $\ell_p$ : Alternative Threat Models

Limitations of $\ell_p$ Threat Models

Why $\ell_p$ Norms Dominate

What $\ell_p$ Misses

Sparse Perturbations ( $\ell_0$ Attacks)

$\ell_0$ Threat Model

Verification for $\ell_0$