Regularization-Based Robust Training

Adversarial training [Madry et al., 2018] is the most common approach to building robust networks: train on adversarially perturbed examples to learn invariance. But adversarial training is expensive (requires solving inner maximization for each batch), can hurt clean accuracy, and doesn’t provide certified guarantees during training.

Regularization-based training offers an alternative: add penalty terms to the loss that encourage robustness properties—small Lipschitz constants, large margins, smooth decision boundaries. These regularizers don’t require generating adversarial examples during training, provide clearer optimization objectives, and often integrate naturally with certified verification methods.

This guide explores regularization-based approaches to robust training: what robustness regularizers exist, how they compare to adversarial training, and when they’re the right choice.

Why Regularization for Robustness?

Standard training minimizes empirical risk on clean examples:

\[\min_\theta \mathbb{E}_{(x,y) \sim \mathcal{D}} [\mathcal{L}(f_\theta(x), y)]\]

This says nothing about behavior on perturbed inputs. Two strategies add robustness:

Adversarial training [Madry et al., 2018]: Augment training with worst-case perturbations:

\[\min_\theta \mathbb{E}_{(x,y)} \left[ \max_{\|\delta\|_p \leq \epsilon} \mathcal{L}(f_\theta(x + \delta), y) \right]\]

Regularization-based: Add penalties encouraging robustness properties:

\[\min_\theta \mathbb{E}_{(x,y)} [\mathcal{L}(f_\theta(x), y)] + \lambda R(f_\theta)\]

where \(R(f_\theta)\) is a regularizer (e.g., Lipschitz constant, smoothness).

Key Differences

Adversarial Training:

  • Trains on worst-case perturbations

  • Requires solving inner maximization (expensive)

  • Implicit robustness (through data augmentation)

  • No certified guarantees during training

Regularization-Based:

  • Penalizes network properties directly

  • Closed-form or efficiently computable penalties

  • Explicit robustness (through regularization term)

  • Can integrate with certified bounds

Types of Robustness Regularizers

Lipschitz Regularization

Idea: Constrain the Lipschitz constant \(\ell_f\) to limit sensitivity to input perturbations.

Regularizer: Penalize large Lipschitz constant:

\[R_{\text{Lip}}(f_\theta) = L_f^2 = \left( \sup_x \|\nabla f_\theta(x)\|_2 \right)^2\]

Approximations: Computing the exact Lipschitz constant is hard. Practical approaches:

Spectral normalization: Normalize weight matrices by spectral norm:

\[W_{\text{normalized}} = \frac{W}{\|W\|_2}\]

This ensures each layer has Lipschitz constant ≤ 1, giving \(\ell_f \leq 1\) for the network.

Gradient penalty: Penalize large gradients at training points:

\[R_{\text{grad}}(f_\theta) = \mathbb{E}_x \|\nabla_x f_\theta(x)\|_2^2\]

This encourages small local Lipschitz constant near training data.

Benefits: Networks with small Lipschitz constants have [Fazlyab et al., 2019]:

  • Certified robustness radius \(\epsilon_{\text{cert}} = \Delta / L_f\)

  • Better generalization (bounded complexity)

  • Smoother decision boundaries

Implementation:

import torch
import torch.nn as nn
from torch.nn.utils import spectral_norm

# Spectral normalization (hard constraint)
class RobustNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = spectral_norm(nn.Linear(784, 256))
        self.fc2 = spectral_norm(nn.Linear(256, 10))

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        return self.fc2(x)

# Gradient penalty (soft constraint)
def gradient_penalty(model, x):
    x.requires_grad = True
    output = model(x)
    gradients = torch.autograd.grad(
        outputs=output, inputs=x,
        grad_outputs=torch.ones_like(output),
        create_graph=True
    )[0]
    penalty = (gradients.norm(2, dim=1) ** 2).mean()
    return penalty

# Training loop
for x, y in dataloader:
    loss_task = criterion(model(x), y)
    loss_reg = gradient_penalty(model, x)
    loss_total = loss_task + lambda_reg * loss_reg

    optimizer.zero_grad()
    loss_total.backward()
    optimizer.step()

Margin-Based Regularization

Idea: Maximize the classification margin—the distance from decision boundary to training examples.

Regularizer: Penalize small margins:

\[R_{\text{margin}}(f_\theta) = -\mathbb{E}_{(x,y)} \left[ f_\theta(x)_y - \max_{c \neq y} f_\theta(x)_c \right]\]

Larger margin → more confident correct classification → harder to flip with perturbations.

Connection to robustness: For a network with Lipschitz constant \(L\), certified radius is:

\[\epsilon_{\text{cert}} = \frac{\text{margin}}{L}\]

Maximizing margin while minimizing \(L\) improves certified robustness.

Cross-entropy vs margin: Standard cross-entropy doesn’t explicitly maximize margin. Adding margin regularization or using margin-based losses (e.g., hinge loss) encourages larger separability.

Smoothness Regularization

Idea: Encourage smooth network outputs—small second derivatives (Hessian).

Curvature penalty: Penalize large curvature:

\[R_{\text{smooth}}(f_\theta) = \mathbb{E}_x \|\nabla^2 f_\theta(x)\|_F^2\]

where \(\nabla^2 f\) is the Hessian matrix.

Why it helps: Smooth functions change slowly, making them robust to small perturbations. High curvature means rapid output change → vulnerability.

Computational challenge: Computing and penalizing the Hessian is expensive (quadratic in dimension). Approximations:

Finite differences: Approximate curvature via output changes:

\[R_{\text{smooth}} \approx \mathbb{E}_{x, \delta} \|f_\theta(x + \delta) - f_\theta(x) - \nabla f_\theta(x)^T \delta\|^2\]

Random projection: Project Hessian onto random directions for efficient estimation.

Jacobian Regularization

Jacobian Frobenius norm: Penalize large Jacobian \(J = \nabla_x f_\theta(x)\):

\[R_{\text{Jac}}(f_\theta) = \mathbb{E}_x \|J(x)\|_F^2 = \mathbb{E}_x \sum_{i,j} \left( \frac{\partial f_\theta(x)_i}{\partial x_j} \right)^2\]

Relationship to Lipschitz: For \(\ell_2\) norm, \(\ell_f = \sup_x \|J(x)\|_2\) (spectral norm of Jacobian). Frobenius norm is an upper bound: \(\|J\|_2 \leq \|J\|_F\).

Efficient computation: Jacobian regularization can be computed via backpropagation without explicit Jacobian construction:

def jacobian_regularization(model, x):
    """Compute Frobenius norm of Jacobian."""
    x.requires_grad = True
    output = model(x)

    jac_penalty = 0
    for i in range(output.shape[1]):  # For each output dimension
        grad = torch.autograd.grad(
            output[:, i].sum(), x,
            create_graph=True, retain_graph=True
        )[0]
        jac_penalty += (grad ** 2).sum()

    return jac_penalty / (output.shape[0] * output.shape[1])

Comparison with Adversarial Training

Adversarial Training (PGD-AT)

Standard PGD adversarial training [Madry et al., 2018]:

\[\min_\theta \mathbb{E}_{(x,y)} \left[ \max_{\|\delta\|_\infty \leq \epsilon} \mathcal{L}(f_\theta(x + \delta), y) \right]\]

Inner maximization: Solve for worst-case perturbation via PGD:

\[\delta^{(t+1)} = \Pi_{\|\delta\|_\infty \leq \epsilon} \left( \delta^{(t)} + \alpha \cdot \text{sign}(\nabla_\delta \mathcal{L}(f_\theta(x + \delta^{(t)}), y)) \right)\]

Pros: - Strong empirical robustness (state-of-the-art for adversarial accuracy) - Directly optimizes worst-case objective - Well-studied, mature techniques

Cons: - Expensive (7-10× training time due to inner PGD loop) - Accuracy-robustness tradeoff (often 10-15% clean accuracy drop) - No certified guarantees (only empirical robustness) - Unstable training (requires careful tuning)

Regularization-Based Training

Lipschitz/gradient regularization:

\[\min_\theta \mathbb{E}_{(x,y)} [\mathcal{L}(f_\theta(x), y)] + \lambda \mathbb{E}_x \|\nabla f_\theta(x)\|^2\]

Pros: - Fast (no inner maximization, just gradient penalty) - Better clean accuracy (smaller accuracy-robustness tradeoff) - Certified robustness (via Lipschitz bounds) - Stable training (convex regularization term)

Cons: - Weaker empirical robustness than PGD-AT (adversarial accuracy typically lower) - Requires tuning regularization strength \(\lambda\) - Certified bounds may be loose (conservative)

Table 29 Training Method Comparison

Method

Training Time

Clean Accuracy

Adversarial Accuracy

Certified Robustness

Standard

Highest

0% (no defense)

None

PGD-AT

7-10×

Medium (-10-15%)

High (state-of-the-art)

No (empirical only)

Lipschitz Reg

1.2-2×

High (-3-8%)

Medium

Yes (Lipschitz bounds)

Certified AT

3-5×

Medium-High

Medium-High

Yes (tight bounds)

Hybrid Approaches: TRADES

TRADES [Zhang et al., 2019] (Trade-off between Robustness and Accuracy) combines adversarial training with regularization:

\[\min_\theta \mathbb{E}_{(x,y)} \left[ \mathcal{L}(f_\theta(x), y) + \lambda \max_{\|\delta\| \leq \epsilon} D_{\text{KL}}(f_\theta(x) \| f_\theta(x + \delta)) \right]\]

Key insight: Separate natural accuracy (first term) from robustness (second term). The parameter \(\lambda\) controls the tradeoff.

Advantages over PGD-AT: - Better natural accuracy (explicit natural loss term) - More stable training (KL divergence smoother than cross-entropy) - Tunable robustness-accuracy tradeoff (via \(\lambda\))

Relationship to regularization: The KL divergence term \(D_{\text{KL}}(f(x) \| f(x + \delta))\) penalizes output sensitivity—similar spirit to Lipschitz/gradient regularization but applied to worst-case perturbations.

Implementation:

def trades_loss(model, x, y, epsilon=0.031, alpha=0.007, num_steps=10, beta=6.0):
    """TRADES loss combining natural and robust objectives."""
    # Natural loss
    logits_natural = model(x)
    loss_natural = F.cross_entropy(logits_natural, y)

    # Generate adversarial examples
    x_adv = x.detach() + 0.001 * torch.randn_like(x)
    for _ in range(num_steps):
        x_adv.requires_grad = True
        with torch.enable_grad():
            logits_adv = model(x_adv)
            loss_kl = F.kl_div(
                F.log_softmax(logits_adv, dim=1),
                F.softmax(logits_natural, dim=1),
                reduction='batchmean'
            )
        grad = torch.autograd.grad(loss_kl, x_adv)[0]
        x_adv = x_adv.detach() + alpha * grad.sign()
        x_adv = torch.min(torch.max(x_adv, x - epsilon), x + epsilon)
        x_adv = torch.clamp(x_adv, 0.0, 1.0)

    # Robust loss (KL divergence)
    logits_adv = model(x_adv)
    loss_robust = F.kl_div(
        F.log_softmax(logits_adv, dim=1),
        F.softmax(logits_natural, dim=1),
        reduction='batchmean'
    )

    return loss_natural + beta * loss_robust

Practical Considerations

Choosing Regularization Strength

The regularization parameter \(\lambda\) controls the robustness-accuracy tradeoff:

  • Too small: Insufficient robustness, close to standard training

  • Too large: Over-regularization, poor accuracy

Heuristics for tuning:

  1. Grid search: Try \(\lambda \in \{0.001, 0.01, 0.1, 1.0, 10.0\}\), select based on validation accuracy and certified radius

  2. Warmup: Start with \(\lambda = 0\), gradually increase during training (annealing schedule)

  3. Adaptive: Adjust \(\lambda\) based on current margin/robustness (increase if under-regularized, decrease if over-regularized)

Combining Multiple Regularizers

Different regularizers capture different robustness aspects. Combining them can improve overall robustness:

\[R_{\text{total}} = \lambda_1 R_{\text{Lip}} + \lambda_2 R_{\text{margin}} + \lambda_3 R_{\text{smooth}}\]

Example: Lipschitz regularization + margin maximization ensures both small sensitivity (Lipschitz) and large separation (margin).

Challenge: More hyperparameters to tune. Start with single regularizers, add others if needed.

Integration with Certified Training

Regularization integrates naturally with certified training methods [Gowal et al., 2019, Zhang et al., 2020]:

IBP training [Gowal et al., 2019]: Compute bounds via interval propagation, maximize worst-case correct margin:

\[\min_\theta \mathbb{E}_{(x,y)} \left[ \mathcal{L}_{\text{IBP}}(f_\theta, x, y, \epsilon) \right] + \lambda R_{\text{Lip}}(f_\theta)\]

The Lipschitz regularization tightens IBP bounds, improving certified accuracy.

CROWN training: Use CROWN bounds [Zhang et al., 2018] for certified loss + Lipschitz regularization.

When to Use Regularization-Based Training

Use regularization when:

  • Clean accuracy is important (minimal accuracy drop acceptable)

  • Want certified robustness guarantees (Lipschitz-based certification)

  • Training budget is limited (faster than adversarial training)

  • Network architecture benefits from regularization (e.g., GANs with spectral normalization)

  • Deploying in settings where certified guarantees are valued

Use adversarial training when:

  • Empirical robustness is paramount (need highest adversarial accuracy)

  • Willing to accept accuracy-robustness tradeoff

  • Computational budget allows expensive training

  • Certification not required (only need empirical robustness)

Use hybrid (TRADES) when:

  • Want balance between natural and robust accuracy

  • Need tunability (adjust \(\lambda\) for desired tradeoff)

  • Willing to pay moderate training cost (between regularization and full PGD-AT)

Complementary Approaches

Regularization and adversarial training aren’t mutually exclusive:

  • Start with regularization (Lipschitz, margin) for good initialization

  • Fine-tune with adversarial training for empirical robustness

  • Use TRADES to balance both objectives

This combines certified guarantees (regularization) with strong empirical robustness (adversarial training).

Current Research Directions

Tighter regularization: Developing regularizers that more directly correspond to certified robustness (e.g., tightening Lipschitz constant estimation).

Architecture-aware regularization: Exploiting specific architectures (CNNs, transformers) for more efficient regularization.

Learned regularization: Using meta-learning to automatically tune regularization strengths or learn custom regularizers.

Scalable certified training: Combining regularization with scalable certified bounds (IBP, CROWN) for certified training on large networks.

Multi-task regularization: Regularizing for multiple robustness properties (adversarial, natural distribution shift, fairness) simultaneously.

Limitations

Weaker empirical robustness: Regularization-based methods typically achieve lower adversarial accuracy than PGD adversarial training [Madry et al., 2018] on strong attacks.

Loose certified bounds: Lipschitz-based certification can be conservative, especially for deep networks where product bounds accumulate.

Hyperparameter sensitivity: Performance depends on regularization strength \(\lambda\); requires careful tuning.

Not defense against all attacks: Regularization helps against \(\ell_p\) perturbations but may not defend against other attack types (e.g., semantic, physical).

Final Thoughts

Regularization-based training provides an elegant alternative to adversarial training: instead of augmenting data with adversarial examples, directly encourage robustness properties through regularization. This approach is faster, maintains better clean accuracy, and integrates naturally with certified verification methods [Fazlyab et al., 2019, Gowal et al., 2019, Zhang et al., 2018].

While regularization alone may not achieve the empirical robustness of intensive adversarial training [Madry et al., 2018], it offers a better accuracy-robustness-cost tradeoff for many applications. Methods like TRADES [Zhang et al., 2019] show that combining both philosophies—explicit regularization and adversarial examples—yields the best of both worlds.

Understanding regularization-based training clarifies the relationship between network properties (Lipschitz constant, margin, smoothness) and robustness. This perspective guides both training (what to regularize) and verification (what bounds to expect), providing a unified framework for building and certifying robust networks.

Further Reading

This guide provides comprehensive coverage of regularization-based robust training. For readers interested in diving deeper, we recommend the following resources organized by topic:

Lipschitz Regularization:

Lipschitz-based certification [Fazlyab et al., 2019] provides the theoretical foundation connecting Lipschitz constants to certified robustness. Spectral normalization and gradient penalties represent practical implementations of Lipschitz regularization, ensuring networks have bounded sensitivity to perturbations.

Adversarial Training for Comparison:

PGD adversarial training [Madry et al., 2018] remains the gold standard for empirical robustness, providing important context for evaluating regularization-based methods. Understanding the accuracy-robustness tradeoff in adversarial training helps clarify when regularization offers advantages.

TRADES - Hybrid Approach:

TRADES [Zhang et al., 2019] elegantly combines natural accuracy objectives with robustness regularization, demonstrating that explicit tradeoff control yields better results than pure adversarial training. This work shows how regularization principles can enhance adversarial training.

Certified Training Integration:

IBP training [Gowal et al., 2019] and CROWN-based training [Zhang et al., 2018, Zhang et al., 2020] demonstrate how regularization integrates with incomplete verification methods for certified robust training. These approaches show that regularization can tighten certified bounds during training, improving both certified and empirical robustness.

Margin-Based Methods:

The connection between classification margin and robustness is well-established in learning theory. Maximizing margins while controlling Lipschitz constants provides provable robustness guarantees, connecting classical machine learning principles to modern neural network robustness.

Smoothness and Curvature:

Curvature-based regularization explores higher-order smoothness properties beyond first-order Lipschitz bounds. Penalizing large Hessian norms encourages locally linear behavior, reducing vulnerability to small perturbations.

Comparison with Other Defenses:

For probabilistic robustness guarantees, randomized smoothing [Cohen et al., 2019] offers an alternative to deterministic regularization. For complete verification after training, methods like Marabou [Katz et al., 2019] and branch-and-bound [Wang et al., 2021] complement training-time regularization with deployment-time verification.

Related Topics:

For understanding Lipschitz bounds that regularization controls, see Lipschitz Bounds and Curvature-Based Verification. For certified training methods that integrate with regularization, see Certified Defenses and Randomized Smoothing. For adversarial training that regularization compares against, see Training Robust Networks. For verification methods that exploit regularized networks, see Bound Propagation Approaches.

Next Guide

Continue to Certified Adversarial Training to learn about training with verified bounds using IBP, CROWN, and relaxation-based methods for provable robustness guarantees.

[1]

Jeremy Cohen, Elan Rosenfeld, and Zico Kolter. Certified adversarial robustness via randomized smoothing. In International Conference on Machine Learning. 2019.

[2] (1,2,3)

Mahyar Fazlyab, Alexander Robey, Hamed Hassani, Manfred Morari, and George Pappas. Efficient and accurate estimation of lipschitz constants for deep neural networks. In Advances in Neural Information Processing Systems, 11423–11434. 2019.

[3] (1,2,3,4)

Sven Gowal, Krishnamurthy Dj Dvijotham, Robert Stanforth, Rudy Bunel, Chongli Qin, Jonathan Uesato, Relja Arandjelovic, Timothy Mann, and Pushmeet Kohli. Scalable verified training for provably robust image classification. In Proceedings of the IEEE International Conference on Computer Vision, 4842–4851. 2019.

[4]

Guy Katz, Derek A Huang, Duligur Ibeling, Kyle Julian, Christopher Lazarus, Rachel Lim, Parth Shah, Shantanu Thakoor, Haoze Wu, Aleksandar Zeljić, and others. The marabou framework for verification and analysis of deep neural networks. In International Conference on Computer Aided Verification, 443–452. Springer, 2019.

[5] (1,2,3,4,5,6)

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations. 2018.

[6]

Shiqi Wang, Huan Zhang, Kaidi Xu, Xue Lin, Suman Jana, Cho-Jui Hsieh, and J Zico Kolter. Beta-crown: efficient bound propagation with per-neuron split constraints for neural network robustness verification. Advances in Neural Information Processing Systems, 2021.

[7] (1,2,3)

Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric Xing, Laurent El Ghaoui, and Michael Jordan. Theoretically principled trade-off between robustness and accuracy. In International conference on machine learning, 7472–7482. PMLR, 2019.

[8] (1,2)

Huan Zhang, Hongge Chen, Chaowei Xiao, Sven Gowal, Robert Stanforth, Bo Li, Duane Boning, and Cho-Jui Hsieh. Towards stable and efficient training of verifiably robust neural networks. In International Conference on Learning Representations. 2020.

[9] (1,2,3)

Huan Zhang, Tsui-Wei Weng, Pin-Yu Chen, Cho-Jui Hsieh, and Luca Daniel. Efficient neural network robustness certification with general activation functions. In Advances in neural information processing systems, 4939–4948. 2018.