SDP-Based Verification: Semi-Definite Programming for Tighter Bounds

SDP-Based Verification: Semi-Definite Programming for Tighter Bounds¶

January 2026 NNV Guide 10 min read 1,927 words

Zhongkui Ma This guide is based on SoK: Certified Robustness for Deep Neural Networks. It covers semidefinite programming (SDP) approaches for tighter verification bounds.

Verification Methods SDP Semidefinite Programming

Linear programming [Salman et al., 2019] relaxations provide polynomial-time verification but sometimes yield loose bounds. Complete methods [Tjeng et al., 2019, Katz et al., 2019] provide exact answers but don’t scale. What if you want tighter bounds than LP without the exponential cost of complete methods?

Semi-definite programming (SDP) offers exactly this: relaxations tighter than LP while remaining tractable. SDP-based verification [Raghunathan et al., 2018] formulates bound computation as an SDP, exploiting quadratic relationships that LP misses. The tradeoff is computational cost—SDP solvers are slower than LP solvers—but for critical properties on moderately-sized networks, the tightness gain often justifies the expense.

This guide explores how SDP-based verification works, why it provides tighter bounds, and when the extra computational cost is worthwhile.

Why Semi-Definite Programming?¶

Linear programs optimize over polytopes (linear constraints). Semi-definite programs optimize over spectrahedra (positive semi-definite matrix constraints). This richer geometry captures relationships that linear constraints cannot.

Key advantage: SDP can represent quadratic constraints and products of variables naturally through positive semi-definite (PSD) matrix constraints. For neural networks, this captures neuron correlations and non-linear relationships more tightly than linear relaxations.

Semi-Definite Program (SDP)

A semi-definite program has the form:

\[\begin{split}\begin{aligned} \text{minimize} \quad & \langle C, X \rangle \\ \text{subject to} \quad & \langle A_i, X \rangle = b_i \quad \forall i \\ & X \succeq 0 \end{aligned}\end{split}\]

where:

\(X\) is a symmetric matrix variable
\(X \succeq 0\) means \(X\) is positive semi-definite (all eigenvalues non-negative)
\(\langle A, B \rangle = \text{trace}(A^T B)\) is the matrix inner product

Complexity: SDP can be solved in polynomial time using interior-point methods. Practical solvers (CVXPY, SCS, Mosek) handle SDPs with thousands of variables.

SDP vs LP for Verification¶

LP relaxations [Salman et al., 2019] use linear inequalities to over-approximate the neural network’s behavior. They’re fast (polynomial time) but conservative.

SDP relaxations [Raghunathan et al., 2018] use positive semi-definite constraints to over-approximate behavior. They’re slower (higher-degree polynomial) but tighter.

Tightness hierarchy: For the same network, SDP relaxations provably provide tighter or equal bounds compared to LP relaxations. This theoretical guarantee comes from the fact that SDP can encode LP constraints (every LP is an SDP) plus additional quadratic constraints.

Table 22 LP vs SDP Comparison¶
Aspect	LP-Based Verification	SDP-Based Verification
Constraints	Linear inequalities	PSD matrix constraints
Time Complexity	O(n³) interior-point	O(n^4.5) interior-point
Tightness	Moderate (single-neuron)	Tight (captures quadratic)
Scalability	Thousands of neurons	Hundreds to ~1000 neurons
Best Use	General verification	Critical properties, small networks

SDP Formulation for Neural Networks¶

The SDP formulation [Raghunathan et al., 2018] for neural network verification extends the LP approach by introducing a matrix variable that captures second-order relationships between neurons.

Lifting to Quadratic Form¶

Standard approach: For a network with variables \(z\) (all neuron activations), verify bounds on outputs.

SDP approach: Introduce a lifted variable \(X = zz^T\), a matrix encoding all pairwise products of neuron activations.

Why this helps: ReLU and other non-linearities create quadratic relationships. By explicitly representing these products in \(X\), the SDP can encode tighter constraints.

Basic SDP Formulation¶

Given: Neural network \(f_\theta\), input region \(\mathcal{X}\), output neuron \(k\) to bound.

Variables:

\(z \in \mathbb{R}^n\): All neuron pre/post-activations
\(X \in \mathbb{R}^{n \times n}\) : Lifted matrix with :math:`X_{ij} = z_i z_j`Objective: Minimize \(z_k\) (output neuron \(k\))

Constraints:

Linear layer constraints: For \(z_j = \sum_i W_{ji} z_i + b_j\):

\[z_j = \langle W_j, z \rangle + b_j\]
Input bounds: \(\underline{x} \leq x \leq \overline{x}\) for input \(x\)
ReLU constraints: For \(y = \text{ReLU}(z)\) with bounds \(\underline{z} \leq z \leq \overline{z}\):
- If \(\overline{z} \leq 0\): \(y = 0\) (inactive)
- If \(\underline{z} \geq 0\): \(y = z\) (active)
- If \(\underline{z} < 0 < \overline{z}\): Use SDP relaxation
Lifted consistency: \(X_{ij} = z_i z_j\) (exactly)
PSD constraint: \(\begin{bmatrix} X & z \\ z^T & 1 \end{bmatrix} \succeq 0\) Key insight: The PSD constraint \(\begin{bmatrix} X & z \\ z^T & 1 \end{bmatrix} \succeq 0\) enforces that \(X\) represents outer products, making \(X_{ij} \approx z_i z_j\) without requiring exact equality (which would be non-convex).

ReLU Relaxation in SDP¶

For uncertain ReLUs (\(\underline{z} < 0 < \overline{z}\)), the SDP formulation adds tighter constraints than LP [Raghunathan et al., 2018]:

LP triangle relaxation:

\[\begin{split}\begin{aligned} y &\geq 0 \\ y &\geq z \\ y &\leq \frac{\overline{z}}{\overline{z} - \underline{z}} (z - \underline{z}) \end{aligned}\end{split}\]

SDP quadratic relaxation (additional constraint beyond LP):

\[y \cdot z \geq 0\]

This constraint captures the fact that when ReLU is active (\(y = z\)), both \(y\) and \(z\) are positive, so their product is positive. When inactive (\(y = 0\)), the product is zero.

Why it’s tighter: The quadratic constraint \(y \cdot z \geq 0\) is non-linear, which LP cannot express. SDP encodes it via the lifted matrix \(X\) with the PSD constraint.

Nuclear Norm and Rank Constraints¶

An alternative SDP formulation [Raghunathan et al., 2018] uses the nuclear norm (sum of singular values) to encourage low-rank structure in the lifted matrix \(X\).

Nuclear Norm Regularization¶

Motivation: The exact relationship \(X = zz^T\) implies \(X\) is rank-1. Relaxing this constraint makes the problem convex, but allowing arbitrary \(X\) can be loose.

Middle ground: Minimize the nuclear norm \(\|X\|_* = \sum_i \sigma_i(X)\) where \(\sigma_i\) are singular values.

SDP formulation with nuclear norm:

\[\begin{split}\begin{aligned} \text{minimize} \quad & z_k + \lambda \|X\|_* \\ \text{subject to} \quad & \text{(network constraints)} \\ & X \succeq 0 \end{aligned}\end{split}\]

Effect: The nuclear norm penalty encourages \(X\) to be low-rank, closer to the rank-1 structure of true outer products. The parameter \(\lambda\) balances tightness (higher \(\lambda\)) and conservativeness (lower \(\lambda\)).

Nuclear norm as SDP: The nuclear norm minimization can be reformulated as an SDP constraint:

\[\begin{split}\min \|X\|_* \iff \min t \text{ s.t. } \begin{bmatrix} tI & X \\ X^T & tI \end{bmatrix} \succeq 0\end{split}\]

This makes nuclear norm optimization tractable within the SDP framework.

Tightness Improvements¶

Empirical results [Raghunathan et al., 2018]:

Nuclear norm regularization provides 10-30% tighter bounds than unregularized SDP
SDP provides 20-50% tighter bounds than LP relaxation for typical networks
Tightness gain is most pronounced for networks with many uncertain ReLUs

Comparison with Other Methods¶

SDP vs LP-Based Methods¶

LP-based methods [Salman et al., 2019]:

Pros: Fast, scalable to large networks
Cons: Loose bounds (misses quadratic relationships)

SDP-based methods [Raghunathan et al., 2018]:

Pros: Tighter bounds (captures products via PSD constraints)
Cons: Slower (cubic to quartic complexity)

When SDP wins: Networks with strong neuron correlations, properties requiring very tight bounds, moderate network size (hundreds to ~1000 neurons).

SDP vs Multi-Neuron Relaxations¶

Multi-neuron relaxations [Müller et al., 2022]:

Approach: Add pairwise product terms explicitly for selected neuron pairs
Complexity: Depends on number of pairs \(k\); roughly \(O(n + k)\) where \(k \ll n^2\)
Tightness: Tight for selected pairs, but greedy pair selection may miss important correlations

SDP-based methods:

Approach: Encode all pairwise products via lifted matrix \(X\)
Complexity: \(O(n^{4.5})\) for general SDP solvers
Tightness: Captures all pairwise products, potentially tighter than selective multi-neuron

Tradeoff: PRIMA [Müller et al., 2022] is often faster with comparable tightness due to selective pairing. SDP provides theoretical tightness guarantees but at higher computational cost.

SDP vs Complete Methods¶

Complete methods [Katz et al., 2019, Wang et al., 2021]:

Pros: Exact answers (verified or counterexample)
Cons: Exponential worst-case complexity, don’t scale to large networks

SDP methods:

Pros: Polynomial time, tighter than LP/CROWN
Cons: Still incomplete (might return “unknown”)

When to choose SDP over complete: When the network is too large for complete methods but LP/CROWN are too loose. SDP fills the gap between fast incomplete and slow complete methods.

Table 23 Verification Method Tightness-Speed Spectrum¶
Method	Tightness	Time Complexity	Network Size Limit
IBP	Loosest	O(n)	Millions of neurons
CROWN	Moderate	O(n)	Millions of neurons
LP-based	Moderate-Tight	O(n³)	Thousands of neurons
PRIMA	Tight	O(n + k pairs)	Thousands of neurons
SDP-based	Very Tight	O(n^4.5)	Hundreds to ~1000
Complete (SMT/MILP)	Exact	Exponential	Hundreds of neurons

Practical Implementation¶

Using SDP Solvers¶

Available solvers:

CVXPY [authors, 2020]: High-level Python interface, supports multiple SDP solvers
SCS: Splitting Conic Solver, fast for large-scale problems
MOSEK: Commercial solver, highly optimized
SDPT3/SeDuMi: MATLAB-based solvers

Typical workflow:

import cvxpy as cp
import numpy as np

# Define network layers, input bounds, etc.
n_neurons = 100  # Total neurons (all layers)

# Variables
z = cp.Variable(n_neurons)  # Neuron activations
X = cp.Variable((n_neurons, n_neurons), symmetric=True)  # Lifted matrix

# Objective: minimize output neuron
objective = cp.Minimize(z[-1])

# Constraints
constraints = []

# Linear layer constraints: z_j = W @ z + b
# (add for each layer)

# Input bounds
constraints.append(z[:input_dim] >= x_lower)
constraints.append(z[:input_dim] <= x_upper)

# ReLU constraints (triangle + quadratic)
# For each uncertain ReLU with pre-activation z_i, post-activation y_j:
# constraints.append(y_j * z_i >= 0)  # SDP quadratic constraint

# PSD constraint (lifted consistency)
lifted = cp.bmat([[X, z.reshape(-1, 1)],
                  [z.reshape(1, -1), np.array([[1]])]])
constraints.append(lifted >> 0)  # PSD

# Solve
problem = cp.Problem(objective, constraints)
problem.solve(solver=cp.SCS)

lower_bound = problem.value

Performance Considerations¶

Scalability challenges:

Matrix size: \(X \in \mathbb{R}^{n \times n}\) has \(O(n^2)\) entries. For \(n = 1000\), that’s 1 million variables.
Solver time: SDP solvers have \(O(n^{4.5})\) complexity. Doubling network size increases time by ~22×.

Optimizations:

Sparsity exploitation: Many entries of \(X\) may be irrelevant (neurons in different layers don’t interact directly). Exploit this sparsity to reduce variable count.

Layer-by-layer solving: Similar to LP [Salman et al., 2019], solve SDP layer-by-layer rather than for the entire network. This keeps \(n\) small (neurons per layer) rather than large (total neurons).

Warm-starting: When solving multiple similar SDPs (different output neurons, slightly different bounds), initialize solver with previous solution.

GPU acceleration: Some recent work [Xu et al., 2020] explores GPU-accelerated SDP solving, though traditional SDP solvers are CPU-based.

When to Use SDP-Based Verification¶

Use SDP when:

LP or CROWN [Weng et al., 2018, Zhang et al., 2018] return “unknown” and you need tighter bounds
Network is small to medium (hundreds to ~1000 neurons)
Properties are critical—extra computational cost justified for tightness
Strong neuron correlations (convolutional layers, residual connections)
Computational budget allows hours for verification

Don’t use when:

Network is very large (thousands to millions of neurons)—LP/CROWN/PRIMA faster
LP already verifies the property—no need for tighter bounds
Need complete verification—use SMT [Katz et al., 2019] or branch-and-bound [Wang et al., 2021] instead
Rapid iteration required—SDP too slow for tight feedback loops

Sweet Spot for SDP

SDP-based verification occupies a niche: tighter than LP but slower, faster than complete but still incomplete. It’s most valuable when:

Simple incomplete methods fail to verify
Complete methods are too slow
Network size is moderate (not tiny, not huge)
Tightness matters more than speed

Current Research and Extensions¶

Active research directions:

Tighter SDP relaxations: Beyond basic nuclear norm, researchers explore additional constraints (e.g., exploiting network structure like convolution, adding higher-order tensor relaxations) to tighten bounds further.

Faster solvers: Developing specialized SDP solvers for neural network verification that exploit problem structure for faster solving.

Hybrid SDP-MILP: Combine SDP relaxations with integer programming for select neurons, balancing tightness and completeness.

Integration with training: Train networks that are easier to verify via SDP (e.g., encouraging low nuclear norm in activations).

Probabilistic SDP: Extend SDP to handle probabilistic guarantees, combining with randomized smoothing [Cohen et al., 2019].

Limitations¶

Computational cost: \(O(n^{4.5})\) complexity limits SDP to moderately-sized networks. Large-scale verification requires faster methods.

Still incomplete: SDP provides tighter bounds than LP but might still return “unknown.” For definitive answers, complete methods [Katz et al., 2019, Wang et al., 2021] are needed.

Solver numerical issues: SDP solvers can face numerical instability, especially for large problems or poorly-scaled constraints. Careful problem formulation and solver tuning are required.

Not always tighter in practice: While theoretically tighter than LP, in practice the improvement depends on network structure. For some networks, the gain is marginal, not worth the extra cost.

Final Thoughts¶

SDP-based verification [Raghunathan et al., 2018] represents a sophisticated approach to incomplete verification: encoding neural networks as semi-definite programs captures quadratic relationships that simpler methods miss. The positive semi-definite constraint—a subtle but powerful tool from convex optimization—enables reasoning about products and correlations without sacrificing polynomial-time solvability.

While SDP doesn’t scale to massive networks like CROWN [Weng et al., 2018, Zhang et al., 2018] or IBP [Gowal et al., 2019] do, it fills an important niche. For critical properties on moderately-sized networks where LP is too loose and complete methods [Katz et al., 2019] are too slow, SDP provides the right balance.

Understanding SDP-based verification illuminates the tightness hierarchy of incomplete methods: from simple intervals (IBP) to linear bounds (CROWN) to linear programs (LP [Salman et al., 2019]) to semi-definite programs (SDP) to complete enumeration (SMT/MILP [Tjeng et al., 2019]). Each step up the hierarchy trades speed for tightness. The art of verification is choosing the right point on this spectrum for your application.

Next Guide

Continue to Lipschitz Bounds and Curvature-Based Verification to learn about Lipschitz constant estimation and spectral normalization for robustness certification.

[1] (1,2)

The CVXPY authors. Welcome to cvxpy 1.1 - cvxpy 1.1 documentation. 2020. https://www.cvxpy.org/.

[2] (1,2)

Jeremy Cohen, Elan Rosenfeld, and Zico Kolter. Certified adversarial robustness via randomized smoothing. In International Conference on Machine Learning. 2019.

[3] (1,2)

Sven Gowal, Krishnamurthy Dj Dvijotham, Robert Stanforth, Rudy Bunel, Chongli Qin, Jonathan Uesato, Relja Arandjelovic, Timothy Mann, and Pushmeet Kohli. Scalable verified training for provably robust image classification. In Proceedings of the IEEE International Conference on Computer Vision, 4842–4851. 2019.

[4] (1,2,3,4,5,6)

Guy Katz, Derek A Huang, Duligur Ibeling, Kyle Julian, Christopher Lazarus, Rachel Lim, Parth Shah, Shantanu Thakoor, Haoze Wu, Aleksandar Zeljić, and others. The marabou framework for verification and analysis of deep neural networks. In International Conference on Computer Aided Verification, 443–452. Springer, 2019.

[5] (1,2,3)

Mark Niklas Müller, Gleb Makarchuk, Gagandeep Singh, Markus Püschel, and Martin Vechev. PRIMA: precise and general neural network certification via multi-neuron convex relaxations. Proceedings of the ACM on Programming Languages, 6(POPL):1–33, 2022.

[6] (1,2,3,4,5,6,7,8,9)

Aditi Raghunathan, Jacob Steinhardt, and Percy S Liang. Semidefinite relaxations for certifying robustness to adversarial examples. In Advances in Neural Information Processing Systems, 10877–10887. 2018.

[7] (1,2,3,4,5,6)

Hadi Salman, Greg Yang, Huan Zhang, Cho-Jui Hsieh, and Pengchuan Zhang. A convex relaxation barrier to tight robustness verification of neural networks. In Advances in Neural Information Processing Systems, 9832–9842. 2019.

[8]

Gagandeep Singh, Rupanshu Ganvir, Markus Püschel, and Martin Vechev. Beyond the single neuron convex barrier for neural network certification. In Advances in Neural Information Processing Systems, 15072–15083. 2019.

[9] (1,2,3)

Vincent Tjeng, Kai Y. Xiao, and Russ Tedrake. Evaluating robustness of neural networks with mixed integer programming. In International Conference on Learning Representations. 2019.

[10] (1,2,3,4)

Shiqi Wang, Huan Zhang, Kaidi Xu, Xue Lin, Suman Jana, Cho-Jui Hsieh, and J Zico Kolter. Beta-crown: efficient bound propagation with per-neuron split constraints for neural network robustness verification. Advances in Neural Information Processing Systems, 2021.

[11] (1,2,3)

Lily Weng, Huan Zhang, Hongge Chen, Zhao Song, Cho-Jui Hsieh, Luca Daniel, Duane Boning, and Inderjit Dhillon. Towards fast computation of certified robustness for relu networks. In International Conference on Machine Learning, 5276–5285. 2018.

[12] (1,2)

Kaidi Xu, Zhouxing Shi, Huan Zhang, Yihan Wang, Kai-Wei Chang, Minlie Huang, Bhavya Kailkhura, Xue Lin, and Cho-Jui Hsieh. Automatic perturbation analysis for scalable certified robustness and beyond. Advances in Neural Information Processing Systems, 2020.

[13]

Huan Zhang, Shiqi Wang, Kaidi Xu, Linyi Li, Bo Li, Suman Jana, Cho-Jui Hsieh, and J. Zico Kolter. General cutting planes for bound-propagation-based neural network verification. arXiv preprint arXiv:2208.05740, 2022.

[14] (1,2,3)

Huan Zhang, Tsui-Wei Weng, Pin-Yu Chen, Cho-Jui Hsieh, and Luca Daniel. Efficient neural network robustness certification with general activation functions. In Advances in neural information processing systems, 4939–4948. 2018.