Singular Value Decomposition (SVD)¶

Used in Deep Learning for Network Compression, among other things.

\[A = U \Sigma V^{\mathsf{T}}\]

\(A \in \mathbb{R}^{m \mathsf{x} n} \\ U \in \mathbb{R}^{m \mathsf{x} m} \\ \Sigma \in \mathbb{R}^{m \mathsf{x} n} \\ V \in \mathbb{R}^{n \mathsf{x} n}\)

\(U, V\) are orthogonal matrices;
\(\Sigma\) contains singular values (denoted by \(\sigma_{i}\)) of \(A\) along the diagonal.

\[A = \sum_{i=1}^{r}{\sigma_{i}u_{i}v_{i}^{\mathsf{T}}}\]

The \(m \mathsf{x} n\) \(\Sigma\) matrix whose off-diagonal entries are all 0's;
SVD is a matrix factorization technique. Given training set \(X\), SVD decomposes it into the matrix multiplication of three matrices.

For example, the NumPy function, np.linalg.svd() will perform this calculation.

PyTorch Implementations¶

With PyTorch, SVD is performed with the torch.linalg.svd() function.
- Nota bene: torch.svd() has been deprecated.
Because the \(V\) matrix is output as \(V^{H}\), a transformation is required to put the matrix into \(V\) form.

U, S, Vh = torch.linalg.svd(A, full_matrices=False)
V = Vh.transpose(-2, -1).conj()

SVD as applied to Weight matrices of Neural Networks serves to compress the network; the goal is to streamline computations.

Some Definitions¶

Orthogonal matrices --

Resources¶

Steve Brunton on SVD w/ Python: