Skip to content

Singular Value Decomposition (SVD)

  • Used in Deep Learning for Network Compression, among other things.
\[A = U \Sigma V^{\mathsf{T}}\]

\(A \in \mathbb{R}^{m \mathsf{x} n} \\ U \in \mathbb{R}^{m \mathsf{x} m} \\ \Sigma \in \mathbb{R}^{m \mathsf{x} n} \\ V \in \mathbb{R}^{n \mathsf{x} n}\)

  • \(U, V\) are orthogonal matrices;
  • \(\Sigma\) contains singular values (denoted by \(\sigma_{i}\)) of \(A\) along the diagonal.
\[A = \sum_{i=1}^{r}{\sigma_{i}u_{i}v_{i}^{\mathsf{T}}}\]
  • The \(m \mathsf{x} n\) \(\Sigma\) matrix whose off-diagonal entries are all 0's;

  • SVD is a matrix factorization technique. Given training set \(X\), SVD decomposes it into the matrix multiplication of three matrices.

For example, the NumPy function, np.linalg.svd() will perform this calculation.

PyTorch Implementations

  • With PyTorch, SVD is performed with the torch.linalg.svd() function.

    • Nota bene: torch.svd() has been deprecated.
  • Because the \(V\) matrix is output as \(V^{H}\), a transformation is required to put the matrix into \(V\) form.

1
2
U, S, Vh = torch.linalg.svd(A, full_matrices=False)
V = Vh.transpose(-2, -1).conj()
  • SVD as applied to Weight matrices of Neural Networks serves to compress the network; the goal is to streamline computations.

Some Definitions

Orthogonal matrices --

Resources

Steve Brunton on SVD w/ Python: