Rylan Schaeffer

Logo
Resume
Research
Learning
Blog
Teaching
Jokes
Kernel Papers


Nystrom Approximation

The kernel (a.k.a. Gram) matrix grows quadratically with the number of data \(N\). Using it requires space \(O(N^2)\) and time \(O(N^3)\). The Nystrom approximation provides a way to create a rank-\(k\) approximation of the kernel matrix with space \(O(kN)\) and time \(O(k^2N)\).

The Nystrom approximation says that for an \(n \times n\) kernel matrix \(K\), we can approximate it with low error in the following manner: first, randomly shuffle columns and rewrite \(K = \begin{bmatrix}W & K_{12}^T\\K_{12} & K_22 \end{bmatrix}\)
where \(W\) is \(m \times m\).

Properties

\[\lvert\lvert K - \Tilde{K}_k \lvert \lvert_F \leq \lvert\lvert K - K_k \lvert \lvert_F + \epsilon\]

where \(\epsilon = \mathcal{O}(m^{-1/4}) \lvert \lvert K \lvert_F\).