Randomized algorithm for trace estimation

Posted March 8, 2025

Trace

The trace of a matrix \(A \in \mathbb{R}^{n \times n}\) is the sum of its diagonal entries:

\[\text{tr}(A) = \sum_{i=1}^n a_{ii} = \sum_{i=1}^n e_i^T A e_i, \]

where \(e_i\) is the \(i\)-th column of the identity matrix. Many scientific computing and machine learning applications require estimating the trace of a square linear operator \(A\) that is represented implicitly. For example, the matrix-vector product \(u \mapsto Au\) might represent the solution of a linear differential equation with initial condition \(u\). In such cases, we aim to avoid applying this procedure \(n\) times.

Trace Estimation by Random Sampling

Randomized trace estimation is based on the insight that it is straightforward to construct a random variable whose expectation equals the trace of the input matrix.

Lemma. Let \(\omega \in \mathbb{C}^n\) be a random test vector that is isotropic, i.e., \(\mathbb{E}[\omega\omega^*] = I\). Then,

\[\mathbb{E}[X] = \text{tr}(A) \quad \text{where } X = \omega^*(A\omega). \]

Proof.
Since \(\mathbb{E}[\omega\omega^*] = I\), we have \(\mathbb{E}[\omega_i \omega_j] = \delta_{ij}\), where \(\delta_{ij}\) is the Kronecker delta. Therefore,

\[\mathbb{E}[X] = \mathbb{E}\left[\sum_{i,j} \omega_i a_{ij} \omega_j\right] = \sum_{i,j} a_{ij} \mathbb{E}[\omega_i \omega_j] = \sum_{i=1}^n a_{ii} = \text{tr}(A), \]

which follows from the linearity of expectation. \(\quad \Box\)

In other words, the random variable \(X\) is an unbiased estimator of the trace. This idea dates back to Girard, who suggested drawing \(\omega\) from a uniform distribution over the \(\ell_2\) hypersphere with radius \(\sqrt{n}\). Later, Hutchinson proposed using Rademacher random vectors.

Monte Carlo Sampling

A single sample of \(X\) is rarely enough because its variance, \(\mathbb{V}[X]\), tends to be large. The most common mechanism for reducing the variance is to average \(k\) independent copies of \(X\), a technique known as the Monte Carlo Method. For \(k \in \mathbb{N}\), define

\[\overline{X}_k = \frac{1}{k} \sum_{i=1}^k X_i, \]

where \(X_i \sim X\) are independent and identically distributed (i.i.d.). By linearity, \(\overline{X}_k\) is also an unbiased estimator of the trace. More importantly,

\[\mathbb{E}[\overline{X}_k] = \text{tr}(A), \quad \mathbb{V}[\overline{X}_k] = \frac{\mathbb{V}[X]}{k}. \]

More Algorithms

References

  1. Riley Murray, James Demmel, Michael W. Mahoney, N. Benjamin Erichson, Maksim Melnichenko, Osman Asif Malik, Laura Grigori, Piotr Luszczek, Michał Derezínski, Miles E. Lopes, Tianyu Liang, Hengrui Luo, and Jack J. Dongarra. Randomized numerical linear algebra : A perspective on the field with an eye to software. arXiv EPrint arXiv:2302.11474, arXiv, April 2023. 202 pp.
  2. Per-Gunnar Martinsson and Joel A. Tropp. Randomized numerical linear algebra: Foundations and algorithms. Acta Numerica, 29:403–572, 2020.
  3. Didier Girard. Un algorithme simple et rapide pour la validation croisée généralisée sur des problémes de grande taille. Research Report 669, Institut d‘Informatique et de Math´ematiques Appliquées de Grenoble, Grenoble, France, March 1987. 32 pp.
  4. Arvind K. Saibaba, Alen Alexanderian, and Ilse C. F. Ipsen. Randomized matrix-free trace and log-determinant estimators. Numer. Math., 137(2):353–395, 2017.
  5. Raphael A. Meyer, Cameron Musco, Christopher Musco, and David P. Woodruff. Hutch++: Optimal stochastic trace estimation. In Symposium on Simplicity in Algorithms (SOSA), Society for Industrial and Applied Mathematics, 2021, page 142–155.
  6. Ethan N. Epperly, Joel A. Tropp, and Robert J. Webber. XTrace: Making the most of every sample in stochastic trace estimation. SIAM J. Matrix Anal. Appl., 45(1):1–23, 2024.