Higgledy-piggledy notes on the theme of exploiting sparsity to recover signals from few non-local measurements, given that we know they are nearly *sparse*, in a sense that will be made clear soon.

See also matrix factorisations, restricted isometry properties, Riesz basesβ¦

## Basic Compressed Sensing

Iβll follow the intro of (E. J. CandΓ¨s et al. 2011), which tries to unify many variants.

We attempt to recover a signal \(x_k\in \mathbb{R}^d\) from \(m\ll n\) measurements \(y_k\) of the form

\[y_k =\langle a_k, x\rangle + z_k,\, 1\leq k \leq m,\]

or, as a matrix equation,

\[ y = Ax + z \]

where \(A\) is the \(m \times d\) stacked measurement matrices, and the \(z\) terms denote i.i.d. measurement noise.

Now, if \(x\) is a sparse vector, and \(A\) satisfies a restricted isometry property or something then we can construct an estimate \(\hat{x}\) with small error by minimising

\[ \hat{x}=\min \|\dot{x}\|_1 \text{ subject to } \|A\dot{x}-y\|_2 < \varepsilon, \]

where \(\varepsilon> \|z\|_2^2.\)

In the lecture notes on restricted isometry properties, CandΓ¨s and Tao talk about not vectors \(x\in \mathbb{R}^d\) but functions \(f:G \mapsto \mathbb{C}\) on Abelian groups like \(G=\mathbb{Z}/d\mathbb{Z},\) which is convenient for some phrasing, since then when I say my signal is \(s\)-sparse, which means that its support \(\operatorname{supp} \tilde{f}=S\subset G\) where \(|S|=s\).

In the finite-dimensional vector framing, we can talk about best sparse approximations \(x_s\) to non-sparse vectors, \(x\).

\[ x_s = \argmin_{\|\dot{x}\|_0\leq s} \|x-\dot{x}\|_2 \]

where all the coefficients apart from the \(s\) largest are zeroed.

The basic results find attractive convex problems with high probability in a nest of nastier ones. There are also greedy optimisation versions, which are formulated as above, but no longer necessarily a convex optimisation; instead, we talk about Orthogonal Matching Pursuit, Iterative Thresholding and some other stuff the details of which I do not yet know, which I think pops up in wavelets and sparse coding.

For all of these the results tend to be something like

with data \(y,\) the difference between my estimate of \(\hat{x}\) and \(\hat{x}_\text{oracle}\) is bounded by something-or-other where the oracle estimate is the one where you know ahead of time the set \(S=\operatorname{supp}(x)\).

CandΓ©s gives an example result

\[ \|\hat{x}-x\|_2 \leq C_0\frac{\|x-x_s\|_1}{\sqrt{s}} + C_1\varepsilon \]

conditional upon

\[ \delta_2s(A) < \sqrt{2} -1 \]

where this \(\delta_s(\cdot)\) gives the restricted isometry constant of a matrix, defined as the smallest constant such that \((1-\delta_s(A))\|x\|_2^2\leq \|Ax\|_2^2\leq (1+\delta_s(A))\|x\|_2^2\) for all \(s\)-sparse \())x\). That is, the measurement matrix does not change the norm of sparse signals βmuchβ, and in particular, does not null them when \(\delta_s < 1.\)

This is not the strongest bound out there apparently, but for any of that form, those constants look frustrating.

Measuring the restricted isometry constant of a given measurement matrix is presumably hard, although I havenβt tried yet. But generating random matrices that have a certain RIC with high probability is easy; thatβs a neat trick in this area.

## Redundant compressed sensing

π For now see Frame theory.

## Introductory texts

Aside: see the rather good practical python notebook in numerical tours.

Terry Taoβs exposition is great as usual. See, e.g.

[β¦] we can at least give an informal geometric argument as to why \(\ell^1\) minimisation is more likely to recover a sparse solution than \(\ell^2\) minimisation. The set of all \(f\) whose Fourier coefficients match the observed data \(c_\xi\) forms an affine subspace of the space of all functions. The \(\ell^2\) minimiser can then be viewed geometrically by taking l^2 balls (i.e. Euclidean balls) centred at the origin, and gradually increasing the radius of the ball until the first point of contact with the affine subspace. In general, there is no reason to expect this point of contact to be sparse (i.e. to lie on a high-codimension coordinate subspace). If however we replace \(\ell^2\) with \(\ell^1\), then the Euclidean balls are replaced by octahedra, which are much βpointierβ (especially in high dimensions) and whose corners lie on coordinate subspaces. So the point of first contact is now much more likely to be sparse. The idea of using \(\ell^1\) as a βconvex relaxationβ of \(\ell^0\) is a powerful one in applied mathematics; see for instance (J. A. Tropp 2006) on the topic.

Hegde, Baraniuk, Davenport and Duarte have an open source textbook

Gabriel Peyreβs Compressed Sensing of Images

Igor Carronβs Sunday Morning Insight: A Quick Panorama of Sensing from Direct Imaging to Machine Learning

## β¦Using random projections

Classic. Notes under low dimensional projections

## β¦Using deterministic projections

Surely this is close to quasi monte carlo?

Dustin G. Mixon Achieving the Welch bound with difference sets

I blogged about constructing harmonic frames using difference sets. We proved that such harmonic frames are equiangular tight frames, thereby having minimal coherence between columns. I concluded the entry by conjecturing that incoherent harmonic frames are as good for compressed sensing as harmonic frames whose rows were randomly drawn from the discrete Fourier transform (DFT) matrix

A variant on the compressed sensing of Yves Meyer

recent work of Yves Meyer might be relevant:

A variant on the compressed sensing of Emmanuel Candes, Basarab Matei and Yves Meyer

Simple quasicrystals are sets of stable sampling, Basarab Matei and Yves Meyer

These papers are interesting because their approach to compressed sensing is very different. Specifically, their sparse vectors are actually functions of compact support with sufficiently small Lebesgue measure. As such, concepts like conditioning are replaced with that of stable sampling, and the results must be interpreted in the context of functional analysis. The papers demonstrate that sampling frequencies according to a (deterministic) simple quasicrystal will uniquely determine sufficiently sparse functions, and furthermore, the sparsest function in the preimage can be recovered by L1-minimization provided itβs nonnegative.

## Bayesian

Sparse Bayes can be tricky. See, perhaps, Bayesian Compressive Sensing.

## Phase transitions

How well can you recover a matrix from a certain number of measurements? In obvious metrics there is a sudden jump in how well you do with increasing measurements for a given rank. This looks a lot like a physical phase transition, which is a known phenomenon in ML. Hmm.

## Weird things to be classified

csgm, (Bora et al. 2017) compressed sensing using generative models, tries to find a model which is sparse with respect toβ¦ some manifold of the latent variables ofβ¦ a generative model? or something?

## References

*Journal of Computer and System Sciences*, Special Issue on PODS 2001, 66 (4): 671β87.

*arXiv:1506.00898 [Cs, Math, Stat]*, June.

*Optimization With Sparsity-Inducing Penalties*. Foundations and Trends(r) in Machine Learning 1.0. Now Publishers Inc.

*IEEE Signal Processing Magazine*24 (4).

*IEEE Signal Processing Magazine*25 (2): 83β91.

*IEEE Transactions on Information Theory*56 (4): 1982β2001.

*Constructive Approximation*28 (3): 253β63.

*IEEE Transactions on Signal Processing*58 (1): 269β80.

*IEEE Transactions on Information Theory*57 (2): 764β85.

*Communications of the ACM*59 (8): 72β80.

*SIAM Journal on Optimization*23 (3): 1718β41.

*Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining*, 245β50. KDD β01. New York, NY, USA: ACM.

*Proceedings of the National Academy of Sciences*110 (4): 1146β47.

*International Conference on Machine Learning*, 537β46.

*arXiv:1612.01183 [Cs, Math]*, December.

*3rd International Symposium on Communications, Control and Signal Processing, 2008. ISCCSP 2008*, 762β67.

*IEEE Transactions on Information Theory*54 (11): 4813β20.

*arXiv:0805.0149 [Cs]*, May.

*The Annals of Statistics*43 (1): 102β38.

*ICM 2014 Proceedings, to Appear*.

*arXiv:1104.5246 [Cs, Math, Stat]*, April.

*Applied and Computational Harmonic Analysis*31 (1): 59β73.

*Foundations of Computational Mathematics*9 (6): 717β72.

*IEEE Transactions on Information Theory*52 (2): 489β509.

*Communications on Pure and Applied Mathematics*59 (8): 1207β23.

*IEEE Transactions on Information Theory*52 (12): 5406β25.

*IEEE Signal Processing Magazine*25 (2): 21β30.

*IEEE Transactions on Information Theory*51 (12): 4203β15.

*Digital Signal Processing*23 (3): 751β70.

*Compressed Sensing & Sparse Filtering*, edited by Avishy Y. Carmi, Lyudmila Mihaylova, and Simon J. Godsill, 281β324. Signals and Communication Technology. Springer Berlin Heidelberg.

*Advances in Neural Information Processing Systems*, 257β64. Curran Associates, Inc.

*IEEE International Conference on Acoustics, Speech and Signal Processing, 2008. ICASSP 2008*, 3869β72.

*Mathematical Programming*134 (1): 71β99.

*Computational Optimization and Applications*59 (1-2): 47β61.

*arXiv:0809.0660 [Stat]*, September.

*Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence*, 143β51. UAIβ00. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.

*Random Structures & Algorithms*22 (1): 60β65.

*Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence*, 114β21. UAIβ06. Arlington, Virginia, USA: AUAI Press.

*Communications on Pure and Applied Mathematics*57 (11): 1413β57.

*Communications on Pure and Applied Mathematics*63 (1): 1β38.

*Acta Numerica*7 (January): 51β150.

*The Annals of Statistics*12 (3): 793β815.

*IEEE Transactions on Information Theory*52 (1): 6β18.

*IEEE Transactions on Information Theory*52 (4): 1289β1306.

*Proceedings of the National Academy of Sciences*100 (5): 2197β2202.

*2010 IEEE Information Theory Workshop (ITW)*, 1β5.

*Proceedings of the National Academy of Sciences*106 (45): 18914β19.

*2010 IEEE Information Theory Workshop (ITW)*, 1β5.

*Applied and Computational Harmonic Analysis*35 (1): 111β29.

*New Journal of Physics*14 (9): 095022.

*arXiv:1108.0373 [Math, Stat]*, August.

*Advances in Neural Information Processing Systems*, 473β80.

*IEEE Transactions on Signal Processing*64 (13): 3444β57.

*Applied Optics*54 (8): C23β44.

*The Annals of Statistics*21 (2): 867β89.

*Mathematical Programming*152 (1-2): 75β112.

*Proceedings of the Forty-Fourth Annual ACM Symposium on Theory of Computing*, 563β78. STOC β12. New York, NY, USA: ACM.

*Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms*, 1183β94. Proceedings. Kyoto, Japan: Society for Industrial and Applied Mathematics.

*IEEE Transactions on Information Theory*58 (12): 7204β14.

*IEEE Transactions on Signal Processing*58 (3): 1095β1109.

*Journal of Machine Learning Research*5 (9): 1457β69.

*Journal of Machine Learning Research*, 427β35.

*arXiv:1311.3257 [Stat]*, November.

*Journal of Machine Learning Research*, 448β56.

*Signal Processing*125 (August): 274β89.

*arXiv:1607.04331 [Cs, q-Bio, Stat]*, July.

*Advances in Neural Information Processing Systems*, 33:15.

*Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining*, 287β96. KDD β06. New York, NY, USA: ACM.

*Inverse Problems and Imaging*3 (3): 487β503.

*Complex Variables and Elliptic Equations*55 (8-10): 947β64.

*IEEE Journal of Selected Topics in Signal Processing*4 (2): 375β91.

*Compressed Sensing: Theory and Applications*, 394β438.

*ICASSP*.

*arXiv:0803.2392 [Cs, Math]*, March.

*5th IEEE Sensor Array and Multichannel Signal Processing Workshop, 2008. SAM 2008*, 257β60.

*Network (Bristol, England)*7 (2): 333β39.

*Current Opinion in Neurobiology*14 (4): 481β87.

*arXiv:1501.00320 [Cs, Math]*, January.

*IEEE Transactions on Signal Processing*60 (5): 2286β2303.

*IEEE Signal Processing Letters*25 (12): 1835β39.

*2011 IEEE International Symposium on Information Theory Proceedings*, 2168β72. St.Β Petersburg, Russia: IEEE.

*arXiv:1501.02923 [Cs, Stat]*, January.

*IEEE Transactions on Signal Processing*63 (9): 2389β2404.

*Compressed Sensing & Sparse Filtering*, edited by Avishy Y. Carmi, Lyudmila Mihaylova, and Simon J. Godsill, 77β93. Signals and Communication Technology. Springer Berlin Heidelberg.

*Sparse Modeling: Theory, Algorithms, and Applications*. Chapman & Hall/CRC Machine Learning & Pattern Recognition Series. Boca Raton, FL: CRC Press, Taylor & Francis Group.

*IEEE Signal Processing Magazine*25 (2): 14β20.

*The Annals of Statistics*35 (3): 1012β30.

*IEEE Transactions on Signal Processing*61 (3): 661β77.

*In Proc. Allerton Conf. On Comm., Control, and Computing*.

*2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton)*, 815β22.

*Journal of Machine Learning Research*12 (July): 1865β92.

*arXiv:1512.04011 [Cs]*, December.

*arXiv:1509.00130 [Cs, Math, Stat]*, August.

*Proceedings of the IEEE*98 (6): 948β58.

*IEEE Transactions on Information Theory*52 (3): 1030β51.

*AeroSenseβ99*, 3723:28β31. International Society for Optics and Photonics.

*IEEE Transactions on Information Theory*58 (8): 4969β92.

*Microsoft Research*, July.

*IEEE Signal Processing Letters*20 (4): 403β6.

*International Conference on Machine Learning*, 6850β60.

*2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)*, 5409β12.

*Journal of Machine Learning Research*, 494β503.

*Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining*, 615β23. KDD β17. New York, NY, USA: ACM.

## No comments yet. Why not leave one?