by Jon Kujawa
Emmanuel Candes of Stanford University was a plenary speaker at the recent International Congress of Mathematicians. By chance I saw Candes speak several years ago at an American Mathematical Society meeting. Truthfully, Candes work is miles away from my own and I might not have seen him speak if it weren't for the wine reception scheduled immediately after his talk! That was my great fortune: Candes is an excellent speaker and is doing truly remarkable mathematics.
Indeed Candes and his collaborators are doing something very exciting which has immediate practical applications and endless future ones.
Here is the essence of the problem:
Imagine you have an array of data. That is, imagine you have a spreadsheet in which you have a series of rows of numbers with one number per column. Now imagine that some of those numbers are missing. Perhaps the data has been lost or perhaps you never had the data in the first place. Can you recover the missing data?
Put like that, the answer is certainly no. If the numbers are random, then even knowing all but one of them it will be impossible to recover the last missing number. Candes and his collaborators have shown, however, that the real world isn't random and in fact we can often reconstruct the missing data.
Let me give an example where you can see that recovering real world data is sometimes possible. Imagine that you have a huge data set in which each row of numbers is the biometric data for a single person. The first number is their weight, the second their height, the third their age, the fourth their blood cholesterol level, the fifth their shoe size, and so on. If you accidently delete the weight of the 2,381,773rd person in your data set, you can certainly make an accurate estimate of the missing weight by comparing person number 2,381,773 with others who have similar heights, ages, etc.
That was easy, of course, because you have nearly all the data at hand for working out the missing bit. Candes does something much more remarkable, though. He shows that under reasonable assumptions you can actually usually recover the entire array of data even if nearly half of it is missing! Not only this, but he and his collaborators give us the tools to calculate how much data can be missing and how close we can get to a perfect reconstruction.
Does this so far imaginary problem actually occur in real life? You bet! Once you start you'll find it everywhere you look. Let me mention a couple of examples.