This week, TCS+ hosted a talk by Greg Valiant via a Google+ hangout. Valiant gave a talk on his work with brother, Paul, on an efficient estimator for entropy and support size of an unknown probability distribution requiring only O(n/log n) samples, where n is a bound on the support size of the distribution. This work diverges from the existing literature by demonstrating that the estimate can be obtained with a concrete linear program; an algorithm which outputs a distribution very similar to the unknown distribution with respect to certain statistical properties.
A second problem tackled algorithmically by the Valiants is the lower bound on sample size. Namely, they show that no algorithm on o(n/log n) samples can estimate these properties up to a small additive constant. The lower bound relies on two multivariate central limit theorems; one with respect to the earthmover distance and which lies foundation for the second, under L1 distance. (Earthmover distance is also known as the Wasserstein distance. This work on estimating unknown distributions is my first encounter with earthmover distance, and I must say I hope this metric – not only for the nomenclature – catches on!) The earthmover distance CLT is proved with Stein’s method. Using Stein’s method for central limit theorems has gained recent popularity as a powerful tool for showing convergence to a particular distribution. As Valiant himself explains, it is a way to “deal with strange settings gracefully.” Stein’s method is noteworthy in itself, and I would like to devote a few words on its behalf.
The Stein continuity theorem states that a sequence of real random variables with uniformly bounded second moment, Xn, converges to the standard normal distribution N(0,1) exactly when E[f'(Xn) – Xnf(Xn)] converges to 0. Here, f : R → R is continuously differentiable and both f, f’ are bounded. In its bare form, this theorem can be used to give Berry-Esséen-type bounds, and in fact, as noted in the wonderful blog post of Terry Tao on the central limit theorem, it can be used to prove the full Berry-Esséen theorem. In his 1991 article, Götze uses Stein’s method to derive such a multivariate CLT bound for the class of Borel convex subsets of Rk. His basic strategy for estimating the “distance” in expectation of two probability measures (nicely outlined in a technical report by Rabi Bhattacharya and Susan Holmes) is:
- find an invertible map L which maps functions in the space to the null space of the mean of one of the distributions
- find a perturbation of L
- estimate the distance of distributions by the expected change of L to its perturbation
This algorithmic approach is immediately appealing to me as a computer scientist, and introduces some new ways to think about central limit theorems. Valiant, for example, uses an inspired process to show that a distribution is (nearly) Gaussian. The basic idea behind this approach is to transform unusual or unfamiliar random variables into the multivariate Gaussian by a process of adding noise and rescaling a well-behaved set of test functions. At a high level, Valiant presents this procedure for showing a distribution is close to Gaussian (or any other distribution):
- find a transformation for which the Gaussian is a fixed point
- apply this transformation to your test functions
- look at the change incurred: if no change, the distribution is close to Gaussian/if big change, the distribution is far from Gaussian
We avoid working with the strange distribution by simulating a transformation on carefully chosen test functions.
Valiant’s application of the approach illustrates the versatility of Stein’s method, and its beauty in an algorithmic form. In my opinion, this is a tool that will be of great use to the computer science community because of its elegance and simplicity. And the fact that it prevents fussing around with messy random variables cannot be ignored.
As a closing note, I wanted to comment on the format of the talk. Promptly at 10am PST on Wednesday morning, I and a few other students and faculty met in a conference room where a laptop was hooked up to the projector and lights were dimmed to watch Greg Valiant and a number of other faculty, groups, and perhaps just enthusiasts appropriately outfitted with headsets and coffee mugs, projected onto tiny cells on the screen before us. These cells surrounded the slides accompanying the talk so that the observer had an up-close view of the speaker and attendees. Those who were added to the hangout could interact with verbal questions, and those who missed out on G+ seats were able to contribute with comments. While the scenario was initially jarring, as the talk progressed it became evident that an air of casualness was cast. Something about being seated at your own desk, behind the security of a webcam and a screen, really encourages relaxation and interaction at a comfortable pace. The talk, now posted on YouTube, for instance, spanned a full hour and a half, while the G+ event was scheduled to last only an hour. The talk in no way felt dragged out or verbose, merely paced and relaxed. Further, the recorded stream is now immortalized (for the time being) on YouTube, so the viewer has the freedom to revisit any part of the talk later on. I think this is a trend that will gain some momentum in the future. Following in the footsteps of online open courseware, online venues such as these make important talks available to the general public, and yet maintains the caliber (sanctity) of the talk. I look forward to tuning into some TCS+ streams in the coming weeks!