## Sunday, February 12, 2012

### Spectral learning

Spectral learners are those that deal with the eigenvectors of a space.

Why spectra? Well, according to Wikipedia:

• A spectrum (plural spectra or spectrums[1]) is a condition that is not limited to a specific set of values but can vary infinitely within a continuum. The word saw its first scientific use within the field of optics to describe the rainbow of colors in visible light when separated using a prism; it has since been applied by analogy to many fields other than optics. Thus, one might talk about the spectrum of political opinion, or the spectrum of activity of a drug, or the autism spectrum. In these uses, values within a spectrum may not be associated with precisely quantifiable numbers or definitions. Such uses imply a broad range of conditions or behaviors grouped together and studied under a single title for ease of discussion.
• In most modern usages of spectrum there is a unifying theme between extremes at either end. Some older usages of the word did not have a unifying theme, but they led to modern ones through a sequence of events set out below. Modern usages in mathematics did evolve from a unifying theme, but this may be difficult to recognize.
• In mathematics, the spectrum of a (finite-dimensional) matrix is the set of its eigenvalues. This notion can be extended to the spectrum of an operator in the infinite-dimensional case.
This idea of the spectrum actually sums up years of my own thinking. SE theory has treated a software project  as "one thing" when actually it is a spectrum of things- and we can best study it if we study the data's spectrum).

So, there is a relatively recent newer class of learners that base their reasoning on the eigenvectors.  If you don't know what is an eigenvector, they are the invariant properties of a matrix. Its direction, if you will. For example, if we take the eigenvectors of a co-variance matrix, we get the major directions of that data

For more, see http://www.slideshare.net/guest9006ab/eigenvalues-in-a-nutshell or http://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf (the first is better for visuals, the second is better for explanatory text).

Now I raise this snince Andrew's learner and the IDEA that Raymond is working with spectra since these are based on Fastmap and...
First published in 1995, FastMap algorithm has been significantly elaborated since then. Wang et al.’s MetricMap algo- rithm tries to replace the iterative projections of Figure 2 with a single multi-profject method [43]. See also the theoretical analysis of Platt that shows that FastMap and MetricMap are really instances of a broader class of what is known as Nystrom algorithms [44]; i.e. they are based on the Nystrom approximation of the eigenvectors of a large matrix.
[43] T. L. Wang, X. Wang, D. Shasha, and K. Zhang, “Metricmap: An embedding technique for processing distance-based queries in metric spaces,” IEEE Transactions on Systems, Man, and Cybernetics, Part B, vol. 35, p. 2005.
[44] J. C. Platt, “Fastmap, metricmap, and landmark mds are all nystro ̈m algorithms,” in In Proceedings of 10th International Workshop on Artificial Intelligence and Statistics, 2005, pp. 261–268.
So we are doing spectral learning  which means that, in theory, our ASE'11 paper could be greatly improved a learner that uses the eigenvectors directly rather than the FASTMAP approximation.

For yet more on spectral learning, see (in stuff/doc):

• 08spectralLearning.pdf
• 03spectralLearning.pdf
• 05spectralDivisvePartitioning.pdf
• 02_part2_principalDirectionDivisvePartitioning.pdf
• 08farthestCentroidsDivisiveClustering.pdf
• 98principalDirectionDivisvePartitioning.pdf

Note the last paper. The clustering method andrew and me "invented" in 2010 actually dates back to 1998. sigh. our only innovation was to (1) approximate the eigenvectors with fastmap and (2) try it on SE data and (3) compare it to results from the global spalce.