Monday, March 8, 2010

Fastmap: Mapping objects into a k-dimensional space

The Fastmap algorithm turns data-set instances into a k-dimensional coordinate which can be primarily used for visualization and clustering purposes. Given a set of n instances from a dataset, Fastmap operates by doing the following. Assume we're using euclidean distance as the dissimilarity function and that we're operating in a 2-D space.

1. Arbitrarily choose an instance from the set and then find the farthest instance from it. These instances will serve as pivots to map the rest of our instances. Call these objects Oa and Ob. Each incoming instance Oi will be mapped using the line that exists between Oa and Ob.


2. As each new instance enters the space, apply geometric rules to extrapolate the coordinates of the new point.

Using the Cosine Law we can find Xi and then use the Pythagorean theorem to determine the coordinates of Oi as it relates to the line formed between Oa and Ob.



3. From there we can easily turn our coordinates into a visualization or use the coordinates to aid in clustering.

To move from a 2-D space to a k-D space, see http://portal.acm.org/citation.cfm?id=223812

1 comment:

  1. With Euclidean distance this works fine... However if I use the cosine distance this doesn't work at all. Is there anything special I need to take care of? Thanks in advance for your help.

    ReplyDelete