Tuesday, August 31, 2010

Results from Splitting Data // Oracle

Results can be found HERE.

The System

1. Randomize 20x

2. Divide data into two halves.
a. Separate first half into eras of 10 instances.
b. Use the second half to build an oracle compass tree.

3. Using the eras from (2a), build an initial tree with the first several eras.

4. Incrementally insert the instances from eras (2a) into the compass tree formed in (3).
a. After each era find the most interesting instances by finding high variance children pairs and removing the center of their union.
b. Classify instances in the naughty list using the oracle from (2b).
c. Insert the naughty list back into the incremental tree formed in (3), (4) by using their classification information from (4b).
d. To keep the tree growing, re-compass high areas of variance (leaves in the top X% of variance among leaves).

6. Compare with a standard tree.

No comments:

Post a Comment