Monday, June 24, 2013

Fayola: Okay, here's the story

Prototype Learning

During my masters, I worked on prototype learning. Created the CLIFF algorithm and applied it to the forensic science domain as a forensic interpretation model.

Prototype learning algorithms are designed to eliminated the drawbacks of the K-nearest neighbor algorithm:
  1. The high computation costs caused by the need for each test sample to find the distance between it and each training sample.
  2. The storage requirement is large since the entire dataset needs to be stored in memory.
  3. Outliers can negatively affect the accuracy of the classifier.
  4. The negative effect of data sets with non-separable and/or overlapping classes.
  5. The low tolerance to noise.
The CLIFF algorithm reduced the effects of these drawbacks significantly better that the state-of-art prototype learning algorithms.

CLIFF takes a dataset and for each class ranks (power) each attribute sub-range using BORE. Multiply ranks of each row then select the most powerful rows of each class.


Now I use CLIFF along with MORPH to privatized defect datasets. CLIFF removes the overlap of classes with instance reduction while MORPH moves remaining data to low density areas and avoids overlap.

y = x ± (x − z) ∗ r
  • x ∈ D, the original instance;
  • z ∈ D the NUN of x;
  • y the resulting MORPHed instance.

No comments:

Post a Comment