Monday, February 28, 2011

More on Business Intelligence and Other Things

Quadrant Neighborhood

Mann-Whitney reports the following regarding the distributions in the chart below.

Rank Dataset
0 albrecht
0 china
0 coc81
1 cocomo_sdr
1 desharnais_1_1
1 finnish
1 kemerer
1 maxwell
1 nasa93

More on the Bussman9

Nomograms have been implemented for Summarization (and arguably Goals & Benchmarking). The nomogram accepts target classes such as "<> 20000" as an example of bad effort. This allows us to show two parts of the space... The nomogram also indicates where "you are here" lies on the scale.



Trends (needs to be splined)



Are there favourite subsets?

Noise injection experiments showed that it is difficult to introduce noise to effort datasets. Then this raised the question: Are certain subsets of the datasets more preferable than the others? To answer that we can look at neighbor-ordering matrices. Two forms of neighbor-ordering are:
* Absolute : here.
* Percentage : here.

Tuesday, February 15, 2011

Business Intelligence & Quadrant Differences Graphed

All of the above can be found HERE.

Results from Noise Generation in Effort Sets

Ran kNN and kNN with Cliff instance selection for k = 1,3,5.

Noise Injection method:

for 10%, 20%,...,80% of train data:
Replace with a randomly selected effort score is not equal to the original value.
Perform similar transform with the top 3 attributes as selected by BORE also ensuring replaced value is not equal to the original value.

Results are MDMRE from 5x5 cross-validation.


Tuesday, February 8, 2011

Monday, February 7, 2011

Initial BAMBOO Results

Initial results from running BAMBOO are provided in the above file.

Tuesday, February 1, 2011

Results from Active Learning Projects This Week

Comparison between NB, NB+CLiff, and 1NN+Cliff:

Comparison between NB treatments:

Secret mixture?

Is there a secret percentage mixture when selecting instances from your own company and from a cross-company? Find out here.