Thursday, November 20, 2014

New Param Ranges

Table of Rank Sums Across All Datasets (Random Forest)
66  : Default Cur -> Cur
41  : Tuned Prev -> Cur
28  : Tuned Cur -> Cur
0  : Default Prev -> Cur

Table of Rank Sums Across All Datasets (Logistic Regression)
66 : Default Cur -> Cur
41 : Tuned Prev -> Cur
28 : Tuned Cur -> Cur
0 : Default Prev -> Cur

As it turns out, I had a flawed assumption about Bernouli NB. I thought that the binarize parameter was a percentage threshold, where instead it is a value threshold.

In other words, I was under the impression a binarize of 0.3 meant that the bottom 30% of values got translated to 'low' and 70% got translated as 'high'. This is not the case. A binarize of 0.3 means that all values below 0.3 become 'low' and all values greater than 0.3 become 'high'.

Below are the results generated from Bernouli NB without the binarize parameter being fixed. For the next set of results, I intend to pre-process the inputs by normalizing 0-100 and then providing a binarize param of 0-100.

Table of Rank Sums Across All Datasets (Bernoulli NB)
41  : Tuned Cur -> Cur
28  : Tuned Prev -> Cur
20  : Default Cur -> Cur
6  : Default Prev -> Cur

Wednesday, October 8, 2014

XOMO PAPER v2.0

```Techniques              -effort              -months             -defects               -risks    #
Base Line xomoal500 m                    3                    4                    1                   63    #
CT0*_xomoal500 m                    2                    4                    0                   63    #
CT0_xomoal500 m                    1                    4                    0                   23    #
CT1*_xomoal500 m                    2                    3                    3                   24    #
CT1_xomoal500 m                    1                    3                    3                   23    #
NSGA xomoal500 m                    7                   44                    1                    0    #
Base Line xomoal500 q                    1                    0                    6                   11    #
CT0*_xomoal500 q                    2                    0                    6                   15    #
CT0_xomoal500 q                    3                    1                    3                    5    #
CT1*_xomoal500 q                    0                    0                    6                    1    #
CT1_xomoal500 q                    0                    0                    6                    1    #
NSGA xomoal500 q                    8                   22                    1                    2    #
Base Line xomoal500 w                   22                    4                   39                   87    #
CT0*_xomoal500 w                   36                    4                  100                  100    #
CT0_xomoal500 w                  100                    4                   62                   49    #
CT1*_xomoal500 w                   42                    3                   65                   44    #
CT1_xomoal500 w                   40                    3                   65                   42    #
NSGA xomoal500 w                   94                  100                   24                   33    #
100               6809.1                65.27             229326.0                 25.7    #
0               544.09                  0.0              3752.55                 1.69    # ```
` `
` `
` `
```Techniques              -effort              -months             -defects               -risks    #
Base Line xomofl500 m                   10                    4                    3                   84    #
CT0*_xomofl500 m                   10                    4                    3                   85    #
CT0_xomofl500 m                    0                    3                    0                   17    #
CT1*_xomofl500 m                    3                    3                    4                    3    #
CT1_xomofl500 m                    4                    3                    6                    4    #
NSGA xomofl500 m                    6                   41                    2                    4    #
Base Line xomofl500 q                    8                    0                    6                    5    #
CT0*_xomofl500 q                   10                    0                    8                    6    #
CT0_xomofl500 q                    6                    0                    4                    0    #
CT1*_xomofl500 q                    3                    0                    8                    0    #
CT1_xomofl500 q                    4                    0                    9                    0    #
NSGA xomofl500 q                    8                   21                    3                    7    #
Base Line xomofl500 w                   53                    4                   37                   98    #
CT0*_xomofl500 w                  100                    4                   60                  100    #
CT0_xomofl500 w                   71                    3                   41                   27    #
CT1*_xomofl500 w                   56                    3                   81                   21    #
CT1_xomofl500 w                   64                    3                  100                   22    #
NSGA xomofl500 w                   96                  100                   34                   46    #
100               7557.7                68.97            184550.15                 21.4    #
0               560.08                  0.0              2178.01                 0.93    # ```
` `
` `
` `
```Techniques              -effort              -months             -defects               -risks    #
Base Line xomogr500 m                    6                    4                    2                   64    #
CT0*_xomogr500 m                    6                    4                    2                   63    #
CT0_xomogr500 m                    0                    2                    0                   11    #
CT1*_xomogr500 m                    6                    3                    8                   18    #
CT1_xomogr500 m                    6                    3                    7                   18    #
NSGA xomogr500 m                   11                   41                    2                    1    #
Base Line xomogr500 q                    4                    0                    7                   11    #
CT0*_xomogr500 q                    5                    0                    8                   12    #
CT0_xomogr500 q                    0                    0                    2                    0    #
CT1*_xomogr500 q                    3                    0                    9                    6    #
CT1_xomogr500 q                    3                    0                    9                    7    #
NSGA xomogr500 q                   13                   22                    3                    5    #
Base Line xomogr500 w                   23                    4                   34                   86    #
CT0*_xomogr500 w                   36                    4                  100                  100    #
CT0_xomogr500 w                   18                    2                   27                   23    #
CT1*_xomogr500 w                   37                    3                   73                   42    #
CT1_xomogr500 w                   39                    3                   80                   43    #
NSGA xomogr500 w                  100                  100                   34                   41    #
100               6319.6                67.75             182979.6                 23.6    #
0               248.43                  0.0              1640.18                  1.3    # ```
` `
` `
```Techniques              -effort              -months             -defects               -risks    #
Base Line xomoo2500 m                    6                    4                   31                   78    #
CT0*_xomoo2500 m                    5                    4                   16                   42    #
CT0_xomoo2500 m                    2                    2                    5                   14    #
CT1*_xomoo2500 m                    8                    3                    2                   25    #
CT1_xomoo2500 m                    7                    2                    2                   20    #
NSGA xomoo2500 m                   14                   42                    5                    5    #
Base Line xomoo2500 q                    1                    0                   14                   17    #
CT0*_xomoo2500 q                    1                    0                    8                    9    #
CT0_xomoo2500 q                    0                    0                    1                    2    #
CT1*_xomoo2500 q                    2                    0                    0                    1    #
CT1_xomoo2500 q                    1                    0                    0                    0    #
NSGA xomoo2500 q                   15                   21                    8                    9    #
Base Line xomoo2500 w                   12                    4                   65                  100    #
CT0*_xomoo2500 w                   16                    4                   47                   57    #
CT0_xomoo2500 w                    9                    2                   17                   21    #
CT1*_xomoo2500 w                   24                    3                   15                   33    #
CT1_xomoo2500 w                   19                    2                   13                   26    #
NSGA xomoo2500 w                  100                  100                  100                   51    #
100              6163.92                67.25             65267.84                18.53    #
0                111.2                  0.0              1703.33                 0.71    # ```
` `
` `
``` Techniques              -effort              -months             -defects               -risks    #
Base Line xomoos500 m                    4                    4                    0                   88    #
CT0*_xomoos500 m                    3                    4                    0                   88    #
CT0_xomoos500 m                    6                    3                    4                   36    #
CT1*_xomoos500 m                    4                    2                    2                   25    #
CT1_xomoos500 m                    4                    2                    2                   26    #
NSGA xomoos500 m                   15                   43                    8                    8    #
Base Line xomoos500 q                    0                    0                    1                    4    #
CT0*_xomoos500 q                    0                    0                    1                    3    #
CT0_xomoos500 q                    2                    0                    4                    4    #
CT1*_xomoos500 q                    0                    0                    2                    1    #
CT1_xomoos500 q                    0                    0                    2                    0    #
NSGA xomoos500 q                   16                   21                   11                   13    #
Base Line xomoos500 w                    8                    4                    6                   98    #
CT0*_xomoos500 w                   14                    4                   11                  100    #
CT0_xomoos500 w                   26                    3                   29                   47    #
CT1*_xomoos500 w                   13                    2                   16                   32    #
CT1_xomoos500 w                   13                    2                   17                   31    #
NSGA xomoos500 w                  100                  100                  100                   73    #
100              5905.91                65.63             60206.72                 13.9    #
0               119.84                  0.0               197.22                 0.72    #
```

Friday, October 3, 2014

XOMO and POM PAPER

XOMO

Base Line xomofl500 m                   13                   12                   10                    6    #
CT0*_xomofl500 m                   12                   11                    9                    7    #
CT0_xomofl500 m                    0                    1                    0                    0    #
CT1*_xomofl500 m                    3                    7                    1                    4    #
CT1_xomofl500 m                   11                    7                    4                    3    #
NSGA xomofl500 m                    0                   48                    0                    0    #
-------------------------------------------------------------------------------------
Base Line xomofl500 q                   11                    0                   10                   14    #
CT0*_xomofl500 q                   12                    0                   11                   13    #
CT0_xomofl500 q                    0                    0                    0                    1    #
CT1*_xomofl500 q                    3                    0                    1                    8    #
CT1_xomofl500 q                    7                    0                    4                    9    #
NSGA xomofl500 q                    1                   21                    0                   13    #
-------------------------------------------------------------------------------------
Base Line xomofl500 w                   49                   12                   47                   41    #
CT0*_xomofl500 w                  100                   11                  100                   60    #
CT0_xomofl500 w                    8                    1                    8                    6    #
CT1*_xomofl500 w                   21                    7                   15                   33    #
CT1_xomofl500 w                   51                    8                   45                   33    #
NSGA xomofl500 w                    6                  100                    3                  100    #
-------------------------------------------------------------------------------------
Techniques              -effort              -months             -defects               -risks    #
100               6845.6                 33.3              66229.2                  4.0    #
0                 81.3                  0.0               259.29                  0.0    #

Base Line xomogr500 m                   18                   10                    8                   14    #
CT0*_xomogr500 m                   17                   10                    7                   14    #
CT0_xomogr500 m                   17                   10                    8                   14    #
CT1*_xomogr500 m                    8                    5                    4                    2    #
CT1_xomogr500 m                   17                   10                    8                   13    #
NSGA xomogr500 m                    0                   42                    0                    0    #
-------------------------------------------------------------------------------------
Base Line xomogr500 q                   17                    0                   11                   20    #
CT0*_xomogr500 q                   19                    0                   11                   20    #
CT0_xomogr500 q                   19                    0                   11                   19    #
CT1*_xomogr500 q                    9                    0                    5                    5    #
CT1_xomogr500 q                   17                    0                   11                   19    #
NSGA xomogr500 q                    1                   19                    0                    7    #
-------------------------------------------------------------------------------------
Base Line xomogr500 w                   65                   10                   40                   62    #
CT0*_xomogr500 w                  100                   10                  100                  100    #
CT0_xomogr500 w                   99                   10                   91                   93    #
CT1*_xomogr500 w                   58                    5                   37                   42    #
CT1_xomogr500 w                   64                   10                   39                   60    #
NSGA xomogr500 w                   22                  100                    1                   92    #
-------------------------------------------------------------------------------------
Techniques              -effort              -months             -defects               -risks    #
100               3145.8                 37.0              68144.5                  4.3    #
0                61.16                  0.0               174.68                 0.54    #

Base Line xomoos500 m                   28                   10                   21                   46    #
CT0*_xomoos500 m                   27                   10                   19                   33    #
CT0_xomoos500 m                   33                   10                   27                   31    #
CT1*_xomoos500 m                   13                    5                   12                   26    #
CT1_xomoos500 m                   13                    4                   11                   22    #
NSGA xomoos500 m                    0                   39                    0                    0    #
-------------------------------------------------------------------------------------
Base Line xomoos500 q                    9                    0                   10                   33    #
CT0*_xomoos500 q                   10                    1                   11                   42    #
CT0_xomoos500 q                   10                    1                   13                   49    #
CT1*_xomoos500 q                    3                    0                    6                    6    #
CT1_xomoos500 q                    3                    0                    6                    5    #
NSGA xomoos500 q                    1                   10                    0                   12    #
-------------------------------------------------------------------------------------
Base Line xomoos500 w                   52                   11                   45                  100    #
CT0*_xomoos500 w                   93                   10                   65                   99    #
CT0_xomoos500 w                  100                   10                  100                   92    #
CT1*_xomoos500 w                   32                    5                   39                   27    #
CT1_xomoos500 w                   33                    5                   35                   26    #
NSGA xomoos500 w                    9                  100                   10                   93    #
-------------------------------------------------------------------------------------
Techniques              -effort              -months             -defects               -risks    #
100              3479.32                 29.4             26699.79                 5.95    #
0                67.25                  0.1               191.75                 0.67    #

Base Line xomoo2500 m                   57                   16                   31                    7    #
CT0*_xomoo2500 m                   27                    8                   16                    7    #
CT0_xomoo2500 m                    6                    3                    2                    2    #
CT1*_xomoo2500 m                   20                    7                   15                    1    #
CT1_xomoo2500 m                   21                    6                   15                    2    #
NSGA xomoo2500 m                    2                   59                    1                   23    #
-------------------------------------------------------------------------------------
Base Line xomoo2500 q                   16                    2                   11                   13    #
CT0*_xomoo2500 q                    8                    1                   10                    8    #
CT0_xomoo2500 q                    0                    1                    0                    3    #
CT1*_xomoo2500 q                    3                    0                    6                    0    #
CT1_xomoo2500 q                    4                    0                    5                    0    #
NSGA xomoo2500 q                    2                   45                    2                   34    #
-------------------------------------------------------------------------------------
Base Line xomoo2500 w                  100                   18                   66                   16    #
CT0*_xomoo2500 w                   78                    9                  100                    8    #
CT0_xomoo2500 w                   18                    3                   21                    3    #
CT1*_xomoo2500 w                   41                    7                   48                    1    #
CT1_xomoo2500 w                   46                    6                   49                    2    #
NSGA xomoo2500 w                   20                  100                   18                  100    #

Base Line xomoal500 m                   12                   11                    6                   10    #
CT0*_xomoal500 m                   12                   12                    6                    9    #
CT0_xomoal500 m                   11                   11                    6                    9    #
CT1*_xomoal500 m                    6                    5                    4                    0    #
CT1_xomoal500 m                   12                   10                    5                   13    #
NSGA xomoal500 m                    0                   39                    0                    1    #
-------------------------------------------------------------------------------------
Base Line xomoal500 q                   12                    2                    9                   14    #
CT0*_xomoal500 q                   14                    5                   10                   15    #
CT0_xomoal500 q                   14                    5                   10                   17    #
CT1*_xomoal500 q                    7                    0                    5                    4    #
CT1_xomoal500 q                   14                    1                    8                   16    #
NSGA xomoal500 q                    0                   21                    0                   21    #
-------------------------------------------------------------------------------------
Base Line xomoal500 w                   52                   16                   40                   59    #
CT0*_xomoal500 w                   99                   16                  100                  100    #
CT0_xomoal500 w                   83                   16                   90                   91    #
CT1*_xomoal500 w                   52                    8                   62                   49    #
CT1_xomoal500 w                  100                   16                   73                   96    #
NSGA xomoal500 w                    3                  100                    2                   56    #
-------------------------------------------------------------------------------------
Techniques              -effort              -months             -defects               -risks    #
100               6069.8                 27.8              91852.0                  9.7    #
0                47.89                 0.52               164.88                  1.1    #

POM

Base Line pom3A500 m                   25                   88                   14    #
CT0*_pom3A500 m                   22                   84                   14    #
CT0_pom3A500 m                   22                   84                   14    #
CT1*_pom3A500 m                   14                   49                    4    #
CT1_pom3A500 m                   10                   35                    0    #
NSGA pom3A500 m                    1                   85                   16    #
-------------------------------------------------------------------------------------
Base Line pom3A500 q                   16                    8                   30    #
CT0*_pom3A500 q                   21                    5                   42    #
CT0_pom3A500 q                   21                    5                   41    #
CT1*_pom3A500 q                    8                    1                   12    #
CT1_pom3A500 q                    4                    0                    4    #
NSGA pom3A500 q                    0                   15                   42    #
-------------------------------------------------------------------------------------
Base Line pom3A500 w                   60                  100                   71    #
CT0*_pom3A500 w                  100                   96                   86    #
CT0_pom3A500 w                   99                   96                   84    #
CT1*_pom3A500 w                   41                   56                   48    #
CT1_pom3A500 w                   26                   40                   32    #
NSGA pom3A500 w                    4                   96                  100    #
-------------------------------------------------------------------------------------
Techniques                -cost          +completion                -idle    #
100               2694.4                 1.04                  0.8    #
0                 37.6                 0.05                 0.11    #

Base Line pom3B500 m                   24                   87                   22    #
CT0*_pom3B500 m                   21                   83                   20    #
CT0_pom3B500 m                   21                   83                   20    #
CT1*_pom3B500 m                   13                   39                    0    #
CT1_pom3B500 m                   10                   38                    3    #
NSGA pom3B500 m                    3                   85                    3    #
-------------------------------------------------------------------------------------
Base Line pom3B500 q                   19                    9                   28    #
CT0*_pom3B500 q                   21                    5                   23    #
CT0_pom3B500 q                   21                    5                   23    #
CT1*_pom3B500 q                    9                    0                    8    #
CT1_pom3B500 q                    9                    1                    2    #
NSGA pom3B500 q                    0                    5                   23    #
-------------------------------------------------------------------------------------
Base Line pom3B500 w                   72                  100                   72    #
CT0*_pom3B500 w                  100                   96                  100    #
CT0_pom3B500 w                   97                   96                   97    #
CT1*_pom3B500 w                   45                   45                   35    #
CT1_pom3B500 w                   47                   45                   38    #
NSGA pom3B500 w                    9                   96                  100    #
-------------------------------------------------------------------------------------
Techniques                -cost          +completion                -idle    #
100              31762.3                 1.04                  0.8    #
0                494.8                 0.05                 0.15    #

Base Line pom3B500 m                   24                   87                   22    #
CT0*_pom3B500 m                   21                   83                   20    #
CT0_pom3B500 m                   21                   83                   20    #
CT1*_pom3B500 m                   13                   39                    0    #
CT1_pom3B500 m                   10                   38                    3    #
NSGA pom3B500 m                    3                   85                    3    #
-------------------------------------------------------------------------------------
Base Line pom3B500 q                   19                    9                   28    #
CT0*_pom3B500 q                   21                    5                   23    #
CT0_pom3B500 q                   21                    5                   23    #
CT1*_pom3B500 q                    9                    0                    8    #
CT1_pom3B500 q                    9                    1                    2    #
NSGA pom3B500 q                    0                    5                   23    #
-------------------------------------------------------------------------------------
Base Line pom3B500 w                   72                  100                   72    #
CT0*_pom3B500 w                  100                   96                  100    #
CT0_pom3B500 w                   97                   96                   97    #
CT1*_pom3B500 w                   45                   45                   35    #
CT1_pom3B500 w                   47                   45                   38    #
NSGA pom3B500 w                    9                   96                  100    #
-------------------------------------------------------------------------------------
Techniques                -cost          +completion                -idle    #
100              31762.3                 1.04                  0.8    #
0                494.8                 0.05                 0.15    #

10/30/14 - Rank Sums, NSGAII-style Selection

Random Forest

Table of Rank Sums Across All Data-sets
60  : Default Cur -> Cur
35  : Tuned Prev -> Cur
28  : Tuned Cur -> Cur
0  : Default Prev -> Cur

Overall Rankings

Params (387 permutations)
('bootstrap', ['values', True])
('min_samples_leaf', ['values', 1])
('n_estimators', ['values', 8, 16, 32])
('min_samples_split', ['values', 2])
('criterion', ['values', 'gini'])
('max_features', ['values', 2, 4, 8, 16])
('max_depth', ['values', 2, 4, 6, 8, 10, 12, 14, 16, 18])

Bernoulli Bayes

Table of Rank Sums Across All Data-sets
29  : Tuned Cur -> Cur
28  : Tuned Prev -> Cur
21  : Default Cur -> Cur
9  : Default Prev -> Cur

Overall Rankings

Params (50 permutations)
('binarize', ['values', 0.0, 0.2, 0.4, 0.6, 0.8])
('alpha', ['values', 0.0, 0.2, 0.4, 0.6, 0.8])
('fit_prior', ['values', True, False])

Logistic Regression

Table of Rank Sums Across All Data-sets
31  : Default Cur -> Cur
27  : Tuned Cur -> Cur
24  : Tuned Prev -> Cur
5  : Default Prev -> Cur

Overall Rankings

Params (36 permutations)
('penalty', ['values', 'l1', 'l2'])
('C', ['values', 0.5, 1, 2])
('class_weight', ['values', None])
('intercept_scaling', ['values', 0.5, 1, 2])
('fit_intercept', ['values', True, False])

New Results Format

Link to the results: (I think landscape is easiest to see)
https://www.dropbox.com/sh/iowdac5ki9hyyw4/AAAzRAdOf0pI571udSY0qf-ya?dl=0

In this format, the top item is a comparison of the prev and current version dataset stats. These are the usual suspects plus "overlapping instances" where the software module name is the same in both versions and "identical instances" where the software module's metrics are unchanged from one version to the next.

The rest of the chart shows the results of parameter tuning on both the previous and current versions. There is a table for each learner which lists all of its explored parameter values and the frequency with which they were selected by the grid search in both the previous and current versions. For example, a parameter value of "False" appearing in 90% of the selected combinations in the previous version and 43% of the current version selections would be represented as "False: (90/43)".

It also shows the pD/pF performance of each learner's non-dominated turnings applied in-set and out of set. in this case we have four combinations:

• tune on prev -> apply in-version
• tune on prev -> apply out-of-version (current)
• tune on current -> apply in-version
• tune on current -> apply out-of-version (prev)
In this case, the major effect that we see is that the green sticks with the blue and the red sticks with the purple. This scenario arises when one dataset is more difficult to preform well on than the other. Beyond that, the performance in-version and out-of version seem pretty comparable. There are occasional exceptions, but not a real trend towards in-version or out-of-version doing better.