Thursday, November 20, 2014

New Param Ranges

Table of Rank Sums Across All Datasets (Random Forest)
66  : Default Cur -> Cur
41  : Tuned Prev -> Cur
28  : Tuned Cur -> Cur
 0  : Default Prev -> Cur




Table of Rank Sums Across All Datasets (Logistic Regression)
66 : Default Cur -> Cur
41 : Tuned Prev -> Cur
28 : Tuned Cur -> Cur
0 : Default Prev -> Cur


As it turns out, I had a flawed assumption about Bernouli NB. I thought that the binarize parameter was a percentage threshold, where instead it is a value threshold.

In other words, I was under the impression a binarize of 0.3 meant that the bottom 30% of values got translated to 'low' and 70% got translated as 'high'. This is not the case. A binarize of 0.3 means that all values below 0.3 become 'low' and all values greater than 0.3 become 'high'. 

Below are the results generated from Bernouli NB without the binarize parameter being fixed. For the next set of results, I intend to pre-process the inputs by normalizing 0-100 and then providing a binarize param of 0-100.


Table of Rank Sums Across All Datasets (Bernoulli NB)
41  : Tuned Cur -> Cur
28  : Tuned Prev -> Cur
20  : Default Cur -> Cur
 6  : Default Prev -> Cur