- Wilcoxon Rank Sum Test: To compare 2 arrays of MRE's
- Kruskal-Wallis: To compare n arrays of MRE's
- Sort the solo-methods by their MdMRE's
- Start with j'th group (initially j=1) and place the i'th solo-method into this group
- Compare MRE's of i'th and i+1'th solo-method w.r.t. Wilcoxon
- Compare MRE's of all solo-methods in j'th group and MRE of i+1'th w.r.t. Kruskal-Wallis
- If step3 and step4 both agree that i+1'th method is statistically the same, then place it into j'th group
- Else (if one of step3 or step4 says i+1'th is statistically different), then increment j by 1, i.e. form a new group.
- i = i+1;
- Go to step2
The results of the above procedure are incorporated into the jiggle paper on p16-17.
Concise summary of results:
- Some dataset are awfully flat without any ability to form a second group: telecom and kemerer.
- The datasets that show a good distribution of the solo-methods to their groups is around half the data sets: cocomo81e, nasa93_center_5, desharnais, nasa_93_center_2, sdr, finnish, cocomo81, miyazaki94, nasa93
- The occurrence of the top13 solo-methods within these groups still proves our point, i.e. they are mostly on the top group. However, for certain datasets (desharnaisL3, cocomo81o) the generality does not hold, and the top13 solo-methods occur in lower-performance groups.
- The above result shows that, the generality that we have found from the aggregate of 7 error measures and 20 data sets via win-tie-loss values, still holds when we look at the specifics (just using the MRE's w.r.t. another procedure based on multiple statistical tests). However, we still cannot say that there will be no exceptions due to data sets like desharnaisL3 and cocomo81o.