Wednesday, March 20, 2013

Playing with DIMACS feature models -- updated 4/10/2013

Running IBEA for 1 hour over the available DIMACS feature models
Parameters: Population = 300, constrained mutation at rate = 0.001, NO crossover.
Technique: Before evolution, TWO rounds of rule checking are made:
1) First round: all features standing by themselves in a line are fixed: some are selected (mandatory), and some are deselected ("prohibited"!!!)
2) Second round: all features that share a line with "fixed features" from first round, they become fixed as well.
Mutation is constrained so that it doesn't mess with the fixed features.


Skipped features Skipped Rules
FM Features Rules %correct Round 1 Total Total
toybox 544 1020 100% 361 66% 363 67% 394 39%
axTLS 684 2155 100% 382 56% 384 56% 259 12%
ecos 1244 3146 100% 10 1% 19 2% 11 0%
freebsd 1396 62183 100% 3 0% 3 0% 20 0%
fiasco 1638 5228 100% 995 61% 995 61% 553 11%
uClinux 1850 2468 100% 1234 67% 1244 67% 1850 75%
busybox 6796 17836 100% 3947 58% 3949 58% 2644 15%
uClinuxconfig 11254 31637 4% 6025 54% 6027 54% 4641 15%
coreboot 12268 47091 1% 4592 37% 4672 38% 2060 4%
buildroot 14910 45603 0% 6755 45% 6759 45% 3534 8%
embtoolkit 23516 180511 0% 6370 27% 6619 28% 657 0%
freetz 31012 102705 0% 14444 47% 14493 47% 3911 4%
Linux 2.6.32 60072 268223 0% 32329 54% 32479 54% 18734 7%
Linux 2.6.33 62482 273799 0% 33597 54% 33766 54% 19394 7%
Linux 2.6.28.6 6888 343944 0%








1) Verify how many features are fixed in the first and second rounds of rule checking. Do we benefit by making further rounds? Done. No benefit expected from further rounds.
2) The first 7 feature models are "easy"... why? Do they have a lot of "fixed" features? or just because they're smaller? Done. ecos and freebsd have very few "fixed" features, yet they are solved easily... Size is a factor that is part of the larger notion of "hardness" or "complexity".
3) These 7 can be used in experiments to compare IBEA with others (NSGAII etc)... We already know others will suck. This could be our "scale-up" verification. No.
4) We could also compare this "rule detection" method with complete randomness (ICSE'13 style) but we've already learned not to rely on complete randomness. No.
5) We could try to optimize two objectives only, then plug the result into 5-objective optimization... This can save a lot of wasted "number crunching". We could run 2-objective and identify the rules that are most frequently violated, make recommendations, take feedback, fix a bunch of features... run again.. sound familiar?

No comments:

Post a Comment