Monday, January 24, 2011

popularity is (almost) a perfect predictor for defects



One surprising observation from the Helix study [50] was that(a) most classes are not popular and (b) patterns of the most pop-ular classes emerged very early in the lifetime of an open-sourceproject. Specifically, popularity and age of a class maintain a pos-itive monotonic relationship throughout the lifetime of a system[50], that is, as a system matures, popular classes tend to becomeeven more popular. 

This contradicts our reading of Fowler [20] andBeck’s work [6] that object-oriented quality assurance requires in-volved constant refactoring (e.g., Beck’s TDD loop of write tests,run test, refactor). We cannot see evidence of constant, widespreadrefactoring in open-source projects studied by Helix [50].

The implications of this lack-of-refactoring for the theory of object-oriented development is discussed elsewhere [50]. This paperfocuses of class popularity since (a) it is a stable concept for thelifetime of an object-oriented system and (b) popularity can lead todefects via:

  • Defect injection:As developers work with the popular classes, they make the occasional mistake. Some of these mistakesresult in code defects. Since developers work on popular classes more than other classes, then most developer defects accumulate in the popular classes.
  • Defect discovery: Since developers work mostly on popular classes, they are most likely to uncover those classes’ defects.
We demonstrate that in 33 open-source Java projects popularity-based defect predictors work within 4% of a theoreti-cal upper bound on predictor performance (this is the basis for ourclaim that such predictors are “nearly perfect”).  

No comments:

Post a Comment