ai @ wvu: Problem with Stopping Rule in Compass Defect Prediction

Tuesday, August 31, 2010

Problem with Stopping Rule in Compass Defect Prediction

I've found that the stopping rule for walking through the Compass tree stops too early. Compass currently stops when the variance of the node is less than the weighted variance of the node's children.

	       (if (< (node-variance c-node)
		      (weighted-variance c-node))



(defun weighted-variance (c-node)
  (if (and (null (node-right c-node)) (null (node-left c-node)))
      (node-variance c-node)
      (if (or (null (node-right c-node)) (null (node-left c-node)))
	  (if (null (node-right c-node))
	      (node-variance (node-left c-node))
	      (node-variance (node-right c-node)))
	  (/ (+ (* (node-variance (node-right c-node))
		   (length (node-contents (node-right c-node))))
		(* (node-variance (node-left c-node))
		   (length (node-contents (node-left c-node)))))
	     (+ (length (node-contents (node-right c-node)))
		(length (node-contents (node-left c-node))))))))

http://github.com/abutcher/compass/raw/master/trunk/src/lisp/variance.lisp

In the defect prediction data sets, this condition occurs at a high level (usually 1 or 2). The result is that the cluster for majority voting is often most of the data set rather than similar instances.

Example:

http://github.com/abutcher/compass/raw/master/doc/dot/defect/pruned/jm1.dot.png

ai @ wvu

Tuesday, August 31, 2010

Problem with Stopping Rule in Compass Defect Prediction

No comments:

Post a Comment

Labels