Monday, August 26, 2013

Yet another data mining toolkit


eg. Naive Bayes classifier. Finds class with highest liklihood

function likelihood(row,total,hypotheses,l,_Tables,k,m,
      like,h,nh,prior,tmp,c,x,y,best) {
   like  = NINF ;    # smaller than any log
   total = total + k * length(hypotheses)
   for(h in hypotheses) {   
      nh    = length(datas[h])
      prior = (nh+k)/total
      tmp   = log(prior)
      for(c in terms[h]) {
         x = row[c]
         if (x == "?") continue
         y = counts[h][c][x] 
         tmp += log((y + m*prior) / (nh + m))
      for(c in nums[h]) {
         x = row[c]
         if (x == "?") continue
          y = norm(x, mus[h][c], sds[h][c])
          tmp += log(y)
      l[h] = tmp
      if ( tmp >= like ) {like = tmp; best=h}
   return best

Lit review on transfer learning

Transfer Learning for Software Engineering:
A Literature Review

Tim Menzies, Fayola Peters 
Lane Department of CS & EE, WVU, USA

Forrest Shull, Lucas Layman
Fraunhofer Center for Experimental SE, College Park, MD, USA, {

For decades, empirical methods in SE have focused on a mostly manual analysis of project data. That approach has often suffered from lack of transfer since it was hard to migrate lessons learned from one project to another.

Recently, there has been much success with automatic transfer learning between SE projects. This short paper reviews that work to observe that (1) transfer learning has been far more successful at transferring lessons learned than traditional SE methods; (2) past research on transfer learning has uncovered numerous issues and open issues that need exploring.

Download (380K, pdf)

Two articles in TSE

WVU grad students rule

Wednesday, August 21, 2013

A Summer in Rear-view - Joe

NASA Ames Report:

In progress of developing a framework for NextGen aviation technology studies.  This framework will investigate the integration of highly-necessary NextGen technology into current aviation policies.


modeling for optimization:

[min] false alarm rate
[min] distance travelled
[max] granularity
[min] look-ahead time
[min] safe spacing distance
[min] crashes
[min] conflicts

num planes
homogeneous airspace {true, false}   /    num types of planes
num runways
percentage RNP
look-ahead time
safe spacing distance
[psych] scenario level {SC1, SC2, SC3, SC4, SC5}
[psych] function allocation {FA1, FA2, FA3, FA4, FA5}
[psych] cognitive control model     {OPP, STR, TAC}

California Trip:

In addition to visiting NASA Ames campus, I've visited several other sites as well:
 - Google
 - Stanford
 - Vegas
 - Los Angeles
 - San Francisco
 - Golden Gate Bridge
 - Napa Valley Area

GALE Paper Preview:

[culture] [crit] [crit mod] [init kn] [inter-D] [dyna] [size] [plan] [team size]

"New" Decision Bin Charts GALE vs NSGA-II:

Sans Completion:   {(just look at the stuff on the left)}

Looking Ahead:

Before end of 2013:
- Finish NASA Aviation Project
- "Upgrade" GALE into GALE2.  That means identify current quirks with GALE1 and clean them out to enhance performance.
- Integrate NASA project into GALE2 for testing.

Spring 2014: 
- Finish Ph.D.