Wednesday, October 21, 2009
Binary/Trinary/etc Trees in LaTeX
For all you LaTeX fans out there, thought you might find this interesting. There is a package out there that lets you represent a decision tree in LaTeX using relatively simple code. You can download it here.
Now for a quick example.
\Tree [.A [.true [.B [.true $+$ ] [.false $-$ ] ] ] [.false [.C [.true [.D [.true $+$ ] [.false $-$ ] ] ] [.false $-$ ] ] ] ]
Turns into the image at the left. The only limitations are that each node can have no more than 5 childeren and the whole tree must be less than 20 levels deep.
Tuesday, October 6, 2009
MESO
In their paper, "MESO: Supporting Online Decision Making in Autonomic Computing Systems", Kasten and McKinley showcase their novel approach to pattern classification. Called MESO (Multi-Element Self-Organising tree), this pattern classifier is designed to support online, incremental learning and decision making in autonomic systems.
Using supervised data MESO uses a novel approach to cluster data. It also unveils a new tree structure to organise the resulting clusters, which the authors call sensitivity spheres.
To create their sensitivity spheres, Kasten and McKinley improved on the long standing leader follower algorithm which creates small clusters of patterns. Basically, a training pattern within a specified distance is assigned to that cluster, otherwise a new cluster is created.
What is the problem with the algorithm: The distance measure which determines the size of the clusters is fixed throughout the clustering process.
In their paper the authors proposed the use of a growth function to remedy this problem.
distance between the new pattern and the nearest sensitivity sphere
scales the result relative to the difference between the current and
Note: the denominator serves to limit the growth rate based on how far the current is from
Once the data is assigned to these clusters or sensitivity spheres, it is then organised into a novel tree structure. Kasten boasts of a tree structure which rather than focussing on putting individual patterns into large clusters close to the root of the tree, he instead places the focus on his sensitivity spheres. He then builds a MESO tree starting at the root node which is home to all the sensitivity spheres. He further explains:
The root node is then split into subsets of similar spheres which produces child nodes. Each child node is futher split into subsets until each child contains only one sphere.
Results
Using the eight datasets in the table below MESO results shows it superiority in terms of speed and accuracy against other classifiers.
Using supervised data MESO uses a novel approach to cluster data. It also unveils a new tree structure to organise the resulting clusters, which the authors call sensitivity spheres.
To create their sensitivity spheres, Kasten and McKinley improved on the long standing leader follower algorithm which creates small clusters of patterns. Basically, a training pattern within a specified distance is assigned to that cluster, otherwise a new cluster is created.
What is the problem with the algorithm: The distance measure which determines the size of the clusters is fixed throughout the clustering process.
In their paper the authors proposed the use of a growth function to remedy this problem.
distance between the new pattern and the nearest sensitivity sphere
scales the result relative to the difference between the current and
Note: the denominator serves to limit the growth rate based on how far the current is from
Once the data is assigned to these clusters or sensitivity spheres, it is then organised into a novel tree structure. Kasten boasts of a tree structure which rather than focussing on putting individual patterns into large clusters close to the root of the tree, he instead places the focus on his sensitivity spheres. He then builds a MESO tree starting at the root node which is home to all the sensitivity spheres. He further explains:
The root node is then split into subsets of similar spheres which produces child nodes. Each child node is futher split into subsets until each child contains only one sphere.
Results
Using the eight datasets in the table below MESO results shows it superiority in terms of speed and accuracy against other classifiers.
POM2
In 2008, Dr. Menzies authored a paper titled "Using Simulation to Investigate Requirements Prioritization Strategies". This paper showed the effects of project and requirement dynamism on the software development processes. In the Spring of 2008, he tasked his Artificial Intelligence class to expand on this model.
Thus, POM2 was born. One of the main differences between POM and POM2 is that POM focused on small teams and small projects. POM2 built its model around the 5 different branches proposed by Dr. Turner and Dr. Boehm. This varied the project size as well as introducing other factors.
POM2 was then put through a Monte Carlo simulator while a KEYS search engine isolated the interesting areas in the simulator.
The main discovery is that Agile development methods performed as well as or better than Plan based development methods in all areas, especially excelling in dynamic environments. Further research is needed into the Personell and Criticality branches of POM2 to fully flesh out the model.
POM2 was accepted as a short paper to ASE'09. The full version of the paper can be found here, and the short version can be found here.
Thus, POM2 was born. One of the main differences between POM and POM2 is that POM focused on small teams and small projects. POM2 built its model around the 5 different branches proposed by Dr. Turner and Dr. Boehm. This varied the project size as well as introducing other factors.
- Personnel - The number skill level of the staff
- Dynamism - The amount of project change
- Criticality - The impact of failure after deployment
- Culture - How accepting the staff is of change
- Size - The total number of staff across all teams
POM2 was then put through a Monte Carlo simulator while a KEYS search engine isolated the interesting areas in the simulator.
The main discovery is that Agile development methods performed as well as or better than Plan based development methods in all areas, especially excelling in dynamic environments. Further research is needed into the Personell and Criticality branches of POM2 to fully flesh out the model.
POM2 was accepted as a short paper to ASE'09. The full version of the paper can be found here, and the short version can be found here.
Bryan Lemon
Bryan is currently a Computer Science Master's student at West Virginia University. He graduated with his BS in Computer Science from Bluefield State College in spring of 2008.
He, along with a team of other students at Bluefield State College competed in The Intelligent Ground Vehicle competition in the summer of 2008. They took first place in the Autonomous Challenge and 4th overall.
In the Fall of 2009, he, his advisor, and his classmates submitted a paper to ASE'09 detailing a software process development model. It will be featured as a short paper in November's conference. He is currently developing a Decision Tree based clusterer called Ripple Down Rules.
Bryan plans on going on to the PhD program upon completing his Masters degree. Once all is said and done, maybe he will finally have the time to develop a hobby.
He, along with a team of other students at Bluefield State College competed in The Intelligent Ground Vehicle competition in the summer of 2008. They took first place in the Autonomous Challenge and 4th overall.
In the Fall of 2009, he, his advisor, and his classmates submitted a paper to ASE'09 detailing a software process development model. It will be featured as a short paper in November's conference. He is currently developing a Decision Tree based clusterer called Ripple Down Rules.
Bryan plans on going on to the PhD program upon completing his Masters degree. Once all is said and done, maybe he will finally have the time to develop a hobby.
Monday, October 5, 2009
PhotoSketch Combines Art, AI, and Voodoo Magic
AI research is a great field and all, but I occasionally look at our colleagues over in computer graphics and marvel at what they are doing. SIGGRAPH Asia is coming up later this year, and while I'm sure that we'll see all sorts of mind-blowing things, Tao Chen et al from Tsinghua University have decided to get the party started right now.
Their development, PhotoSketch, is a cloud-based app (I hear they are all of the rage these days). It takes a quick, labeled sketch from the user and turns it into a beautiful photograph containing all of the requested elements. This video demonstrates the process:
Basically, the app lets you enter any rough sketch, label it, and press the big "go" button. Their algorithm will find images in its database that match the given labels and decide on the most appropriate match by looking at the labels and sizes of the other objects in the sketch (hey - there's the data mining connection). It will them seamlessly merge all of these disparate elements into a single image, match them with a background, and adjust the lighting and shadows into something vaguely realistic. The results look awesome. There is still something a little.. off.. about the examples, but they will let the layperson compete with the gods in internet Photoshop contests.
PhotoSketch
Their development, PhotoSketch, is a cloud-based app (I hear they are all of the rage these days). It takes a quick, labeled sketch from the user and turns it into a beautiful photograph containing all of the requested elements. This video demonstrates the process:
PhotoSketch: Internet Image Montage from tao chen on Vimeo.
Basically, the app lets you enter any rough sketch, label it, and press the big "go" button. Their algorithm will find images in its database that match the given labels and decide on the most appropriate match by looking at the labels and sizes of the other objects in the sketch (hey - there's the data mining connection). It will them seamlessly merge all of these disparate elements into a single image, match them with a background, and adjust the lighting and shadows into something vaguely realistic. The results look awesome. There is still something a little.. off.. about the examples, but they will let the layperson compete with the gods in internet Photoshop contests.
PhotoSketch
Clojure: Lisp's Next Evolution
Clojure is a new, dynamic programming language that is built upon the rock-solid foundation of the Java Virtual Machine, the industry-standard platform that is the foundation for Java, one of the world's most popular and powerful multi-platform languages. But, why write in Clojure if you want to target the JVM instead of Java? Wouldn't it make more sense to write your project in the language for which the virtual machine was designed?
Enter functional programming at its finest; Clojure doesn't just target the JVM. No, it's much more than that -- it's an implementation of Lisp. It's not your standard Common Lisp, however; it's a highly specialized form of the language. It's designed with everything modern functional programmers have in mind: concurrency, immutability, and perhaps most importantly, portability. That's right: all your current Java classes are compatible with Clojure.
Clojure doesn't stop there. The core data structures are immutable and extensible. Code-as-data has been extended to maps and vectors. Practically everything is abstract. Multimethods foster polymorphic programming. It really is an amazing thing.
To see what I mean, you should really have a look for yourself over at Clojure's website. The MIL already has a project that is built upon Clojure, CLIFF, which uses an embedded version of the language to generate forensics models dynamically.
As a functional programmer, writing in Clojure has been a dream come true. Do yourself a favor and hack something up today. :D
Enter functional programming at its finest; Clojure doesn't just target the JVM. No, it's much more than that -- it's an implementation of Lisp. It's not your standard Common Lisp, however; it's a highly specialized form of the language. It's designed with everything modern functional programmers have in mind: concurrency, immutability, and perhaps most importantly, portability. That's right: all your current Java classes are compatible with Clojure.
Clojure doesn't stop there. The core data structures are immutable and extensible. Code-as-data has been extended to maps and vectors. Practically everything is abstract. Multimethods foster polymorphic programming. It really is an amazing thing.
To see what I mean, you should really have a look for yourself over at Clojure's website. The MIL already has a project that is built upon Clojure, CLIFF, which uses an embedded version of the language to generate forensics models dynamically.
As a functional programmer, writing in Clojure has been a dream come true. Do yourself a favor and hack something up today. :D
Subscribe to:
Posts (Atom)