Lee Wilkinson sends along this cool paper demonstrating a simple but effective automatic classifier:
Linf is a classier that was designed to address the curse of dimensionality and polynomial complexity by using projection, binning, and covering in a sequential framework. For class-labeled points in high-dimensional space, Linf employs computationally-efficient methods to construct 2D projections and sets of rectangular regions on those projections that contain points from only one class. Linf organizes these sets of projections and regions into a decision list for scoring new data points.
Linf is not a hybrid or modication of existing classifiers; it employs a new covering algorithm. The accuracy of Linf on widely-used benchmark datasets is comparable to the accuracy of competitive classifiers and, in some important cases, exceeds the accuracy of competitors. Its computational complexity is sub-linear in number of instances and number of variables and quadratic in number of classes.
I also like the article’s delightfully understated conclusion. After a page of bullet points on the virtues of their method, the authors write:
Given these distinctive features and its fundamental differences from other classifiers, Linf is a candidate for inclusion in portfolios of classifiers.
If only all of us could be so modest.
P.S. Table 1 and Figure 5 shouldn’t be in alphabetical order, and I think Figure 6 would work better as a parallel coordinates plot. These are pretty minor comments, but Lee is an authority on statistical graphics so I hold him to a high standard.