Skip to content

Toward a framework for automatic model building

Patrick Caldon writes:

I saw your recent blog post where you discussed in passing an iterative-chain-of models approach to AI.

I essentially built such a thing for my PhD thesis – not in a Bayesian context, but in a logic programming context – and proved it had a few properties and showed how you could solve some toy problems. The important bit of my framework was that at various points you also go and get more data in the process – in a statistical context this might be seen as building a little univariate model on a subset of the data, then iteratively extending into a better model with more data and more independent variables – a generalized forward stepwise regression if you like. It wrapped a proper computational framework around E.M. Gold’s identification/learning in the limit based on a logic my advisor (Eric Martin) had invented.

What’s not written up in the thesis is a few months of failed struggle trying to shoehorn some simple statistical inference into this framework with decent computational properties! I had a good crack with a few different ideas and didn’t really get anywhere, and worse I couldn’t say much in the end about why it seemed to be hard. That said I think it’s straightforward in Gold’s original framework to show something along the lines that an integer-leaf valued CART tree is identifiable in the limit iff such a tree describes a collection of data, and my framework should give a straightforward (if probably computationally terrible) way of actually implementing such a thing.

I’ve now moved onto different things (indeed, moved on from logic in academia into statistics in finance) but I thought you might it interesting to see this problem analysed from a different perspective.


  1. Kaiser says:

    A friend just sent over a blog post hosted at a startup that has the tagline “building predictive models in minutes!” (paraphrasing). It really is sad that people would buy this crap. No one who actually uses models to make real decisions (and sticks around to see the results) would even spend 30 seconds on such “products”.
    I just put up a post on Junk Charts, in which I discussed a much simpler problem, which also defies any current AI paradigm. That is the problem of transforming employment data that arrives as a time series into a dataset that can be directly plotted into one of these cohort-style curves. This requires the AI to recognize starting and ending points of recessions, chopping up the time-series into multiple non-uniform segments, building indices that use data from consecutive segments, etc. It’s one thing to literally program those steps, it’s another thing to tell the AI this is the input, this is the output, get me from A to Z. And there isn’t even any prediction involved here!

  2. Matt says:

    I’m reminded of some recent research in cognitive science that aims to learn an optimal representation for given data (as well as classifying data within that representation). The ideas is that we (humans) have this immensely useful capability to organize and reorganize our internal models, while in AI the models are typically predefined by the engineer.