Add one feature to your model and test and debug with fake data before going on.
Don’t try to add two features at once.
A more general piece of advice is to take one step at a time. Or as Daniel Lee likes to say, “baby steps.” Especially if we’re talking heavily templated C++ in a system as complex as Stan. And set up a way to trace what your program does.
We have diagnostics like divergences and we have print statements. Not ideal, but way better than nothing. So-called “debug by printf” (the name comes from C), is a tried and true method of debugging. If you have a stepwise debugger and are better than me at navigating a complex stack, by all means use that.
It’s no coincidence that I’m up debugging after 11 PM. I think one of the main reasons programmers stay up late is that we can’t stand to go to bed on a bug. The other is that we can’t stop programming when things are going well and we’re in the zone. Those stem from what I think of as the two main drives for programming: hating to be beaten by a problem and loving to build stuff. The third reason is that there are no interruptions. Maybe that’d also work at 6 AM—I’ve never tried that approach.
But the first is a very bad habit, because going to bed on a bug is one of the best things you can do to solve it, and really, solving a bug at 3 AM vs. 11 AM doesn’t usually matter to anyone. I can’t even begin to count the number of times “sleeping on it” has worked for me in 35 years of math and programming problems. If you can’t go to bed as early as 11 PM, get up and walk away from the problem and come back. I don’t mean just checking your texts, email, or blog, I mean something enough to really pull you out of what you’re doing and completely clear your stack. I almost never have the discipline to do that. Maybe it’s the false presumption that I’m really close to finding the bug.
> Maybe it’s the false presumption that I’m really close to finding the bug.
That’s what it always is for me. And not really internalizing the idea of sunk costs, so if its 2:50 AM already, how silly would it be to go to bed now that I’m just 5 minutes from killing this bug!?
This is great advice. In my personal experience, when I’m stuck on a complex problem, the solution will *only* present itself to me once I take a moment to really step away. Actually, sometimes the solution becomes clear at a very inconvenient time (ok, so I went to bed already, but now I suddenly know the solution to my problem – if I don’t get up again to write it down I might forget it again).
Just keep a notebook by your bed.
And then spend a lot of time in the morning trying to figure out what those unintelligible scrawls are trying to say.
Better than trying to figure out in the morning why you are missing half your files and three months of git history.
Keeping a notebook by the bed is a good idea.
Invariably, I discover in the morning that my idea is terrible/wrong/impossible — but once I wrote it down, I could sleep.
Similarly, I always pack the night before, and use that notebook in case I remember something I might have forgotten.
Another reason to keep pressing on as the clock keeps going round is that it can take so long to load up the mental overlays to deal with each area of work. You seem to have to “learn” the subroutine or whatever you’re working on and that can take hours.
I think the reason why leaving and then returning to a problem helps, is because every time you open a concept in your mind you have to analyse it in its setting. That often only happens when you first start concentrating on it.
One annoying result of conceprtating on debuging or the actual encocding process is that you can’t remember words very well for a day or two afterwards. Let’s hope this staves off Alzheimer’s rather than encouraging it!! :-S
I agree that “sleeping on it” can be helpful or even necessary in many puzzle-like situations. This particular debugging problem is different: it’s not so much a puzzle (“find the bug”) as a slog. There’s no one particular tricky thing, just a lot of moving parts, each one which needs to be tested.
So, on a similar vein in troubleshooting engineering problems the age-old wisdom was that “change only one variable at a time”.
So also, in doing a chemistry experiment to optimize reaction conditions. But recently I remember reading advice to the contrary that I don’t remember exactly but said something to the effect:
“You can change multiple parameters at a time so as to cover the problem search space rapidly and then use math to deconvolute the effects. Especially when working in high dimensional problem spaces where the change-one-variable-at-at-time might approach be too slow.”
Yup, Stephen Stigler discussed this in his The Seven Pillars of Statistical Wisdom attributing the idea to Fisher.
Thanks Keith! What was the “idea”? Can you elaborate more? What is your opinion as a statistician re experiments / troubleshooting? Should we change more than one parameter at once? Or not?
For a more recent explanation than Fisher, see Box / Hunter / Hunter: Statistics for Experimenters: Design, Innovation, and Discovery.
> and then use math to deconvolute the effects
I’m pretty sure you need a true theory about what the changes do and how they interact for this to work.
This is really akin to causal inference from observational data where a lot of things change at the same time all the time, instead of just the one thing you want to know the effect of.
To give an actual example: Suppose you are probing how to make a baseline reaction faster. You know what can matter: Temperature, rpm of stirring, ratio of reactants, dosage of catalyst.
So do you conduct runs where you vary variables one at a time or use some other approach that involves changing other combinations?
Most bugs do not vary but are systematic.
It is when we encounter noisy variation where varying combinations of variables kicks in real advantages.
Basically, if you vary the right variables together and analyse them properly not only do you do more at once you drive down the noise that would have been driven in each comparison by the other variables uncontrolled variation (but I think you you would get much more out of reading Stigler).
Good insight! Thanks!
Rahul. Google on simultaneous perturbation stochastic approximation optimization (SPSA) great way to optimize real world high dim. Engineering problems
It’s called Simultaneous Perturbation (Spall 1987). You can read about it on Wikipedia (for instance).
That’s where genetic algorithms are strong – they slightly sample little aspects of the problem (“hyper-planes”, talking posh) – but don’t expect your mind to thank you for working it like that!
What’s the connection of Genetic Algorithms with all this?
“Changing multiple parameters at a time so as to cover the problem search space rapidly.”
I think I’ve posted this Poincare quote elsewhere on this blog. It’s his understanding of why “sleep on it” works.
It is certain that the combinations which present themselves to the mind in a kind of sudden illumination after a somewhat prolonged period of unconscious work are generally useful and fruitful combinations… all the combinations are formed as a result of the automatic action of the subliminal ego, but those only which are interesting find their way into the field of consciousness… A few only are harmonious, and consequently at once useful and beautiful, and they will be capable of affecting the geometrician’s special sensibility I have been speaking of; which, once aroused, will direct our attention upon them, and will thus give them the opportunity of becoming conscious… In the subliminal ego, on the contrary, there reigns what I would call liberty, if one could give this name to the mere absence of discipline and to disorder born of chance.
From _Science and Method_, Chapter 3, Mathematical Discovery, 1914, pp.58,
During grad school, I realized that I was much most productive at writing code around 1AM. I think this was because at that hour, I was mentally unable to keep more than the current line of code in my head.
Alternatively, the statistical ideas I came up with during those hours were all garbage.
I’m glad that Andrew Gelman has this problem too.
Mail (will not be published)