Lorin H. writes:
One big question in the world of software engineering is: how much variation is there in productivity across programmers? (If you google for “10x programmer” you’ll see lots of hits).
Let’s say I wanted to explore this research question with a simple study. Choose a set of participants at random from a population of programmers. The participants will write a computer program according to a specification. We’ll measure how long it takes for them to complete a correct program.
(Let’s put aside for now the difficulties of using task completion time alone as a measure of productivity, or the difficulty of verifying that a program is “correct”).
I know that I can estimate the population variance from the sample variance. But the problem is that there’s some “noise” in this measurement: there are factors other than their underlying ability that can affect how they perform: maybe some of them skipped lunch, or they’re just having a bad day.
My question is: how do I design my studies and analysis to try to identify the amount of variation that represents “individual ability” rather than these other factors that I can’t control for?
My reply: I’d think the thing to do would be to give multiple tasks to each programmer and then fit a model allowing for varying task difficulties, varying programming abilities, and task*programmer interaction. An additive model (on the log scale) with interactions should work just fine, I’d think.