Kaveh sends along this, from a recent talk at Berkeley by Katherine Casey:
Paul Alper sends this in, from the article, “Ovarian cancer screening and mortality in the UK Collaborative Trial of Ovarian Cancer Screening (UKCTOCS): a randomised controlled trial,” by Ian J Jacobs, Usha Menon, Andy Ryan, Aleksandra Gentry-Maharaj, Matthew Burnell, Jatinderpal K Kalsi, Nazar N Amso, Sophia Apostolidou, Elizabeth Benjamin, Derek Cruickshank, Danielle N Crump, Susan K Davies, Anne Dawnay, Stephen Dobbs, Gwendolen Fletcher, Jeremy Ford, Keith Godfrey, Richard Gunu, Mariam Habib, Rachel Hallett, Jonathan Herod, Howard Jenkins, Chloe Karpinskyj, Simon Leeson, Sara J Lewis, William R Liston, Alberto Lopes, Tim Mould, John Murdoch, David Oram, Dustin J Rabideau, Karina Reynolds, Ian Scott, Mourad W Seif, Aarti Sharma, Naveena Singh, Julie Taylor, Fiona Warburton, Martin Widschwendter, Karin Williamson, Robert Woolas, Lesley Fallowfield, Alistair J McGuire, Stuart Campbell, Mahesh Parmar, and Steven J Skates:
Declaration of interests IJJ reports personal fees from and stock ownership in Abcodia as the non-executive director and consultant. He reports personal fees from Women’s Health Specialists as the director. He has a patent for the Risk of Ovarian Cancer algorithm and an institutional licence to Abcodia with royalty agreement. He is a trustee (2012–14) and Emeritus Trustee (2015 to present) for The Eve Appeal. He has received grants from the Medical Research Council (MRC), Cancer Research UK, the National Institute for Health Research, and The Eve Appeal. UM has stock ownership in and research funding from Abcodia. She has received grants from the MRC, Cancer Research UK, the National Institute for Health Research, and The Eve Appeal. NNA is the founder of, owns stock in, and is a board member of MedaPhor, a spin-off company at Cardiff University. He has a patent for the ultrasound simulation training system MedaPhor. SA is funded by a research grant from Abcodia. AD reports personal fees from Abcodia. AL reports personal fees from Roche as a panel member and advisory board member and fees from Sanofi Pasteur Merck Sharp & Dohme (Gardasil) as an advisory board member. JM is involved in a private ovarian cancer screening programme after closure of this trial. MWS reports personal fees from Abcodia as a consultant. LF reports funding from the MRC for the UK Collaborative Trial of Ovarian Cancer Screening psychosocial study. She reports personal fees from GlaxoSmithKline, Amgen, AstraZeneca, Roche, Pfzier, Teva, Bristol-Myers Squibb, and Sanofi and grants from Boehringer Ingelheim and Roche. SJS reports personal fees from the LUNGevity Foundation and SISCAPA Assay Technologies as a member of their Scientifi c Advisory Boards. He reports personal fees from Abcodia as a consultant and AstraZeneca as a speaker honorarium. He has a patent for the Risk of Ovarian Cancer algorithm and an institutional license to Abcodia. All other authors declare no competing interests.
All right, then.
Oh, in case you were wondering, here’s the last part of the paper’s summary:
Although the mortality reduction was not significant in the primary analysis, we noted a significant mortality reduction with MMS when prevalent cases were excluded. We noted encouraging evidence of a mortality reduction in years 7–14, but further follow-up is needed before firm conclusions can be reached on the efficacy and cost-effectiveness of ovarian cancer screening.
N = 202638 and the effect wasn’t statistically significant. No problem, says the non-executive director and consultant of Abcodia, the director of Women’s Health Specialists, the trustee of the Eve Appeal, the owner of stock in Abcodia, the owner of stock in MedaPhor, the patenter of MedaPhor, the receiver of personal fees from Abcodia, the panel member of Roche and advisory board member of Sanofi Pasteur Merck Sharp & Dohme, the receiver of personal fees from GlaxoSmithKline, Amgen, AstraZeneca, Roche, Pfzier, Teva, Bristol-Myers Squibb, and Sanofi and grants from Boehringer Ingelheim and Roche, the receiver of personal fees from SISCAPA Assay Technologies, Abcodia, and AstraZeneca. No problem, says the patent holder for the Risk of Ovarian Cancer algorithm. No problem at all. Encouraging evidence, they say.
P.S. I get funding from Novartis and have also been paid by Merck, Procter & Gamble, Google, and lots of other companies that I can’t remember right now.
See footnote 10 on page 5 of this GAO report.
(The above graphs are just for age 45-54, which demonstrates an important thing about statistical graphics: They should be as self-contained as possible. Otherwise when the graph is separated from its caption, it requires additional words of explanation, as you are seeing here.)
Wagenmakers et al. write:
A single experiment cannot overturn a large body of work. . . . An empirical debate is best organized around a series of preregistered replications, and perhaps the authors whose work we did not replicate will feel inspired to conduct their own preregistered studies. In our opinion, science is best served by ruthless theoretical and empirical critique, such that the surviving ideas can be relied upon as the basis for future endeavors. A strong anvil need not fear the hammer, and accordingly we hope that preregistered replications will soon become accepted as a vital component of a psychological science that is both though-provoking and reproducible.
I don’t feel quite so strongly as E.J. regarding preregistered replications, but I agree strongly with his anvil/hammer quote, which comes at the end of a recent paper, “Turning the hands of time again: a purely confirmatory replication study and a Bayesian analysis,” by Eric-Jan Wagenmakers, Titia Beek, Mark Rotteveel, Alex Gierholz, Dora Matzke, Helen Steingroever, Alexander Ly, Josine Verhagen, Ravi Selker, Adam Sasiadek, Quentin Gronau, Jonathon Love, and Yair Pinto, which begins:
In a series of four experiments, Topolinski and Sparenberg (2012) found support for the conjecture that clockwise movements induce psychological states of temporal progression and an orientation toward the future and novelty.
OK, before we go on, let’s just see where we stand here. This is a Psychological Science or PPNAS-style result: it’s kinda cool, it’s worth a headline, and it could be true. Just as it could be that college men with fat arms have different political attitudes, or that your time of the month could affect how you vote or how you dress, or that being primed with elderly-related words could make you walk slower. Or just as any of these effects could exist but go in the opposite direction. Or, as the authors of those notorious earlier papers claimed, such effects could exist but only in the presence of interactions with socioeconomic class, relationship status, outdoor temperature, and attitudes toward the elderly. Or just as any of these could exist, interacted with any number of other possible moderators such as age, education, religiosity, number of older siblings, number of younger siblings, etc etc etc.
Topolinski and Sparenberg (2012) wandered through the garden of forking paths and picked some pretty flowers.
What happened when Wagenmakers et al. tried to replicate?
Here we report the results of a preregistered replication attempt of Experiment 2 from Topolinski and Sparenberg (2012). Participants turned kitchen rolls either clockwise or counterclockwise while answering items from a questionnaire assessing openness to experience. Data from 102 participants showed that the effect went slightly in the direction opposite to that predicted by Topolinski and Sparenberg (2012) . . .
No surprise. If the original study is basically pure noise, a replication could go in any direction.
Wagenmakers et al. also report a Bayes factor, but I hate that sort of thing so I won’t spend any more time discussing it here. Perhaps I’ll cover it in a separate post but for now I want to focus on the psychology experiments.
And the point I want to make is how routine this now is:
1. A study is published somewhere, it has p less than .05, but we know now that this says little to nothing at all.
2. The statistically significant p-value comes with a story, but through long experience we know that these sort of just-so stories can go in either direction.
3. Someone goes to the trouble of replicating. The result does not replicate.
Let’s just hope that we can bypass the next step:
4. The original authors start spinnin and splainin.
And instead we can move to the end of this story:
5. All parties agree that any effect or interaction will be so small that it can’t be detected with this sort of crude experimental setup.
And, ultimately, to a realization that noisy studies and forking paths is not a great way to learn about the world.
Let me clarify just one thing about preregistered studies. Preregistration is fine, but it helps to have a realistic sense of what might happen. That’s one reason I did not recommend that those ovulation-and-clothing researchers do a preregistered replication. Sure, they could, but given their noise level, it’s doomed to fail (indeed, they did do a replication and it did fail in the sense of not reproducing their original result, and then they salvaged it by discovering an interaction with outdoor temperature). Instead, I usually recommend people work on reliability and validity, that is, on reducing the variance and bias of their measurements. It seems kinda mean to suggest someone do a preregistered replication, if I think they’re probably gonna fail. And, if they do succeed, it’s likely to be a type S error, which is its own sort of bummer.
I guess what I’m saying is:
– Short-term, a preregistered replication is a clean way to shoot down a lot of forking-paths-type studies.
– Medium-term, I’m hoping (and maybe EJ and his collaborators are, too) that the prospect of preregistered replication will cause researchers to moderate their claims and think twice about publishing and promoting the exciting statistically-significant patterns that show up.
– Long term, maybe people will do more careful experiments in the first place. Or, when people do want to trawl through data to find interesting patterns (not that there’s anything wrong with that, I do it all the time), that they will use multilevel models and do partial pooling to get more conservative, less excitable inference.
Mon: “A strong anvil need not fear the hammer”
Tues: Best Disclaimer Ever
Wed: These celebrity photos are incredible: Type S errors in use!
Thurs: Selection bias, or, some things are better off left unsaid
Fri: John Yoo blogging
Sat: You won’t be able to forget this one: Alleged data manipulation in NIH-funded Alzheimer’s study
Sun: Should I be upset that the NYT credulously reviewed a book promoting iffy science?
Michael Smith writes:
I have a research challenge and I was hoping you could spare a minute of your time. I hope it isn’t a bother—I first came across you when I saw your post on how psychology researchers can learn from statisticians. I figure even if you don’t know the answer to this question, you might know someone who would. My colleagues and I want to explore implicit biases using the trolley problem as the mechanism for discovering these biases. The problem we have is we have very specific needs for our survey software. I don’t know enough about what is out there to make an informed decision. What we need is a salient timer on the screen, counting down from 5 seconds. There will be two images, one on the left and one on the right. In one condition, when those 5 seconds are up, we want the software to read “Dead” on one of those images, the default being the people in the image on the right will “die”. In the other condition, we want the software to read “Saved” on one of those images. I’m not sure we can have the effect we want without that added touch. Thanks in advance for any ideas you have!
My reply: I have no idea, but I’m sure this is possible. Maybe one of our readers has a suggestion?
From John Lardner:
A young ex-paratrooper visited Ebbets Field, Brooklyn, one day, and addressed some language, as ball fans will, to Mr. Leo Durocher, the Brooklyn manager, himself the most polite and clean-tongued gentleman in the national pastime when his mouth is shut, which is a hypothetical situation.
I should really stop here because this is perfection, but the continuation isn’t bad either:
After the game the fan was beaten up with a blackjack and hospitalized by two men whom he identified as Mr. Durocher and a house cop. He must have been confused, because Mr. Durocher and the house cop say they didn’t do it.
It has been argued that female-named hurricanes are deadlier because people do not take them seriously. However, this conclusion is based on a questionable statistical analysis of a narrowly defined data set. The reported relationship is not robust in that it is not confirmed by a straightforward analysis of more inclusive data or different data.
Hurricanes; Data grubbing; Sexism
Ha ha. I’m just bummed he didn’t use the term “himmicane,” which I think treats this topic with the seriousness it deserves.
To the best of my knowledge, the authors of that joke paper, and the editors at the journal that published it (PPNAS), have refused to acknowledge the flaws in the paper, nor have they apologized for wasting all of our time with it. I probably should not be surprised by this never-back-down attitude. But I don’t have to like it.
Seth Green writes:
I thought you might enjoy this update from the STATA team:
. . . suppose we wish to know the effect on employment status of a job training program. Further suppose that motivation affects employment status and motivation affects participation. We do not observe motivation. We have an endogeneity problem.
Stata 14’s new eteffects eliminates the confounding effects of unobserved variables and allows us to test for endogeneity. In return, you must model both the treatment and the outcome.
Well ok then! Glad we can all retire!
I was shocked. I already emailed the support staff with a quote from Judea Pearl about how the correctness of the model is, even in principle, unverifiable. Whom do you think they hire to write these updates?
To be fair, if you have 2 natural experiments you should be able to estimate 2 separate causal effects and then get what you want. The trouble is with any implication that this can be automatically done from observational data. “You must model,” sure, but a statistical model without some real-world identification won’t get you far!
To which Green responded:
I wish that that was what they were claiming. In the example on the page, however, “eteffects” models “wages as being determined by job tenure and age” and “college attainment by age and the number of parents who attended college.” So the actual implementation is “independent conditional on observables.” The post then gives a test of “the correlation between the unobservables that affect treatment and outcome. If these correlations are zero, we have no endogeneity.” The test detects endogeneity, the model was correct because it was simulated data, and therefore endogeneity has been addressed (!).
The deeper I peer in the less meaning there is.
All I can say is, what an amazing accomplishment. Whoever came up with it is the most extraordinary collection of talent, of human knowledge, that has ever been gathered in the field of statistics, with the possible exception of when Stephen Wolfram dined alone.
A cognitive scientist writes:
You’ll be interested to see a comment from one of my students, who’s trying to follow all your advice:
It’s hard to see all this bullshit in top journals, while I see that if I do things right, it takes a long time, and I don’t have the beautiful results these top journals want, even that I did a ton of experiments…
It’s an interesting situation, we either have to essentially fake our results or be doomed to taking back-seats in the scientific debates because our papers won’t come out in top journals because they don’t have crystal clear results. It doesn’t matter to me any more, but it will matter to young people starting out.
Almost every data I have reanalyzed from other people’s published work has not panned out; in all cases, there was p-value hacking and forking paths of one sort or another.
Indeed, if people can get published in a top journal by conducting a two-month exercise involving a Mechanical Turk survey, a burst of data analysis, and some well-placed theory, then why conduct a long research project involving careful data collection?
The answer has to be that you have to have a deeper motivation. Your aim has to be to do the best possible work. If you get published in Psychological Science, fine—indeed, a lot of excellent work does get published in these journals—but publication in those journals can’t be your goal.
My colleague continues:
If you have any suggestions on how these people starting out their careers can move forward without doing all this unethical stuff with their data, it’d be a big step forward.
To me, the hard part about doing things right is not the analysis, it’s the data collection. When it comes to design and analyses of studies, I recommend moving to a paradigm in which researchers seek to push their models hard and find problems with their theories, rather than the currently standard approach in which researchers try to find confirming evidence for their theories by rejecting straw-man null hypotheses.
This post is by Phil.
The “Affordable Care Act” a.k.a. “Obamacare” was passed in 2010, with its various pieces coming into play over the following few years. One of those pieces is penalties for hospitals that see high readmission rates. The theory here, or at least one of the theories here, was that hospitals could reduce readmission rates if they wanted to, but they didn’t have a strong incentive to do so, and indeed there was a moral hazard: if a hospital sends a patient home for good, they’re done collecting money from them, but if the patient has to come back for more treatment…cha-ching.
I have to admit I didn’t think this was going to be a big deal. I know doctors, I’ve seen doctors, some of my good friends are doctors, and I know they’re not scheming to make more money by providing bad treatment so the patients have to come back for more.
But…well, check out this plot, from the Department of Health and Human Services. The plot does us all a disservice by not starting its y-axis at 0, but still…wow. If the data are real and the plot is real, this is pretty stunning: a 20% reduction (or a 3.5 percentage point reduction) in readmissions for “HRRP”, and a similar scale reduction in all other readmissions.
My first thought was that the hospitals are gaming the system somehow by readmitting patients but not reporting them, but I’m not the first person to suggest this and supposedly “The new research shows that this isn’t the case. The number of observation stays are very small compared to readmissions and have increased steadily since at least 2008, with no acceleration after the Affordable Care Act was enacted.”
Of course, there are other possibilities, like maybe the hospitals are refusing to readmit patients even if they really should, or maybe they put them off a bit and readmit them on day 31 instead of day 28 or 29 or 30. But something like this would make their fatality numbers go up, and I assume someone tracks those.
One of the interesting things about this is that you really don’t need a statistician: the signal is so clear that the only questions are related to the definitions of things like “readmission.” A sixth-grader can look at the numbers and come up with a good estimate of the effect of the law on readmissions.
Paul Alper sends in this news article by Ryan Foley:
The former security chief for a national group that operates state lotteries personally bought two prize-winning tickets in Kansas worth $44,000, investigators said Monday, bringing to five the number of states where he may have fixed games to enrich himself and associates.
Investigators recently linked the winning 2010 Kansas tickets to Eddie Tipton, former security director of the Multi-State Lottery Association . . . In his job at the association managing lotteries for 37 states and territories, Tipton managed random number generators that pick winning numbers for some national games . . . Since Tipton’s conviction, Iowa prosecutors have charged Tipton with ongoing criminal conduct and money laundering for allegedly fixing jackpots valued at $8 million in Colorado, Wisconsin and Oklahoma. . . .
Tipton’s attorney, Dean Stowers, laughed out loud when told of the latest allegations against his client. . . . Stowers said that arguing the games were rigged was a risky strategy for state lotteries.
“If that’s their claim, what is their obligation to the players? Obviously they were running games that weren’t legitimate and collected all this money from people and spent it,” he said.
Indeed, lotteries are evil even when they’re not rigged. See Clotfelter and Cook.
Stephen Senn writes, “Bayesians (quite rightly so according to the theory) have every right to disagree with each other.”
He could also add, “Non-Bayesians (quite rightly so according to the theory) have every right to disagree with each other.”
Non-Bayesian statistics, like Bayesian statistics, uses models (or, if you prefer, methods). Different researchers will use different models (methods) and thus, quite rightly so according to the theory, have every right to disagree with each other.
Maurits Van Wagenberg writes:
Coming from the traditional side, started to use Bayes, quickly limiting it to models with less variables, notwithstanding the lure. Am not in academics but have for many years researched design processes of complex objects such as engineering complex process plants. These processes have a lead-time from 12 to 18 months.
Aim was to check on development of variables that could indicate derailment of process. Felt comfortable in using posterior to update new prior, a week later, especially two months into the process.
This winter, was asked to look into a new group of design projects where my concept failed. My previous body of knowledge was limited as were empirical data at hand.
Started to look at your approach (in your Bayesian Data Analysis 3rd and presentations, incl. your French presentations).
My question is: could I find more time-series related examples?
Any suggestions? Our forthcoming Bayesian Econometrics in Stan book should have a few such examples, although this person’s application area seems a bit different. I know that some of you out there work on engineering problems so maybe you have some thoughts for him.
Byron Gajewski writes in with a good question:
My seven year old daughter asked us “what is a Republican?” We struggled. Do you have a working definition? Democrat too?
There are different answers to this one. Simplest is party registration (that is public record), or party identification (which is a survey response). It’s kinda like being a Christian: One one hand, you are a Democrat if you say you are, similarly to how you are a Christian if that’s how you identify. On the other hand, others might not accept your identification. For example, Mormons consider themselves Christian, but I think most Christian churches don’t consider Mormons to be a Christian denomination. Similarly, Jeb Bush etc have argued about whether Donald Trump can really be considered a Republican. And remember what Ronald Reagan said: “I didn’t leave the Democratic party, the Democratic Party left me.”
P.S. Some commenters didn’t like my answer! Here’s what Gajewski wrote to me a few months ago in response:
This is what I was thinking too, kind of a fluid definition of republican/democrat throughout history. I might give examples of democrat and republican presidents to my daughter to help her understand. Regan and Truman will be where I begin, probably. We have the Truman library in town here and I took her to that which may help build an answer to her fundamental question.
Mon: What is a Republican?
Tues: “Bayesians (quite rightly so according to the theory) . . .”
Wed: Lottery is evil
Thurs: Gresham’s Law of experimental methods
Fri: In the biggest advance in applied mathematics since the most recent theorem that Stephen Wolfram paid for . . .
Sat: Himmicanes and hurricanes update
Sun: What they’re saying about “blended learning”: “Perhaps the most reasonable explanation is that no one watched the video or did the textbook reading . . .”
And, the week after:
“A strong anvil need not fear the hammer”
Best Disclaimer Ever
These celebrity photos are incredible: Type S errors in use!
Selection bias, or, some things are better off left unsaid
John Yoo blogging
You won’t be able to forget this one: Alleged data manipulation in NIH-funded Alzheimer’s study
Should I be upset that the NYT credulously reviewed a book promoting iffy science?
Allen and Michael pointed us on the Stan list to these amusing documents by Oliver Keyes:
Rbitrary Standards: “This is an alternate FAQ for R. Specifically, it’s an FAQ that tries to answer all the questions about R’s weird standards, formatting and persnicketiness that you’re afraid to ask.”
Rick Desper writes:
I face some tough career choices.
I have a background in mathematical modeling (got my Ph.D. in math from Rutgers back in the late ’90s) and spent several years working in the field of bioinformatics/computational biology (its name varies from place to place). I’ve worked on problems in modeling cancer progression and also on (mathematically similar) problems in phylogeny estimation. But after several years of doing that, I decided that I don’t really have the passion for biology to spend the rest of my life as a full-time researcher there.
Basically I found myself more and more interested in issues of public policy, economics, and the like. But most of my contacts from computational biology have little idea about what’s going on in the social sciences.
I was wondering if you had any advice regarding resources or people to talk to regarding a potential mid-career switch. I’m currently in the DC area – I moved here several years ago for a job at the NIH, though currently I have a different job for a different HHS agency. I would prefer to stay local, but that’s not an absolute – it’s just that I figure somebody in this city must be doing this kind of work.
I’m not sure what to say here. I do think there’s a recognized need for mathematical modelers in social sciences. Yes, there are a lot of economists and even some political scientists and sociologists floating around, but each of these groups tends to have its own data-analytic tools they focus on, so I’m pretty sure that someone with experience in mathematical modeling in biology could have something useful to add.
My guess is that Rick’s way into the policy world will be by way of his existing biology expertise. There must be groups and organizations working on biology-related policy, this could be a pharmaceutical company or a government regulatory agency, for example. Health care policy is huge, and I’m guessing it’s dominated by economists, but there could be more technical areas, for example allocating resources within cancer research or performing cost-benefit analysis of some biology projects, where a model of the biology itself could be relevant. Then, once you get involved in policy modeling in a specific context, maybe you can move on.
But maybe I’m missing some other ideas. Do you readers have any suggestions?
Andy Solow writes:
I have a question about Bayesian statistics. Why is it wrong to use the same data to formulate the prior and to update it to the posterior? I am having a hard time coming up with – or finding in the literature – a formal reason.
I asked him to elaborate and he wrote:
Here’s an example. Independent observations X1, X2, …, Xn from the uniform distribution on (0, U). Interested in the posterior distribution of U. Likelihood is 1/U**n for U > xmax, 0 otherwise. Exponential prior for U with parameter log 2/xmax so that prob(U < xmax) = 1/2. Apart from lacking justification, the prior depends on the data.
I replied: Yes, in general you don’t want the prior to depend on the data. When prior is expressed to depend on data, we think of it as an approx to a prior distribution that depends on some unknown parameter that is estimated from the data. So the idea is that there is a coherent model that is being approximated for convenience.
For another example, consider regression models with standardized predictors, as discussed in my 2008 paper, “Scaling regression inputs by dividing by two standard deviations.” If you rescale predictors based on the data, then put informative priors on their coefficients (as we recommend in our other 2008 paper), then you have a prior that depends on data.
I’d like to think of this as an approximation to a prior that depends on a hyperparameter—the population sd of the predictor in question—which in turn is estimated from data using a simple point estimate. Thus, an approximation to a hierarchical model.
Now that Stan is available it would make sense to set up the full hierarchical model, partly for practical purposes when sample sizes are small, partly because the full model will be easier to extend if we want to go to other scalings, and partly as a demonstration so that if we do use the approximate approach, we’ll know what we’re approximating.
Maybe we can put this in our Bayesian econometrics text. I’m thinking of an intro chapter, or maybe an appendix, with all the basic regression models with different priors.
Mike Hughes writes:
I have been looking a your blog entries from about 8 years ago in which you comment on the number of groups that is appropriate in multilevel regression. I have a research problem in which I have 6 groups and would like to use multilevel regression.
Here is the situation. I have racial attitudes data from samples of University of Alabama students drawn in 1963, 1966, 1969, 1972, 1982, and 1988. (I also have data from 2013, but I am not including them in this analysis).
I have run a number of analyses predicting social distance attitudes in which the level-2 group variable is year. The main predictor is endorsement of racial stereotypes. I am interested in whether the association between endorsement of racial stereotypes and social distance declines over time (it does), so I specify a random slope for racial stereotypes, and I look at the cross level interaction between the racial stereotypes coefficient and year, included in the fixed part of the equation as number of years since 1963 (1963=0, 1966=3, etc.). I use xtmixed in Stata14.
The models run, and the likelihood ratio test vs the linear model is significant. AIC and BIC indicate that the xtmixed model is better than the OLS model that I run in Stata. Also, the OLS model predicts values of the dependent variable that are beyond its range (1 to 5), but the xtmixed model does not.
So my question is, is it ok to present my multi-level model, with 6 groups, rather than the OLS model? If so, is there something I can cite to provide some backing for using the model with 6 groups?
My reply: I agree that it makes sense to include a linear predictor for time and also allow intercepts and slopes to vary by discrete survey (so that you have 6 values for each coefficient). And once you do this, there’s not problem using the model to extrapolate. Whether that’s a good idea, is another story.