**Postdoctoral research opportunity: Columbia University, Departments of Epidemiology and Statistics**

Supervisors: Ezra Susser (epidemiology) and Andrew Gelman (statistics)

We have a NIH-funded postdoctoral position (1 or 2 years) available for what is essentially statistical research as applied to some important problems in psychiatric epidemiology. One project which we are working is the Jerusalem Perinatal Study of Schizophrenia, a birth cohort of about 90,000 (born 1966-1974) followed for schizophrenia in adulthood. Another project is a California birth cohort study of schizophrenia–this is a cohort of 20,000 collected in 1959-1966 for which we have ascertained/diagnosed 71 cases of schizophrenia spectrum disorders. The data set already exists and has produced several important findings. The statistical methods involve fitting and understanding multilevel models; see below. The position can also involve some teaching in the Statistics Department if desired.

**Statistical Project 1: Tools for understanding and display of regressions and multilevel models**

Modern statistical packages allow us to fit ever-more-complicated models, but there is a lag in the ability of applied researchers (and of statisticians!) to understand these models and check their fit to data. We are in the midst of developing several tools for summarizing regressions, generalized linear models, and multilevel modelsâ€”these tools include graphical summaries of predictive comparisons, numerical summaries of average predictive comparisons, measures of explained variance (R-squared) and partial pooling, and analysis of variance. To move this work to the next stage we need to program the methods for general use (writing them as packages in the popular open-source statistical language R) and further develop them in the context of ongoing applied research projects.

**Statistical Project 2: Deep interactions in multilevel regression**

In regressions and generalized linear models, factors with large effects commonly have large interactions. But in a multilevel context in which factors can have many levels, this can imply many many potential interaction coefficients. How can these be estimated in a stable manner? We are exploring a doubly-hierarchical Bayes approach, in which the first level of the hierarchy is the usual units-within-groups (for example, patients within hospitals) in which coefficents are partially pooled and the second level is a hierarchical model of the variance components (so that the different amounts of partial pooling are themselves modeled). The goal is to be able to include a large number of predictors and interactions without the worry that lack-of-statistical-significance will make the estimates too noisy to be useful. We plan to develop these methods in the context of ongoing applied research projects.

**If you are interested . . .**

Please send a letter to Prof. Andrew Gelman (Dept of Statistics, Columbia University, New York, N.Y. 10027, gelman@stat.columbia.edu), along with c.v., copies of any relevant papers of yours, and three letters of recommendation.

Dear Prof. Gelman:

Is this postdoctoral position still open ?

Regards, D Roy