Jeff Lax points to this post from Nate Silver and asks for my thoughts.
In his post, Nate talks about data quality issues of national and state polls. It’s a good discussion, but the one thing he unfortunately doesn’t talk about is multilevel regression and poststratification (or see here for more). What you want to do is fit a multilevel regression to your raw data so as to estimate your outcome of interest (for example, support for Hillary Clinton) in demographic/geographic slices of the population, characterized by age, sex, ethnicity, education, state of residence, maybe some other variables, maybe party identification as well. Then you poststratify using some combination of census and poll data that give you the number of people in each category within each state.
That’s the way to go.
See here for further discussion, particularly on the subject of state-level opinion.
And, hey! We wrote this paper in 2005 on state-level opinion from national surveys.
And here’s the original paper, “Poststratification into many categories using hierarchical logistic regression.” It was published in 1997 in the journal Survey Methodology.
It takes a long long time for research to make its way from academia to journalism. Slowly but surely, though, it will happen.