Skip to content
 

“Dear Major Textbook Publisher”: A Rant

Dear Major Academic Publisher,

You just sent me, unsolicited, an introductory statistics textbook that is 800 pages and weighs about 5 pounds. It’s the 3rd edition of a book by someone I’ve never heard of. That’s fine—a newcomer can write a good book. The real problem is that the book is crap. It’s just the usual conventional intro stat stuff. The book even has a table of the normal distribution on the inside cover! How retro is that?

The book is bad in so many many ways, I don’t really feel like going into it. There’s nothing interesting here at all, the examples are uniformly fake, and I really can’t imagine this is a good way to teach this material to anybody. None of it makes sense, and a lot of the advice is out-and-out bad (for example, a table saying that a p-value between 0.05 and 0.10 is “moderate evidence” and that a p-value between 0.10 and 0.15 is “slight evidence”). This is not at all the worst thing I saw; I’m just mentioning it here to give a sense of the book’s horrible mixture of ignorance and sloppiness.

I could go on and on. But, again, I don’t want to do so.

I can’t blame the author, who, I’m sure, has no idea what he is doing in any case. It would be as if someone hired me to write a book about, ummm, I dunno, football. Or maybe rugby would be an even better analogy, since I don’t even know the rules to that one.

Who do I blame, then? I blame you, the publisher.

You bastards.

Out of some goal of making a buck, you inflict this pile of crap on students, charging them $200—that’s right, the list price is just about two hundred dollars—for the privilege of ingesting some material that is both boring and false.

And, the worst thing is, this isn’t even your only introductory statistics book! You publish others that are better than this one. I guess you figure there’s a market for anything. It’s free money, right?

And then you go the extra environment-destroying step of printing a copy just for me and mailing it over here, just so that I can throw it out.

Please do me a favor. Shut your business down and go into something more productive to the world. For example, you could run a three-card monte game on the street somewhere. Three-card monte, that’s still a thing, right?

81 Comments

  1. John Hall says:

    I’m thanking you on behalf of all students who have had crappy introductory statistics textbooks.

  2. Rahul says:

    >>>I can’t blame the author,<<<

    Why not? Did the publisher go and hire a rugby player to write a book on Stat?!

    I mean sure the publishers are evil but these crappy, ignorant authors are equally complicit in producing these cesspools. Please don't absolve the author.

  3. Jonathan says:

    I hated intro to statistics in college and don’t remember a thing from it. Twenty years later, I was able to learn a more practical approach and am in love. So, I can’t agree with this post enough, on the substance alone.

  4. A.P. Salverda says:

    Is the first author A.F.?

  5. Dave says:

    Andrew – So this one is bad, but which introductory stats books do you rate? My students read your blog and they will be asking me…

    • Andrew says:

      Dave:

      I like Dick DeVeaux’s book. Actually, I’m not thrilled with the material it covers—it’s the standard stuff, p-values and all those other things I don’t like—but, conditional on a book covering the standard material, I think DeVeaux does a good job. And I don’t remember seeing any conceptual errors in it, of the sort that were in the book discussed in the above post.

      • Rob says:

        Andrew – Is there an intro stats book that you recommend that covers the right material? Thanks for your great blog.

        • Willem says:

          A 2016 version could start with the books by Allen Downey. First some programming in Think Python, then an application in bayesian science in Think Bayes. Can easily fit in a semester course. Easy, accessible and outright skips all the stuff we learn to forget.

          The linear model and then quickly more advanced stuff can be introduced later on, when the student has a feeling for working-with-data and working-with-inferences. It’s all about inference nowadays in both stat and machine learning.

          • Mark says:

            I cannot speak to ‘Think Bayes’ or ‘Think Stats’ (I have them both but haven’t gotten around to them yet) but ‘Think Python’ seems to me very much like the book you are critiquing. Just look at the first dozen or so chapters:

            Variables, Expressions, Statements
            Functions
            Conditionals & Recursion
            Iteration
            Strings
            Lists
            Dictionaries

            It is a fine book as it is, but very much a ‘traditional’ walk through a language. It is not much different from my ancient copy of “Programming and Problem Solving in Pascal”. How many programming etxts have been written in the same way, with the same progression, entire chapters devoted to “types”, then “statements”, then “functions”, etc.? Most of them!

            Guttag’s “Introduction to Computation and Programming Using Python” is much better. It only uses the language as a tool to teach computation thinking, and as such only brings in language elements as needed. Also, very affordable.

            • Willem says:

              One hand: you got me there :) there might be a pretty strong preference bias going on.

              Other hand: you learn a language differently than ideas. Old stat is filled with (old) ideas in the language of algebra.

              My suggestion is learning ideas of statistics by practicing the language of programming. That starts with learning to program. Stat 101 presumes algebra as well. I only made it explicit.

              I dare say things like GLM or Boosting are easier understood as algorithms than in algebra. Again might be personal preference.

              • Mark says:

                I think I see what you mean. Speaking from a student’s perspective, I like the Guttag text for how quickly the student is doing something meaningful. By about page 30, the student is already working with code for simple bisection search, newton-raphson etc. The text is focused on ‘computational thinking’ combined with stats, data analytics, machine learning. Highly recommended … but then I am not a prof so YMMV!

              • Keith O'Rourke says:

                > Old stat is filled with (old) ideas in the language of algebra … learning ideas of statistics by practicing the language of programming
                Agree, when I proposed at an JSM panel on teaching (2014 or 15) that shift in balance from statistical concepts to programming in intro course – most of the panel and many in audience audience pushed back on that.

                I think its starting to happen now.

      • Thanks Andrew. I both appreciate your comments — and agree with the criticism on the material. We’ve been trying to pull the Intro Stats market into the 21st (20th?) century with our books, so in each new edition, instead of just rearranging the deck chairs, we’ve actually tried to steer it into more relevant stuff. We’re working on 5th edition right now which I think is getting closer. But progress is slow! There’s a lot of inertia out there.

        • Brad Stiritz says:

          Hi Professor De Veaux, I just wanted to give an actual student testimonial for your book (I used the AP “Stats” version).

          Overall, I think your textbook is outstanding. I appreciated the obvious effort you and your co-authors made to carefully think through the narrative flow. I also enjoyed how much personality and good sense of humor comes through! It really makes a huge difference in reader engagement.

          The only caveat I would offer : please consider rethinking and rewriting the chapter “More About Tests and Intervals”, in which you cover topics that have been so controversial here on Andrew’s blog : p-values, NHST, Type I/II errors, etc. My novice view is that students definitely need to be exposed to all these concepts. They can certainly be misused, but also have important, real-world, practical relevance.

          I think this chapter might be helpfully expanded, and even split into two. Some of the material feels way too compressed and frankly rushed a bit. Given that you’re covering NHST, I feel there should be a chapter title which directly conveys that, rather than lumping into a vaguely-titled hodgepodge chapter, as at present.

          FWIW, I’ve gotten into long debates here on the blog with other readers, who insist that NHST is completely unfounded, misleading, and harmful to the student, and that only Bayesian inference should be taught. Perhaps your next edition could take that bull (so to speak ;) more fully by the horns, please? It’s very confusing to hear so forcefully, that some of the foundational material you teach is considered by some to be fundamentally wrong.

          Thank you for your consideration, respectfully submitted.

        • Andrea says:

          Dick,

          what is the difference between “Intro Stats” and “Stats: Data and Models”? They both seem introductory statistics books: are they different in terms of intended audience?

      • Martha (Smith) says:

        I agree — DeVeaux’s text is the best of the intro textbooks I have looked at. (But some people don’t like it — in some cases, precisely for the reasons that make it stand out above the competition.)

      • Joel says:

        Which one are we talking about. The “Intro Stats” oder the “Stats: Data and Models”?
        Joel (student)

      • Andrea says:

        Andrew, are you referring to the “Intro Stats” book or to the “Stats: Data and Models” book?

  6. Sean Mackinnon says:

    I had a book seller come to my office, and try to sell me an intro stats book for my 2nd year psychology students. It was ok, but was $300 CAN a pop. I said there was no way I could ask my students to buy something that expensive. Then she said that they could lower the price to like $150 … which is sketchy as hell. If your profit margins are so good you can cut the cost in half and not blink an eye, why the hell is it so expensive to begin with? (btw, that hasn’t stopped them from sending me *5* copies of the book to my mailbox in hopes that I’ll get my 200 students to buy it).

    This year, I decided to go with an open-access textbook: https://www.openintro.org/

    I like it, but I suspect it has a lot of the things you hate about statistics textbooks, because it’s pretty traditional. But at least then you can say you get what you pay for! Notably, they sell print copies of the book which (after our university store markup) were only $11.22 CAN. Really puts into perspective how the textbook companies are gouging poor students.

    • Andrew says:

      Sean:

      Yes, I’ve used that Open intro book too. It’s not bad. Not great, but not bad, and it’s an advantage that it’s free.

      • Sean Mackinnon says:

        Yes, now that I’ve spent a semester with it I’d say that’s a good review: Not great, but not bad. I haven’t found a book that has been better enough to be worth the $100+ extra in costs passed on to each student though, so being free is a big deal for me.

        • Anoneuoid says:

          It teaches NHST though… Unsurprisingly (since that is the *only* use for NHST), the book teaches rejecting precise strawman hypothesis A to accept favorite vague hypothesis B:

          “A drug called sulphinpyrazone was under consideration for use in reducing the death rate in heart attack patients. To determine whether the drug was effective, a set of 1,475 patients were recruited into an experiment and randomly split into two groups: a control group that received a placebo and a treatment group that received the new drug. What would be an appropriate null hypothesis? And the alternative?35

          […]

          We want to evaluate the hypothesis setup from Guided Practice 4.48 using data from the actual study.36

          […]

          Because the p-value is less than the significance level (a = 0:05), we say the null hypothesis is implausible. That is, we reject the null hypothesis in favor of the alternative and conclude that the drug is effective at reducing deaths in heart attack patients.

          35. The skeptic’s perspective is that the drug does not work at reducing deaths in heart attack patients (H0), while the alternative is that the drug does work (HA).

          36. Anturane Reinfarction Trial Research Group. 1980. Sulfinpyrazone in the prevention of sudden death after myocardial infarction. New England Journal of Medicine 302(5):250-256.”
          https://www.openintro.org/stat/textbook.php?stat_book=os

          It is a good idea to follow up on these real life examples found in stats books. I have always found a stark contrast between the conclusion the student is told to draw and the reality. In this case we discover the usual: before studied with NHST no one knew what was going on, after NHST still no one knows.

          1980: “Those who do not remember history are forced to relive it, and the Anturane trial and the forthcoming results from the studies of aspirin and dipyridamole need to be assessed against a backcloth of the past.
          […]
          These unresolved questions have been reiterated to remind us why history must be remembered if it is not to be relived. Confusion and weariness ended the anticoagulant era: we were left with no results on which we could base our practice. This must not be allowed to happen with the three first-heat runners in the new antithrombotic stakes. We must remember that we are testing our hypotheses about causation as well as testing the efficacy of treatment. We must ensure that trials of dipyridamole, aspirin, and sulphinpyrazone, both current and planned, are brought to a firm conclusion, and we must insist that the results are presented with every scrap of evidence.””
          https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1601166/

          1980: “Our review of this clinical trial of sulfinpyrazone indicates that the cause-of-death classification and all conclusions based on it are unreliable”
          https://www.ncbi.nlm.nih.gov/pubmed/7432418

          1980: “On the basis of their selective audit of the case records of about half the 163 deaths that occurred during the study, they believe that the assignment of the cause of death (“sudden death,” “acute myocardial infarction,” or “other cardiac”) frequently failed to conform to criteria set forth at the outset of the study, and that the criteria themselves were ambiguous and illogical. Furthermore, say the FDA team, even the claimed effect on total cardiac mortality is not convincing because it depends heavily on after-the-fact exclusion from analysis of certain patients who died while receiving sulfinpyrazone. According to Temple and Pledger, there were insufficient grounds and no clearly stated policy for declaring these patients “ineligible” for analysis. They also say that any plan to exclude patients after entry into a study such as this one is a dubious epidemiologic practice that must be carefully considered and justified in advance.”
          https://www.ncbi.nlm.nih.gov/pubmed/7432408

          1982: “However, the Food and Drug Administration refused to approve the claim by the pharmaceutical company that sulfinpyrazone was effective in preventing sudden death in the first six months after infarction. The FDA took the unusual step of justifying its actions in the pages of the Journal. 7 About half the case records of the deaths included in the study were reanalyzed by the FDA. They concluded that the classification scheme for sudden death, myocardial infarction, and “other cardiac death” (the three categories of causes of death listed in the study) was imprecise and that some classification errors had been made. The FDA also reanalyzed the charts of all patients who died and who had been excluded from the study because they were “ineligible” or “nonanalyzable.” The ineligible patients were those who, although randomized, were ultimately considered not to have met the study criteria. The nonanalyzable patients were those who died within one week after therapy was started or beyond one week after therapy was stopped, or those who died of clearly unrelated causes, such as surgical procedures. The FDA took the position that those exclusions may have biased the study in favor of the sulfinpyrazone-treated group.”
          https://www.ncbi.nlm.nih.gov/pubmed/7038499

          1985: “The use of sulfinpyrazone (15) is difficult to assess because the United States trial suggests a significant decrease in sudden cardiac death and no change in nonfatal myocardial infarction and the Italian trial (16) showed opposite effects. Criticism of the design and conduct of the United States sulfinpyrazone trial has been considerable.”
          https://www.ncbi.nlm.nih.gov/pubmed/3889107

          1993: “The role of sulfinpyrazone remains undefined.”
          https://www.ncbi.nlm.nih.gov/pubmed/8321905

          • Martha (Smith) says:

            “Because the p-value is less than the significance level (a = 0:05), we say the null hypothesis is implausible. That is, we reject the null hypothesis in favor of the alternative and conclude that the drug is effective at reducing deaths in heart attack patients.”

            When teaching hypothesis testing, I make an effort to avoid simplistic language such as that and say things like the following:

            “If we obtain an unusually small p-value, then (at least) one of the
            following must be true:
            I. At least one of the model assumptions is not true (in which
            case the test may be inappropriate).
            II. The null hypothesis is false.
            III. The sample we’ve obtained happens to be one of the small
            percentage (of suitable samples from the same population and
            of the same size as ours) that result in an unusually small p-value.

            Thus, if the p-value is small enough and all the model assumptions
            are met, then rejecting the null hypothesis in favor of the alternate
            hypothesis can be considered a rational decision, based on the
            evidence of the data used.

            However:
            1. How small is “small enough” is a judgment call.
            2. “Rejecting the null hypothesis” does not mean the null
            hypothesis is false or that the alternate hypothesis is true. (Why?)
            3. The alternate hypothesis is not the same as the scientific
            hypothesis being tested.

            For example, the scientific hypothesis might be “This reading
            program increases reading comprehension,” but the statistical
            null and alternate hypotheses would be expressed in terms of
            a specific measure of reading comprehension.
            • Different measures (AKA different outcome variables)
            would give different statistical tests (that is, different
            statistical hypotheses).
            • These different tests of the same research hypothesis
            might lead to different conclusions about the
            effectiveness of the program.”

            • Andrew says:

              Martha:

              I’ve come to the conclusion that it’s not enough to avoid simplistic language. I think it’s necessary to directly address the problem by stating the simplistic version, explaining that’s what most people do, and then explaining why it’s wrong.

            • Anoneuoid says:

              The problem here seems to be that the null hypothesis was false, and rightly detected to be false, but factors than the drug could plausibly account for the deviation. The correct conclusion to draw from the rejected null hypothesis here is that “mean mortality rates of the treatment and placebo groups are different for some reason”. That is it.

              Stats textbook authors can’t get away with that though, too many people would ask what the point of the test is then.

          • Keith O'Rourke says:

            > before studied with NHST no one knew what was going on, after NHST still no one knows.
            Most of this disaster seems to have been the result of poor trial conduct and misreporting.

            I am not aware of a statistical approach that can adequately deal with this especially given only regulatory agencies can get access to data and records to learn about these real problems.

            • Anoneuoid says:

              >”I am not aware of a statistical approach that can adequately deal with this…”

              Their “hypothesis” that “the drug works” is too vague. If instead they had a model about how it worked and deduced that “the mortality rate should drop by 15% for this dose” (or more likely, predicted some functional relationship between dose, platelet levels, and mortality), then messing up the study isn’t likely to give you results consistent with the model. That is not the case when half the possible results are consistent with your theory (eg “drug leads to lower mortality rate”).

              In other words, these clinical trials that just look for the existence of “an effect” are poorly designed to begin with. The problems encountered here are inherent to that design and the goal it is based on, and will not be solved by stats. So why do stats textbooks claim they can tell you whether “the drug works” from such studies? That is either a lie or result of confusion.

              • Martha (Smith) says:

                +1
                I’d say it’s not usually a deliberate lie, but confusion — with the result being de facto a lie.

        • Ian Fellows says:

          The nice thing is that OpenInro is, well, open. I’m sure that they would love some additional points of view on how best to present the material.

      • Jan says:

        Did you ever try the “Randomization and Simulation” version? I am going to give it a try this Winter semester.

        • Andrew says:

          Jan:

          That might be the one we used. It was OK. I liked it a lot at first look, but then when we actually had to put together the course I realized there was a lot that we needed that wasn’t there. My collaborators and I wanted to then write our own book but that takes a lot of work. In retrospect maybe we should’ve just whipped something out and gotten it out there. At some point I plan to get back to this project.

        • elin says:

          I feel it is somewhat half hearted. As though they did the first one, then people said they wanted a simulation based book because that’s what people recommend (and is in the Common Core), but it did not cause them to really rethink. I don’t blame them, it is a lot of work to put together a book like that and I think only one of the authors is an academic. That said, anyone can take the files and make their own version … I looked at that idea but it’s still a huge amount of time.

      • elin says:

        I want to like Open Intro so much, but I find aspects of that book really frustrating, it’s as if they didn’t think at all about how intro students learn not to mention the graphs. I do think it isn’t sure who its audience is.

  7. allan says:

    ..quite a chore ahead of you if you’re gonna take on the eternally corrupt textbook-publisher-academia mafia. The body count of student victims is in the hundreds of millions.

  8. John Goodwin says:

    To balance this, let’s hear from people who *like* statistics how they first encountered the subject in book form. I was given a copy of Bevington’s book on Error Analysis (how to write your own Fortran programs as a scientist), and later discovered Box and Hunter’s Experimental Design book in the library, somewhat randomly.

    That’s when I started to care.

  9. Dale Lehman says:

    The problem is larger than statistics. I was trained as an economist – and I detest most economics texts. Fictional data made up to make a political statement; superficial “analysis” aimed to convince people that government is bad and markets are good (a conclusion that I often agree with, but prefer to reach that view after reasoned analysis and not contrived misleading simplified propaganda). As a non-statistician who has been teaching statistics for many years, I have always found most statistics books to be poor – small unrealistic examples where rich real data should be used; rote mechanical exercises designed to make teaching and grading mindless and easy rather than stimulating examples that require students and instructors to think carefully and humbly.

    So, my point is that the fault lies squarely in academia. These texts exist and survive because we are not doing our jobs teaching. I don’t blame the publishers (nor do they earn any respect for catering to our worst habits). I do blame the authors as they are the epitome of what I think is the worst of our teaching. But mostly I blame academics who are willing to use books that cater to teaching mechanics rather than thinking; who care more about grading as a means to ensure students are studying rather than grading as a means to provide meaningful feedback to further develop students’ minds. And I blame academics who defend themselves with the belief that they are overworked and underpaid, so this is what you should expect to get.

  10. Bill B says:

    Normally, I put books like that on my shelf so that I can sell them for beer money next time one of those shady used textbook buyers knocks on my office door. But in this case, that might be unethical, since getting one more lower-priced copy into circulation might induce some naive student to buy it who might otherwise decide to get by on class notes alone.

  11. Jack says:

    The publishers do not force teachers to use this book.

    • Andrew says:

      Jack:

      No, but they’re knowingly flooding the market with crap. Or, perhaps I should say, flooding the market with material that, as far as they can tell, is crap. And then ripping off students. It’s a classic bust-out maneuver, just like in Goodfellas: the market for textbooks is collapsing so they’re trying to squeeze out every dollar they can, before the whole system crumbles.

      I call that immoral, just as I call three-card monte immoral.

      Nobody’s forcing people to play three-card monte either. But it’s still fraud.

      • Jimmy Two-Times says:

        I’m gonna go get the t-tables get the t-tables.

      • Rahul says:

        Andrew:

        Speaking of flooding the market with crap: Aren’t the Journals doing the same thing? When we incentivize quantity over quality?

        Do we need so many sub-standard, marginal-impact articles flooding the marketplace of ideas?

        Knowing that you so prolifically *review* articles, aren’t you part of the problem there?

        • Andrew says:

          Rahul:

          Huh? I’m part of the problem because I review articles that journal editors send to me? I guess, sure, you could say that if I and other reviewers just say No, that the journals would go out of business. So, maybe so. I’ve done my part by always saying no when people ask me to be a journal editor.

      • John Goodwin says:

        tldr: (sqrt(-textbooks))^2

      • Martha (Smith) says:

        “No, but they’re knowingly flooding the market with crap. Or, perhaps I should say, flooding the market with material that, as far as they can tell, is crap.”

        I doubt that the publishers realize that it’s crap.

        • Andrew says:

          Martha:

          I agree; I too doubt that the publishers realize that it’s crap. But they do send books out for review, so my point is that they don’t care if the book is crap. Just like the manufacturer of other “lemons”: they carefully avoid assessing the quality of the product that they’re selling. They don’t care as long as they’re getting their 200 bucks. It’s not like the editor’s or publisher’s reputation are on the line.

          Say what you want about Susan Fiske, at least she attached her name to those PPNAS papers. For better or for worse, she’s standing by himmicanes and the rest: those papers got approved on the benefit of her reputation, and now her reputation is tied in part to those papers that she approved. For the book editors and publishers, though, the reputational link seems very weak—indeed, I didn’t mention the book or publisher names in my above post because it seems like just about every textbook publisher puts out books of this quality. It’s their bread and butter.

    • Anonymous Coward says:

      If the author teaches 500 students a year, and each of them have to pay $200, that’s $100,000/year. If the author’s colleagues and friends also require it, that’s a lot of students getting screwed over. Maybe the publisher can’t force teachers to adopt it, but sometimes factors other than what is best for students enter into the decisions.

  12. Mike Maltz says:

    Like Dale Lehman (above), I was not trained in statistics, but engineering. So when I started teaching I made do with what I thought I was supposed to teach. From one of my recent papers on data visualization:

    “First, a confession. I taught statistics for 30 years, and for most of that time, I stuck pretty close to the topics covered in standard social science and statistics textbooks, explaining correlation, regression, statistical significance, t -tests, ANOVA, etc. In other words, I ended up teaching the same-old, same-old statistical methods (sorry, students), primarily inferential statistics, without stopping to consider how well they filled the students’ needs. Of course, students need to know these methods, since they have to be able to interpret the findings of papers written by researchers who learned, and were applying the same-old, same-old methods. But they also need to know what assumptions are implicit in these methods; many are based on random sampling, which is often not the case, and on linearity, normality, independence, and other idealizations that are rarely found in real data – which does not stop researchers from applying them.”

    • Martha (Smith) says:

      I was lucky in some sense: I was not trained in statistics, but in mathematics. So when I first started teaching statistics (because I was interested in it and there was a severe shortage of statisticians at my university), I found it really frustrating that the textbooks were so “authoritarian” — i.e, “this is how you do it” with little or any “this is why we do it.” (All the more so because in teaching math I emphasized “explain your reasoning” rather than using algorithms to come up with an answer.)

      Fortunately, there were a couple of statisticians around who were glad to answer my questions and point me to things to fill in gaps in my background. Also, an NSF funded summer workshop for mathematicians-teaching-statistics was helpful in pointing me toward textbooks which were better-than-average.

      • Keith O'Rourke says:

        > math I emphasized “explain your reasoning” rather than using algorithms to come up with an answer
        Interesting.

        I once had summer student who had just completed their Phd in math who thought they might want to switch to statistics.

        When I tried to get them to explore/grasp the reasoning in a statistical approach (e.g. what do you think the motivation was for this novel method proposed by Efron) they retorted “isn’t there just an algorithm you look up somewhere to do that?”

        Whether that was a reflection of how they were taught math or their expectations of what statistics was I am not sure.

        p.s. They went into statistics and successfully published there but the last time I reviewed one of their drafts I raised a fair number of criticisms. The draft was accepted by the journal by the time they got these, so they responded with “I guess I don’t have to respond to these” and they didn’t until they had subsequently published another 2 or 3 papers that only addressed a subset of them.

  13. anon says:

    Mind-boggling that bad and boring textbooks are sold for $200 and something like Cosma Shialzi’s terrific and entertainingly written (seriously, best prose in any textbook I’ve looked at) textbook Advanced data Analysis from an elementary point of view is available for free on his website.

  14. Vladimir says:

    You mentioned OpenStats is okay and Dick DeVeaux’s book is good for the standard stuff, but what textbook should I read to learn the *correct* way for different levels of skill (intro, intermediate, advanced)?

    • Andrew says:

      Vladimir:

      I like my books! As well as others that have been discussed on this blog such as Box/Hunter/Hunter, Lohr, McElreath, . . .

      I can’t really heartily recommend anything at the intro level, though.

    • Ben Goodrich says:

      If you can wait until March 2017, I would recommend Kosuke Imai’s book for the undergrad or master’s level
      http://press.princeton.edu/titles/11025.html
      even though my undergrads did not like it any more than they would have liked a traditional textbook.

      • elin says:

        No table of contents, that’s frustrating. It looks interesting but sounds like it (as with most of these books) covers much more than it is possible to meaningfully cover in a one semester course with beginners. Having taught my last course of the semester tonight, I was thinking about this. First, I’m glad my beginners learn some programming via R even though that took time. I’m fine with “losing” a few hours to having groups present research results during one class. I’m frustrated that I didn’t cover everything I wanted to cover. But my students are real beginners, .

        • Martha (Smith) says:

          Yes, this points out a big problem in teaching intro statistics — it’s just not possible to do a good job (except possibly with really exceptional students) in one course.

      • Keith O'Rourke says:

        Ben:

        This endorsement seems overly hopeful in the extreme for someone at the intro level – It provides a seamless path from ignorance to insight in a few hundred clear and enlightening pages.”–Gary King, Harvard University

        Do you have a sense of what the students actually learned in their one or two semesters using this book?

    • elin says:

      I do think that Franklin’s book for high school students would be very useful for a first undergraduate course if it didn’t rely on TI calculators instead of statistical software.

  15. Stephen Martin says:

    Just a question. Is there a textbook that teaches introductory statistics from a PURELY Bayesian perspective?
    I love BDA and the doggie book, but it’s not like I could hand those to 0-experience intro-stats undergrads.

    So – Is there a book that covers many similar topics that most intro books do, but approach it from a purely Bayesian perspective, such that I could hand it to an intro student and it would be useful? No NHST whatsoever, no fixes to frequentist problems, no frequentist logic, literally *just* Bayesian modeling?

  16. What about this one? Also free, has a focus on effect sizes, and covers Bayesian stuff as well. Authors actually prefers Bayesian, but have to cover NHST stuff because commonly used (positive feedback loop for methods).

    http://health.adelaide.edu.au/psychology/ccs/teaching/lsr/

  17. Laws, Andrew. Rugby doesn’t have rules. It has laws of the game. :)

  18. Seriously, Andrew if I could Kickstarter (or other funding system) into producing a full series of educational course materials for physical and social sciences with real world Bayesian statists, for release in Creative Commons licenses I’d take it on over the next few years and produce a fully Bayesian course on experimentation, modeling, and data analysis, with asides to frequentist perspectives, real world examples from biology, engineering, physics, economics, epidemiology, sociology, criminology, and various places… Using a combination of physical experimentation, analysis of government datasets, analysis of public datasets, etc etc.

    The thing is, although there’s a market for textbook publishers to produce that crap book you wrote about because they get a monopolistic captive market, is there a market for kickstartering the production of a fully fledged good open access resource?

    To release it Creative Commons at the back end requires getting paid up front for the full cost of production (or at least getting some money up front to fund production, and the rest at release). I estimate maybe $200,000 – $500,000 depending on what’s produced, I’m talking books, worked examples, homework problems, code, downloadable datasets, a USB-key image that boots to a modified Ubuntu with all the tools you could want: R, RStudio, Stan, Emacs, ESS, Maxima, Octave, MySQL, built-in datasets, the whole works.

    I’m not an academic, so my salary isn’t going to come from a University while I write and produce and test the thing over a multi year period in my intro-class etc.

    It doesn’t seem reasonable for college students to be willing to kickstarter this kind of thing… after all the ones who will use it are currently high school students with no knowledge that they even need such a thing, and the ones who wish they had it now will be well out of their first couple years of Ugrad by the time it’s produced.

    So, you’ve got a big audience here, what do you think? Is there a way to raise the funds needed?

    • Louis says:

      Cool idea but the way. I’d happy to contribute pro bono.

    • Rahul says:

      You’d have stiff competition from academics with their time already paid for. Besides, isn’t writing books a big part of what they are already paid for?

      The other point is, full time book writing (unsubsidized by a primary job), from a strictly economic perspective is a hard activity to justify: There’s already tons of stuff on Bayesian statistics already out there, some even in the open domain. So first, one would have to be reasonably sure you’d produce stuff better than the baseline. Now, I’m not doubting your competence but it’s a risky proposition for any author. It’s like knowing you’d end up with a “War and Peace” the day you commissioned someone to write it for you.

      Then again, even if you are better than all known sources, the question is how much better. If there’s a marginal improvement over all the stuff out there would people be willing to fund it?

      et cetra. I think there’s a reason why book writing has largely been a by-product of the academic enterprise and the itch to write and be known for your work. Rather than a full time, directly paid for activity.

  19. Louis says:

    I have sent emails to the representatives of the major publishers informing them that I ever would receive another unsollicited copy again, I would go out of the way of ever using a book of them in any of my courses.
    I explain that I do this solely out of environmental considerations.

    So far this “threat” has worked.

    I do not get upset so much about receiving yet another intro to xxxx book which says exactly the same as all the other books in way too many pages with way to many pictures. What really bothers me is the waste AND the outrageous price.

Leave a Reply