Kevin Lewis and Paul Alper send me so much material, I think they need their own blogs. In the meantime, I keep posting the stuff they send me, as part of my desperate effort to empty my inbox.

**1.** From Lewis:

“Should Students Assessed as Needing Remedial Mathematics Take College-Level Quantitative Courses Instead? A Randomized Controlled Trial,” by A. W. Logue, Mari Watanabe-Rose, and Daniel Douglas, which begins:

Many college students never take, or do not pass, required remedial mathematics courses theorized to increase college-level performance. Some colleges and states are therefore instituting policies allowing students to take college-level courses without first taking remedial courses. However, no experiments have compared the effectiveness of these approaches, and other data are mixed. We randomly assigned 907 students to (a) remedial elementary algebra, (b) that course with workshops, or (c) college-level statistics with workshops (corequisite remediation). Students assigned to statistics passed at a rate 16 percentage points higher than those assigned to algebra (p < .001), and subsequently accumulated more credits. A majority of enrolled statistics students passed. Policies allowing students to take college-level instead of remedial quantitative courses can increase student success.

I like the idea of teaching statistics instead of boring algebra. That said, I think if algebra were taught well, it would be as useful as statistics. I think the most important parts of statistics are not the probabilistic parts so much as the quantitative reasoning. You can use algebra to solve lots of problems. For example, this age adjustment story is just a bunch of algebra. Algebra + data. But there’s no reason algebra has to be data-free, right?

Meanwhile, intro stat can be all about p-values, and then I hate it.

So what I’d really like to see is good intro quantitative classes. Call it algebra or call it real-world math or call it statistics or call it data science, I don’t really care.

**2.** Also from Lewis:

“Less Is More: Psychologists Can Learn More by Studying Fewer People,” by Matthew Normand, who writes:

Psychology has been embroiled in a professional crisis as of late. . . . one problem has received little or no attention: the reliance on between-subjects research designs. The reliance on group comparisons is arguably the most fundamental problem at hand . . .

But there is an alternative. Single-case designs involve the intensive study of individual subjects using repeated measures of performance, with each subject exposed to the independent variable(s) and each subject serving as their own control. . . .

Normand talks about “single-case designs,” which we also call “within-subject designs.” (Here we’re using experimental jargon in which the people participating in a study are called “subjects.”) Whatever terminology is being used, I agree with Normand. This is something Eric Loken and I have talked about a lot, that many of the horrible Psychological Science-style papers we’ve discussed use between-subject designs to study within-subject phenomena.

A notorious example was that study of ovulation and clothing, which posited hormonally-correlated sartorial changes within each woman during the month, but estimated this using a purely between-person design, with only a single observation for each woman in their survey.

Why use between-subject designs for studying within-subject phenomena? I see a bunch of reasons. In no particular order:

1. The between-subject design is easier, both for the experimenter and for any participant in the study. You just perform one measurement per person. No need to ask people a question twice, or follow them up, or ask them to keep a diary.

2. Analysis is simpler for the between-subject design. No need to worry about longitudinal data analysis or within-subject correlation or anything like that.

3. Concerns about poisoning the well. Ask the same question twice and you might be concerned that people are remembering their earlier responses. This can be an issue, and it’s worth testing for such possibilities and doing your measurements in a way to limit these concerns. But it should not be the deciding factor. Better a within-subject study with some measurement issues than a between-subject study that’s basically pure noise.

4. The confirmation fallacy. Lots of researchers think that if they’ve rejected a null hypothesis at a 5% level with some data, that they’ve proved the truth of their preferred alternative hypothesis. Statistically significant, so case closed, is the thinking. Then all concerns about measurements get swept aside: After all, who cares if the measurements are noisy, if you got significance? Such reasoning is wrong wrong wrong but lots of people don’t understand.

Also relevant to this reduce-N-and-instead-learn-more-from-each-individual-person’s-trajectory perspective is this conversation I had with Seth about ten years ago.

Andrew: ““Less Is More: Psychologists Can Learn More by Studying Fewer People”

This title reminds me of Meehl’s paper “Theory-testing in Psychology and Physics”:

“Because physical theories typically predict numerical values, an improvement in experimental precision reduces the tolerance range and hence increases corroborability. In most psychological research, improved power of a statistical design leads to a prior probability approaching 1/2 of finding a significant difference in the theoretically predicted direction. Hence the corroboration yielded by “success” is very weak, and becomes weaker with increased precision. “Statistical significance” plays a logical role in psychology precisely the reverse of its role in physics.”

The subject is not quite the same, yet the underlying notion of more study leading to worse results is there in each work.

The exercise for the reader is to draw a connection between the two topics.

Shameless plug – skip if you like.

I’ve (co)written Common Sense Mathematics, for college students who aren’t going on in science so should not redo high school algebra even when it’s called College Algebra.

The thrust is real problems (the text and the exercises all start with direct quotes from newspapers and the web) to think about estimation, percentage change, elementary descriptive statistics (teaching Excel), regression and regression nonsense, linear and exponential growth, credit card debt and mortgages, probability.

The last chapter on the prosecutor’s fallacy and dealing with false positives quotes this blog (with Andrew’s permission) in several places. (Use contingency tables and natural frequencies, not Bayes’ Theorem).

http://www.commonsensemathematics.net/

http://www.maa.org/press/books/common-sense-mathematics

Sounds like the kind of thing we need more of.

Ethan:

From your first link, I agree this is a common (mis)perception of both students and faculty – “a math course like any other — formal skills to master with no thought as to their usefulness or meaning.”

“Students rarely believe us. Many complain half way through that this math course isn’t like any other they’ve taken.

Where are the formulas? As instructors we often find that hard to remember. Since we’re mathematicians, we’re

tempted to think (semi)formal mathematics is both more useful and more important than it really is.”

One of the many hurdles – “Unfortunately, many quantitative reasoning instructors are underpaid adjuncts stitching together multiple jobs to eke out a living. Perhaps it’s unfair to ask them to spend the extra time it can take to teach from our text. “

The context of University finances, realities and politics is probably a large part of the problem.

Hopefully efforts like your will help reduce the costs/risks for others.

>”The confirmation fallacy. Lots of researchers think that if they’ve rejected a null hypothesis at a 5% level with some data, that they’ve proved the truth of their preferred alternative hypothesis. Statistically significant, so case closed, is the thinking.”

I’ve been seeing that what researchers want to do is simply stop mentioning p-values, significance, null hypothesis, etc but keep on with the same problems. For example, in this vaccine trial they briefly mention giving the vaccinees ibuprofen, then this is completely ignored as accounting for any decrease in diagnoses (fever is the primary sign of ebola). It is interesting to see there is extended discussion and concern about statistical issues like biased samples, but none about the scientific issues with the study: http://www.thelancet.com/journals/lancet/article/PIIS0140-6736(16)32621-6/fulltext

I’m becoming concerned that by complaining about NHST, we are losing simple heuristics to gauge the competence of the researchers, without an improvement in reliability.

I didn’t follow on to the Lancet story, but your broader comment resonates. What will the next fallacies and logic traps be beyond NHST? Does the machinery of NHST induce the poor logic in many scientific papers? Or does poor reasoning find an outlet in NHST type arguments? I guess we will find out in time…

The “From Lewis” and “Also from Lewis” links go to the same place.

fixed; thanks.

Hi Andrew,

Re the issue of between-subjects versus within-subject design and analysis in psychology, see Molenaar, P. C. M. (2004). A manifesto on psychology as idiographic science: Bringing the person back into psychology, this time forever. MEASUREMENT: INTERDISCIPLINARY RESEARCH AND PERSPECTIVES, 2, 201–218, with comments by Tuerlinckx, von Eye, Thum, Rogosa, Nesselroade, Curran & Wirth, a discussion by Molenaar, and a separate reply to Rogosa.

I think a best way I to teach is by case and for math you need statistical and mechanical and the like examples. For example: trig is hated but children on boats learned it because they had to figure out their latitude but we’ve abstracted the basic reason to instead think that teaching using symbols is faster. Yes, symbols are faster for a small subset of people, including a subset of mathematicians and physicists, etc. but not the entire set of really, really good people in those fields. But if we set the group to “nearly every person in school” then we need to teach them by an example and I don’t mean those shit examples we get in those awful books. Those seem to me to fall in two categories: those which begin from childishly simple notions and which then shift into “so by lemma 6.b.2, we see the factors are coterminous” without anything in between, as though they’re either writing for 1st graders or grad students, and those which are simply incapable of explaining using a meaningful set of examples but which instead have problem sets of equations as though someone trying to learn can tell what the equations mean by reading symbols and that all the translation into concepts which could be transmitted better with good examples is that learner’s own bleeping problem. As I like to say, imagine you are given some stuff and told to make a glove. Or you’re dumped in a cold shack and you have to cut wood to heat it. What kind of lousy bleep glove are you going to make without training, without people showing you how to make a glove? And you may manage to cut some wood, but you also might kill yourself unless someone has shown you how to do it properly – and how to maintain your tools, etc.

So my position is that we teach badly because a) we don’t teach using the natural method by which people learn but instead rely on substituted abstractions as though they adequately convey the stories that numbers and symbols actually tell and b) we don’t even try to come up with examples that matter. (Your papers about examples are an obvious exception.) You’d think that after decades teachers could manage to share examples of how they see this or that or how they learned this or that … except they do, but only in lower grades. It’s as though we forget how much freaking learning is actually going on in young kids! We act as though they’re given simple stuff when it’s not the simplicity of the material but the amount they have to put together to make sense of the world, to read and write and even to walk and talk at all. So we abandon the method we use when we’re doing the most important and difficult learning, meaning that learning which occurs when the roadmap is in our heads, when we have to traverse the great divide between meaning and abstraction, between distance and effect, between objects here and gone and permanence versus impermanence.

You should watch the MIT opencourseware course by the author of street fighting mathematics.

A bit off-topic but related to Andrew’s comment that he hates first year statistics courses that spend too much time with p-values.

Earlier this year I took computer lab sessions teaching basic statistics to first year medicine students. Amusingly in the past the 2nd year students always told the first year students not to bother taking the course- “It’s only worth 5% of you overall mark, the assignment is kind of hard and who needs statistics anyway?”. After a few years it seems the message got filtered down that the 3rd and 4th year work actually included a fair bit of interpreting medical studies so maybe the statistics course was worth doing after-all.

Anyway, we were given 4 weeks with one hour of lecture and one hour in the lab each week.

The labs were

Week 1- Hey this is Excel, can you make a new variable? How about a scatter plot? Calculate the average height of all the males.

Week 2- Let’s create a pivot table. Please don’t put continuous variables in as a factor- now you have 100 rows. How about you put something simple like gender in instead?

Week 3- How to do a t-test in excel. Don’t worry if this doesn’t really make sense now- it probably never will.

Week 4- Let’s brainstorm about running a state wide study looking at a new medical treatment.

The lectures looked at similar stuff but also looked at making inferences, sampling from populations,random allocation to treatment groups,correlations, statistical power and observational vs controlled experiments. I don’t think they got up to ANOVA.

My question is, given that reason we are teaching young med students any statistics is in the hope that they will one day be able to interpret a medical study (the majority of which are going to have p-values everywhere) and have some ability to pick good studies from poor studies, does it still make sense to spend 1/4 of the lessons talking about t-tests and p-values? What would be more pressing subject matter with only 8 hours to play with? Do we have a duty to explain concepts like garden-of-forking-paths to all prospective doctors?

I think it can and should be done. I have worked out a 5×90-minute course sequence in which I walk students through many of the basic issues of frequentist statistics. I teach it every few years at the European Summer School of Logic, Language and Information. I think this course works now. If you have only eight hours, I would focus on the ideas and forget about the labs.

BTW, next year, there is a Bayesian course at ESSLLI in Toulouse, taught by Mark Andrews (who is an outstanding teacher).

The answer to your last question is an emphatic “Yes!”

First, I think that p-values should continue being taught to students. I know, I know. Many readers of this blog vehemently oppose the framework of hypothesis testing and would prefer a Bayesian approach to scientific discovery. Yet, aspiring doctors will almost certainly come across research couched within the frequentist framework of hypothesis testing, p-values, and the like. So you would be doing medical students a disservice to ignore these topics.

Yet, p-values can and should be discussed in the context of what they really tell us about medical evidence. And here the emphasis should be put on other concepts that are more important than just p<.05 = truth. Not only do p-values treat uncertainty as a binary decision (rather than a continuum), but they mask more important considerations like measurement error, effect sizes, sample representativeness, etc (it's actually quite a long list!).

Second, I think it's most important to teach students how to evaluate

goodresearch. That is, how doctors can become astute consumers (rather than producers) of empirical evidence (and statistics). Thus, students would be better served to learn about causal inference and uncertainty, among other things. I could go on…+1 to Todd’s comments.

“Yet, aspiring XXX will almost certainly come across research couched within the eleusinian mysteries of sulphur steam breathing… So you would be doing medical students a disservice to ignore these topics.”

No. It’s a disservice to give some semblance of pretending that this stuff is science.

Huge swaths of medical research is just holding coconut husks to your head and waving flags and waiting for the planes to land.

On second reading, I’d modify Todd’s sentence “Yet, p-values can and should be discussed in the context of what they really tell us about medical evidence,” to read “Yet, p-values can and should be discussed in the context of what they do NOT tell us about medical evidence.”

Martha: Yes. I originally had written “what they do (and do not) tell us about medical evidence” but edited it out. I agree with you that it should be put back in.

Daniel: I see your point. And that’s what I was trying to make that the focus of my comment — that it’s better to train doctors how to evaluate evidence rather than being able to perform a few statistical tests. I completely agree that just because a study has a significant finding (p<.05) doesn't mean it provides strong evidence. But it also doesn't mean that the study is meaningless either. There's a lot that goes into evaluating what the available data tell us, and I would hope that doctors would be trained to properly weigh these considerations (so that they can make informed decisions and help their patients).

All good suggestions- I don’t design the course but I might get to one day if I keep running the labs. I think they do also do a separate course looking at medical research and reviewing some literature, but that is done by someone else in the medicine school. They also all get the option to take an extra year of study and get a double degree in medicine and medical research.

I suspect that there is a bit of a political problem that the school just wants someone from the Maths dpt to come teach “statistics”, not to mess with their heads by telling them that the way everyone sitting up-stairs is doing research is probably flawed.

I just went to a noise-chasing middle-aged GP who doesn’t understand what a positive test result means (or doesn’t) when the false positive rate exceeds the incidence rate in the general population.

I’ve looked at that study a few times because it’s causing CUNY to basically rethink all of the math requirements and all the campuses will be forced to have to have a statistics class that does not require students to have college algebra.

However, there is a basic problem with the framing of the study. What is it that they are doing in those statistics classes that these students are passing? My impression is that they are basically teaching statistics as an arithmetic class. It’s not all about p values (I’m thinking that’s a couple of weeks at the end maybe) but about using formulas to calculate the mean and percents, defining the median as “the middle number” in a list of numbers that is no longer than half a line. Students may learn summation notation and permutations and combinations and they may or may not use a fancy calculator but definitely not a computer or Excel, but they are doing a tremendous amount of calculation by hand. They learn to use a normal table from the back of the book to look up probabilities. So … I would actually call this largely a variation on college algebra with a statistics wrapper. Not a bad thing, probably more useful than factoring polynomials. And what they are finding is that, sad but true, you really don’t need a lot of math to succeed in a lot of majors and the majors that need stats will offer their own courses.

This is math 5 (the EA) http://fsw01.bcc.cuny.edu/mathdepartment/Courses/Math/MTH05/math05.htm

This is the stats

http://www.bcc.cuny.edu/MathematicsComputerScience/math23.1.htm

I’m not saying that it is a bad thing to learn college algebra in the applied context of this kind of statistics course. I think it is a good thing, especially done with intentionality. If students have never mastered slope and intercept how can they know anything about regression? I just don’t buy it. Are they passing? Sure they can plug and chug the homework problems. Are they learning the same amount as in EA? Probably. But it really seems like a slight of hand to say that EA is remedial and the basic stats course is not.