A client tried to stiff me for $5000. I got my money, but should I do something?

This post is by Phil Price, not Andrew.

A few months ago I finished a small consulting contract — it would have been less than three weeks, if I worked on it full time — and I find it has given me some things to think about, concerning statistical modeling (no surprise there) but also ethics. There’s no particular reason anyone would be interested in hearing me ramble on about what was involved in the job itself, but I’m going to do that anyway for a few paragraphs. Maybe it will be of interest to others who are considering going into consulting. If you are here for the ethical question then you can skip the next several paragraphs; pick up the story at the line of XXXX, far below.

Continue reading

ChatGPT4 writes Stan code so I don’t have to.

Several months ago I (Phil Price) wrote a Stan model to do some time series forecasting. It took me almost a full day to get it running and debugged. Today I decided to test ChatGPT4, by seeing if it could write a Stan model to give me the same functionality. I have been using ChatGPT to help me with coding, but in the past I’ve used it for little stuff, such as “I have a data frame with column names that all start with mm, and some of the names have ( and ) in them; write a function to remove all of that”. I have just started using it to write entire programs for me, and it looks like it is going to be enormously useful. Below, I present a verbatim transcript of the start of a ChatGPT4 session. At the end I got tired of cutting and pasting — and you would have gotten tired of reading it — but I tell you what happened. And at the very end, I’ll summarize my thoughts on the ChatGPT4 programming experience thus far.


Phil: I have monthly time series data. I want to fit a model to the data (typically from the past five to eight years) and use the model to forecast future values. Let’s call the data y[t]. Here are some things I want to include:
1. Seasonality. If the value in March is high in one year it is probably also high in the following year, and so on for other months. So I want ‘month effects’.
2. Autocorrelation in month effects. The month effect, for a given month of the year, is not iid random. If the March effect is high for several years and then drops a lot, it is more likely to stay low than to jump back up.
3. Local trend. If the value has increased for the past several months it will probably keep increasing, although it might not.
4. Soft constraints. The value of y[t] is very unlikely to fall below a value y_low, or to go above y_high. I don’t want to impose the constraints after the fact. Instead, I want to repel the forecasts from these limits. I want a fairly short-range repulsion, so the forecasts aren’t affected unless they get close to the limits.

Please write a Stan program to do this, and include comments that show where each of those four requirements is implemented.

ChatGPT4: Here’s a Stan program that incorporates seasonality, autocorrelation in month effects, local trend, and soft constraints:

Continue reading

The cleantech job market: Every modeler is supposed to be a great Python programmer.

This post is by Phil Price, not Andrew.

I’ve had a run of luck ever since I left my staff scientist position at Lawrence Berkeley Laboratory to become a freelance consultant doing statistical modeling and forecasting, mostly related to electricity consumption and prices: just as I finished a contract, another one would fall into my lap. A lot of work came my way through my de facto partner Sam, but then my friend Clay brought me into a project, and every now and then my friend Aeneas has something that he needs for his company, and I had a couple of clients who found me through having heard about me without any personal connection.

One lesson is: even in today’s world, with LinkedIn and websites and blogs and other ways of making ourselves known to the world, personal contacts matter a lot in getting consulting work. Or at least that has been the case for me. That’s been good for me because I’ve had good contacts, but it’s not necessarily good for society. If you’re younger and don’t have a lot of work experience, and you don’t have many friends doing the same sort of work you’re doing, you won’t have the advantages I’ve had.

So, for seven years everything was great. But this year has not gone so perfectly: I’m down to two clients at the moment, and one of them only needs a little bit of work from me each month. I’m looking for work but, never having had to do it before, I don’t really know how. But one thing I know is that people use LinkedIn to look for jobs and for people to fill those jobs, so I updated my long-moribund LinkedIn profile and clicked a few buttons to indicate that I’m looking for work. Several recruiters have contacted me about specific jobs, and I’ve also been looking through the job listings, looking for either more consulting work or for a permanent job.

Three things really stand out. Here’s the TLDR version:
1. There’s a lot of demand for time series forecasting of electricity consumption and prices.
2. The modeler has to write the production code to implement the model.
3. It’s gotta be Python.

That’s pretty much it for factual content in this post, but then I have some thoughts about why one aspect of this doesn’t make much sense to me, so read on if this general topic is of interest to you.


I. Modeling and Forecasting of Electricity Supply, Demand, and Price.

There are quite a few jobs for electricity time series modeling, and for optimization based on that modeling. Some companies want to predict regional electricity demand and/or price and use this to decide when to do things like charge electric vehicles or operate water pumps or do other things that need to be done within a fairly narrow time window but not necessarily right now. And then there are other forecasting and optimization problems like whether to buy a giant battery to use when the electricity price is high, and if so how big, and how do you decide when to use it or recharge it. All of this stuff is right up my alley: I’m good at this and I have lots of relevant experience. To give an example of a job in this space, here’s something from a job description I just looked at (for a company called Geli): “Your primary responsibility will be to lead the development of our time series forecasting models for solar and energy consumption using machine learning techniques, but you will also help develop new forecasting models as various needs arise (eg: prototyping forecasting wholesale prices for a new market).” This is extremely similar to work I have been doing off and on for one of my clients for the past eighteen months or so. Sounds great.

And there’s a bullet list for that same job listing:
* Feature engineering
* Prototyping new algorithms
* Benchmarking performance across various load profiles
* Integrating new forecasting algorithms into our production code base with robust test coverage
* Collaborate with the rest of the team to assess how forecasts can be adjusted for various economic objectives.
* Proactively identify opportunities within [our company] that can benefit from data science analysis and present those findings.
* Work collaboratively in a diverse environment. We commit to reaching better decisions by respecting opinions and working through disagreements.
* Gain in depth experience in an exciting industry as you work with storage sizing, energy financial models, energy tariffs, storage controls & monitoring.

Continue reading

Time Series Forecasting: futile but necessary. An example using electricity prices.

This post is by Phil Price, not Andrew.

I have a client company that owns refrigerated warehouses around the world. A refrigerated warehouse is a Costco-sized building that is kept very cold; 0 F is a common setting in the U.S. (I should ask what they do in Europe). As you might expect, they have an enormous electric bill — the company as a whole spends around a billion dollars per year on electricity — so they are very interested in the cost of electricity. One decision they have to make is: how much electricity, if any, should they purchase in advance? The alternative to purchasing in advance is paying the “real-time” electricity price. On average, if you buy in advance you pay a premium…but you greatly reduce the risk of something crazy happening. What do I mean by ‘crazy’? Take a look at the figure below. This is the monthly-average price per Megawatt-hour (MWh) for electricity purchased during the peak period (weekday afternoons and evenings) in the area around Houston, Texas. That big spike in February 2021 is an ice storm that froze a bunch of wind turbines and also froze gas pipelines — and brought down some transmission lines, I think — thus leading to extremely high electricity prices. And this plot understates things, in a way, by averaging over a month: there were a couple of weeks of fairly normal prices that month, and a few days when the price was over $6000/MWh.

Monthly-mean peak-period (weekday afternoon) electricity price, in dollars per megawatt-hour, in Texas.

If you buy a lot of electricity, a month when it costs 20x as much as you expected can cause havoc with your budget and your profits. One way to avoid that is to buy in advance: a year ahead of time, or even a month ahead of time, you could have bought your February 2021 electricity for only a bit more than electricity typically costs in Texas in February. But events that extreme are very rare — indeed I think this is the most extreme spike on record in the whole country in at least the past thirty years — so maybe it’s not worth paying the premium that would be involved if you buy in advance, month after month and year after year, for all of your facilities in the U.S. and Europe. To decide how much electricity to buy in advance (if any) you need at least a general understanding of quite a few issues: how much electricity do you expect to need next month, or in six months, or in a year; how much will it cost to buy in advance; how much is it likely to cost if you just wait and buy it at the time-of-use rate; what’s the chance that something crazy will happen, and, if it does, how crazy will the price be; and so on.

Continue reading

Chess cheating: how to detect it (other than catching someone with a shoe phone)

This post is by Phil Price, not Andrew.

Some of you have surely heard about the cheating scandal that has recently rocked the chess world (or perhaps it’s more correct to say the ‘cheating-accusation scandal.’) The whole kerfuffle started when World Champion Magnus Carlsen withdrew from a tournament after losing a game to a guy named Hans Niemann. Carlsen didn’t say at the time why he resigned, in fact he said “I really prefer not to speak. If I speak, I am in big trouble.” Most people correctly guessed that Carlsen suspected that Niemann had cheated to win. Carlsen later confirmed that suspicion. Perhaps he didn’t say so at the start because he was afraid of being sued for slander.

Carlsen faced Niemann again in a tournament just a week or two after the initial one, and Carlsen resigned on move 2.

In both of those cases, Carlsen and Niemann were playing “over the board” or “OTB”, i.e. sitting across a chess board from each other and moving the pieces by hand. That’s in contrast to “online” chess, in which players compete by moving pieces on a virtual board. Cheating in online chess is very easy: you just run a “chess engine” (a chess-playing program) and enter the moves from your game into the engine as you play, and let it tell you what move to make next. Cheating in OTB chess is not so simple: at high-level tournaments players go through a metal detector before playing and are not allowed to carry a phone or other device. (A chess engine running on a phone can easily beat the best human players. A chess commentator once responded to the claim “my phone can beat the world chess champion” by saying “that’s nothing, my microwave can beat the world chess champion.”). But if the incentives are high enough, some people will take difficult steps in order to win. In at least one tournament it seems that a player was using a chess computer (or perhaps a communication device) concealed in his shoe.

I don’t know if there are specific allegations related to how Niemann might have cheated in OTB games. A shoe device again, which Niemann uses to both enter the moves as they occur and to get the results through vibration? A confederate who enters the moves and signals Niemann somehow (a suppository that vibrates?). I’m not really sure what the options are. It would be very hard to “prove” cheating simply by looking at the moves that are made in a single game: at the highest levels both players can be expected to play almost perfectly, usually making one of the top two or three moves on every move (as evaluated by the computer), so simply playing very very well is not enough to prove anything.

Continue reading

Praising with Faint Damnation

This post is by Phil Price, not Andrew.

A friend and I were discussing a route for a bike ride. I was pretty tired and unmotivated so I said sure but let’s do a really easy ride. I suggested taking Bay Bridge bike path from Berkeley (California) to Treasure Island. My friend had never done that, and said “is that pretty nice?” I replied “No it’s not nice at all…it’s probably the least-pleasant ride I do with any regularity.”

One might think “jeez, why would you want to do that ride, then? But consider: _something_ has to be “the worst ride I do with some regularity.” I do this one every couple of months, when I want a change from my usual rides and when I don’t want to exert myself too hard. If it really sucked I wouldn’t do it at all! It’s sort of the opposite of “damning with faint praise”: I’m praising with faint damnation.

This is peripherally related to the Reebok Principle, which came up here about eighteen months ago in a post that I think is worth re-visiting because of its comment section, which went off into pandemic-related stuff a bit and provides an interesting reminder of what people were thinking at that point.

This post is by Phil.

Is Martha (Smith) still with us?

This post is by Phil Price, not Andrew.

It occurred to me a few weeks ago that I haven’t seen a comment by Martha (Smith) in quite a while…several months, possibly many months? She’s a long-time reader and commenter and often had interesting things to say. At times she also alluded to the fact that she was getting on in years. Perhaps she has simply lost interest in the blog, or in commenting on the blog…I hope that’s what it is. Martha, if you’re still out there, please let us know!

The Course of the Pandemic: What’s the story with Excess Deaths?

This post is by Phil Price, not Andrew.

A commenter who goes by “Anoneuoid” has pointed out that ‘excess deaths’ in the U.S. have been about as high in the past year as they were in the year before that. If vaccines work, shouldn’t excess deaths decrease?

Well, maybe not. Anoneuoid seems to think vaccines offer protection against COVID but increase the risk of deaths from other causes. Or something. I don’t much care about Anon’s belief system, but I do think it’s interesting to take a look at excess deaths. So let’s do that.

I went to https://stats.oecd.org and searched for ‘excess’ in the search field, which led me to a downloadable table of ‘excess deaths by week’ for OECD countries. “Excess deaths” means the number of deaths above a baseline (which I believe is the average over the previous ten years or something, perhaps adjusted for population; I don’t know the exact definition used for these data). “Excess deaths” over the past couple of years have been dominated by COVID deaths but that’s not the only effect: at least in the first year of the pandemic people were avoiding doctors and hospitals and thus missing out on being diagnosed or treated for cancer and heart disease and so on, suicides and car accident numbers have changed, and so on.

Below is a plot of excess deaths, by week since the beginning of 2020, in nine OECD countries that I selected somewhat haphazardly. You can download the data yourself and make more plots if you like.

“Excess Deaths” by week, as a percent of baseline deaths, in nine OECD countries, including the US.

If you had asked me a year or so ago, “what do you think will happen with US COVID deaths now that we have vaccines” I probably would have guessed something like what has happened in Italy or the UK or Belgium or France: there would be some ups and downs, but at substantially decreased magnitude. Instead, the US really stands out as being the only country that had high excess mortality prior to the vaccines and also has high mortality now.

But then, I also expected that just about everyone in the US would get vaccinated, which isn’t even close to being the case (about 20% of US residents haven’t gotten any COVID vaccination, and about 30% are not ‘fully vaccinated’…a term that is a bit misleading, perhaps, as the effects of the vaccines wears off for those of us who got our booster several months ago).

Also, there are competing factors — competing in the sense that some tend to make excess deaths increase while some make it decrease. Vaccines provide substantial protection, and doctors have gotten better at treating COVID, so those tend to lead to lower COVID deaths. But most people seem to have resumed normal life without many COVID precautions, presumably leading to higher infection rates than there would otherwise be. And of course there are still traffic accidents and suicides and drug overdoses and so on that could either increase or decrease compared to baseline.

I find the figure above really interesting. Here are a few things that stand out to me, in no particular order:

  • Denmark had no excess mortality through early 2021! That’s remarkable, they saved a lot of lives compared to the other countries.
  • Canada looks like the US in temporal pattern, which kinda makes sense, but with mortality at about half the US level.
  • I knew Italy got hit very hard early on, northern Italy especially, but hadn’t realized Belgium had it so bad. Jeez they had a terrible first year.
  • The time series in the U.S. is much smoother than in the other countries. Belgium, France, Sweden, the UK, Italy…they all had a big initial spike and then dropped all the way back to 0 excess deaths for a few weeks before the next spike. The U.S. went up and never came all the way back down, even briefly, until a few weeks ago. The U.S. has a much larger population and much larger geographic area than any single European country; perhaps the data on some small part of the U.S., like just New England or just Florida, it would look more like one of these other countries.
  • If the U.S. had matched the average excess mortality of the rest of the OECD countries, hundreds of thousands of Americans would be alive who are now dead.

I guess I’ll leave it to commenters to provide insights on all of this. Go to it!

This post is by Phil.

High-intensity exercise, some new news


This post is by Phil Price, not Andrew.

Several months I noticed something interesting (to me!) about my heart rate, and I thought about blogging about it…but I didn’t feel like it would be interesting (to you!) so I’ve been hesitant. But then the NYT published something that is kinda related and I thought OK, what the hell, maybe it’s time for an update about this stuff. So here I am.

The story starts way back in 2010, when I wrote a blog article called “Exercise and Weight Loss: Shouldn’t Somebody See if there’s a Relationship?” In that article I pointed out that there had been many claims in the medical / physiology literature that claim that exercise doesn’t lead to weight loss in most people, but that those studies seemed to be overwhelmingly looking at low- and medium-intensity exercise, really not much (or at all) above warmup intensity. When I wrote that article I had just lost about twelve pounds in twelve weeks when I started doing high-intensity exercise again after a gap of years, and I was making the point that before claiming that exercise doesn’t lead to weight loss, maybe someone should test whether the claim is actually true, rather that assuming that just because low-intensity exercise doesn’t lead to weight loss, no other type of exercise would either.

Eight years later, four years ago, I wrote a follow-up post along the same lines. I had gained some weight when an injury stopped me from getting exercise. As I wrote at the time, “Already this experience would seem to contradict the suggestion that exercise doesn’t control weight: if I wasn’t gaining weight due to lack of exercise, why was I gaining it?” And then I resumed exercise, in particular exercise that had some maximum short-term efforts as I tried to get in shape for a bike trip in the Alps, and I quickly lost the weight again. Even though I wasn’t conducting a formal experiment, this is still an example of what one can learn through “self-experimentation,” which has a rich history in medical research.

Well, it’s not like I’ve kept up with research on this in the mean time, but I did just see a New York Times article called “Why Does a Hard Workout Make You Less Hungry” that summarizes a study published in Nature that implicates a newly-discovered “molecule — a mix of lactate and the amino acid phenylalanine — [that] was created apparently in response to the high levels of lactate released during exercise. The scientists named it lac-phe.” As described in the article, the evidence seems pretty convincing that high-intensity exercise helps mice lose weight or keep it off, although the evidence is a lot weaker for humans. That said, the humans they tested do generate the same molecule, and a lot more of it after high-intensity exercise than lower-intensity exercise. So maybe lac-phe does help suppress appetite in humans too.

As for the interesting-to-me (but not to you!) thing that I noticed about my heart rate, that’s only tangentially related but here’s the story anyway. For most of the past dozen years a friend and I have done bike trips in the Alps, Pyrenees, or Dolomites. Not wanting a climb up Mont Ventoux or Stelvio to turn into a death march due to under-training, I always train hard for a few months in the spring, before the trip. That training includes some high-intensity intervals, in which I go all-out for twenty or thirty seconds, repeatedly within a few minutes, and my heart rate gets to within a few beats per minute of my maximum. While I’m doing this training I lose the several pounds I gained during the winter. Unfortunately, as you may recall we have had a pandemic since early 2020. My friend and I did not do bike trips. With nothing to train for, I didn’t do my high-intensity intervals. I still did plenty of bike riding, but didn’t get my heart rate up to its maximum. I gained a few pounds, not a big deal. But a few months ago I decided to get back in shape, thinking I might try to do a big ride in the fall if not the summer. My first high-intensity interval, I couldn’t get to within 8 beats per minute of my usual standard, which had been nearly unchanged over the previous 12 years! Prior to 2020, I wouldn’t give myself credit for an interval if my heart rate hadn’t hit at least 180 bpm; now I maxed out at 172. My first thought: blame the equipment. Maybe my heart rate monitor isn’t working right, maybe a software update has changed it to average over a longer time interval, maybe something else is wrong. But trying two monitors, and checking against my self-timed pulse rate, I confirmed that it was working correctly, I really was maxing out at 172 instead of 180. Holy cow. I decided to discuss this with my doctor the next time I have a physical, but in the mean time I kept doing occasional maximum-intensity intervals…and my max heart rate started creeping up. A few days ago I hit 178, so it’s up about 6 bmp in the past four months. And I’ve lost those few extra pounds and now I’m pretty much back to my regular weight for my bike trips. The whole experience has (1) reinforced my already-strong belief that high-intensity exercise makes me lose weight if I’m carrying a few extra pounds, and (2) made me question the conventional wisdom that everyone’s max heart rate decreases with age: maybe if you keep exercising at or very near your maximum heart rate, your maximum heart rate doesn’t decrease, or at least not much? (Of course at some point your maximum heart rate goes to 0 bpm. Whaddyagonnado.)

So, to summarize: (1) Finally someone is taking seriously the possibility that high-intensity exercise might lead to weight loss, and even looking for a mechanism, and (2) when I stopped high-intensity exercise for a couple years, my maximum heart rate dropped…a lot.

Sorry those are not more closely related, but I was already thinking about item 2 when I encountered item 1, so they seem connected to me.

 

An Easy Layup for Stan

This post is by Phil Price, not Andrew.

The tldr version of this is: I had a statistical problem that ended up calling for a Bayesian hierarchical model. I decided to implement it in Stan. Even though it’s a pretty simple model and I’ve done some Stan modeling before I thought it would take at least several hours for me to get a model I was happy with, but that wasn’t the case. Right tool for the job. Thanks Stan team!

Longer version follows.

I have a friend who routinely plays me for a chump. Fool me five times, shame on me. The guy is in finance, and every few years he calls me up and says “Phil, I have a problem, I need your help. It’s really easy” — and then he explains it so it really does seem easy — “but I need an answer in just a few days and I don’t want to get way down in the weeds, just something quick and dirty. Can you give me this estimate in…let’s say under five hours of work, by next Monday?” Five hours? I can hardly do anything in five hours. But still, it really does seem like an easy problem. I say OK and quote a slight premium over my regular consulting rate. And then…as always (always!) it ends up being more complicated than it seemed. That’s not a phenomenon that is unique to him: just about every project I’ve ever worked on turns out to be more complicated than it seems. The world is complicated! And people do the easy stuff themselves, so if someone comes to me it’s because it’s not trivial. But I never seem to learn.

Anyway, what my friend does is “valuation”: how much should someone be willing to pay for this thing? The ‘thing’ in this case is a program for improving the treatment of patients being treated for severe kidney disease. Patients do dialysis, they take medications, they’re on special diets, they have health monitoring to do, they have doctor appointments to attend, but many of them fail to do everything. That’s especially true as they get sicker: it gets harder for them to keep track of what they’re supposed to do, and physically and mentally harder to actually do the stuff.

For several years someone ran a trial program to see what happens if these people get a lot more help: what if there’s someone at the dialysis center whose job is to follow up with people and make sure they’re taking their meds, showing up to their appointments, getting their blood tests, and so on? One would hope that the main metrics of interest would involve patient health and wellbeing, and maybe that’s true for somebody, but for my friend (or rather his client) the question is: how much money, if any, does this program save? That is, what happens to the cost per patient per year if you have this program compared to doing stuff the way it has been done in the past?

As is usually the case, the data suck. What you would want is a random selection of pilot clinics where they tried the program, and the ones where they didn’t, and you’d want the cost data from the past ten years or something for every clinic; you could do some sort of difference-in-differences approach, maybe matching cases and controls by relevant parameters like region of the country and urban/rural and whatever else seems important. Unfortunately my friend had none of that. The clinics were semi-haphazardly selected by a few health care providers, probably slightly biased towards the ones where the administrators were most willing to give the program a try. The only clinic-specific data are from the first year of the program onward; other than that all we have is the nationwide average for similar clinics.

The fact that no before/after comparison is possible seemed like a dealbreaker to me, and I said so, but my friend said the experts think the effect of the program wouldn’t likely show up in the form of a step change from before to after, but rather in a lower rate of inflation, at least for the first several years. Relative to business as usual you expect to see a slight decline in cost every year for a while. I don’t understand why but OK, if that’s what people expect then maybe we can look for that: we expect to see costs at the participating clinics increase more slowly than the nationwide average. I told my friend that’s _all_ we can look for, given the data constraints, and he said fine. I gave him all of the other caveats too and he said that’s all fine as well. He needs some kind of estimate, and, well, you go to war with the data you have, not the data you want.

First thing I did is to divide out the nationwide inflation rate for similar clinics that lack the program, in order to standardize on current dollars. Then I fit a linear regression model to the whole dataset of (dollars per patient) as a function of year, giving each clinic its own intercept but giving them all a common slope. And sure enough, there’s a slight decline! The clinics with the program had a slightly lower rate of inflation than the other clinics, and it’s in line with what my friend said the experts consider a plausible rate. All those caveats I mentioned above still apply but so far things look OK.

If that was all my friend needed then hey, job done and it took a lot less than five hours. But no: my friend doesn’t just need to know the average rate of decrease, he needs to know the approximate statistical distribution across clinics. If the average is, say, a 1% decline per year relative to the benchmark, are some clinics at 2% per year? What about 3%? And maybe some don’t decline at all, or maybe the program makes them cost more money instead? (After all, you have to pay someone to help all those patients, so if the help isn’t very successful you are going to lose out). You’d like to just fit a different regression for each clinic and look at the statistical distribution of slopes, but that won’t work: there’s a lot of year-to-year ‘noise’ at any individual clinic. One reason is that you can get unlucky in suddenly having a disproportionate number of patients who need very expensive care, or lucky in not having that happen, but there are other reasons too. And you only have three or four years of data per clinic. Even if all of the clinics had programs of equal effectiveness, you’d get a wide variation in the observed slopes. It’s very much like the “eight schools problem”. It’s really tailor-made for a Bayesian hierarchical model. Observed costs are distributed around “true costs” with a standard error we can estimate; true inflation-adjusted cost at a clinic declines linearly; slopes are distributed around some mean slope, with some standard deviation we are trying to estimate. We even have useful prior estimates of what slope might be plausible. Even simple models usually take me a while to code and to check so I sort of dreaded going that route — it’s not something I would normally mind, but given the time constraints I thought it would be hard — but in fact it was super easy. I coded an initial version that was slightly simpler than I really wanted and it ran fine and generated reasonable parameter values. Then I modified it to turn it into the model I wanted and…well, it mostly worked. The results all looked good and some model-checking turned out fine, but I got an error that there were a lot of “divergent transitions.” I’ve run into that before and I knew the trick for eliminating them, which is described here. (It seems like that method should be described, or at least that page should be linked, in the “divergent transitions” section of the Stan Reference Manual but it isn’t. I suppose I ought to volunteer to help improve the documentation. Hey Stan team, if it’s OK to add that link, and if you give me edit privileges for the manual, I’ll add it.) I made the necessary modification to fix that problem and then everything was hunky dory.

From starting the model to finishing with it — by which I mean I had done some model checks and looked at the outputs and generated the estimates I needed — was only about two hours. I still didn’t quite get the entire project done in the allotted time, but I was very close. Oh, and in spite of the slight overrun I billed for all of my hours, so my chump factor didn’t end up being all that high.

Thank you Stan team!

Phil.

Is “choosing your favorite” an optimization problem?

This post is by Phil Price, not Andrew.

A week or so ago, a post involving an economic topic had a comment thread about what is involved in choosing a “favorite” (or a “preferred choice” or “the best”…terms like these were used more or less interchangeably in the comments). The thread got pretty long and, to me, a bit frustrating. Here’s what started it off: I wrote that “you need a consistent dimension to compare things because you need to be able to put choices in order if you want to choose the best one. You need a utility function that puts everything in one dimension.” (I shouldn’t have used the term “utility function”, with its strong connection to economics and rational choice. I think the actual situation is more general. )

Much to my surprise, several thoughtful, intelligent readers disagreed (and still disagree) with that statement.

In order to decide which you prefer among A, B, and C, you need to evaluate your preference for A, B, and C so that you can compare them. You need to be able to put them in order, e.g. pref(B) > pref(A) > pref (C)…or at least, pref(B) > pref(A or B). So far, so tautological.

You need to be able to evaluate the choices even when your preference depends on multiple parameters. If I’m choosing a phone, for example, I have a choice of (speed, cost, battery life) and of course many other parameters too. So you might have a choice between:

A: (fast, expensive, long-lived)
B: (fast, cheap, short-lived)
C: (slow, cheap, long-lived)

How can you choose among these? Well, go back to the tautology above: to decide which one you prefer, you need to be able to evaluate your preference for each. The tautology is still a tautology.

You get nowhere by saying “I prefer the speed of A, the expense of, B, and the battery life of either B or C”…you need to boil it down to one ‘preference function’, although you may not think of it that way. As Daniel Lakeland put it: “Phil, here’s a more mathy way to say what I think you’re saying. All complete totally ordered fields are isomorphic to Real Numbers. This is a known mathematical fact.” (And this is also where the Wikipedia article on optimization takes you.)

So what do people object to? Different things. At first I thought the objections were specious, but the discussion changed my mind, I think there’s some stuff worth thinking about. Hence this post.

1. One respondent said “You say “a consistent dimension” but surely you know that it is possible to think more multidimensional than a single one.””

2. A couple of respondents challenged the claim that you _always_ need a single preference function. Suppose, for the cell phone, I prefer fast to slow, cheap to expensive, and long-lived to short-lived. If there were an option D (in addition to A, B, and C above) that is characterized by (fast, cheap, long-lived) then D would be the dominant choice, preferred in every dimension, so there would be no need for a single preference function. You might have one, sure, but you don’t need one.


3. Responding to my claim that to come up with a favorite “ultimately you have to be able to put these in order, or at least to have one of them bubble to the top”, another respondent said “I guess you do, but I certainly don’t. In fact I sometimes specifically avoid ranking them by “value” so as to pretend I don’t have to deal with the consequences of my choices.”


I’ll give my take on these, below, but would also be interested in reading what others have to say.

Continue reading

Objectively worse, but practically better: an example from the World Chess Championship

A position from Game 2 of the 2021 World Chess Championship match. White has just played e4.

This post is by Phil Price, not Andrew.

The World Chess Championship is going on right now. There have been some really good games and some really lousy ones — the challenger, Ian Nepomniachtchi (universally known as ‘Nepo’) has played far below his capabilities in a few games. The reigning champ, Magnus Carlsen, is almost certain to retain his title (I’ll offer 12:1 if anyone is interested!).

It would take some real commitment to watch the games in real time in their entirety, but if you choose to do so there is excellent coverage in which strong grandmasters discuss the positions and speculate on what might be played next. They are aided in this by computers that can evaluate the positions “objectively”, and occasionally they will indeed mention what the computer suggests, but much of the time the commenters ignore the computer and discuss their own evaluations.

I suppose it’s worth mentioning that computers are by far the strongest chess-playing entities, easily capable of beating the best human players even if the computer is given a significant disadvantage at the start (such as being down a pawn). Even the best computer programs don’t play perfect chess, but for practical purposes the evaluation of a position by a top computer program can be thought of as the objective truth.

I watched a fair amount of live commentary on Game 2, commented by Judit Polgar and Anish Giri…just sort of got caught up in it and spent way more time watching than I had intended. At the point in the commentary shown in the image (1:21 into the YouTube video), the computer evaluation says the players are dead even, but both Polgar and Giri felt that in practice White has a significant advantage. As Giri put it, “disharmonious positions [like the one black is in] require weird solutions…Ian has great pattern recognition, but where has he seen a pattern of pawns [on] f6, e6, c6, queen [on] d7? Which pattern is he trying to recognize? The pattern recognition is, like, it’s broken… I don’t know how I’m playing with black, I’ve never seen such a position before. Fortunately. I don’t want to see it anymore, either.”

In the end, Nepo — the player with Black — managed to draw the position, but I don’t think anyone (including Nepo) would disagree with their assessment that at this point in the game it is much easier to play for White than for Black.

Interestingly, this position was reached following a decision much earlier in the game in which Carlsen played a line a move that, according to the computer, gave Nepo a slight edge. This was quite early in the game, when both players were still “in their preparation”, meaning that they were playing moves that they had memorized. (At this level, each player knows the types of openings that the other likes to play, so they can anticipate that they will likely play one of a manageable number of sequences, or ‘lines’, for the first eight to fifteen moves. When I say “manageable number” I mean a few hundred.). At that earlier point in the game, when Carlsen made that “bad” move, Giri pointed out that this might take Nepo out of his preparation, since you don’t usually bother looking into lines that assume the other player is going to deliberately give away his advantage.

So: Carlsen deliberately played in a way that was “objectively” worse than his alternatives, but that gave him better practical chances to win. It’s an interesting phenomenon.







Why are goods stacking up at U.S. ports?

This post is by Phil Price, not Andrew.

I keep seeing articles that say U.S. ports are all backed up, hundreds of ships can’t even offload because there’s no place to put their cargo, etc. And then the news articles will quote some people saying ‘this is a global problem’, ‘there is no single solution’, and so on. I find this a bit perplexing, although I feel like my perplexification could be cleared up with some simple data. How many containers per day typically arrived at U.S. ports pre-pandemic, and how many are arriving now? How many truck drivers were on the road on a typical day in the U.S. pre-pandemic, and how many are on the road now? How many freight train employees were at work on a typical day pre-pandemic, and how many are at work now?

I understand that there are problems all over the place: various cities and countries go in and out of lockdown, companies have gone out of business, factories have closed, there are shortages of raw materials and machine parts etc. due to previous and current pandemic-related shutdowns…that’s all fine, but it does nothing to explain why goods that are sitting at US ports are not moving. Have all of the U.S. truck drivers died of COVID or something? Inquiring minds want to know!

The COVID wager: results are in

This post is by Phil Price, not Andrew.

Frequent readers of this blog will already know about the wager between me and a commenter called Anoneuoid. Would the number of new COVID cases in the U.S. in the seven days ending 10/5 be lower than 500,000? If yes, I pay him $34. If no, he pays me $100. The number of new cases in those seven days was around 590K, so he owes me $100.

Anoneuoid, please send the money to Andrew (in small, unmarked bills); Andrew, please post your preferred address in the comments. Please use the money to buy donuts for the Stan team or for some other Stan-supporting activity that isn’t too much hassle for you.

Wanna bet? A COVID-19 example.

This post is by Phil Price, not Andrew.

Andrew wrote a post back on September 2 that plugged a piece by Jon Zelner, Nina Masters, Ramya Naraharisetti, Sanyu Mojola, and Merlin Chowkwanyun about complexities of pandemic modeling. In the comments, someone who calls himself Anoneuoid said (referring to model projections compiled and summarized by the CDC, shown below; the link is to the most recent projections, not the ones we were talking about) “…it is a near certainty cases will be below 500k per week on Oct 1st, yet that looks like it will be below the lower bound of the ensemble 95% interval. If anyone disagrees, lets bet!”

Historic COVID cases, and model projections


I love this kind of thing. Indeed, if a friend says something is “a lock” or “almost certain” or whatever, I will often propose a bet: “if you’re so sure, you should be willing to offer 10:1 odds, so let’s do it! I’ll wager $100 to your $1000. Deal?” Most of the time, people back off on their “almost certain” claim. It seems that usually when people say they’re “sure” or “almost certain” about something they aren’t speaking literally, but instead feel that the outcome has a probability of, say, 75% or 85% or something. In any case I appreciate someone who is willing to put his money where his mouth is.

We had some back-and-forth about the betting. Perhaps I could have shamed him into offering 12:1 or 10:1 or something, which was indeed my initial proposal, but upon reflection that seemed a bit one-sided. If he thinks the probability of a specific thing happening is, say, 8%, and I think it’s 66%, it seems a bit unfair to make him bet on his odds. Why not bet on my odds instead, in which case I should be offering him odds the other way?   (I said at the time: “I know little about this issue or about how much to trust the models. Just looking at the historical peaks we have had thus far, I’d guess this week or next week will be the nationwide peak and that things will fall off about half as quickly as they climbed. My central estimate for the week in question would be something like 650K new cases, but with wide enough error bars that I’d put maybe 30 or 35% of the probability below 500,000. But I’d also have a substantial chunk of probability over $1M, which you must think is pretty much nuts.”)

In the end we decided to sort of split the difference: if the number of new cases is over 500K that week, he’ll pay me $100; otherwise I pay him $34. We’ll settle about a week and a half after October 1, in case there are later updates to the numbers (due to reporting issues or whatever). Andrew asked me to do a short post about this, to “have it all in one place”, so here it is. 

This post is by Phil.

 

Tokyo Track revisited: no, I don’t think the track surface is “1-2% faster”

This post is by Phil Price, not Andrew.

A few weeks ago I posted about the claim — by the company that made the running track for the Tokyo Olympics — that the bounciness of the track makes it “1-2% faster” than other professional tracks. The claim isn’t absurd: certainly the track surface can make a difference. If today’s athletes had to run on the cinder tracks of yesteryear their speed would surely be slower.

At the time I wrote that post the 400m finals had not yet taken place, but of course they’re done by now so I went ahead and took another quick look at the whole issue…and the bottom line is that I don’t think the track in Tokyo let the runners run noticeably faster than the tracks used in recent Olympics and World Championships. Here’s the story in four plots. All show average speed rather than time: the 200m takes about twice as long as the 100m,  so they have comparable average speed. Men are faster, so in each panel (except the bottom right) the curve(s) for men are closer to the top, women are closer to the bottom.  Andrew, thanks for pointing out that this is better than having separate rows of plots for women and men, which would add a lot of visual confusion to this display. 

The top left plot shows the average speed for the 1st-, 2nd-, 3rd-, and 4th-place finishers in the 100, 200, and 400m, for men and women.  Each of the subsequent plots represents a further aggregation of these data. The upper right just adds the times together and the distances together, so, for instance, the top line is (100 + 200 + 400 meters) / (finishing time of the fastest man in the 100m + finishing time of the fastest man in the 200m + finishing time of the fastest man in the 400 m).  The bottom left aggregates even farther: the total distance run by all of the male finishers divided by the total time of all the male finishers, in all of the races; and the same for the women.

And finally, taking it to an almost ludicrous level of aggregation, the bottom right shows the mean speed — the total distance run by all of the competitors in all of the races, divided by the total of all of the times —  divided by mean of all of the mean speeds, averaged over all of the years. A point at a y-value of 1.01 on this plot would mean that the athletes that year averaged 1% faster than in an average year.

If someone wants to claim the track allows performances that are 1-2% faster than on previous tracks, they’re going to have to explain why the competitors in the sprints this year were only about 0.4% faster than the average per the past several Olympics and World Championships. 

Even that 0.4% looks a bit iffy, considering the men weren’t faster at all. You can make up a ‘just so’ story about the track being better tuned towards women’s lighter bodies and lower forces exerted on the track, but I won’t believe it. 

There’s year-to-year and event-to-event variation in results, depending on exactly what athletes are competing, where they are in their careers, what performance-enhancing drugs they are taking (if any), and other factors too (wind, temperature on race day, etc.).  It’s not inconceivable that the sprint speeds would have been 1-2% slower this year if not for the magical track, which just happened to bring them back up to around the usual values. But that’s sure not the way to bet.

This post is by Phil. 

 

How much faster is the Tokyo track?

Plot sprinting speeds by year and placing

Speed (meters per second) in Olympic and World Championship finals in track sprinting.

This post is by Phil Price, not Andrew.

The guy whose company made the track for the Tokyo Olympic stadium says it’s “1-2% faster” than the track used at the Rio Olympics (which is the same material used at many other tracks), due to a surface that returns more energy to the runners. I’d be interested in an estimate based on empirical data.  Fortunately the Olympics are providing us with plenty of data to work with, but what’s the best approach to doing the analysis?

One obvious possibility is to compare athletes’ performances in Tokyo to their previous performances. For instance, Karsten Warholm just set a world record in the men’s 400m hurdles with a time of 45.94 seconds, which is indeed 1.6% faster than his previous best time. Sydney McLaughlin set a world record in the women’s 400m hurdles at 51.46 seconds, 0.8% faster than her previous time.  So that 1-2% guesstimate looks pretty reasonable.

On the other hand, it’s common for new records to be set at the Olympics: athletes are training to peak at that time, and their effort and adrenaline is never higher. 

I can imagine various models that could be fit, such as a model that predicts an athlete’s time based on their previous performances, with ‘athlete effects’ as well as indicator variables for major events such as World Championships and the Olympics, and with indicator variables for track surfaces themselves. But getting all the data would be a huge pain, I think.

Another possibility is to look at the first-place times for each event: instead of comparing Karsten Warholm’s Olympic time to his other most recent competition times, we could compare (the first place time in the 400m hurdles at the Olympics) to (the first place time in the 400m hurdles at a previous major competition). We might not be comparing McLaughlin to McLaughlin this way, we’d be comparing McLaughlin to whoever won the last World Championship in the event, but maybe this approach would help remove the influence of the time-dependence of a single person’s training fitness and such. There are some problems with this approach too, though, with the most obvious one being that some athletes are simply faster than others and that is going to add a lot of noise to the system. Usain Bolt sure made that Beijing track look fast, didn’t he?

A technology-based solution would be to use some sort of running robot that can run at a fixed power output. You could run it on different tracks and quantify the speed difference. But as far as I know such a robot does not exist, and even if it did, it would have to use almost the same biomechanics as a human runner if the results are to be applicable.

Everything I’ve listed above seems like a huge pain. But there’s something that would be easier, that I think would be almost as good: compare the third- or fourth-fastest times in Tokyo with the third- or fourth-fastest times at other competitions. The idea is that the third-fastest time should be more stable than the fastest time, since a single freak performance or exceptional athlete won’t matter…basically the same reason for using a trimmed mean in some applications. For instance, in the men’s 400m hurdles at the World Championships in 2019, Kyron McMaster finished third in 48.10 seconds. In the 400m hurdles at the Tokyo Olympics, Alison dos Santos finished third in 46.72 s. That’s 2.7% faster.  For women, the 2019 World Championship time was Rushell Clayton’s 53.74 s, compared to Femke Bol’s third-place time of 52.03s in Tokyo; that’s 3.2% faster. 

Anyone got any other ideas for the best way to quantify the effect of the track surface?

[Added later: I got data (from Wikipedia) from recent Olympics and World Championships, and generated the plot that I have now included. The columns are distances (100, 200, and 400m), rows are sex.]

 

This post is by Phil.

 

 

What is a “woman”?

This post is by Phil Price, not Andrew.

As we approach the Olympic Games, this seems like a good time to think about the rules for deciding whether a person is a “woman” when it comes to athletic competition. As I was doing some searches to find some information for the post, I found an excellent piece that puts everything together much better than I would have. Go ahead and read that, then come back here. (The piece is by Christie Aschwanden,  of whom I think very highly; she wrote a book I reviewed here two years ago).

The issue is: if you’re going to have separate divisions for men and women, then you need a way to define “woman.”  A good way of defining this might seem obvious: if the person has a vagina, she’s a she. That’s the way the international sports governing bodies used to do it, but then that was rejected for reasons mentioned in Aschanden’s piece. Well, how about “does the person have two X chromosomes?”  After nixing genitalia as the criterion, this is what they switched to…but then that was rejected for reasons also mentioned in the piece. Currently,  according to the article, “Female athletes who [have] functional testosterone (in other words, not just high testosterone levels, but also functioning receptors that allowed their bodies to respond to the hormone) above a threshold number [are] not eligible to compete unless they [do] something to reduce their testosterone below the threshold.”  (The original sentence is in the past tense, but this is pretty much the current situation too). 

I think it’s safe to say that most people with an interest in how “woman” should be defined for sporting purposes are unhappy with any of the past standards and with the current one. An issue with the current article is, as it says in the article: “People who go through male puberty are taller, have bigger bones and develop greater muscle mass than those who go through female puberty, said William Briner, a sports medicine physician at the Hospital for Special Surgery in Uniondale, New York, during a session on transgender athletes at the American College of Sports Medicine meeting in early June. Men also have more red blood cells than women, and their hearts and lungs are bigger too. Some of these advantages are irreversible [even if testosterone levels are later reduced].”  

Furthermore, although the Olympics focuses only on the best athletes in the world, there’s a need for competition rules that apply at other levels too. High school basketball, for example, needs a way to determine who is eligible to play on the girls’ team. Is it fair for a 17-year-old 6’4″ high school athlete who went through puberty as a male, but then transitioned to female, to be allowed to compete against girls who went through puberty as girls? Viewed one way, being a woman who went through puberty as a male is just another genetic advantage, and we let athletes use their genetic advantages, so there’s no problem with such a person competing on the girls’ team. Viewed another way, it’s not fair to let someone compete as a girl if they got their body by growing up as a boy. 

So what do I think the rule should be? I have no idea. Indeed, I’m not even sure how to think about what the rules are trying to achieve. We have separate competitions for men and women because it would be “unfair” for women to have to compete against men: in most sports the best women wouldn’t stand a chance against the best men, and the average woman wouldn’t stand a chance against the average man. A female sprinter, for instance, would have no chance of reaching the elite level if competing against men. OK, fine…but what about me? I’m a man but I would also have no chance of reaching the elite level if sprinting against other men. Indeed, I would have no chance of reaching the elite level if I were sprinting against women! The fact is, very few people have the genes (and other characteristics) to compete at the elite level. If the point of the rules is to give everyone a reasonable chance to be among the best in the world in a given category, well, that’s not gonna happen, because no matter how you define the categories there will be only a small fraction of people who have what it takes.  By having separate competitions for men and women we can’t really be trying to give “everyone” a chance. So what are we trying to do?

All of this puts me in mind of a statistical principle or worldview that Andrew has mentioned before, that I think he attributes to Don Rubin: most things that we think of as categorical are really continuous. For some purposes (and with some definitions) male/female is indeed categorical — no male can bear a child, for example — but when it comes to innate ability in a sport, what we have is one statistical distribution for men and another statistical distribution for women, and (for most sports) for any reasonable definition of “man” and “woman” the high tail for men will be higher than the high tail for women, but  the bulk of the distributions will overlap. If  Caster Semenya is a woman for sporting purposes, in spite of her XY chromosomes and typically-male testosterone level, then she is at the very top of the sport. If she is a male for sporting purposes, then she is not remotely competitive at the elite level. (Welcome to the club!). Sporting ability is continuous but we have to somehow force people into two categories, assuming we want to continue the current male/female division in competitions. 

It’s said that “hard cases make bad law” but this seems like a sphere in which all of the cases are going to be hard.



This post is by Phil.

If a value is “less than 10%”, you can bet it’s not 0.1%. Usually.

This post is by Phil Price, not Andrew.

Many years ago I saw an ad for a running shoe (maybe it was Reebok?) that said something like “At the New York Marathon, three of the five fastest runners were wearing our shoes.” I’m sure I’m not the first or last person to have realized that there’s more information there than it seems at first. For one thing, you can be sure that one of those three runners finished fifth: otherwise the ad would have said “three of the four fastest.” Also, it seems almost certain that the two fastest runners were not wearing the shoes, and indeed it probably wasn’t 1-3 or 2-3 either: “The two fastest” and “two of the three fastest” both seem better than “three of the top five.” The principle here is that if you’re trying to make the result sound as impressive as possible, an unintended consequence is that you’re revealing the upper limit. Maybe Andrew can give this principle a clever name and add it to the lexicon. (If it isn’t already in there: I didn’t have the patience to read through them all. I’m a busy man!)

This came to mind recently because this usually-reliable principle has been violated in spectacular manner by the Centers for Disease Control (CDC), as pointed out in a New York Times article by David Leonhardt. The key quote from the CDC press conference is “DR. WALENSKY: … There’s increasing data that suggests that most of transmission is happening indoors rather than outdoors; less than 10 percent of documented transmission, in many studies, have occurred outdoors.”  Less than 10%…as Leonhardt points out, that is true but extremely misleading. Leonardt says “That benchmark ‘seems to be a huge exaggeration,’ as Dr. Muge Cevik, a virologist at the University of St. Andrews, said. In truth, the share of transmission that has occurred outdoors seems to be below 1 percent and may be below 0.1 percent, multiple epidemiologists told me. The rare outdoor transmission that has happened almost all seems to have involved crowded places or close conversation.”

This doesn’t necessarily violate the Reebok principle because it’s not clear what the CDC was trying to achieve. With the running shoes, the ad was trying to make Reeboks seem as performance-boosting as possible, but what was the CDC trying to do? Once they decided to give a number that is almost completely divorced from the data, why not go all the way? They could say “less than 30% of the documented transmissions have occurred outdoors”, or “less than 50%”, or anything they want…it’s all true! 

The Pandemic: how bad is it really?

This post is by Phil Price, not Andrew.

Andrew’s recent post about questionable death rate statistics about the pandemic has reminded me that I have not yet posted about a paper Troy Quast sent me. Quast is from the University of South Florida College of Public Health. Quast, Ross Andel, Sean Gregory, and Eric Storch have written a paper: “Years of Life Lost Associated with COVID-19 Deaths in the United States During the First Year of the Pandemic. [Note added later at Quast’s request: Here’s the published version in Journal of Public Health.]

Here’s the money quote: “We estimated roughly 3.9 million YLLs due to COVID-19 deaths, which corresponds to roughly 9.2 YLLs per death. We observed a large range across states in YLLs per 10,000 capita, with New York City at 298 and Vermont at 12.”

This is a follow-up of a similar paper from 2020 that appeared in the Journal of Public Health…similar in approach, although that paper covered only the first six or seven months of the pandemic. Quast very kindly credits me with inspiring this work via a post on this blog last May.  That post generated lots of interesting discussion, which some of you may want to revisit. 

[Yikes! I would have thought I could at least get the order of magnitude right when I divide in my head, but evidently not, or at least not reliably. As several early commenters noted, I bungled the numbers badly. The numbers below have been fixed, and the commentary to match.]

3.9 million Years of Life Lost (YLL), how does that help us gauge the magnitude of the pandemic? Well, we could compare this to  U.S. military losses in WWII, for example: that was about 400,000 people killed, at roughly 50 years each, so 20 million YLL. In the first year of the pandemic, the U.S. lost about 1/5 as many years to COVID as we lost in all of WWII, 

Daniel Lakeland has suggested that YLL is most easily interpreted if you divide by 80 to get ‘equivalent lifetimes’. 3.9 million years of life lost corresponds to about 50,000 lifetimes. 

In 2015 the National Transportation Safety Board published a table of leading causes of death, ordered by years of life lost. At 3.9 million YLL lost in a year, COVID would be third on that list (after cancer and heard disease), and more than double the next contender (chronic lower respiratory disease).

A bit of a tangent, but in case you’re wondering the 2015 list, in order by YLL, was: Cancer; Heart disease; Chronic lower respiratory disease; Accidental poisoning; Suicide; Stroke; Motor vehicle crashes; Diabetes; Chronic liver disease; Infant death. This shocks the hell out of me. I knew there are lots of suicides, but I would have bet traffic accidents were responsible for way more YLL than suicide, but no (it was quite close, though, at least in 2015. And I would never have put ‘accidental poisoning’ at number four, probably not even in the top 10.

But I digress. 

Check out the paper, there’s some pretty interesting stuff in there, such as: “In every jursidiction, the male value is greater than the female value, which is consistent with YLLs per 10,000 capita for males being roughly a third greater than for females at the national level (136.3 versus 102.3). However, the divergence between the two genders varies considerably by state. In New York City, the male value was nearly 75% greater than the female value, while in Mississippi the male value was only 7% greater.”

 

This post is by Phil