What are the x and y-axes here?
P.S. Popeye nails it (see comments).
X-axis is ln() of something…
What jumped up from 1902 to 1903, stayed high until 1917, then dropped back down in 1918?
I have no idea of the answer to this one, by the way. I’m guessing it’s some sort of social statistics but it really could be any two variables that are measured each year.
Probably not anything economic since no great spike / drop during the great depression……
In fact, no irregular behavior during the great depression, WWI, WWII, but “phase shifts” of some sort in 1919-21, 1949-57, 1993-94, and what looks like an ongoing phase shift that started in 2008.
I’m mostly stumped, but I suspect that it could be some sort of obscure sports statistics, like average home runs hit per major league baseball player per year vs. average attendance, or something like that. Baseball is a likely candidate, the game has changed so significantly between 1918 and 1921 that pre-1920 and post-1920 eras even have their own names (“dead-ball era” and “live-ball era”.) But I can’t identify any parameters that would fit either axis.
Not to mention log scale or linear?
Some relation to Okun’s law perhaps?
In particular: https://skitch-img.s3.amazonaws.com/20100213-kbus3a3mf61ncax25qi4fckc7x.gif
In the end, we’re probably just looking for patterns in the realizations of a 2D RW, as in the following:
Home runs per game on the y-axis, strikeouts per game on the x-axis…
See also http://www.nytimes.com/interactive/2013/03/29/sports/baseball/Strikeouts-Are-Still-Soaring.html and http://michaelbein.com/baseball.html
Cool. Oddly enough, I came close to this combination but didn’t put it together. When the earlier commenter suggested baseball and 1919, I first thought of home runs but realized it wouldn’t work with that time frame (because there was no pre-1903 home run era), then I thought of strikeouts (since these go with home runs), but I didn’t think to put them together.
very impressive – I actually figured it’d be sport because as people say above, economic stats would show the wars & great depression. Also the blogger at Chartsnthings who posted this does a lot of sports. Given long-running stats I thought baseball – but I know little about baseball and couldn’t believe there were such consistent upward trends, so I rejected it.
The x-axis is litres of displacement and the y-axis is horsepower / cubic centimetre. It’s naturally aspirated engines from a particular company or kind of vehicle… Ford trucks?
Strikeouts per game and homers per game?
You can figure out what this is a graph of from famous high and low points:
- There is a stand-alone spike in hitting-crazed 1930, when Hack Wilson drove in 190 runs, and the National League is rumored to have juiced the ball.
- 1987 was another offensive year (allegedly due to changing the ball for one year).
- The year of the pitcher, 1968, is below other years of its era.
- 2000 was the peak of the steroids era of offense.
- 1918 is a low homer year — there was something wrong with baseballs due, somehow, to the War Effort (maybe umpires were told to keep dirty, spat-upon baseballs in play longer)
- 1943 is a low homer year — a lot of sluggers like Williams and DiMaggio were in the Army by then, and they were once again cutting corners on the quality and quantity of baseballs
Once you figure out that this is a graph of homers and strikeouts, you can use it to see longer term trends:
- The most vertical section of the graph is the era around 1920 when Babe Ruth revolutionized the game by demonstrating swinging for the fences made sense.
- The 1949-1956 era was dominated by players following Ted Williams’ example of waiting for a good pitch to drive, even if they struckout more than Williams did.
- We currently are in an era of rapidly increasing strikeouts despite homers being stable since players got the message that steroids testing was semi-serious.
Re “there was something wrong with baseballs due, somehow, to the War Effort”.
I don’t know much about baseball, but Wikipedia informs me that, in the early 20th century, “part of every pitcher’s job was to dirty up a new ball the moment it was thrown onto the field. By turns, they smeared it with dirt, licorice, tobacco juice; it was deliberately scuffed, sandpapered, scarred, cut, even spiked. The result was a misshapen, earth-colored ball that traveled through the air erratically, tended to soften in the later innings, and as it came over the plate, was very hard to see”.
This went on till 1920, when a Major League player (Ray Chapman) got hit in the head and killed by a pitch. After that, the league adopted some rule changes, including a ban on spitting on balls, which made the balls more predictable and easier to see, and the permanent increase in the number of home runs past 1920 was one of the side effects.
One big question in baseball history was whether Ruth’s giant homerun stats were, as is widely assumed, the result of the end of the Deadball Era. Was Ruth a product of his times? Or did Ruth personally change his times?
A counter-theory is that Ruth made his breakthrough during the Deadball Era and he personally played a huge role in moving baseball into its golden age. Despite his lack of education, Ruth was the original Moneyball theorist who figured out that the reigning intellectual orthodoxy of singles-hitting was wrong, that it made more sense to accept higher risk (in terms of striking out) to garner more reward (in terms of home runs, which had been considered flashy and gauche in Ty Cobb’s preceding era).
Not only was Ruth a role model to younger sluggers, but he generated so much revenue for baseball owners that when Ray Chapman was killed on August 17, 1920 by a dirty ball that he apparently never saw, the owners found they could easily afford to do the right thing and instruct umpires to replace worn balls with fresh ones multiple times in each game. This safety innovation also made hitting easier. New baseballs were easier to see and easier to hit for home runs.
However, it’s clear that Ruth’s big breakthrough came before this post-Chapman reform. Ruth hit a record 29 homers in 1919, which everybody agrees was part of the Deadball Era. And he did it in only 130 games and playing in Fenway, which was then a pitcher’s park. Ruth hit 20 of his 29 homers on the road in 1919. If he’d been playing in an average ballpark, that would project out to about 42 or 43 homers in 1919.
Then in 1920, Ruth already had 42 of his 54 homers when Chapman was killed about 3/4ths of the way through the season. All this suggests to me that the usual interpretation is backward: that Ruth’s breakthrough largely happened during the Deadball Era.
To give some perspective, the 20th Century homer record before Ruth in 1919 had been Gavvy Cravath’s 24 in 1915. Cravath is an interesting figure in that he’s the closest thing to a proto-Ruth. But Cravath’s homer-hitting was largely a function of playing in small Baker Bowl in Philadelphia. He hit 19 of his 24 homers in 1915 at home, and 94 of his 119 career homers at home.
But Cravath had been stuck in the minors during his peak years in his late 20s, so he didn’t make much of an impression on baseball orthodoxy. Ruth almost singlehandedly took on and changed the traditions of how the game ought to be played.
I remember Bill James making that point about Babe Ruth, that in addition to his athletic abilities he was an innovator, a man who didn’t believe that ordinary rules applied to him. Ruth broke the unwritten rules by swinging really hard even at the risk of striking out; that’s a lot cooler than breaking the written rules by throwing World Series games. One of the challenge of being an innovator is knowing what rules to break.
Partial ban on spitballs went in effect in the winter of 1919/20. As the chart indicates, 1920 already saw more home runs per game than any season in the history of major league baseball.
If we assume that the average number of home runs per game through 1921 is a good indicator of the “ease” of hitting the ball (Ruth’s influence on game play came later), and we renormalize Ruth’s home runs to the 1919 ball, we get 29 home runs in 1919, 42 home runs in 1920, 31 home runs in 1921, 18 home runs in 1922 (short season), and 21 home runs in 1923.
Without a doubt, Ruth was an exceptional player, and it appears that his best year was 1920 (when, despite spitball still being partially legal, he set a slugging average record that stood for 80 years.) If we dig deeper into his performance, it is probable that we will find evidence that 1920 was a statistical fluke.
“renormalize Ruth’s home runs to the 1919 ball”
A. The 1919 ball was very much the Dead Ball.
B. Ruth’s record setting homerun total of 29 in 1919 was depressed by playing in what was then a pitcher’s park in Boston. His Red Sox teammates only hit four homers all season, home or away! Similarly, when Ruth led the league in homers in 1918 with 11 while going 13-7 as a pitcher, his teammates only hit 4 homers.
C. The 1919 season, like the 1918 season was shortened due (I believe due to the Great War). The Red Sox only played 137 games in 1919, compared to the normal 154 games of that era. Ruth played in only 130 games (he was still pitching, going 9-5, so he had a few games off).
So, give Ruth a normal ballpark and a full season in 1919 and he’d have hit close to 45 homers in the Dead Ball Era! Ruth’s 1919 season came as a revelation to the rest of baseball of what his offensive strategy could accomplish. So, the increase in offense after 1919-20, while undoubtedly related to better balls, was no doubt sizably driven simply by Ruth’s example.
Other standout years that are visible on the chart
- Roger Maris’s 1961 (the first Expansion Year since 1901) was the culmination of the long growth in homers and walks from the Ruth Era. The strikezone was expanded after that.
- 1972 was a mini-Year of the Pitcher, which led the American League to adopt the Designated Hitter rule the next year.
- I hadn’t realized 1976 was as much of a pitcher-friendly year, but, now that I think about it, the next year, 1977, showed big gains in offensive numbers, presumably due to expansion.
- 1911 was the liveliest offense year of the Dead Ball Era.
As for the overall secular trend toward more strikeouts and more homeruns, it was set in motion in 1918 when Babe Ruth shifted away from pitching to playing the outfield. He led the league with 11 homeruns in 1918, 29 the next, 54 the next, and 59 in 1921. Ty Cobb pointed out that Ruth was only allowed to practice his uppercut because he was a pitcher, so nobody cared about his batting style. If he’d started out as a hitter, somebody in authority would have told him to stop showing off and making a fool of himself by striking out so much.
Ruth’s fame is well-deserved.
For data of this sort do people prefer a scatter plot or twin time series.
Wonder what’s easier to interpret.
Definitely twin time series. This graph is horrendous to make sense of, even now that I know what it shows and with Steve Sailer’s helpful commentary. And I even know a bit about baseball. Two time series would be much better.
Maybe one could colour-code the time, so that the beginning of the time series was red and the end blue: then one could place any year visually by looking at the colour blend.
(I’d prefer twin time series as well).
The celebration of strikeouts in baseball seems to be a bit perverse, along the lines of the celebration of aces in tennis. In each case, it’s an impressive achievement by the pitcher/server but makes the game itself more boring. I’ve long thought they should stop giving tennis players two tries on the serve, but I suspect one reason this rule change hasn’t been done is that people somehow think of an ace as a positive good in itself.
A home run is not nearly as fun to watch as a triple, but at least you get to see someone hit a ball really really hard, which is something.
Right. The Ruthian Revolution did much to make baseball a better spectator sport, but 90+ years later, with the trends set in motion by Ruth still going on, we’re seeing too much of a good thing as the game continues to head toward a strikeout vs. homerun end state.