On the Upswing?
Examining the surge of home runs in Major League Baseball
by Julia E. Seaman and I. Elaine Allen
For the last three Major League Baseball (MLB) seasons, the number and rate of home runs (HR) have increased, especially in the seasons’ first months (March and April).1 Analyses of these data have concentrated on the batters, their swings and the baseball itself as potential influencing factors of this surge.
But there are many other variables in pitching matchups that can lead to HRs, including the pitchers, pitch type, ballparks and weather.
The increase in HRs has been great enough that in 2018, the baseball commissioner’s office created a committee to conduct primary and secondary research to identify potential causes.2 But no definitive source was found. The committee concluded that there may be changes in the baseball’s aerodynamics, as well as how the baseballs are being stored (in varying degrees of temperature and humidity) that may influence the HR rate.
Similarly, author Robert Arthur focused on the baseball’s physical attributes—specifically, the drag coefficient, which measures the loss in speed from the pitcher’s release until it crosses the plate.3 The analysis found that the drag coefficient eliminated the effect of weather on the baseball, such as cold temperatures in April games, and in a Bayesian random walk model found much smaller drag coefficients in the first month of regular season baseball in 2019.
Interestingly, many pitchers have their own opinions about the baseballs in 2019. Author Matthew Cerrone interviewed seven pitchers in early May, and they all said they believed that the baseballs have a different feel this year (smoother, difficult to grip). Former New York Yankees Manager Joe Girardi attributed the HR surge to batters getting used to hard-throwing pitchers.4
In this column, we investigate the trend of increased HRs for the first month of the MLB season for the five-year period between 2015 and 2019. We use quality control methods and statistical modeling to determine whether there is evidence for an increase in HR rates and the role that pitchers play in the trend.
Evidence for an increase?
There has been an increasing numerical rate of HRs in the first month of the MLB season, shown in Table 1. From 2015 to 2019, the total number of HRs has almost doubled, while the number of games played during that time has increased by only one-third. This corresponds to an increase of 0.81 HRs per game played in the first month of the season.
However, while the HR rates appear to increase year over year, the only statistically significant differences (adjusted for multiple comparisons) are between 2015-2019 and 2016-2019 HR rates (p = 0.024 and p = 0.043, respectively). Therefore, there is evidence that HR rates have increased, though the significance of the rate may not be as great as the raw numbers appear to show.
The pitcher’s role
While the baseball physics and batter strategies are key components to hitting HRs, the pitcher is an indispensable part of the matchup. One of the most basic statistics of a pitcher-hitter duel is the number of pitches thrown. Each pitch is a new opportunity for the hitter to put the baseball into play, suchas a HR, or for the pitcher to strike out the player and record an out.
For HRs in the first month of the season, it takes an average of 1.01 to 1.26 pitches for the batter to hit an HR, with a median of two pitches per HR for every year, shown in Table 1. Therefore, most of the HRs during this period were hit on the first or second pitch of the at-bat appearance. There is a trend for increasing pitch percentages (number of pitches per HR) from 2015 to 2019, with a significant difference (adjusted for multiple comparisons) between 2015-2019 and 2016-2019 (p = 0.006 and p = 0.009, respectively).
In contrast, the average number of pitches per at bat did not significantly change over this same period, remaining at about 3.9 pitches. Therefore, it does appear that while pitching battles have been the same length during the last five seasons, the chance that a baseball in play will be an HR has (slightly) increased.
Using a Kruskal-Wallis nonparametric analysis (Table 2) shows a slightly different result—with 2019’s data statistically different from 2015, 2016 and 2018 (p = 0.01), and no different from 2017. The advantage of this test is that it examines the ranks, and large outliers do not have extreme weights in the analysis.
Looking at details of pitching matchups adds another layer to this story. Over these seasons, more than one-third of the pitchers gave up only one HR in the first month, and 90% gave up fewer than five HRs. This distribution did not change from 2015 to 2019. Figure 1 shows the overall distribution of HRs per pitcher.
In addition, Figure 2 shows box plots by season. This figure shows a significant number of outliers and increasingly large outliers in successive seasons. There are a low of 15 outliers5 in 2015 (values greater than 1.5 times the interquartile range from the median) to a high of 30 outliers in 2019.
From Figures 1 and 2, it’s clear that using standard statistical descriptors of means miss the nonnormality of the distributions and give excessive weight to the outliers. Therefore, while there are more HRs on average each season, there is much less change season to season than the single statistic shows.
The best statistics to use
Overall, there has been an increase in the total number of HRs and mean HRs per game for the past five seasons for the first month of the MLB regular season. This also corresponds to a slight increase in the average number of pitches per HR. However, the increases are not significant in the recent years (2017-2019) in which there has been much conversation and analyses focused on these changes.
Baseball is a game of numbers, and often many statistics are chosen for their ease of calculation and comprehension. For example, average rates per at bat or per game are widely used to compare and rank all players.
Despite their prevalence and importance, these averages may not be the best statistic and often can hide the bigger picture. Specifically, for comparing the HRs per game by season, we have shown that while the raw values have increased, it does not correspond to statistical significance and can mask more important factors, such as individual outliers.
Additionally, the distribution of these average statistics often is not symmetric and, therefore, nonparametric tests may be more appropriate for the analyses.
References and Note
- David Waldstein, “MLB Hired Scientists to Explain Why Home Runs Have Surged. They Couldn’t,” New York Times, May 24, 2018, https://tinyurl.com/nyt-HRs-increase.
- Jim Albert, Jay Bartroff, Roger Blandford, Dan Brooks, Josh Derenski, Larry Goldstein, Anette (Peko) Hosoi, Gary Lorden, Alan Nathan and Lloyd Smith, “Report of the Committee Studying Home Run Rates in Major League Baseball,” Office of the Commissioner of Baseball, May 24, 2018, https://tinyurl.com/HR-committee-report.
- Robert Arthur, “Moonshot: The Baseball Is Juiced (Again),” Baseball Prospectus, April 5, 2019.
- Matthew Cerrone, “‘These Baseballs Suck’: Pitchers Weigh in and We Examine What’s Going on With Baseballs in MLB,” MetsBlog, May 5, 2019, https://tinyurl.com/mets-blog-HRs.
- Values larger than the third quartile by 1.5 times the interquartile range.
Albert, Jim, “Home Runs Are Up in 2019,” Exploring Baseball Data With R blog, April 18, 2019, https://tinyurl.com/baseball-R-blog.
Julia E. Seamanis research director of the Quahog Research Group and a statistical consultant for the Babson Survey Research Group at Babson College in Wellesley, MA. She earned her doctorate in pharmaceutical chemistry and pharmacogenomics from the University of California, San Francisco.
I. Elaine Allenis professor of biostatistics at the University of California, San Francisco, and emeritus professor of statistics at Babson College. She is also director of the Babson Survey Research Group. She earned a doctorate in statistics from Cornell University in Ithaca, NY. Allen is a member of ASQ.