Saturday, February 17, 2007

Alternative Hall of Fame -- Methods (Normalization)

General

Most of the measures I use to determine the PHoM are centered around normalizing a player's stats, so I can compare them across different eras, and ballparks. The basic method I use is what Bill James called the "Willie Davis" method in his book "Win Shares."

If you are reading this blog, you probably know what park-adjustments are. I use the park adjustment factors found on Baseball Reference.com, which are derived from Pete Palmer's method in "Total Baseball.

Hitters

For hitter normalization, I took a slightly different approach to account for different run scoring environments. First, I calculated the value of an out under Palmer's linear weights method, for each year and each league in baseball history. The values range from a low of .185 in the 1875 National Association, to a high of .335 in the 1894 National League. I tried to find a relatively modern year in which both the NL and AL had the same out figure. The closest year was 1974, where the NL had a .271 and the AL had a .273. I wanted a more round number, so my baseline for the Willie Davis method is an out value of .270.

I apply this to runs created, and then back into the raw stats. In short, I normalize the hitting stats by: (1) park-adjusting runs created, (2) calibrating the actual season's out value to the baseline value of .270 and applying that to runs created and (3) using the calibrated runs created figure to proportionally adjust the actual raw stats. The goal is to leave at-bats roughly the same as in the actual season, but normalize the other stats.

There's another problem, and that has to do with season-length. Comparing the stats of someone who played 80 games in an 81 game major league season to someone who played 150 games in a 162 game major league season has some obvious disadvantages for the short-season player. Rather than just multiply the first player's stats by two, I use an exponential method to calibrate the seasons to a 162 game schedule. I divide 162 by the average number of games that a team played in the subject season, and raise it to the 2/3 power to get an adjustment figure. I use that figure to make season-length adjustments.

Pitchers

What about pitchers? For pitchers I normalize their linear weights by comparing the league average ERA for the year to a baseline ERA of 3.75. That helps with normalizing the ERA, but it does nothing for Wins.

Here's the method for Wins. First, I calculate the pythagorean win pct based on the normalized ERA, using an exponent of 1.83. I compare the number of wins in this method, to the number of pythagorean wins a player would have with his un-normalized ERA. I then add (or subtract) the difference in pythagorean wins to the actual wins to get a new wins figure. Thus, I'm not using normalized pythagorean wins as the number of wins; I'm using actual wins, plus the difference in pythagorean wins between normalized ERA and actual ERA. I also add to those wins 1/2 of the Wins Above Team the pitcher achieved.

Finally, for pitchers, there is a separate kind of season-length disparity that is not based on the number of games a team plays. With current five man rotations, the opportunities for wins are much less than when a team had two starters. It is difficult to compare Greg Maddux to Mickey Welch when it comes to wins. So I developed a separate season-length adjustment for pitchers.

This adjustment is based on the average number of starts earned by starters with at least 24 starts during a season. The baseline for this is right around 1974-1975, where the average "regular" starter started 34 games. Pitchers in early baseball who started a lot more games have their wins ratched down. For instance, in the 1873 National Association, the average regular starter started 47 games. Normalizing that to 34 games means I multiply the pitcher's starts by .723, in both the wins and losses columns. By contrast, in the 2005 American League, teh average regular starter started 31 games, so those pitchers get a boost to their starts by 1.0968.

League Quality

One last normalization point. Some leagues are weaker than others, and that must be accounted for. After much discussion, I have determined there is no way to accurately measure weakness. At best we can make well-reasoned guesses. I have decided to ignore small indications of weakness, where one league is less than 5% weaker than another. Also, I ignore era-related weaknesses. I do not "timeline" -- that is, adjust for the fact that every player in major league baseball today is probably a better athlete than 95% of the players in 1900. I treat all eras equally. Accordingly, players during WWII are not dinged for the lower quality competition. They played against the best players available.

Still, there are some obvious instances of league weaknesses, primarily for the National Association, American Association, Federal League, and Union Association. I ignore any subtle weaknesses, like the American League weaknesses in the first few years of its creation relative to the National League. With no real way to peg a number, and because there were quality players in the league, such an adjustment would be as likely to distort as to elucidate.

Here are my league quality adjustment factors.

Union Association 0.65
Federal League 0.76 (both years)
American Association 0.78 (1882), 0.84 (1883), 0.89 (1884), 0.90 (1885 & 1889), 0.95 (1886-1888), 0.79 (1890), 0.76 (1891).
National Association: 0.90 (1871), 0.97 (1874), 0.72 (1875)