Tuesday, February 26, 2008

Boone Factor for Jockeys


I couldn't resist calculating the Boone Factor for a jockey. The only jockey's name I know is Willie Shoemaker, and Wikipedia says he weighed 95 pounds (and only 2 1/2 pounds at birth).

His Boone Factor? 7.92. Nearly 8 beers a night!

The Boone Factor

Bret Boone was a productive player for awhile, and then he fell off a cliff and retired in a surprise announcement in Spring Training 2006. No explanation.

Yesterday he revealed that the problem was alcohol. He's clean now, and all that stuff, and trying to make a comeback. We'll see.

What struck me about the story on MLB.com was this sentence: "Boone's problems started in a more subtle matter[sic], but it got to a point where he would drink 12 to 15 beers after a game."

Twelve to 15 beers? Maybe it's just me, but that seems like a lot of beer. That's a party.

Was it a lot for Bret? According to baseballreference.com, Boone's playing weight was 180 pounds, which is about 81,647 grams (using Google's conversion feature). Fifteen beers at 12 ounces per beer, is 180 ounces, or about 5,103 grams.

So, bathroom breaks aside, Boone may have added another 6.25% to his body weight after games. He may have weighed as much as 11 pounds more leaving the bar than when he arrived. I think I know why he lost some bat speed.

The sciences always have names for their formulae, so I'm going to refer to this as the Boone Factor. The Boone Factor has many applications. For instance, how many beers would that be for, say, Tony Siragusa? Football reference.com reports his weight at 330 pounds. The exchange rate would be 27 beers.

To do your own calculations of the Boone Factor:

Step One: Multiply the player's weight (in pounds) times 453.59 to find his weight in grams

Step Two
: Multiply the figure in Step One by .0625 (or 6.25%) -- call this the Boone Intake Parameter

Step Three
: Divide the figure in Step Two by 28.35 to find the number of beer ounces imbibed

Step Four
: Divide the figure in Step Three by 12 to derive the number of beers. (Substitute for the 12 if you are calculating tall boys, cans of malt liquor, or appropriately, Boone's Farm).

Monday, February 04, 2008

Anyone Can Write A Baseball Book

Tonight at Borders I read the Introduction and first chapter of a book called "Stat One" by Craig Messmer. The book consists of a ranking of historical players, by position, with some commentary.

Basically the facts and figures associated with the players have been seen in countless other places. The alleged innovation is a new stat that the author claims best represents hitting ability, or value, or both...I'm not sure. Best I can tell, he does not actually use the new stat to rank the players, so the rankings end up being just another guy's opinion.

The new stat is this: (Net Runs + Net Runs + Complete Bases)/Plate Appearances.

Net Runs are: RBI + R - HR
Complete Bases are: Total Bases + BB + HBP + SB - CS

That's it. That's the Holy Grail stat the book considers "A New System For Rating Baseball's All-Time Greatest Players." He calls it the P/E. The "P" is for production and the "E" is for efficiency.

He does almost nothing to justify the stat. He first criticizes production stats because they are dependent on other players. He then criticizes efficiency stats because they do not translate to runs. Fair criticisms, but nothing new.

How does he solve the problem? He creates a new stat that is two-thirds dependent on other players and divides it by plate appearances to produce an efficiency stat that does not translate to actual runs produced. In other words, his stat is comprised of exactly the parts that he says don't work in existing statistical methods. Then he makes it worse by combining them all.

Sabermetricians might ask (among many other things) "Why count net runs twice?" His response? To even things out so that Complete Bases does not dominate the outcome.

P/E seems to have no mathematical or theoretical justification. It's just a guy taking a bunch of traditional stats and combining them in a way that hasn't been published before. There's a good reason a stat this simple hasn't already been published.

Add, subtract, divide. Mix and match. See how many combinations you can come up with. It doesn't have to mean anything. It will sell.

I'm thinking of (Height (inches) + Weight (pounds) + Total Bases)/(Plate Appearances - Strikeouts). Perhaps I'll publish a list of the greatest players of all time.

Sunday, February 03, 2008

Santana Trade Part III

We've got four Mets prospects...three pitchers and one hitter. I don't have Baseball America's Prospects book for this year, so I'll go with Kevin Goldstein's Top 100 list.

Two of the pitchers, Humber and Mulvey, are not listed in Goldstein's Top 100. They were both in Double A last year. Neither seem likely to crack the list of Top 25 pitchers next year.

The other pitcher, Deolis Guerra, is a kid. He is at the bottom of the Top 100 list. It is conceivable he could crack the Top 25 pitchers. Let's assume he does. Under Wang's system, his expected value is 11.2 WARP over six years, and the savings over a free agent add another 8.4 WARP for six years, so that's 19.6 in expected WARP value for Guerra.

Carlos Gomez, the young center fielder, is higher on Goldstein's list. It seems likely he could crack the Top 25 hitting prospect list next year, but he does not look like he'll be projected as a Top 10 guy. That's another 31.7 in expected WARP value.

So, if Gomez and Guerra are #11-#25 prospects as hitter and pitcher, respectively, and Humber and Mulvey do not crack the Top 25, Santana has to be worth 51.3 WARP over the next six years.

BP has not released PECOTA projections for this year, but last year's remain available. BP only projected five years from 2007-2011. From the years 2008-2011, Santana was expected to generate WARP of 23.3, as follows: 7.1, 6.2, 5.0 & 5.0. For 2012, let's assume another 5.0, and for 2013, let's assume a little decline, since he'll be 34 years old. Call it 4.5.

That makes Santana worth 32.8 over the six years. That seems to give the Twins a significant edge. It does, however, assume that Gomez and Guerra end up as top 25 prospects. Let's say Gomez has a 50% chance of being a Top 25 prospect, and Guerra has a 25% chance. The expected WARP for those two would now be 20.75, and the Mets win the trade (subject to the additional caveat below about Humber and Mulvey).

Maybe PECOTA is off and Santana will be better. But keep in mind that in the last six years, Santana has provided 54.0 of WARP, and that's during his prime! If he performs at that same level, the Mets win the trade.

Keep in mind that the scenarios showing the Mets winning are without any data about what prospects outside the Top 25 are worth. They are worth something, even if it is a long shot. Mulvey and Humber have some expected value that we'd need to add onto the Twins side of the equation.

Long story short, if Gomez and Guerra become Top 25 prospects next year, it looks like the Twins win. If Gomez does and Guerra doesn't, it's close to break even for the Twins, subject to whatever value Mulvey and Humber have. If Guerra does and Gomez doesn't, the Twins lose.

It's not as obvious a win for the Mets as I would have thought...at least using Wang's analysis.

One other point is required, and Wang mentions it too. Despite all the math, Santana is not just a star...he's a big star. He can win the Mets a couple of championships. All of the Twins guys can contribute to a team, but they cannot carry a team. If the Twins got three minor contributors and an everyday player, or even two of each, they may add up to a star's value. However, the star brings something more than raw value to the equation. The star pushes you over the edge towards a championship.

Santana Trade Part II

Top 10 hitting prospects have an expected 6-year WARP of 23.1, while those ranked #11-#25 are at 18.0. Top 10 pitching prospects have an expected 6-year WARP of 12.9, while those ranked #11-#25 are at 11.2.

The most innovative thought in Wang's analysis is to calculate how much salary a team saves by keeping a prospect, over what the team would spend to get the same WARP from a free agent. That salary savings can be used to then purchase or develop another player.

For example, a Top 10 hitting prospect can be expected to produce a 6-year WARP of 23.1 at a cost of approximately $2.0 million per year. The same free agent would cost about $8.8 million per year. The savings of $6.8 million could be used to purchase a free agent who would provide another 18.28 in WARP. Therefore the prospect is worth at least 42 WARP, and any trade in which that prospect is shipped to another team requires that the "selling" team get at least 42 WARP in return over the next 6 years to break even. (Again, these are figures from my spreadsheet using what I interpret Wang's methodology to be. They differ slightly from those reported in his tables.)

It works out something like this:

1. A top 10 hitting prospect is odds-on to be a contributor, and the money saved by keeping him and getting a free agent nets you another contributor. If you trade the prospect, you trade two contributors. You have to get a star in return.

2. An 11-25 hitting prospect is odds-on to be a minor contributor, and the money saved by keeping him and getting a free agent nets you another minor contributor. If you trade the prospect, you trade two minor contributors. You have to get an everyday player in return.

3. A top 10 pitching prospect is odds-on to be a minor contributor, and the money saved by keeping him and getting a free agent nets you a fairly useless pitcher. If you trade the prospect, you trade a minor contributor and a bust. You have to get a contributor in return.

4. An 11-25 pitching prospect is odds-on to be a bust, and the money saved by keeping him and getting a free agent nets you another fairly useless pitcher. If you trade the prospect, you trade two busts. You still have to get a contributor in return.


That's the concept I want to apply to the Santana deal.

(continued in Part III)

Santana Trade Part I

I have yet to see a publication, journalist or fan who thinks the Twins "won" the Santana trade. Today I read an article in the most recent issue of The Baseball Research Journal, written by Victor Wang that makes me wonder if the consensus is correct.

Wang looked at Top 10 prospects listed in Baseball America in the late 90s, and calculated how many of them were busts, how many were contributors, how many were everyday players and how many were stars. He used BP's WARP system to make those determinations, and I won't rehash them here. He did the same for prospects ranked 11-25 (and who never cracked the Top 10).

He then calculated what those prospects are expected to be worth, and what they are expected to cost, over a 6 year period. The first three years they make major league minimum. The next three years they make $0.64 million, $0.83 million and $1.29 million per WARP. The expected figures are a straight "expected value" calculation, multiplying the historical probabilities from his Baseball America analysis, times the salary and WARP figures he found.

He also calculated how much a free agent would cost assuming the same expected WARP, using $1.69 million per WARP in year one, and then increasing the cost by 10.87% each year for "inflation." (This last point is not entirely clear in the article. He mentions 10.87% escalation, but then seems to use $1.69 million in his calculation).

The article includes tables that are totally non-intuitive, and unfortunately, his peer reviewers did not flag this for the author. He provides a text description of the tables that is scarcely more helpful. I had to read the article four times, and then do my own spreadsheet, despite the fact the calculations are pretty straightforward once you have Wang's data about the prospects.

(In the rest of this topic, I'll use the figures from my spreadsheet, based on what I interpret Wang's methodology to be. They differ slightly from those reported in his tables.)

Many of the conclusions support what we already know. Pitching prospects are a significantly more risky crop than hitting prospects. Top pitching prospects have a 54-60% chance of being a bust and only a 3-4% chance of being stars. Top hitting prospects have a 20-33% chance of being a bust, and 14%-16% chance of being stars.

(continued in Part II)