clock menu more-arrow no yes mobile

Filed under:

MondoLinks: Weekend Recap

MondoLinks: A recap of baseball happenings over this past weekend...and...this week's One Big Idea: Putting the Pork into Projections.


Well, the only real news worth reading if you are an LAA fan is that, as has been well documented here on HH, Mike Trout signed a fantastic extension for both himself AND the team. Then he went here to celebrate. And he ate this. For the record, those are "The Cowboy Ribeye" entrees. "Bone-In Rib Chop Grafton Farms White Cheddar Scalloped Potatoes / Sauce Bordelaise Average 36 oz. of Meat, 8 oz. of Fat and 14 oz. of Bone = Total of 58 oz." It's only $89 each. Mikey can afford it.

As the 2014 Spring Training period concluded, the Halos took 2 of 3 from the Doyers, and the Doyers' Super-Duper-Star wannabe was left trying to avoid becoming an all out train wreck.

Rosters are set throughout the organization. Here, here and here.


[3/29/2014 Change of Art Answers: 1-Umpire loses patch on left sleeve...2-Bengie's right sleeve is lengthened...3-Snow's helmet loses logo...4-Baserunner in background loses a hand...5-Card number changes from 1 to 11...6-Molina acquires a modern logo patch on his left sleeve...7-Molina's chest protector is extended...8-Man in stands behind bengie has extra leg.]



So here we are, on the verge of a new season (ignoring the few games played by LAD so far...). You have seen, and/or could see with more Googling, quite a few projections as to where the various teams will end up in the Win column and, hence, in the playoffs or not. Some of these are "finger in the wind" kind of stuff. You know, just taking a look at where things ended up last year, figure things will end up pretty much the same, but make a few modifications based on some teams not suffering massive injuries or flops, and other teams not over-achieving twice in a row. This is what baseball writers, "experts", had been doing for decades. And I would bet that many still do this today.

But there are other ways to do these projections. Ways that use a lot of math. Math that uses lots of symbols. Symbols that scare off the meek. I am not among the meek.

I had visited these roads before here on HH. (Just trust me. It was long ago.) I spoke my piece and moved along, and much of my dismay was obscured behind our ongoing chortle at the annual PECOTA miss aimed at the Mike Scioscia managed LA Angels. I mean, after all, why worry about somebody complaining about the ugly abuse of PECOTA when there was a gargantuan zit already there at the tip of the PECOTA nose?

And I was going to ignore it again this year, and had a different essay lined up for today, when I got a call out of the blue from a person who might be considered near the pinnacle of scientific use of statistical analysis. And this person was extremely frustrated by all these published "projections". Why, for God's sake, don't they state their margins of error? Someplace there must be this information, so what is it? And why is it not used every single time the projections are proclaimed?

I knew where he was going with this, and confessed to him that there are rare cases where those margins have been discussed. I also told him that the reason the margins of error on PECOTA projections tend to be omitted just might be because they are embarrassing. And one does not try to embarrass a guy like a Nate Silver unless one is, uh, a professional, published, astronomer or something. The kind of person who gets invited to speak to other scientists and present his research concerning the influence of active galactic nuclei on the life-cycle of star formations throughout extremely distant cluster galaxies. I'm not that kind of guy. I'm the kind of guy who gets called on a lovely Sunday afternoon by that kind of guy.

But I see his point.

Here is why this is a problem: the vast majority of seasons played by baseball teams throughout history tend to lump within a rather narrow set of outcomes. And the broader the margin of error for any projection, the greater the likelihood that this set of outcomes is pretty well covered anyway. It's one thing to claim that the Angels will win precisely 89 games in 2014. It's quite another thing to claim that the Angels will win 89 games, +-5.

Why is this?

Well, let's stick with that 89 wins for our example. 89 wins represents almost 55% of a 162 game season. Of the 1389 team seasons played throughout Major League Baseball since both leagues went to a 162 game season in 1962, the times a team has won 55% of their games has happened 3.7708% of the time. If you can project something to happen, and that something only happens about 3.8% of the time, and you get it right, that's pretty damned good.

On the other hand, 89 +-5 wins means that the subject team could win anything from 84 wins to 94 wins. In a 162 game season, that would represent from ~52% to 58% of their games. And, in that same set of 1389 team seasons since the start of 1962, that range represents ~46.27% of all recorded outcomes. Nearly one out of two. Which is almost a 50/50 chance of being right. A coin flip.

Even 89 +-3 wins yields a subset that represents ~31% of all recorded outcomes. One out of 3.

Now, for the uninformed, PECOTA (my example here) is intended primarily for individual players. Team projections are gathered by applying individual projections through a teams depth chart. It was created by the famous Nate Silver, but tweaked into what it is today by Colin Wyers. Wyers himself told Yahoo! Sports back in February that "PECOTA has margin of error of roughly eight wins." The article claims (I have no ability to verify this) that the most perfect projections system would have a 7-win margin of error.

I know what you are wondering. I don't know if by "8 games" Wyers means +-8 or +-4, but even giving him the benefit of the doubt @ +-4 we still can show that based on all team seasons played since 1961, a 4 game margin of error either way on an 89-win prediction means that you have a ~41% chance of being right. A +-8 game margin of error would mean that you have a 66.2% chance of being right. And a "perfect" system with a +- 7 game margin of error on that 89-win prediction still means that you have a 60.8% chance of being right.

And this is why that "professional, published, astronomer" who called me up all frustrated about all these projections for this season went directly to the heart of the matter and explained to me that anybody who makes such a projection and fails to state the actual margin of error can be considered to be (in a scientific sense) "...lying". And when I call him back and let him know that the actual margin of error for PECOTA is 8, he is going to say to me "I told you so." As he should.

Don't let the people with the fancy math symbols lie to you, people. It's really not that hard to figure this stuff to that level, using nothing but your fingers and toes.

UPDATE: The astronomer reached out to me and requested a few edits, to tone things down and remove some identifying personal elements. This I have done. In that same exchange, he clarified that Wyer's statement of PECOTA having "...a margin of error of roughly eight wins" does mean, scientifically, +-8. Which does mean that a PECOTA prediction of 89 wins should be correct 66.2% of the time. Which, I contend, is really not all that magical. For the sake of curiosity, I am going to follow up this essay with my data, and include the ease of being right using +-8 wins for a whole slew of predicted PECOTA team win values. You should see that it's pretty easy to figure this out for yourself, and pretty easy to see that projecting baseball win totals should not be considered one of the more rigorous statistical arenas of modern day baseball.