/cdn.vox-cdn.com/uploads/chorus_image/image/54343531/666616064.0.jpg)
Baseball is a crazy game. ANYTHING can happen... but it usually doesn't.
When I first joined this site back some ten odd months ago, this was (and still is) one community member’s signature line that truly spoke to me. It’s why I was drawn to sports, but more specifically, why I was drawn to baseball: the sheer specialization that is required.
Each position requires a unique skillset not shared by other positions: a shortstop needs range and arm, whereas a third baseman might find the most important attribute quick reflexes. Every outfielder needs to take good routes to the ball, but a center fielder should have good range and closing speed, preferably a good arm. A right fielder, however, needs to have a good arm to prevent runners from taking the extra base.
And unlike other sports, physical appearance doesn’t matter nearly as much as innate ability.
I always get goosebumps thinking of this play in Moneyball, when pinch hitter Scott Hatteberg crushes the walk-off homer to give the A’s their twentieth consecutive victory. The satisfying sound when the ball hits the bat just right, the crowd erupting as the ball lands in the right field bleachers. It’s beautiful.
Do enjoy the game, but don’t let a long winning stretch or losing streak drive you to overreact. After all, baseball is 162 games worth of gut-wrenching patience. Let’s not rush to judgment based on partial, limited information not even 20 games in.
Recency bias is an error in which one focuses on what has happened recently and projecting a small sample size over a large period of time. The Angels won 5 out of 6 games (0.833 winning percentage) in one stretch of 2017. Did anyone actually believe they would continue to win 83% of their games? On the contrary, the Angels lost 6 games in a row. Did anyone actually believe they would lose all of their games?
The notion of recency bias is why Tyler Skaggs won’t pitch 7 innings deep into a game every time, it’s why Ricky Nolasco won’t give up 3 home runs each start, it’s why Albert Pujols isn’t going to ground into an inning-ending double play every single at-bat (although he is closing in on Cal Ripken for the most GIDP’s in MLB History).
If you work in data science, finance, engineering, psychology, or even baseball, you will certainly have heard of regression to the mean. Per Wikipedia:
In statistics, regression toward (or to) the mean is the phenomenon that if a variable is extreme on its first measurement, it will tend to be closer to the average on its second measurement—and if it is extreme on its second measurement, it will tend to have been closer to the average on its first. To avoid making incorrect inferences, regression toward the mean must be considered when designing scientific experiments and interpreting data.
Essentially Regression to the mean (and similarly, reversion to the mean) smoothens out statistical outliers over time. For example, five games ago, Yunel Escobar was the best hitter on the team. Today, he no longer is because he regressed to his averages (he’s slashing .300/.364/.400). Why? Because most statistics are in part to a player’s underlying ability but also an element of luck. When there is an extremely small sample size, one is unable to decipher which data is insightful until said data stabilizes and can be used for comparison and analysis.
At this point in the season, most data is meaningless. Martin Maldonado has a .385 OBP, Albert Pujols and Andrelton Simmons have hit the same number of home runs, and Blake Parker has a 0.07 FIP. Would you bet on this staying the same? I wouldn’t.
So take a deep breath and enjoy the games! I’ve heard the ones in April don’t count, anyways.