Twitter Updates
- The Odd Confluence http://j.mp/bpqgCF 4 hours ago
- Game On http://j.mp/d27wrI 8 hours ago
- Beck Stand-In: Labor Day is 'Socialism Day' http://j.mp/91YmCh 1 day ago
- Petraeus Condemns U.S. Church's Plan to Burn Qurans http://bit.ly/ceMDJR 1 day ago
- Palin: Babies - Guns - Jesus http://j.mp/aUgyKh 1 week ago
Categories
Archives
-
Recent Posts
Weird Things about ERA
One of my planned projects during my new found unemployment involves sabermetrics. As a first step, I’m writing scripts to parse Retrosheet files and calculate advanced pitching metrics like FIP and WAR. Along with these advanced metrics, I’m also calculating more traditional stats like ERA.
When some of the resulting ERAs in my database did not match those found on the back of a player’s baseball card, I did a bit of debugging and found out that some weird shit happens with plays involving a fielder’s choice. For example take this game between the White Sox and Angels on May 6, 2007. The top of the 8th inning starts with Bartolo Colon on the mound for the Angels. Here’s the play-by-play for that half inning:
Colon put Crede and Cintron on base (that’s a lot of players whose last names begin with the letter ‘C’). Shields then relieved Colon and finished the inning. Crede, Cintron, and Pierzynski all scored in the inning. I was under the assumption that Colon would’ve been responsible for Crede and Cintron, and Shields would’ve been responsible for Pierzynski. Turns out that all runs are charged to Colon. Well, according to rule 10.16(g):
The official rules go on to give examples using fictitious players with names like Peter, Roger, Abel, Baker, and Charlie.
Anyway, I had to transfer base runner responsibilities between pitchers during such situations to compensate. I’ll provide details of the solution after cleaning up the code a bit. A lot of help was provided by looking through Chadwick, an open source suite of tools for parsing Retrosheet files.