THE BOOK cover
The Unwritten Book
is Finally Written!

Read Excerpts & Reviews
E-Book available
as Amazon Kindle or
at iTunes for $9.99.

Hardcopy available at Amazon
SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
Shop Amazon & Support This Blog
RECENT FORUM TOPICS
Jul 12 15:22 Marcels
Apr 16 14:31 Pitch Count Estimators
Mar 12 16:30 Appendix to THE BOOK - THE GORY DETAILS
Jan 29 09:41 NFL Overtime Idea
Jan 22 14:48 Weighting Years for NFL Player Projections
Jan 21 09:18 positional runs in pythagenpat
Oct 20 15:57 DRS: FG vs. BB-Ref

Advanced

Tangotiger Blog

<< Back to main

Friday, March 13, 2020

Statcast Lab: xERA

By Tangotiger 06:20 PM

​We’ve rolled out xERA, which simply takes xwOBA and puts it on the ERA scale. To show you that relationship, here’s xwOBA on the x-axis, and xERA on the y-axis. The idea is simply to make the stat more accessible. So, whichever scale you prefer, the OBP scale or the ERA scale, the same information is conveyed.

And if you square xwOBA, that curved line you see there will be a straight line. That’s the translation.


#1    jgf704 2020/03/13 (Fri) @ 23:31

A few months ago, a poster in a fantasy baseball forum I participate in came up with a linear model for xERA from xWOBA (he simply plotted 3 years worth of wOBA vs. ERA with some minimum innings threshold). 

I thought something more basic and fundamental was possible.  In particular, I thought xERA should reflect the “expected” out rate, i.e. xBA should figure in the equation for xERA.  My expression is:

The first term in square brackets is runs per plate appearance for the pitcher.

The second term in square brackets is an estimate of plate appearances per 9 innings.  Batting outs per 9 is somewhere between 25.5 and 26.  “bbfrac” is the pitcher’s actual walk rate (i.e. walks per plate appearance… I suppose one could include HB with walks).  And xBA is expected batting average, of course.

Finally, I multiply by the league’s fraction of runs that are earned (0.924 in 2019).


#2    Tangotiger 2020/03/14 (Sat) @ 00:03

Create an xERA that is proportionate to xwOBA^2.

Then compare that to what you have, and let us know how it looks.


#3    Detroit Michael 2020/03/14 (Sat) @ 20:31

jgf704 might be referring to me.  (I post as Michael@HQ at the BaseballHQ forums.)  I have been using xwERA = xwOBA * 27 - 4.3.  Many of us, not just fantasy players, find it easier to see it in an ERA scale.


#4    jgf704 2020/03/14 (Sat) @ 21:39

Working on it. 😊


#5    jgf704 2020/03/15 (Sun) @ 09:37

Argh.  Dumb typo in my equation in post #1 (wrote the earned run factor backwards).  Should be:


#6    jgf704 2020/03/15 (Sun) @ 09:56

Yep, Detroit Michael, I was referring to your BBHQ work. 😊

For Tom… So digging around the internet, I found a 2017 Reddit thread where you advocated a simple formula for converting wOBA to runs per 9 innings (I call it R9 below):

... which is basically what you are saying in this post without the scaling factors (and ERA would need an earned run factor, of course).

So I calculated R9 by the formula in my post #1 (leaving out the earned run factor) for 2019 pitchers with 50+ IP, along with simple the formula above (in this post).  And, to my surprise (and probably not yours 😊), the simple version works better.  More precisely… there were 341 pitchers with 50+ IP.  The root mean square error in R9 was 0.65 by my formula, and 0.56 by the simple equation.

As you know, my equation in post #1 is based on wRC (along with an estimate of PA9 from xBA and walk rate).  So I view it as more fundamental (in that it is based on something else which we know works).  And this is why I was surprised that the simple equation worked better.

My intuition says that there must be something fundamental in the simple equation as well.  Is there?


#7    Tangotiger 2020/03/15 (Sun) @ 16:10

Yes, for pitchers, the “interdependence” makes more sense.  As it does for teams.

For batters, you want the linear/additive approach.


#8    jgf704 2020/03/15 (Sun) @ 18:21

By “interdependence”, I assume you mean the fact that RA9 is proportional to wOBA*wOBA.

Is it just serendipitous that the simple formula of yours that I posted in #6 works so well?  Or is there some underlying fundamental principle at work (and that I’m missing)?


#9    Tangotiger 2020/03/21 (Sat) @ 08:52

There is a fundamental principle.

At a VERY rough level: the % of times you are on base = the % of times that those runners will score.

Obviously at the low end, approaching .000 OBP, it would be natural that 0% of those runners will score.

Similarly at the high end, as your OBP approach 1.000, then 100% of those runners will score.

More to the point, think of the 3 bases as the transition point between getting on base, and scoring.  Every inning you will have 0 to 3 runners left on base.

So, if you are not left on base, you will score. (Setting aside DP,etc)

Let’s assume there’s always 1 runner left on base each inning.  So, you will end up with something like this, starting with 10 runners reaching base for the game:

Batters Put Out 
Runners Reaching Base 
Left on Base 
Runs Scored 
OBP 
% runners scoring

27 10 9 1 0.270 10%
27 11 9 2 0.289 18%
27 12 9 3 0.308 25%
27 13 9 4 0.325 31%
27 14 9 5 0.341 36%
27 15 9 6 0.357 40%
27 16 9 7 0.372 44%
27 17 9 8 0.386 47%
27 18 9 9 0.400 50%

Each of the columns matches to the data in order.

Focus on the last two.  Notice how, roughly, the % of runners that score is close to the % of runners reaching base?

Runs scored, by defintion
= % of runners reaching base
times
% of runners that score

And so, that’s why runs is proportional to OBP squared.

And wOBA is, basically, OBP.


#10    jgf704 2020/03/23 (Mon) @ 23:02

Thanks you.  And, dang, I should have been able to figure that out.  FWIW, I was re-immersing myself in Base Runs after seeing how well the (wOBA)^2 thing worked.  But I was thinking about it an overly-complicated fashion—should have backed off to a simpler model that still captured the salient features as you did here.

Thanks again for coming back to this.


#11    jgf704 2020/03/24 (Tue) @ 10:57

And yeah, you sum up alot of this stuff (i.e. linear vs. non-linear / dynamic run estimators) on your Base Runs wiki page.


#12    Detroit Michael 2020/03/24 (Tue) @ 13:28

I like the end of post #9.  It clearly, succinctly (in my view anyway) explains why it is not a linear relationship.

And yet, look at the graph again.  It’s not linear, but it’s not very far away from linear either within the relevant portion of the graph.  I don’t feel badly at all for using the linear formula from post #3 to estimate xwERA or xERA during the interim before Baseball Savant starting computing it for us.


#13    jgf704 2020/03/24 (Tue) @ 22:59

FWIW, Michael, I agree with you that the linear formula based on least squares minimization likely provides at least as good (and possibly a better) fit to the data.  OTOH, I really like the simplicity of:


#14    Tangotiger 2020/03/25 (Wed) @ 18:00

Where it saves us is with those great relief seasons that a linear would get you into negative runs.

Especially if you think of it say after 20 or 30 innings into the season, since we are showing the data after every game.


Click MY ACCOUNT in top right corner to comment

<< Back to main


Latest...

COMMENTS

Sep 09 14:47
Can Wheeler win the Cy Young in 2024?

Sep 08 13:39
Small choices, big implications, in WAR

Sep 07 09:00
Why does Baseball Reference love Erick Fedde?

Sep 03 19:42
Re-Leveraging Aaron Judge

Aug 24 14:10
Science of baseball in 1957

Aug 20 12:31
How to evaluate HR-saving plays, part 3 of 4: Speed

Aug 17 19:39
Leadoff Walk v Single?

Aug 12 10:22
Walking Aaron Judge with bases empty?

Jul 15 10:56
King Willie is dead.  Long Live King Reid.

Jun 14 10:40
Bias in the x-stats?  Yes!

Jun 13 17:05
Bat Swing Checklist

Jun 07 12:10
Spray Angle is not needed, part 32

Jun 02 17:37
Stanton Swing Speed and Acceleration Curves

Jun 01 14:44
Statcast Lab: Pre-introducting Bat Acceleration

Jun 01 12:14
Bill James and Tango talk WAR

May 28 16:56
In support of Bill James against the implication of Catcher Framing

May 28 15:24
NaiveWAR and VictoryShares

May 28 15:20
Statcast Lab: Switch Hitters and Swing Speed

May 06 13:59
Team depending on Free Agency

Apr 24 15:03
How bad will the A’s be?