I have had a great time teaching a sports and math class to Kelley School of Business students at Indiana University. One of my ace students Paul Aynilian did a study trying to estimate John Hollinger’s famous ESPN PER Ratings based on box score statistics. We found that

45.75*(Points/Minute)+22.55*(Rebounds/Minute)+32.8*(Assists/Minute)+58.2*(Steals/Minute)-48.65*(Turnovers/Minute)

-39.73*(Missed FG’s per minute) -20.6*(Missed FT per minute)+38.37*(Blocked Shots Per Miute)-18.68*(Personal Fouls Per Minute)

explains over 99% of the variation in this season’s PER rankings and is off by an average of .37 in estimating the PER of the top 200 NBA players whose stats are on Yahoo.com. So basically our simple formula virtually duplicates the PER rating without a lot of mumbo jumbo.

**The disturbing thing about these weights is that if an NBA player shot 33 .33% (1-3) then the more shots they take the higher their PER because shooting 1 for 3 gives you a net contribution of 2(45.75)-2(39.73)>0!! Clearly this is bad because a 33% shooter is not a good shooter and with these weights the more shots a bad shooter takes, the higher his PER rating.**

“shooting 1 for 3 gives you a net contribution of 2(45.75)-2(39.73)>0″

It looks to me that shooting 1 for 3 on 2pt fg’s should translate to 2(45.75)-3(39.73)<0. By my math, a player needs to shoot 43.42% on 2pt fg’s to break even.

Comment by JT — March 1, 2012 @ 6:53 pm

no it is -2 because -39.75 mutliplies missed FG’s not FGA. 1-3 is 2 missed FG’s

Comment by wwinston — March 1, 2012 @ 8:52 pm

Ok, my mistake. In that case, a player needs to shoot 30.76% on 2pt fg’s to break even.

Comment by JT — March 1, 2012 @ 10:05 pm

How big is the correlation between an offensive adjusted +/- and FG% for this season?

Comment by mystic — March 2, 2012 @ 7:08 am

IMO the correct “break even” point for efficiency is the Holy Grail of player value analysis.

At one “reasonable” extreme is the league average for eFG% and TS% and saying you have to be better than that to create value scoring (approximately 50% and 54%).

At another “reasonable” extreme is to look at the average eFG% and TS% for 2P jumpers (approximately 40%). The reason for this standard is that most players are forced to take at least some 2 point jumpers. Hitting 40% would create no value, but it would also not punish them relative to some of the very low usage big men that only score at the rim.

Comment by lovethoseknicks — March 2, 2012 @ 9:57 pm

[...] Hollinger’s PER Ratings Demystified [...]

Pingback by Wayne Winston Simplifies PERs | The Wages of Wins Journal — March 4, 2012 @ 12:26 pm

Wanye, is that .99 r-square for 2012 stats and is that the same sample that was used to generate the equation? Has it been tried out of sample?

Comment by J. Cross — March 4, 2012 @ 7:15 pm

it is for 2012. If I knew where to get an easy set of data to download I would try it out of sample. I am heading to Microsoft to teach Excel modeling this week.

Comment by wwinston — March 4, 2012 @ 8:07 pm

The model does not accurately capture PER. Please stop pretending it does.

I tried this on my PER spreadsheet, giving several players an additional 1 for 3 and 2 more points (and giving the same to their team, along with appropriate other adjustments – 2 more rebounds to the opponents, etc.)

The player’s PER went down in each case. This was my 2007 spreadsheet.

Comment by Craig Burley — March 4, 2012 @ 10:41 pm

Craig, the formula is not so far off. A regression ran on the data from 2004/05 to 2011/12 (using basketball-reference.com as source for PER, Wayne, that would be the source for easy to download raw data as well) gives me:

PER = 3301.2*(3.498*FGM-1.097*FGA+1.412*3PM+1.861*FTM-0.538*FTA+1.062*ORB+0.422*DRB+0.968*AST+1.559*STL+1.062*BLK-1.529*TO-0.479*PF)/(Pace*Min)

For each season the league average is a bit different. That can be captured by using a multiplier 106.8/league average ORtg. That should give you an estimate RMSE of about 0.3.

So, I would give that “ace student” at best a B for that, because he didn’t found that the coefficients for offensive rebounds and defensive rebounds are different and he missed the adjustment for Pace and league average.

Anyway, the issue with volume shooting is still there. And to get an idea of how much of a problem that is, I asked again: How is the correlation between FG% and an offensive APM?

If I use ridge regression for this seasons matchup data and compare the offensive component of the results, I get a R of 0.19 for the whole data set, for all players with 200+ minutes played I get 0.34. For eFG% it is 0.42, for TS% 0.49. Well, for points per minute it is 0.42. A model build with points per minute and TS% gives me a R of 0.54.

Comment by mystic — March 5, 2012 @ 9:17 am

Thanks for this great analysis. We ignored pace to show that the basic box score stats are enough to come close to predicting a player’s PER. We should have looked at Offensive and Defensive Rebound data. Even without pace and OR/DR we are off by around .37 and your improvement brings things down to .3. Of course, seeing the situation validated for a larger data set is very important. Thanks again for the great work!!

Comment by wwinston — March 5, 2012 @ 9:30 am

Wayne,

Do you have any view on what the “break even” point should be for value creation when it comes to scoring?

I think everyone agrees that 33% (give or take) is way too low and gives way too much credit low efficiency volume scorers, but there doesn’t seem to be much consensus of where the break even point actually is.

Berri uses approximately 50%. That seems reasonable, but I think my argument for 40% also makes a lot of sense.

40% avoids giving too much credit to hyper efficient low usage scorers like Tyson Chandler relative to guys that have a broader game and take more shots. It also produces results that are more in line with perceptions without unduly rewarding the low efficiency volume scorers.

Comment by lovethoseknicks — March 5, 2012 @ 8:21 pm

His weights are off, definitely.

Using Z-Scores, I converted 8-year regularized plus-minus into PER (so that the top player would have 30 and the league average would be 15). So we can call this stat PERapm

By using all the same inputs minus player fouls, per-100 possessions, we find the following regression (0.5 R^2 against PERapm) for a 1-3 shooter (holding all else constant):

2 Points * 0.83 – 2 Missed Field Goals * 0.89 = -0.11

Comment by Nathan Walker — March 6, 2012 @ 12:59 pm

Odd that the regression has missed FT’s, but not made ones.

This is important because the average FTA/FGA in the NBA is around 0.30. And the average FT% is around 75%. Even a poor shooter (33%) will get many FTA as he racks up the FGA. The regression penalizes him for missing FT, but what about the ones that he makes?

Another thing that the regression doesn’t capture is that every shot that is taken (even a bad one) means that a turnover was “prevented”. Where is the credit that a high volume player gets for avoiding (team) turnovers? This is a similar problem I have to the criticism of TOV metrics in general, which don’t take into account the role that a player has (see Jeremy Lin for the latest example).

All this is not to say that a 33% shooter is a good one. But the actual break-even point for such a player may be higher than that, when these other factors are taken into account.

(BTW, I approve of mystic’s message.)

Comment by EvanZ — March 12, 2012 @ 2:06 pm

It seems to me that when a player has possession of the ball there are basically 3 things he can do to either create or destroy value that are captured in the box score.

1. Assist

2. TO

3. Shoot

The average for the NBA is approximately 3 assists for every 2 TOs. If you set the average NBA player to zero, then one could argue that the value of an assist should be “approximately” .67 and a TO 1. (setting assists that high may not seem consistent with changes in efficiency between assisted and unassisted shots, but it probably captures the more general value of “high quality passing, hockey assist etc…)

That would net to zero and scoring would be captured alone depending on the break even FG% used.

I repeat myself, the break even FG% is the Holy Grail of player evaluation. No one seems to know or agree on it.

Comment by lovethoseknicks — March 18, 2012 @ 8:56 pm

Sorry this response is kind of late but I think that the “Holy Grail” of shooting percentages can’t be the same for every position. For instance, Centers can shoot higher than 60% while good guards usually are relegated to about 45% or so. I think the actual league average percentage for any given year is about 45% to 48%. I did a few calculations from this year and past years and got all of the averages to be between these two numbers with the majority being about 46%.

Comment by B and G — May 3, 2012 @ 3:59 pm