National Efficiency
12.27.04
[Be advised, I wrote this a long time ago. -kp]
Here's a page where I have computed efficiency stats for the 330 D1 teams.
http://www.kenpom.com/stats.php http://kenpom.com/summary.php
It's kind of bulky and awkward, but I think it does give some unique insight into a few teams out there. The columns are sortable nationally. You can sort the adjusted numbers by conference using the T/O/D selector next to the 'Conf' heading. Unlike the ratings page, this won't be updated daily in the foreseeable future. It will probably be a weekly update until I streamline the process. I'm also thinking of displaying a different "bonus stat" every week.
Any feedback you have would be great, so e-mail me. Also, if you an idea for a bonus stat, I'd love to hear it.
I'll throw out some observations tomorrow. Today, I will do my best to explain these figures. You read on at your own risk. There is no lifeguard on duty.
There are two columns for each statistic: raw and adjusted. First I'll explain the raw numbers.
Tempo/Pace - The number of possessions per 40 minutes. Possessions is not an official NCAA statistic, so it must be estimated. The formula I am using is:
Possessions = FGA-OR+TO+.42*FTA (I use .475 for the free throw mulitplier now.)
This is a pretty standard computation that accounts for when possession is lost by a team. The only bit of uncertainty is the free throw portion, because we don't always know how a team got to the line. If they are shooting two, then the two FTAs account for one possession. But if they go to the line for one after making a shot, then the one FTA has no possession attached to it, because the previous FGA accounts for it.
The .42 multiplier estimates how many FTAs equal one possession. It has to be slightly less than one half. John Hollinger in Pro Basketball Prospectus uses .44. Dean Oliver in Basketball on Paper and other work uses .40. I'm splitting the difference. The difference between .4 and .44 means about a 1% change in the efficiency/tempo calculations.
I do a tempo calculation for each team in a game, average those two numbers and apply it to each team for the game, since each team's total possessions should be nearly equal. Then I average the tempo for every team's games-to-date to come up with the figures shown.
Offensive/Defensive Efficiency - This is the number of points scored or allowed per 100 possessions. There are only about 70 possessions for each team in the average college basketball game, so these numbers are higher that the points-per-game statistics you see used by the media.
Like tempo, I average each team's efficency by game. The other way to do this would be to take a team's total points on the season and divide it by total possesions. But this gives some games more weight than others depending on the number of possessions in a particular contest. Also, I only use games involving two D1 teams.
The raw numbers are computed from the data contained in a box score. Over the course of the season, this gives some unintuitive results, such as West Virginia currently having the 2nd most efficient offense in the nation and Texas A&M - College Station having the 2nd stingiest defense. So there's the matter of adjusting for competition - the "adjusted" numbers.
Say team A averages a pace of 62 possessions per game and team B averages 68 possessions per game. And for the sake of this example, let's say the average college game has 70 possessions, a nice round number. For the model I use, the expected possessions in a game involving teams A and B would be 60. This results from the fact that team A averages eight possessions slower than normal and team B averages two possessions slower than normal. The sum is ten possessions slower than normal, or 60.
Why would the game end up being played at a slower pace than either team's average? A team's average pace is a product of how they like to play and how their opponents like to play. A team playing much slower than average, like team A, is more than likely playing opponents that prefer to play faster than them. So team A's average pace on the season is faster that they would really like if they were totally in control.
The adjusted numbers are computed based on this principle. In every game, each team really wanted to play at a certain pace, and my model tries to dig this out of the data. For an example of how this works, take the Georgia Tech-Air Force game from December 11th.
Georgia Tech: season average pace = 71.5
Air Force: season average pace = 53.9
The pace of that game: 62.2
Based on the average national pace of 69.2, we would have expected the game to result in 56.2 possessions. So an adjustment has to be made in the way each team wanted to play this particular game.
Each team's season average is adjusted upward by the same percentage to produce numbers that would predict the actual game pace of 62.2. In this case, Air Force's pace for the game becomes 56.5 and Georgia Tech's becomes 74.9. That's how each team wanted to play, and that combination produced the game pace of 62.2. (Considering Air Force was playing from behind most of that game, it makes sense that they wanted to play faster than usual.)
All games are examined like this, and a season-long adjusted pace results from averaging a team's adjusted pace for each game played. The computations are repeated until the numbers stabilize.
The efficiency numbers are computed by a similar principle. For example, let's say that team A has an offensive efficiency (OE) of 120 and team B has a defensive efficiency (DE) of 120. Keeping our round number principle, I'll use a national average for OE of 100. For a game between A and B, A's offensive efficiency is expected to be 140. This is arrived at because both teams deviate from the norm by +20. So the sum of the deviations is 40, and that gets added to the nationwide average of 100. This concept was exhibited when Washington State played Oklahoma State. WSU's anemic offense against OSU's renowned defense produced a historically pathetic 29-point outing for Wazzu.
Season-long adjusted numbers are computed in the same manner described for pace above, with each team getting assigned a game OE and DE.
This was very confusing to write, so I am sure it was confusing to read. If you have questions, just e-mail me and I will answer them on the blog.
The Possession
04.19.04
During tourney-time, I played with some stats on how efficient teams were on offense and defense. In order to best evaluate this, one should look at how well a team does per possession, as opposed to per game or per minute. For instance Billy Tubbs' Lamar team averaged 79.1 points per game last season, good for 15th nationally. But they used around 80 possessions per 40 minutes, the fastest pace in the nation. They averaged .963 points per possesion, which ranked 242nd nationally and partially exposes why they lost 18 of 29 games this year.
Across college basketball, teams average .994 points per possession. So Lamar's offense was 3% less efficient than the average college team. And the average college team isn't getting close to thinking about getting to the NCAA Tournament with something other than an automatic bid.
So how does one compute possessions?
Before I go any further, I have to recommend reading Dean Oliver's Basketball on Paper. He does some creative stuff with basketball statistics. The only problem is that he deals with NBA statistics, but the concepts apply to the college game. You'll learn a lot about the game after having read it. I'll be refering to this book more in the future.
Commercial over.
The most common formula for estimating possesions is (FGA - OR) + TO + (Y * FTA),
where FGA = field goal attempts, OR = offensive rebounds, TO = turnovers, Y = some number between zero and 1, and FTA = free throw attempts.
Going through the three terms in the formula, a possession can end by:
1) a shot not rebounded by the offense. An offensive rebound would continue the possession. This is captured by the term FGA-OR.
2) a turnover. (duh.)
3) Free throws - sometimes.
The only mystery here is what Y should be. First off, I'll clear up why Y needs to be there. We don't know how many possesions are used up by free throws, that's why. In the ideal situation, if every trip to the line resulted in two free throws, then we could multiply free throws by one half and be done with it. However, technical fouls, the "and one" situation, missing the front end of the 1 and 1, and shooting 3 shots resulting from a 3-point attempt all deviate from the ideal situation. Oliver estimates that Y should be .4. John Hollinger in Pro Basketball Prospectus estimates it to be .44. From the data I've seen for college hoops, .44 is more accurate so I've been using that.
So that's how possesions can be estimated, and using possesions, folks can get a better understanding of which teams do good things on offense and defense.
You might wonder why offensive rebounds are treated as continuing a possesion, rather than starting a new one. I've seen two good reasons. First, by including them each team's possesions can reasonably be assumed to come out equal for each game. Second, getting and preventing offensive rebounds are skills. So if some teams do those skills better than others, it makes sense to attach those skills to a team's offensive or defensive ability.
