![]() |
A FREE E-ZINE
FROM THE PERENNIAL CHAMPIONS AT | ![]() |
|
"The Great Myths of Projective Accuracy" by Ron Shandler Ashley-Perry Statistical Axiom #1: Skill in manipulating numbers is a talent, not evidence of divine guidance. The quest continues for the most accurate baseball forecasting system. I've been publishing player projections for the better part of nearly two decades. During that time, I have been made privy to the work of many fine analysts and many fine forecasting systems. But through all their fine efforts at attempting to predict the future, there have been certain constants. The core of every system has been comprised of pretty much the same elements: These are the elements that keep all projections within a range of believability. This is what prevents us from predicting a 40-HR season out of Juan Pierre or 40 SBs for David Ortiz. However, within this range of believability is a great black hole where any semblance of precision seems to disappear. Yes, we know that Albert Pujols is a leading power hitter, but whether he is going to hit 40 HRs, or 45, or 35, or 50, is a mystery. You see, while all these systems are built upon the same basic elements, they also are constrained by the same global limitations. We are all still trying to project... As much as we all acknowledge these limitations as being intuitive, we continue to resist them because the game is so darned measurable. The problem is that we do have some success at predicting the future and that limited success whets our desire, luring us into believing that a better, more accurate system awaits just beyond the next revelation. So we work feverishly to try to find the missing link to success, creating vast, complex models that track obscure trends and relationships, and attempt to bring us ever closer to perfection. But for many of us fine analysts, all that work only takes us deeper and deeper into the abyss. Why? Because perfection is impossible and nobody seems to have a real clear vision of what success is. Measuring success Is reasonable predictive accuracy even an attainable goal? Most agree that, given external variables such as injuries, managerial decisions and the like, only about 65-70% of the player population is even marginally predictable in any given year. But even within that group, you cannot get two analysts to agree about what it means to be accurate. In truth, the only completely accurate projection would be one that looks like this: AB HR RBI SB BA OBA SLG OPS
=== === === === === ==== ==== ====
PROJ 500 25 95 15 .280 .330 .450 .780
ACT 500 25 95 15 .280 .330 .450 .780
Clearly, you would be overjoyed if all of our projections yielded perfect results. But it is impossible to be on target with all of these individual categories, each moving more or less independently for 180 days each baseball season. An alternative might be to focus only on the most important statistical gauges. After all, each raw data category measures only an isolated element, and some stats like batting average are flawed. Perhaps a better measure of accuracy can be gleaned by using a gauge of overall talent, like OPS. It sounds reasonable in theory. However, if I projected a player to have an OPS of about .868, for instance, he could post any of the following 2004 stat lines and my projection would still be considered a success: AB HR RBI SB BA OBA SLG OPS
=== === === === === ==== ==== =====
A 240 13 40 0 .296 .365 .504 .8688
B 573 32 67 13 .255 .371 .497 .8685
C 467 19 76 10 .298 .380 .488 .8682
D 438 26 71 0 .279 .329 .539 .8679
E 704 8 60 36 .372 .413 .455 .8676
I suppose, for simulation gamers and pure scientists, these players are all comparable. And with my .868 OPS projection, any of these results would have provided for a perfect success story. But I'd hardly think that, if I projected Brad Wilkerson (B) to have Jason Varitek's stats (C), you'd consider I was a heck of a prognosticator. Kevin Mench (D) and Ichiro Suzuki (E) are hardly comparable either, even though OPS does say that. Admittedly, John Mabry (A) should not be in this group, but aggregate gauges like OPS make no distinction for playing time. Even if we were to separate out full-timers from bench players, OPS again can't reflect the impact that Brad Wilkerson's additional 100- plus ABs has over Varitek or Mench. Despite the similarities using a gauge that measures aggregate performance, these are very different skill sets for most fantasy applications. One way to resolve this issue might be to use a more fantasy-friendly gauge. Rotisserie dollar values can serve a dual purpose here. First, they measure only those categories that we are interested in. A second benefit is that they incorporate the importance of playing time – which OPS does not – and eliminate the problem of a John Mabry being included in this group. And in fact... AB HR RBI SB BA 5x5
=== === === == === ===
Mabry,J 240 13 40 0 .296 $7
Wilkerson,B 573 32 67 13 .255 $22
Varitek,J 467 19 76 10 .298 $18
Mench,K 438 26 71 0 .279 $15
Suzuki,I 704 8 60 36 .372 $35
...now this group is no longer cut from the same cloth. But Rotisserie values still do not negate the underlying problem with comparing sets of numbers. Varitek is a nice $18 player, but $18 doesn't always buy you the same type of statistics: AB HR RBI SB BA 5x5
=== === === == === ===
Varitek,J 467 19 76 10 .298 $18
Grissom,M 562 22 90 3 .279 $18
Wilson,J 652 11 59 8 .308 $18
Lugo,J 581 7 75 21 .275 $18
So, using dollar values doesn't work either. The last thing that a power-rich, speed- starved team needed was Marquis Grissom's numbers when you thought you were paying for Julio Lugo's. With all these obstacles to using aggregate performance gauges, perhaps we need to refocus on projecting individual stat categories. Can this provide any better hope for defining prognosticating success? Here is where it gets “personal.” If I were to project that Albert Pujols is going to hit 45 HRs this year and he only hits 44, you will probably accept that level of inaccuracy. But what if he hits only 43? Or 42? Or 40? Or 39? At what point do we cross that imaginary line where the projection is “officially” deemed a failure? You might say “40.” I might say, “Okay, so if Pujols has 39 HRs on the final day of the season, and he hits a long fly ball that Corey Patterson makes an amazing over-the-wall leap to rob him of #40, has that one event been the difference between success and failure?” We have to draw a line between success and failure somewhere, but there is always going to be a grey area where it can go either way. You might consider the grey area as representing "inaccuracy." But more important is the fact that the size of this grey area is different for everybody. In early 2003, we asked this type of question in two online polls at BaseballHQ.com. Here were the results: If I were to project 35 HRs for Hideki Matsui this year, what is the threshold of actual HRs at which you would perceive that my projection had failed?34 2% 32 3% 30 18% 28 31% 26 24% 24 14% 22 5% 20 3% There is no clear consensus in either poll. That's why this is “personal.” Accuracy can only be assessed based on your own subjective tolerance for error. But you might say, “Shandler, there has to be some type of benchmark I can use. There has to be some way to gauge accuracy.” I'm not so sure. There are some people who might consider a broad stroke approach to be sufficient, using a flat percentage benchmark across all categories. For instance, you might be satisfied if a projection was off by only 10% across-the-board. Doesn't that seem reasonable? But a casual “eyeball test” can be deceiving. To wit: AB R H HR RBI SB BA
=== == === == === == ====
PROJ 550 79 169 29 113 13 .307
ACT 599 70 169 26 100 10 .282
At first glance, this looks like a pretty good projection, at least one that you wouldn't be too unhappy with had you expected to purchase that first set of stats. Our eyeball test says that his overall productivity was pretty much on target. In reality, every one of his statistics was mis-projected by over 10%. Based on the “acceptable” 10% tolerance, this projection was a complete failure. Of course, I could just loosen that tolerance, perhaps to 15% or 20%, which will boost our perceived success rate, but the eyeball test will get much fuzzier. Here is the above example with actual results within 15-20% of projection: AB R H HR RBI SB BA
=== == === == === == ====
PROJ 550 79 169 29 113 13 .307
ACT 632 62 169 23 87 7 .267
My own eyeball test says that, while this projection was marginally in the ballpark, perhaps a 20% error is beyond the limits of my comfort level. But again, you might look at the above results and think these are perfectly fine within your tolerance for error. Can we agree on anything? Not likely. The irony with the above examples is that, despite the shortfalls in batting average, both projections nailed this player's hit total. All of which begets other questions... If a slugging average projection is dead on, but the player hits 10 fewer HRs than expected (and likely, 20 more doubles), is that a success or a failure? Many questions, but all rhetorical. When it comes down to it, perhaps the only thing we can really trust is the eyeball test and our own personal tolerance for error. A fantasy leaguer with a loaded bullpen doesn't care whether his third closer puts up 40 saves or 30. When you are leading your league in home runs by 25, it doesn't matter whether Jeff Bagwell hits 39 HR or 27. And when all the aggregates wash out come October, the fact that your $25 Barry Zito saw his ERA rise by over a run will only affect your team's bottom line by 0.15 – in most leagues, a loss of maybe 2-3 points at worst.
SHANDLER ENTERPRISES , P.O. Box 20303, Roanoke, VA 24018, 540-772-6315 |