A FREE E-ZINE
FROM THE
PERENNIAL
CHAMPIONS AT







SPECIAL REPORT
"The Great Myths of Projective Accuracy"
by Ron Shandler
Ashley-Perry Statistical Axiom #1: Skill in manipulating numbers is a talent, not evidence of divine guidance.

Ashley-Perry Statistical Axiom #2: The product of an arithmetical computation is the answer to an equation; it is not the solution to a problem.

Merkin's Maxim: When in doubt, predict that the present trend will continue.

The quest continues for the most accurate baseball forecasting system.

I've been publishing player projections for the better part of nearly two decades. During that time, I have been made privy to the work of many fine analysts and many fine forecasting systems. But through all their fine efforts at attempting to predict the future, there have been certain constants. The core of every system has been comprised of pretty much the same elements:

  • Players will perform within the framework of their past history and/or trends.
  • Skills will develop and decline according to age.
  • Statistics will be shaped by a player's health, expected role and home ballpark.

    These are the elements that keep all projections within a range of believability. This is what prevents us from predicting a 40-HR season out of Juan Pierre or 40 SBs for David Ortiz. However, within this range of believability is a great black hole where any semblance of precision seems to disappear. Yes, we know that Albert Pujols is a leading power hitter, but whether he is going to hit 40 HRs, or 45, or 35, or 50, is a mystery.

    You see, while all these systems are built upon the same basic elements, they also are constrained by the same global limitations. We are all still trying to project...

  • a bunch of human beings
  • each with their own individual skill sets
  • each with their own individual rates of growth and decline
  • each with different abilities to resist and recover from injury
  • each limited to opportunities determined by other people
  • and each generating a group of statistics largely affected by tons of external noise.

    As much as we all acknowledge these limitations as being intuitive, we continue to resist them because the game is so darned measurable. The problem is that we do have some success at predicting the future and that limited success whets our desire, luring us into believing that a better, more accurate system awaits just beyond the next revelation. So we work feverishly to try to find the missing link to success, creating vast, complex models that track obscure trends and relationships, and attempt to bring us ever closer to perfection. But for many of us fine analysts, all that work only takes us deeper and deeper into the abyss.

    Why? Because perfection is impossible and nobody seems to have a real clear vision of what success is.

    Measuring success

    Is reasonable predictive accuracy even an attainable goal? Most agree that, given external variables such as injuries, managerial decisions and the like, only about 65-70% of the player population is even marginally predictable in any given year. But even within that group, you cannot get two analysts to agree about what it means to be accurate.

    In truth, the only completely accurate projection would be one that looks like this:

           AB    HR    RBI    SB    BA    OBA    SLG    OPS
          ===   ===    ===   ===   ===   ====   ====   ====
    PROJ  500    25     95    15   .280  .330   .450   .780
    ACT   500    25     95    15   .280  .330   .450   .780

    Clearly, you would be overjoyed if all of our projections yielded perfect results. But it is impossible to be on target with all of these individual categories, each moving more or less independently for 180 days each baseball season.

    An alternative might be to focus only on the most important statistical gauges. After all, each raw data category measures only an isolated element, and some stats like batting average are flawed. Perhaps a better measure of accuracy can be gleaned by using a gauge of overall talent, like OPS.

    It sounds reasonable in theory. However, if I projected a player to have an OPS of about .868, for instance, he could post any of the following 2004 stat lines and my projection would still be considered a success:

           AB    HR    RBI    SB    BA    OBA    SLG    OPS
          ===   ===    ===   ===   ===   ====   ====   =====
    A     240    13     40     0  .296   .365   .504   .8688
    B     573    32     67    13  .255   .371   .497   .8685
    C     467    19     76    10  .298   .380   .488   .8682
    D     438    26     71     0  .279   .329   .539   .8679
    E     704     8     60    36  .372   .413   .455   .8676

    I suppose, for simulation gamers and pure scientists, these players are all comparable. And with my .868 OPS projection, any of these results would have provided for a perfect success story. But I'd hardly think that, if I projected Brad Wilkerson (B) to have Jason Varitek's stats (C), you'd consider I was a heck of a prognosticator. Kevin Mench (D) and Ichiro Suzuki (E) are hardly comparable either, even though OPS does say that.

    Admittedly, John Mabry (A) should not be in this group, but aggregate gauges like OPS make no distinction for playing time. Even if we were to separate out full-timers from bench players, OPS again can't reflect the impact that Brad Wilkerson's additional 100- plus ABs has over Varitek or Mench.

    Despite the similarities using a gauge that measures aggregate performance, these are very different skill sets for most fantasy applications.

    One way to resolve this issue might be to use a more fantasy-friendly gauge. Rotisserie dollar values can serve a dual purpose here. First, they measure only those categories that we are interested in. A second benefit is that they incorporate the importance of playing time – which OPS does not – and eliminate the problem of a John Mabry being included in this group. And in fact...

                   AB    HR    RBI    SB     BA    5x5
                  ===   ===    ===    ==    ===    ===
    Mabry,J       240    13     40     0   .296     $7
    Wilkerson,B   573    32     67    13   .255    $22
    Varitek,J     467    19     76    10   .298    $18
    Mench,K       438    26     71     0   .279    $15
    Suzuki,I      704     8     60    36   .372    $35

    ...now this group is no longer cut from the same cloth. But Rotisserie values still do not negate the underlying problem with comparing sets of numbers. Varitek is a nice $18 player, but $18 doesn't always buy you the same type of statistics:

                   AB    HR    RBI    SB     BA    5x5
                  ===   ===    ===    ==    ===    ===
    Varitek,J     467    19     76    10   .298    $18
    Grissom,M     562    22     90     3   .279    $18
    Wilson,J      652    11     59     8   .308    $18
    Lugo,J        581     7     75    21   .275    $18

    So, using dollar values doesn't work either. The last thing that a power-rich, speed- starved team needed was Marquis Grissom's numbers when you thought you were paying for Julio Lugo's. With all these obstacles to using aggregate performance gauges, perhaps we need to refocus on projecting individual stat categories. Can this provide any better hope for defining prognosticating success?

    Here is where it gets “personal.”

    If I were to project that Albert Pujols is going to hit 45 HRs this year and he only hits 44, you will probably accept that level of inaccuracy. But what if he hits only 43? Or 42? Or 40? Or 39? At what point do we cross that imaginary line where the projection is “officially” deemed a failure?

    You might say “40.” I might say, “Okay, so if Pujols has 39 HRs on the final day of the season, and he hits a long fly ball that Corey Patterson makes an amazing over-the-wall leap to rob him of #40, has that one event been the difference between success and failure?” We have to draw a line between success and failure somewhere, but there is always going to be a grey area where it can go either way. You might consider the grey area as representing "inaccuracy." But more important is the fact that the size of this grey area is different for everybody.

    In early 2003, we asked this type of question in two online polls at BaseballHQ.com. Here were the results:

    If I were to project 35 HRs for Hideki Matsui this year, what is the threshold of actual HRs at which you would perceive that my projection had failed?
    			34		2%
    			32		3%
    			30		18%
    			28		31%
    			26		24%
    			24		14%
    			22		5%
    			20		3%

    If I were to project 15 wins for Tom Glavine this year, what is the threshold of actual wins at which you would perceive that my projection had failed?

    			14		4%
    			13		10%
    			12		33%
    			11		27%
    			10		17%
    			9		3%
    			8		2%
    			7		3%

    There is no clear consensus in either poll. That's why this is “personal.” Accuracy can only be assessed based on your own subjective tolerance for error.

    But you might say, “Shandler, there has to be some type of benchmark I can use. There has to be some way to gauge accuracy.”

    I'm not so sure. There are some people who might consider a broad stroke approach to be sufficient, using a flat percentage benchmark across all categories. For instance, you might be satisfied if a projection was off by only 10% across-the-board. Doesn't that seem reasonable? But a casual “eyeball test” can be deceiving. To wit:

            AB    R    H   HR   RBI   SB    BA
           ===   ==  ===   ==   ===   ==   ====
    PROJ   550   79  169   29   113   13   .307
    ACT    599   70  169   26   100   10   .282

    At first glance, this looks like a pretty good projection, at least one that you wouldn't be too unhappy with had you expected to purchase that first set of stats. Our eyeball test says that his overall productivity was pretty much on target. In reality, every one of his statistics was mis-projected by over 10%. Based on the “acceptable” 10% tolerance, this projection was a complete failure. Of course, I could just loosen that tolerance, perhaps to 15% or 20%, which will boost our perceived success rate, but the eyeball test will get much fuzzier.

    Here is the above example with actual results within 15-20% of projection:

            AB    R    H   HR   RBI   SB    BA
           ===   ==  ===   ==   ===   ==   ====
    PROJ   550   79  169   29   113   13   .307
    ACT    632   62  169   23    87    7   .267

    My own eyeball test says that, while this projection was marginally in the ballpark, perhaps a 20% error is beyond the limits of my comfort level. But again, you might look at the above results and think these are perfectly fine within your tolerance for error. Can we agree on anything? Not likely.

    The irony with the above examples is that, despite the shortfalls in batting average, both projections nailed this player's hit total. All of which begets other questions...

    If a slugging average projection is dead on, but the player hits 10 fewer HRs than expected (and likely, 20 more doubles), is that a success or a failure?

    If a projection of hits and walks allowed by a pitcher is on the mark, but the bullpen and defense implodes, and inflates his ERA by a run, is that a success or a failure?

    If the projection of a speedster's rate of stolen base success is perfect, but his team replaces the manager in May with one that doesn't run, and the player ends up with half as many SBs as expected, is that a success or a failure?

    If a batter is traded to Colorado and all the touts project an increase in production, but he posts a statistical line exactly what would have been projected had he not been traded, is that a success or a failure?

    If the projection for a bullpen closer's ERA, WHIP and peripheral numbers is perfect, but he saves 20 games instead of 40 because the GM decided to bring in a high-priced free agent at the trading deadline, is that a success or a failure?

    If I project a .272 batting average in 550 AB and the player only hits .249, is that a success or failure? Most will say "failure." But, wait a minute! The real difference is only two hits per month. That shortfall of 23 points in batting average is because a fielder might have made a spectacular play, or a screaming liner might have been hit right at someone, or a long shot to the outfield might have been held up by the wind... once every 14 games. Does that constitute "failure?"

    Many questions, but all rhetorical.

    When it comes down to it, perhaps the only thing we can really trust is the eyeball test and our own personal tolerance for error. A fantasy leaguer with a loaded bullpen doesn't care whether his third closer puts up 40 saves or 30. When you are leading your league in home runs by 25, it doesn't matter whether Jeff Bagwell hits 39 HR or 27. And when all the aggregates wash out come October, the fact that your $25 Barry Zito saw his ERA rise by over a run will only affect your team's bottom line by 0.15 – in most leagues, a loss of maybe 2-3 points at worst.

    NEXT PAGE...

    FIRST PITCH FLORIDA
    Join Jim Callis, Ron Shandler, Rick Wilton, Todd Zola and other top national baseball analysts for two solid days of fantasy baseball talk, mock drafting workshops and spring training ballgames.. in beautiful, sunny Tampa, Florida!
    FIRST PITCH FLORIDA
    March 11-12, 2005.
    COMPLETE DETAILS HERE




    SHANDLER ENTERPRISES , P.O. Box 20303, Roanoke, VA 24018, 540-772-6315


  • E-mail your comments to: comments@fantasybaseballfriday.com
    This site best viewed with Internet Explorer.