Tuesday, March 13, 2007

It's All About Context

When you don't understand stats, it's easy to rely on the hoary old quote about "lies, damn lies, and statistics." Like most old saws, there's a grain of truth in that. And that's certainly true with baseball, which really is a game of statistics.

You can't talk baseball without talking numbers. How many Homers did Ryan Zimmerman hit? That's a stat. What's John Patteron's ERA? That's a stat. Wow, Cordero saved a game today! That's a stat. Hell, even the linescore gives us three different statistics. Even the most grizzled old timer would acknowledge that those stats aren't worse than those "damn lies."

The key to stats is to understand what they're trying to tell you. I've written previously about my belief that stats are adjectives. They're numbers that describe what's happening on the field in a shorthand way. If I tell you that someone's a 40 homerun hitter, you get an immediate picture in your mind of slugger, likely a first baseman or an outfielder. Maybe you even picture a specific player? When used right, and in the right combinations, stats can describe a player's abilities as well as the best writer can.

The catch, though, is that just as the writer needs to choose his words carefully to paint the proper picture, the person using the stats needs to use the right ones for an accurate description. Unfortunately, especially in sports writing, all too often, you get junk stats, especially when people are trying to support an argument with stats coughboswellcough rather than making an argument based on what stats tell you.

That's why it's important to consider the context of the stats, and understand what they're describing so you can think critically about them.

In another recurring series that I'll never actually get around to finishing, I'll walk through some of the basic every day stats, what they mean, and why they're important.

Baseball stats basically come in two flavors, rates and counting stats. The latter tells how many of something someone did, the prior how often.

For example, hits are a counting stat. Juan Pierre led the NL last year with 204 hits. Batting average is a rate stat. It's simply the number of hits divided by ABs. In Pierre's case, despite leading the league with 204 hits, he only batted .292, which was 27th best. Which of those approaches is better?

Well, they both have their biases.

If you rely solely on a counting stat, it's typically a function of opportunity. Pierre led the league in hits not because he was particularly skillful at getting hits, but because he had more opportunities than anyone else, leading the league in Games and At Bats. A player who plays more is simply going to have more time to accumulate counting stats than someone who plays more infrequently. That doesn't mean he's necessarily better though.

Conversely, a sole reliance on rate stats won't get it done. To pick a silly example, Nats RP Travis Hughes led the team in batting average. He was a perfect 1 for 1. For a more realistic comparison, Daryle Ward batted .309 as a Nat. But he did that in 104 ABs, and he fell about 170 hits short of Pierre's total.

Which approach is more valuable? Neither, really. It all depends on the context of what you're looking for.

It's just one thing that you need to keep in mind when looking at a stat. Always ask yourself what the stat is describing and what biases there might be in its presentation.

More at some point.... looking at Pythagoras and his relationship with runs, but also our good friends batting average, on-base percent and slugging. I know you can't wait!


  • I'm no stats geek. I can tell Austin Kearns has heart because of all his game-winning RBI.

    By Blogger Ryan, at 3/13/2007 3:37 PM  

  • http://sports.espn.go.com/mlb/news/story?id=2797237

    I mean this seriously. Hooray for innovation!

    By Blogger Rage, at 3/13/2007 4:01 PM  

Post a Comment

<< Home