Cubs 2007 Pitch Tracking: Pictures Worth a Thousand Curves
One of the latest and most exciting developments in baseball research is the measurement and analysis of individual pitches. For instance, the Pitch f/x system created by the company Sportvision tracks the in-flight movement of pitches from two different cameras, thereby assessing a pitch's velocity, horizontal and vertical movement. A bit less than 1/4th of all pitches from last year were so assessed, and MLB has made the raw contents of that data available at this location. Better yet, there are several bloggers who, unlike me, have the talent and dedication to transform that heaping mess of data into meaningful findings. Most notable, Josh Kalk has been developing player cards, a la what's available at baseball-reference or fan graphs or baseball cube, except with graphs incorporating this incredible new source of information on pitch selection and pitch behavior. He also has developed a remarkable application where you can select any player and any pitch with just about any limiting parameter you could want - say, Bob Howry fastballs to right-handed hitters on 0-2 counts with a velocity above 93 MPH that resulted in swinging strikes - and then view the results on a handy X/Y graph.
As if that's not enough, there's the more user friendly if less revolutionary pitch data commercially available at Baseball Info Solutions which is being applied by the talented folks at Fan Graphs. Fan Graphs now offers data on individual players' pitch selections and velocity, all thoroughly sortable. For instance, Tim Wakefield and Chad Bradford feature the two slowest average fastballs in the major at 74.2 and 78.6 MPH, respectively, while no one threw a changeup with greater frequency last year than Matt Wise, at 54%
There's a gold mine of potential information available at our fingertips, with The Baseball Analysts and The Hardball Times leading the way in this sort of analysis. With far less sophistication than what those guys can offer, let's see what it can tell us about the Cubs' staff.
First, the most basic stuff, drawing from the Fan Graphs info: who has bragging rights for biggest fastball on the Cubs staff? Who wants to conceal the smallness of their fastball in shame? From 2007, Fan Graphs says it's Marmol, at an average of 93.3 MPH, followed by Wood at 92.9. Zambrano doesn't even medel, coming in at fifth. The full results being:
Marmol 93.3
Wood 92.9
Howry 92.3
Dempster 92
Zambrano 91.6 Hart 91.5
Eyre 91
Wuertz 90.5
Marquis 90.4
Hill 89.4 Lieber 88.5
Lilly
88.4
Marshall
86.8
You could win a lot of bar bets on the question of who throws the faster average fastball, between Dempster and Z.
Ok, who throws the fastball with greatest frequency? This is interesting in that Howry throws his fastball a whopping 18 % more than the second most frequent gas-passer, Zambrano. Howry comes in at 86.2 %, to Zambrano's 68.2.
Five pitchers threw the fastball less than half the time last year - Marmol, Dempster, Marshall, Wuertz and Hart, with Hart at just 31.3% Wuertz and Dempster both throw sliders slightly more than 50% of the time, with Hart and Dempster throwing them a third of the time. To no surpise, Hill throws more curves than anyone, at 27.3 %, with Marshall, Lilly and Hart then following, all in the mid to upper teens. No one else breaks ten percent. Dempster is the only change-up artist of the group, at 20% frequency, with Lilly, Marquis and Marshall just cracking ten percent.
That's all fun and good, but more interesting are observations like the one made at The Hardball Times, which points out that Marmol's use of his slider jumped from 7.1% of all his pitches in 2006, to 51% in 2007. Looking at it further, Fan Graphs' data indicate that Marmol did that at the expense of a change up, which he stopped throwing entirely after using it 11.6% of the time in 2006, and a curveball percentage that fell of the table, if you will, from 19.7 % to 1.3%. It's also interesting to see that he gained about a mile and a half per hour velocity on both the fastball and slider, compared to the prior year. How far does this change in approach go towards explaining his breakout season? How badly does that throw off the legitimacy of the very pessimistic projections listed at Fan Graphs, all of which see last year as an aberration, and have Marmol falling back to earth in 2008?
The change in Marmol's numbers is by far the most pronounced, but there are some other interesting year-to-year differences in pitch compositions.
- Marshall had the next most dramatic change, as he threw his slider 20.1% of the time in 2007 as opposed to just 2.4% of the time in 2006, while his percentage of change ups and fastballs dropped by about 10 and 9 percent respectively.
- Wood almost completely abandoned his changeup last year, after throwing it just under ten percent of the time in 2006.
- Zambrano's velocity on the fastball has dropped from 92.8 to 92.2 to 91.6 over the three years of available data, while the average speeds on the curveball and changeup have both increased by one mile per hour, up to 72.6 and 83.5, respectively. That's 2.2 MPH less difference between the fastball and changeup than where he was at in 2005. (In one early demonstration of the analytical opportunities that these new data offer, Lookout Landing has just posted a list of the ten pitchers who gained and lost the most velocity over that time span. At least Z isn't in Jason Jennings territory. Yet.)
- Hill threw his fastball 10% less frequently than the previous year, with all his secondary pitches showing slight upward ticks.
- Marquis almost doubled his use of the slider, from 8.8 to 16.4%.
- Hart is the anti-Howry, throwing 31.3% fastballs, 33.7 % sliders, 15.7% curveballs, 17.5 percent cutters, and a handfull of changeups.
Fun fun fun. But then, there's the Pitch f/x device, which is even more hours of wasted time. It looks like the application still has some serious bugs in it (as does my ability to use it correctly, no doubt) and user-friendliness issues (again, not unlike yours truly), but take a look at this image as an example
These are the data on 254 sliders tracked by Pitch f/x that Marmol threw to right handed batters last year. A few things to note: the service is not yet available at all parks at all times, so it's not a complete sample. You're looking at this chart as if from behind home plate, with a right handed batter standing to the left. And again, there are several hiccups in the system, the most problematic one being that when I enter "lefty" for batter it spits out the graph and charts for "righty" and visa versa.
Let's clean that graph up a bit, and look only at Marmol's called strikes on sliders to righties
Not much of a pattern here, he's getting called strikes all over the zone. But then, take a look at their data on the swinging strikes off the slider thrown to right-handed hitters
Intuitively, it's exactly what you'd expect - low and away out of the zone. But visually, I still find this a very striking demonstration.
As you might have noticed from some of my game recaps, I became sort of transfixed last season with the hypnotic quality of Howry's relief appearances, just drilling one 94 mph fastball after another on the low outside corner to right-handed hitters, until their eventual demise.
Here's the graph of 275 Howry fastballs to right-handed hitters. Again, none of the images below are complete data sets
It seems to show a tendency towards the outer half, but let's clear that up and show just the 28 available swinging strikes on fastballs to righties
Not quite what I'd anticipated, but not surprising either: lots of swings and misses at high fastballs. But what about my low and outside fastballs? Let's try 44 called-strike fastballs to righties.
There we go. It's still not as vivid at what I thought I was seeing in person, or as that Marmol graph, but there we have a bunch of outside fastballs to righties. I always like it when my subjective viewing of the game matches up with the data.
That's still relatively simple stuff, but the possibilities of this are just mind-numbing, and are keeping me up past my bedtime. Let's flip this, and look at it from the hitter's perspective; Here are right-handed pitchers throwing sliders to Alfonso Soriano
Cleaning that up, the 12 base hits Pitch f/x has for Soriano on sliders from righties
That's some good bad-ball hitting, there. Notice, too, that the two home runs are on the sliders that likely were the biggest mistakes - the one furthest inside and the one highest in the zone. Of course, there's also this chart of Soriano's swinging-strikes on sliders from righties.
Also about what you would expect. But for me, the most interesting thing I've found, and that I'm almost capable of understanding and applying, deals with pitch movement. For instance, who has bigger curves, Hill or Lilly? You can go to the application (or just look at Hill's player card, which I'll do in a bit) and set it to show you the results of all Hill curveballs by "break" instead of by "location." You're then informed by a chart that Hill's curve averaged 73.85 mph, with a horizontal break of -6.84 inches (the negative meaning it is breaking towards right-handed hitters, or away from lefties) and a vertical break of -8.34 inches. One of the more difficult things to grasp (well, at least for me) is that these numbers are standardized against a theoretical pitch thrown without spin under idealized conditions. So that means that Hill's curve is breaking downward by an additional 8.34 inches than what you'd get with a baseball thrown under these theoretical conditions. A pitch like Zambrano's fastball, which has a positive value for its vertical break, is not actually rising, it's just sinking less than what the normalized pitch would.
If I've lost you, (I may have lost myself), let's go to the image showing the movement on Hill's curve.
Again, don't confuse this graph with a depiction of the strike zone. 0-0 coordinates are where that theoretical pitch would travel. You can see that Hill's curve has a sharp down and in break. Now, how does it compare to Lilly's curve?
Lilly's chart, also available on his player card, tells us that he threw his curve at an average of 71.57 mph, with a horizontal break of -3.03 inches, or 3.03 inches in to a right-handed batter, and a vertical break of -7.88 inches. Sorry, Ted Lilly Fan Club, but it looks like the answer is that Hill has the bigger curve. Note the cluster of pitches near the center of the chart - they most likely aren't curveballs, but some other pitch that John Kalk's system still struggles with giving a proper classification. Either that, or Lilly throws more hangers. Either way, the result is a curveball with significantly less horizontal movement, and a bit less vertical movement.
If you go to the player cards sections, you get all of this and more, but let me take one more graph that you can either create on your own through pitch f/x or see pre-made on the player cards. The first one from Rich Hill, showing the relative momement of ll his pitches
And the same for Ted Lilly
I particularly like how it shows Lilly and Hill's changeups and fastballs share similar movement, in terms of vertical and horizontal movement.
Ok, I can't get enough, so one more for your viewing consideration: Is Carlos Zambrano tipping his pitches? You tell me....
I wonder how the hell he threw that one sinker that's off by itself, it looks like he must have shot-putted it outward from between his eyeballs. Most likely, it's just a reminder that there are still some serious bugs in the data. But there does seem to be a slight but noticable difference from where he releases the sinker compared to the slider.
This piece started out as a "hey, you guys have GOT to check out the cool things I just found!" piece, and evolved from there. I'd like to hear what sort of studies you can dream up, if you have any requests or thoughts about directions I could take this in future articles. Things I'm missing? Burning issues to address? Fun comparisons to make? Particular players you'd like to see highlighted? How much confidence do you give their data when compared to your own real-world observations? Part of my interest in this stuff relates to my broader real-world academic interests: we have a new series of related technologies being invented, and do not yet quite know exactly how they will be put to use. What future do you see for this?
Comments