Baseball seasons as sparklines

Baseball's 2016 season is underway so I decided I should write up a little project I did a couple of months ago: baseball sparklines (source repo).

As noted in the repo's readme, this is a re-creation of a chart that appears in Tufte's The Visual Display of Quantative Information. I was skimming that book as I was preparing to attend Tufte's Presenting Data and Information workshop (my notes). When I saw the chart in print I immediately wanted to see it for more than just the 2004 baseball season.

To build the charts, I scraped win-loss data for all baseball seasons back to 1919, AKA the end of the dead-ball era, AKA the beginning of the modern era. Once you go back 100+ years, unless you're a baseball historian, the charts have less meaning since you're probably not going to recognize a few of the teams in the charts. If you found yourself looking at a chart featuring the Boston Beaneaters or Louisville Colonels, would that mean much?

The seasons with dominant teams deserve a look. Checkout the Mariners in 2001 and the legendary '27 Yankees. Big streaks are interesting too: the A's in 2002 made famous by Moneyball, the '35 Cubs had a huge win streak to end that season and there's always Boston's collapse at the end of the 2011 season. Teams that had horrible seasons jump out in a depressing way, see the 2003 Tigers or the 1962 Mets.

On the tech side, unsurprisingly, it's D3. I'm using some es6 via babel. D3 comes from a CDN and everything I wrote gets bundled using webpack.

Most of the code is straightforward D3 charting. The most challenging piece was placing the text for team abbreviations so that everything was labeled correctly without the labels overlapping. The code I used for jiggling labels around isn't anything too robust but it gets the desired result.

If you're viewing this on a big-ish screen (if you see two columns of charts instead of one), you can tap a chart to make the chart larger (you can see this in the .gif in the readme as well).

Finally, if you're interested in how I gathered the data, take a look at my scraping scripts in a separate repo: baseball-scrape-win-loss-data. They will work as long as baseball-reference.com doesn't change their markup and table structures.