Notes from Edward Tufte's presenting data and information

Edward Tufte came through San Diego about a few weeks ago (February 8, 2016) and I attended his one day "Presenting Data and Information" course. Below is a summary of the interesting and important points.

Two recurring things resonated with me. The first was the resolution of paper and importance of digital screens approaching the resolution of paper. This goes beyond an aesthetic argument for high-DPI content and gets to something more useful: higher DPI means higher information density. That means more effective communication more efficiently. The second was slides vs. canvas, also described as scrolling vs. page turning. Tufte didn't cite this, but I immediately thought of Maciej Cegłowski's talks on idlewords.com. The UI is all scroll, no nonsense with carousels or slide shows, no pagination (unlike the majority of content around the web today).

There were tons of other gems throughout the day, usually in the context of analyzing some visualization. Here's a list of some stuff I thought was worth sharing:

  • started with Stephen Malinowski's Music Animation Machine
  • "information is the interface"
  • reason about content, not decoding design, maximize time with content
  • little data: numbers and words, no more little data as graphics: no more pies, bars, etc., weather.gov forecast as example of word-number thingys
  • information interpretation: scan, scroll, leave
  • spatial adjacency is possible with high resolution, digital resolution is at that of paper so data vis is possible without resorting to stacking (slides)
  • "reason about content"
  • "don't require decoding design"
  • "don't get it original, get it right"
  • NYT article on potential doctor shortages in Riverside, CA
    • turns complicated knowledge into approachable story
    • byline: personal responsibility, reputation, integrity
    • quote from expert with credential: get story out of your voice into voices of others/experts
    • reasoning on a flat space: lets viewer control navigation, lets viewer be active, slides make people wait, less decks, more documents
    • cite source for every paragraph, chart, table to build trust, create transparency
  • ability to evaluate evidence and conclusions applies to all fields/studies
  • 2nd NYT article on price of echocardiograms
    • chart with ~500 bills showing price billed to medicare and what medicare paid
    • "just about every data graphic should have annotation"
    • annotations are a gesture of explanation
  • tables: never alphabetize, be substantive, order by performance
  • dashboards: 1950s design, show 4-8 numbers and do it in a poor way, see thread on dashboards on edwardtufte.com
  • no powerpoint because it is audience-hostile, content-hostile but speaker-friendly
  • think about how you talk to a doctor, that's how you should talk during a presentation
  • time is in short supply, don't waste it
  • preparation required is directly proportional to importance of a meeting
  • "think complex, speak simple"
  • rules for inference haven't changed because of Big Data
  • scientific/technical presentations litmus test, write an abstract that covers:
    1. problem
    2. who cares (relevance)
    3. what to do (solution)
  • if you can't do the three points above, re-assess
  • linking lines need verbs, less focus on nouns
  • understand how things work by understanding causality
  • heart of graphics: make smart comparisons, show causality
  • do not pre-specify a data display technique, do whatever it takes, approach with an open mind
  • do not de-quantify
  • Google Maps as example of most successful graphic ever, measure success by number of consumers that use it
  • all linking lines should be annotated, they are the verbs
  • bad maps: too much bright color
  • good: lots of light colors e.g. swiss topo
  • best example of linking lines in the 20th century: Tim Berners Lee @ CERN in 1989, the web, a hierarchy of nouns, moving to network of verbs
  • good idea vs. great idea: great ideas get implemented
  • hierarchy of nouns: wrong, mentioned by Brooks (Mythical Man Month), same idea as experiences are more valuable than things, example is xkcd on university websites
  • novel idea: network of nouns is not what people want, use network of verbs
  • 2-D sentences: powerful way to convey a narrative, example was Sheldon Silver taking kickbacks
  • statistical graphics can be anywhere letters or numbers can be, graphics have resolution of typography
  • Tukey: "better to be approximately right than precisely wrong"
  • typeface for tables: gill sans or trebuchet
  • best visualizations are in the journal Nature
  • winning soccer goal stop-motion, better than slo-mo
  • use variety of graphics for same data: stop action, slo-mo, real-time
  • presenting: content and credibility, for the latter, reputation but also use sources, demonstrate mastery of detail, minimize jargon
  • consuming presentations: presenter's credibility, coherent argument? past performance?
  • demonstrate mastery of detail:
    • first-hand account
    • are verbs being used
    • credibility hazard: cherry-picking, is presenter building a case with selective evidence?
  • credibility:

    • detecting cherry-picking: no sources? access to raw data? evasive or arrogant?
    • as a presenter, provide links to data, establishes credibility and keeps you honest
    • humans are natural pattern matchers, overzealously so, results in pressure to come up with a result, which can harm credibility
  • thinking about the audience
    • know your content
    • endless generic respect for audience (no audience research required), this works if you use ordinary language, avoid jargon, quote experts
    • lack of respect leaks through, cannot be effective if you condescend, patronize, pander
    • always assume and think the best of your colleagues
    • sign of deep intelligence: entertain ideas contrary to your own
  • great skill: look at a wealth of information and pull out the diamonds, breaks down to two skills
    • scanning
    • drilling in selectively
  • "walk around and see directly what you are seeking to understand" (same as user observation and testing in UX world)
  • "people and institutions cannot keep their own score accurately. libor rates, seemingly all quarterly estimates, terrorist attacks prevented, twitter followers -> metrics get gamed
  • Feynman, Galileo and Euclid all were given the highest praise
  • small multiples should be your default choice, especially when unsure how to visualize something, "probably 1/3 of all data vis works as small multiples"
  • make comparisons in the span of the eye
  • different modes of vis are not competitors, they are colleagues

The course was an opportunity to focus, observe and think. This is more important than ever because, in the name of productivity, we all reach for our smart devices when there's any lull or break in what we're doing.

Tufte's day-long seminar is not going to transform beginners into info vis geniuses. It's a thorough and fun and well-delivered walkthrough of basic principles and best practices conveyed through discussion and dissection of world-class data visualizations.

If you get a chance, attend the course. You'll walk away with fresh copies of Tufte's four books, but more importantly, you get to learn from a pioneer and legend of data vis first-hand. And you'll probably learn a thing or two about doing presentations too.