You are on page 1of 23

Grammar of Graphics: A New Approach To Visualization

Karanbir Singh Gujral


IBM, India Software Labs

Why visualization?
Visualization & Data Discovery market is the fastest growing segment of Business Analytics. Our customers need a solution that provides:
High definition (HD) visualizations In-market flexibility new novel visualizations without a new release Portable (across the full mobile landscape) Scalable Extensible Interactive

Data Rich & Information Poor (DRIP)

Dealing with DRIP: Visualization


Human visual system has evolved over time to spot patterns, outliers and trend Gain insight, by visually assessing data first, perform deeper analysis afterward Visualization is not just about reporting and business graphics Visualization is the face of analysis & knowledge
Visualization is a force multiplier, not a stand-alone technology
Analytics Visualization

Anscombe's Quartet

A great visualization is

worth a million data points

Visualization by example
DATA: Basic functionality by any system for analyzing data is to filter, slice and dice to create a view of the data you want TABLE: Presenting the data in the simplest form CHART: Standard recommendation: To compare two categories of counts, use a clustered bar chart
Month Y1998 Y2008 0 13880 17308 1 10484 20596 2 9847 16183 3 6952 10355 4 9393 6229 5 12870 10931 6 9330 10598 7 14726 9835 8 11893 9913 9 7815 3249 10 6419 4458 11 9900 17779
25000 20000

15000 Y1998 10000 Y2008

5000

0 1 2 3 4 5 6 7 8 9 10 11 12

Visualization by example
Adapt the layout to the data: Months are cyclical; use a polar axis. This allows the user to spot seasonal effects more easily Bars are not good for comparisons: Change to aligned points. This allows the years to be compared directly Engage the user: Use a custom symbol appropriate for the domain

Grammar of Graphics
Grammar not Types Not a prescribed Library of Charts A highly adaptive framework that allows each integrator to quickly create and customize their own library of interactive visualizations Language is flexible enough to: describe our known chart types describe unknown chart types
Visualization Description

Common Visualization Framework

Platform native visualizations

Old Way: Charts are Types


Fixed Set of supported charts If it isnt in the list, you cant have it Expensive and slow to innovate Each new chart is a new development effort Ad hoc features tightly coupled to type E.g. Animation only implemented for Hans Rosling-style bubble charts, not for all charts Adding a new feature to 20 charts is a large effort Kills creativity

New Way: Grammar of Graphics


A language-based specification of a chart

In terms of features, not types, e.g.


bar chart = basic 2D coordinates, categorical x numeric displayed with intervals dropped from locations

line chart = basic 2D coordinates, any x numeric displayed with lines connecting locations histogram = basic 2D coordinates, numeric x statistic binned counts, displayed with

Orthogonal set of features describes all common charts, virtually all uncommon charts, and most cutting-edge research charts

Art of the possible

Visual Analytics

Switching chart types

Workload Analytics

Live Social Analytics


Extensible

Where is it available?
Books
Grammar of Graphics by Leland Wilkinson

Open Source
Javascript libraries: ProtoVis and D3 Ggplot2 in R: Statistical computing Bokeh in Python

Commercial

IBM RAVE (Rapidly Adaptive Visualization Engine)


Tableau software

GoG: Composable Set of Chart Features


Element Type
Point, line, area, interval (bar), polygon, schema, text Each element can be used with any data (numeric, category, time ) As many elements on a chart as you like

Guides
Simple Axis Nested Axis Facet Axis Legend

Aesthetics
Map Data to Graphic Attributes. Works on all elements Color (exterior, interior, works with gradients) Size (width/height/both) Symbol Dashing, General Styles Label, Tooltip, Meta

Element Guides Type Coordinates Aesthetics Layouts Faceting

Map Facetingarea, interval (bar), polygon, Simpleline,to of (Network, Treelike) Any number Graphic Attributes.be Data Graph Layouts dimensions can Point, Axis Chart-in-charta chain of transformations Works on all elements defined,text schema, Axis Nested with Treemaps Chart-in-chart PanelingLayouts be used with any data Clustering and interior, works with Custom Color element can Facet (exterior, stacking Each Paneling Axis Coordinates Polar (numeric, category, time ) Legend gradients) of Layouts Transpose Size (width/height/both) chart as you likeAny numbercan be As many elements on a dimensions defined, with a chain of Graph Layouts transformations Map projections Symbol (Network, Treelike) Clustering and stacking Treemaps Polar Custom Layouts Dashing, General Styles Transpose Map projections Label, Tooltip, Meta
This CFO Dashboard Visualization uses chart-in-chart faceting:

The outer chart uses a graph layout, with an integrator-designed schema element for the nodes and standard edge element links. The schema element has multiple parts and five different aesthetics set symbol type and color for each part. The inner chart uses an interval element with 2D coordinates and two axes: a standard bar chart.

The Grammar of a Bar Chart


"grammar": [ { "coordinates": { "dimensions": [ {"axis": {}}, {"axis": {}} ], } "color": [ {"field": {"$ref": "pop1960"}} ], "style": {"stroke": {"width": 0.25}}

"transforms": [ {"type": "transpose"} ]


}, "elements": [ { "type": "interval", }

],
"style": {"fill": "#bbf", "padding": 5}

"position": [
{"field": {"$ref": "pop2010"}}, {"field": {"$ref": "name"}} ],
Coordinates: 2D chart flipped (transpose) - bars run horizontally
Guides: "axis for both x and y dimensions Element Types (go inside the data area): Uses a single interval (e.g. bar) Style for a thin border (e.g. 0.25 width)

This is the complete grammar VizJSON

Position (how we place elements in the coordinate system) shows state names by current population Aesthetics (how to color it): Color uses the population data for 1960 Layouts , Faceting - None Pop2010, pop1960, name are references to parts of the data

Simple Changes: Power of Composition


Before: { "type": "interval",
After

"position": [
Before

{"field": {"$ref": "pop2010"}}, {"field": {"$ref": "name"}} ], Add a position field to make it a range chart with start at 1960, end at 2010

After: {

"type": "interval", "position": [ {"field": {"$ref": "pop1960"}}, {"field": {"$ref": "pop2010"}},

{"field": {"$ref": "name"}} ],

Simple Changes: Power of Composition


Add a point element for 1980 populations { "type": "interval", "position": [ {"field": {"$ref": "pop1960"}}, {"field": {"$ref": "pop2010"}}, {"field": {"$ref": "name"}} ],

"color": [ {"field": {"$ref": "pop1960"}} ],


"style": {"stroke": {"width": 0.25}} }, { "type": "point", "position": [ {"field": {"$ref": "pop1980"}}, {"field": {"$ref": "name"}} ], "style": {"fill": "navy", "size": 8} }

Maps are just another element


{ "coordinates": { "dimensions": [ {}, {} ], "transforms": [ { "type": "projection", "projectionParams": {"name": "mercator"} } ] }, "elements": [ { "type": "polygon",
Use a projection type of transform, specifically with a Mercator coordinate system. Data set already has the geographic Aesthetics, labels, style are all as usual.

Map layers correspond well to elements.

"label": [
{"content": [ {"$ref": "abbr"} ]} ], "color": [{"field": {"$ref": "pop2010"}}], "style": {"stroke": {"width": 0.25}}

} ],
"style": {"fill": "#bbf", "padding": 5} }

Demo: Showcase each feature


Guides
Aesthetics

Element Type
Faceting

Layouts Coordinates

Anything is possible

Rangoli built with a GoG toolkit


- Nitin Chaturvedi

Thank You

Slide authors: Greg Adams (IBM) Graham Wills (IBM) Karan Gujral (IBM)

You might also like