You are on page 1of 86

GeoSense

An open publishing platform for visualization,


social sharing, and analysis of geospatial data.
ARCHNES
Anthony DeVincenzi

TT

B.F.A. Visual Communication, Seattle Art Institute 2007

Submitted to the Program in Media Arts and Sciences,


Shlf

c, oo~ o

hi
A-rcecur

dlI

an-

annng
11,

in partial fulfillment of the requirements for the degree of Master of Science


in Media Arts and Sciences at the
Massachusetts Institute of Technology
June 2012
@ 2012 Massachusetts Institute of Technology. All rights reserved

Aut or
Anthony DeVincenzi
Program in Media Arts and Sciences
May 11, 2012

Certified by
Dr. Hiroshi Ishii
Jerome B. Wiesner Professor of Media Arts and Sciences
Associate Director, MIT Media Lab
Program in Media Arts and Sciences

Accepted by
Dr. Mitchel Resnick
Chairperson, Departmental Committee on
Graduate Students Program in Media Arts and Sciences

GeoSense
An open publishing platform for visualization,
social sharing, and analysis of geospatial data.
Anthony DeVincenzi

;~
Thesis Supervisor
Dr. Hiroshi Ishii
Jerome B. Wiesner Professor of Media Arts and Sciences
Associate Director, MIT Media Lab

Program in Media

and Sciences

Thesis Reader
Cesar A. Hidalgo
Assistant Professor, MIT Media Lab

{'34>
Thesis Reader
Joi Ito
Director, MIT Media Lab

Acknowledgments
THANK YOU,
Hiroshi, my advisor, for allowing me to diverge greatly from our group's primary area of research to investigate an area I believe to be strikingly meaningful; for no holds barred in critique, and providing endless insight.
The Tangible Media Group, my second family, who adopted me as a designer
and allowed me to play pretend engineer.
Samuel Luescher, for co-authoring GeoSense alongside me.
My thesis readers Joichi Ito and Cesar Hidalgo for providing feedback, inspiration, and guidance over the course of this work.
The people of Safecast, who support an idea larger than what any one man
could accomplish. You are truly inspiring.
Divid Lakatos, and Matthew Blackshaw, for our many adventurous projects
to date, and for those to come in the near future.
Mom and Dad, for allowing me to explore my passions despite how inapplicable they may have seemed at times.
My family, and Jessica for loving me. I learn from your patience.
My friends in Seattle, and around the world.

TABLE OF CONTENTS

Introduction

13

Related Work

18

Contemporaries
Safecast

20
23

A call for help

23

Keeping quarters

24

Application Design

27

Balancing simplicity and complexity

27

Data mobility

28

Summary of system

28

Second order observation

30

Data features

30

Development timeline

31

Design Theory
Geovisualization

33

Aesthetics

36

Spatial-temporal narratives

39

Process

42

Concept

43

Safecast worldmap (V1)

45

Generalizing the platform (V2)

48

User interfacefor data management

GeoSense (V3)
9

33

49

50

Spatial comments and chat

52

Continued:Beyond the screen

53

Technical Design
Server structure

56

Amazon EC2

56

Ubuntu

56

Satellite & satellite API

56

Architecture

56

GeoSense Database

57

Data import

58

Aggregation and reduction through MapReduce

58

Spatial indexing and grid queries

59

TeamdataDatabase

61

Application Structure

61

Views

61

Models

62

Collections

62

ExternalLibraries

62

Challenges

65

Data purity

65

Performance

66

Scale

67

Custom instances

67

Use Cases

10

55

69

Safecast

69

Sourcemap

71

The Lace Race

71

Results
Future Work

72

74

Tile servers

74

Expanded visualization types

75

Models & mechanistic explanations

75

Boolean conditions and spatially bound alerts

75

Conclusion

78

References

81

Appendix

87

Tablet AR installation

11

87

12

Introduction
Throughout this document we refer to two projects: GeoSense, a visualization
platform, and Safecast [1], a sensing and data collection organization. Their
differences will be described at length as well as their commonality and
shared resources.

ONWARD Geovisualization is a common form of information visualization, or scientific


data visualization that when combined with visual pattern recognition allows
for increased human understanding in effort to enhance the decision making
process around a given view of data. [2]
Geospatial data has become abundant, and so have the many sensors
that we use to collect it. With over 1.2 billion web and GPS enabled devices in
our pockets [3], the amount of geotagged meta data ranging from tweets to
photos has skyrocketed to enormous proportions. As more data becomes
coupled with geospatial coordinates the intrinsic relationship between the
meaning of the data and the place-in-space from where it came can be visualized, observed, and analyzed to inform decision making processes. However,
this poses a problem as growing amounts of data can become more and more
difficult to parse and understand.

13

Today, the tools available for geospatial mapping remain highly specialized with significant technical overhead often outweighing the capabilities of the user. We use maps to codify the physical existence of immaterial
media and without accessible tools for visualization, the meaning of data is
lost in the columns and rows of spreadsheets. Further, the inability to quickly
and simply create and share geovisualizations in a lightweight manner has
slowed the evolution of sharing and collaboration in GIS [4].
How could a community, a university, or an entire industry benefit
from having the complexity of geospatial data visualization reduced to that of
email, or a single tweet? To be more specific, what if we could seamlessly
share and engage with social features such as comments and live interaction
around geospatial data? We believe that empowering users with the tools
necessary to construct visual and social narratives around contextual data will
enhance their collective ability to respond to current events while simultaneously planning for the future.
To achieve this we must first build a platform that can interpret the
many disparate forms of data and enable them to co-exist in a single unified
visualization. Without this tool, our data and voices are left in singular silos never able to engage and interact with the voices of many. The visualization
may take a number of forms, two or three-dimensional, varied in aesthetics
per the author's discretion yet constrained within a sandbox as to guide the
user - in short, not too much control, but not too little. A simple interface for
sharing and socializing the new geovisualization invites multiuser collaboration, where each user may contribute and discuss the current datasets; supporting the claim that the shared knowledge of many users is often more
valuable than that of one [5]. Finally, data pertaining to user interaction involving comments, tweets, and physical location may be aggregated to create
a second order dataset which in turn may be incorporated into the visualization for communal behavior analysis.
GeoSense aims to provide such a tool, where the user can perform
tasks of both the visual artist and data analyst all while contributing to the
shared cognition and collective intelligence of a broader community. Geo-

14

Sense is an easy-to-use web based platform for the organization and upload of
multiple datasets, a framework platform for 2D and 3D visualization, as well
as a suite of social and analysis tools. GeoSense explores generating visual
correlation models based on data layering and the aggregate of community
analysis in lieu of unified theories, or known mechanistic explanations.
After the 311 disasters in Japan involving the Thhoku earthquake,
tsunami, and Fukushima Daiichi reactor meltdown, the community was left
with little information around the outcome of the crisis. The public struggled
to obtain answers to even the most basic questions: "Is it safe for me to stay in
my home?" and "Is my food safe to eat?" Thousands rose to aid, and amongst
the responders was Safecast, an independently organized crowd-sourced
mapping network. Despite the great amount of information and data that was
collected, there was no clear path towards displaying, juxtaposing, and discussing the multivariate sources of critical information. GeoSense was
founded to support the efforts of Safecast and the many communities of Japan.

15

16

17

Related Work
We have, for hundreds of years, refined our use of visual language in the art of
data visualization. As early as the 18th century men such as Joseph Priestley,
an English theologist and academic had begun exploring the graphical representation of statistical methods through what is believed to be one of the earliest implementations of a timeline; designed to illustrate the contemporaneity of ancient philosophers and statesman [44]. During a similar time William
Playfair, a contemporary of Priestly, debuted what are believed to be the first
known instances of bar and pie charts in his two books The Commercial and
Political Atlas, and Statistical Breviary respectively. These early exploration
laid a foundation upon which nearly three hundred years of related work has
been conducted.
In more contemporary times, an enumerable amount of work has
been done in the field of data visualization, much of which stems from the
foundational work of Edward Tufte and his many visual definitions described
in "Visual Explanations" [6]. Tufte's seminal work in visual explanation and
analysis has provided the foundation for an even wider field of informational
graphic design: a notable trend covering a massive spectrum of content ranging from visualizations for geospatial data [7] to social and emotional observations through data analysis [8].

18

Exports and Imports to and from DENMARK

The .Bottom

ise

is dsqd

nt

Se NORWAY

from r/oo TO178Q

1arrs the Ryht- hand her bzto L.QOOO

eark

One of the first time series graphs:William Playfair'strade-balancetime-series chart,published in PoliticalAtlas, 1786

In the area of visualization for geospatial applications much work


has been done by the GIS community to provide tools which allow for the
exploration and visualization of location based datasets. Of these, Google
Earth [9] and NASA World Wind [10] have been widely adopted as platforms
for plotting sets of data ranging from tracking glacier footprints [11] to the
displacement and distribution of refugees located in remote areas of the
world [12]. This wide application space is evidence towards the versatility of
utilizing a three dimensional globe to display both context and meaning of
data.
Deeper into tools built specifically for spatial data analysis, both Arc
GIS [13] and ESRI MapIt [14] provide tools which claim to provide "easy online discovery, access, visualization, and dissemination of geospatial informa-

19

tion." Both services offer an extensive suite of data visualization and analysis
tools, though neither provide a suitable framework for control over dataset
comparison beyond basic layering and are both constrained to two dimensional view-ports. Similarly, community crisis mapping tools such as Pachube
[15] and Ushahidi [16] allow users to take much of the foundational mapping
work done by the aforementioned sources, and add specific additions related
to disaster relief.
In the space of tools and research for geospatial data comparison,
analysis and theoretical model generation, significant work has been done by
Floraine Grabler et al, with Automatic Generation of Tourist Maps, where the
salience of map elements are determined by using bottom-up vision-based
image analysis and top-down web-based information extraction methods [17].
The technique of selective visualization with respect to geography and locational data is an important accomplishment towards identifying how to present visual data based on the user specified variables of interest within the
data. Further, work by Jeffrey Heer and Michael Bostock of Stanford University has explored how to leverage crowd sourcing to generate a visual analysis
of raw data in "Crowdsourcing Graphical Perception: Using Mechanical Turk
to Assess Visualization Design" [18].

Contemporaries
Web-based authoring tools for generating geovisualizations have become
more prominent in recent years, offering an assortment of services towards
helping online visitors create custom visualizations. Of them, the following
are most related to GeoSense:

GEOCOMMONS
Most notably is GeoCommons, a public community of GeoIQ users who are
building an open repository of data and maps [19]. GeoCommons has a number of similarities to GeoSense, namely in that users are given an interface to

20

assist in the upload and treatment of geospatial data, as well as a shared data
repository amongst users. While many of GeoCommons' features are thoroughly implemented, including an impressive level of control over data layering through boolean operations, there remains little to no social infrastructure beyond the ability to share on Facebook or Twitter.

HARVARD UNIVERSITY'S WORLDMAP


Similar to GeoCommons, yet slightly smaller in scale, is Harvard University's
WorldMap project [20], which invites its users to "build [their] own mapping
portal and publish it to the world or to just a few collaborators." WorldMap
offers a complex and configurable user experience, offering users the ability
handle multiple sets of layered data atop an assortment of base map tiles. As
with GeoCommons, The Harvard Worldmap has no true infrastructure for
communal dialog and analysis.

MAPBOX
MapBox [21] is a simplified toolkit for publishing static geovisualizations.
Their clean aesthetic and well designed native authoring platform named TileMill [22] stands out as best in class regarding user interface and experience.
The MapBox tools are less suited for community map building and are more
fitted towards creating attractive visualizations as an embed or stand alone
site.

MANY EYES
Finally, the democratization and socialization of data visualization has been
explored by Fernanda Viegas et al., in "Your Place or Mine? Visualization as a
Community Component" [23] where a number of studies were conducted in
order to "enable the use of visualization technology by lay users and to facilitate communication around the visualizations via tools for annotation, sharing and discussion." Many eyes does not focus on geovisualization and instead
explores community dialog around common data graphs such as bar and pie
charts.

21

22

Safecast
GeoSense serves as the visualizationenginefor
Safecast.org: a non profit collective of hackers
and humanitarianswho are actively crowd
sourcingradiationmappingfrom the 3-11-11
Daiichi reactormeltdown in the Fukushimaprefecture, Japan.

A call for help


On Friday, the 11th of March 2011, Japan suffered a national catastrophe
known now as the 311 Earthquake. At a staggering magnitude of 9.0 (Mw) [24]
the off-coast earthquake was the most powerful to ever affect Japan and
amongst the most powerful ever recorded [25]. As a result of the undersea
epicenter, a series of tsunamis were triggered generating waves which were
seen to reach as high as 130 feet. Amongst the tragic and catastrophic loss of
life (-15,000), injury (-26,000), and property destruction (-129,000 buildings)
[26], the damage caused by the tsunamis put into motion a chain of events
which would lead to the eventual equipment failures, nuclear meltdown, and
following radioactive material leakage from the Fukushima I Nuclear Power
Plant (referred to as Daiichi). Rated as a level 7 catastrophe on the Interna-

23

tional Nuclear Event Scale (INES) [27], the Fukushima I meltdown was the
largest nuclear incident since the 1986 Chernobyl disaster. [28]
Estimated economic losses skyrocketed into the tens of billions [29].
While no factor could outweigh the tragic loss of life, a full recovery and ensured healthy future for the country and its inhabitants quickly became Japan's main focus. It was during this time, seemingly moments after the beginning of this tragedy, that Safecast was formed.
Safecast is a global organization working to empower people with
data, primarily through building sensor networks that enable both contribution and free use of the data collected. After the 311 earthquake and resulting
nuclear situation at Fukushima Daiichi it became clear that people needed
more data than what was available. Since the post 311 formation of Safecast,
the team has grown to a dedicated core team and over 150 supporting volunteers. It has recently received grants from the John S. and James L. Knight
Foundation and has, to date, deployed over 150 handmade radiation sensors
with a measurement aggregation of over 2 million individual readings [30].
Safecast is almost certainly the single largest source of radiation data
in Japan, if not the world; all if which is open and available under CCO dedication [31]. GeoSense, as a project and platform, was born out of the necessity
for Safecast to make visible its growing collection of data, and quickly evolved
into a larger study which aims to redefine the relationship between community driven datasets and the democratization of geovisualization and analysis.

Keeping quarters
On March 22nd, 2012, we held a meeting at the Tokyo Hacker Space to discuss the current state and future needs of GeoSense as it pertained to Safecast. The following day, a demonstration of the data and its visualization was
given at the Roppongi Hills art night, part of the Mori Art Museum, in Roppongi, Japan. During this event, numerous members of the audience shouted
out, uncharacteristically for Japanese culture, and declared their need for unfettered access to this critical data.

24

"They tell lies" one woman exclaimed from the audience, "they don't
want us to know what's really happeningand you're the only ones who know the
truth!"We can only assume "they" refers to the local government or TEPCO,
the power company responsible for the Fukushima reactors [32]. Regardless
of political or conspiracy beliefs, one year past the 311 incident the cry for
help was clear as ever. During the event we presented a recap of the previous
12 months, announced that at least 2 million data points had been collected,
demonstrated the GeoSense visualization platform, and presented a musical
synthesizer which generated interpretive music related to the ambient radiation around it.
The following day a press conference was held at The Fab Cafe in
Shibuya, Tokyo. Members of the press were invited to attend and learn about
the achievements of Safecast to date. We again announced the 2 million data
points collected, the GeoSense platform, as well as an exciting new Safecast
Geiger Counter which was built entirely by Safecast team members. The
press, many of whom represented major Japanese outlets like NHK and TBS,
had inquiries around the mapping platform:
Questions such as "What do the colors mean? Is red dangerous?Is green
safe? How can I tell who collected the data?What aboutdata that is incorrector malicious?"were most common amongst the bunch. The answer, of course, was
that much like our data our visualization engine would be as agnostic as possible - meaning that all variables from data type to data display would be fully
customizable.
Our answer in short -

"We are not presenting conclusions,


only an observational platform from
which you may draw your own."

25

26

Application Design
Balancing simplicity and complexity
The most fundamental design principle behind GeoSense is to procure simplicity and legibility where complexity and confusion exists. In order to produce a usable platform with the greatest amount of user coverage and rich
feature depth, it has been carefully designed to promote ease-of-use from the
API to the UI. However, this does not discredit the need for a tool which provides even the most seasoned data analysts with new and actionable insights.
To address this, GeoSense scales gracefully dependent upon its user's specific
needs; a simple geovisualization can quickly grow into a deeply insightful tool
for analysis through a series complex, spatial-temporal queries across an infinite number of data sets.
We believe that there exists value in large data analysis in place of
known data models as was philosophically described by Nobel prize winner
Philip Anderson in "More and Different" [33], and further explored by the
entirety of the contemporary big data movement. Rather than incorporate
complex computationally expensive algorithms to understand, interpolate, or
predict model behaviors GeoSense instead invites the community as a source
of analysis utilizing human intuition and natural pattern recognition to detect

27

occurring phenomena. This is not to say there isn't inherent value in known
models, it is however a different approach which lends itself to a level of accessibility and friendliness which may in turn better serve a large community.
Finally, GeoSense takes use of multiple open technologies, all of
which contribute greatly to the usability of the platform. Only 5 years ago the
requirements to offer a service at this scale would have come with astronomical cost, requiring dedicated physical hardware servers, a team of engineers,
and client side computing power that just did not exist. Open source software
efforts and blossoming internet communities cannot be thanked enough.

Data mobility
All data brought into or authored within GeoSense is stored, managed, and
appropriated by the GeoSense Satellite RESTful API. The GeoSense application invites users to explore different dimensions and parameters of their datasets, both spatial and temporal, providing a suite of tools which acquire
their parameters via the API. In fact, any map or source of data may be used
outside the GeoSense application ecosystem. For example, should a user wish
to develop their own front end application or integrate dataset(s) into another
service, the satellite API provides sufficient scaffolding and endpoints to do
so.

Summary of system
GeoSense is an open platform for the comparative and cooperative visualization of geo spatial data. It is fundamentally different from similar platforms
that aim to provide complex mapping GIS tools and as a result are often
weighed down by a cumbersome feature set.
GeoSense aims at providing the highest level of simplicity through
carefully considering the average ability and limited prior knowledge of users,
in regards to GIS systems. In order to build such a system, special considerations have been made in developing the UX. Given that a vast majority of first
28

world internet users are equipped with geospatial aware devices and platforms such as Google Maps and Bing, which has bolstered awareness of cartographic interaction, GeoSense comes at a time when the user has already
acquired familiarity with mapping concepts and is in prime condition to begin authoring.
The system manifests as a web application available publicly at
http://geo.media.mit.edu where any user can, within seconds, acquire a boilerplate visualization template to which they can upload or link geospatial
datasets.
We believe that geospatial data is best understood collaboratively as
was explored by Viegas et al in 2007 with Many Eyes [5]. To promote social
behavior a single user's map is incredibly easy to share, as it belongs to a
unique URL address. Maps can be shared through integrated social outlets
such as Twitter, Facebook, or more traditionally through email or text link. To
promote multi-user collaboration, all maps are generated with a public and
private short URL (public view and administer respectively) which can be
used to access the visualization platform. A map accessed through a specific
URL allows for user annotation and commenting, both on specific data points
and general location coordinates.
Users are also made aware of other current collaborators and their
general whereabouts in the context of the map. To elaborate, the entire map
is a chat room and message board to which invited users may co-author and
analyze data. These features are explained in greater detail throughout this
document.
GeoSense provides an insanely simple platform for visualizing multiple disparate sources of geospatial data. In parallel, it also provides a suite of
tools for collaboration and data insight which have, to date, not existed in well
executed form. GeoSense is built specifically to serve users whose main skills
are not computer science or design, but who have curiosity around geospatial
analysis and appreciate beautiful presentation design.

29

Second order observation


By exposing user behavior in context of the geography from where they
originated along side areas of interaction, a second order observation can be
described. Specifically, for geospatial data and geovisualization the place in
space where the viewer or author exists may have special relevance to the
data they are investigating - both at the individual and community level. To
explore this concept, each instance of GeoSense keeps track of where its users
originate from, where (and if) they leave geospatial comments, as well as how
they interact within the integrated chat room.

Data features
Data representation is highly variable within GeoSense. It is left up to the
map's author to select the visual style, though GeoSense maintains predefined data point aggregates for large or extremely dense datasets. Data may
be explored interactively by clicking on either a cluster of aggregated data or
an individual datum. Meta information associated with the specific data is
then revealed in geospatial context, assisting the user in better understanding
the information with which they are interacting. We discuss in great depth
the visual and computational considerations of visualizing data features in the
Design Theory chapter.

30

Development timeline
GeoSense was developed over a sixth month period, all of which was spent in
close collaboration with Safecast. To serve both the active Safecast community and prepare GeoSense for growth into additional communities, milestones vary from summit meetings in Japan to periods of presentation at the
MIT Media Lab. This timeline is reflective of Geo's development, as well as
its future plans and iterations:

Oct2011

Dec

Jan

Feb

Mar

Conception
V.1 Safecast
Worldmap
Research & Meetings

V2 Development

_____________

Tokyo visit
V.3 GeoSense
Deployment

31

____________

___________

May

32

Design Theory
"The world is complex, dynamic, multidimensional; the paper is static,flat. How are we to
represent the rich visual world of experience and
measurement on mere flatland?"
Edward Tufte [34]

Geovisualization
Producing effective visual representation of multi-layered information atop a
map or any cartographic medium poses a torrent of potential complications.
For every condition that produces a desirable result one hundred new complications may reveal themselves generating information-less patterns as a
byproduct of their presence. As explained first by Josef Albers and underscored later by Tufte, the conundrum is that 1 + 1 may often equal 3 [35],
where the byproduct of the initial variables produce an additional, distinct
condition - adding to the visual complexity.

33

As described by Albers, the combination of one or more shapes may produce a thirdshape (shown in
red) as their byproduct

To address this, we employ a number of techniques, both aesthetic and computational, that address the needs of user generated geovisualization. The key
features we consider are:
1. Mindful representation of multivariate information layers drawn across
both two and three dimensional planes.
2. Dynamic data densities where the application state (or UI) informs the
visual output.
GeoSense is faced with a number of challenges when representing geographic
data within the user interface. Aside from standard complexities that arise
from visualizing large data, such as information density, other conditions
must be considered when we investigate the user's interaction with the data.
It is blunt and inefficient to show all data, as visual comprehension begins to
suffer as the amount, or more specifically the density, of visualized data increases. Overabundant or incomprehensible arrangements stem from failures
in design rather than from the information itself- regardless of magnitude.
To address this complication, we employ a well known tactic of fitting a grid of boundary boxes against the map, to which data is aggregated in
relation to the user's visible viewport. The grid is dynamically generated and
sized. Many geospatial visualizations have addressed this, either for visual or
computational simplicity, by averaging number of occurrences into a known
cultural boundary. For instance, population density is often visualized as a
choropleth map [figure below] where a polygon shape defines the state
boundaries and all data within the given bounds is displayed as a single hue

34

Left: A computationallygenerated interpolationof radiationlevels. Right: A choropleth map showing


population density by prefecture near Tokyo, Japan.Neither image producedfrom GeoSense

across the entire shape. This technique often misleads the viewer, as the data
within the bounded areas is not nearly as uniform as the visualization suggests. A similarly misleading tactic is to attempt averaging information over a
given space. Computational interpolations [figure above], while often making
the visualization seems denser and perhaps more visually compelling, do little more than generate an unqualified visual representation and, in the case
of Safecast's radiation dataset, produce extremely misleading conceptions
regarding the data's meaning. Interpolations are effective when attempting to
predict or model the behavior or future state of a dataset, especially in the
case of trajectory over time and space.

35

Aesthetics
Shape, color, and size of visualized objects is carefully considered, as the
shape of an object is optically tethered to the geography from where it rests.
For example, a single data point may represent one particular point in space
but to show it as a single pixel on a map is sometimes misleading. Instead, by
showing the data point as a 10x10 pixel box it suggests that the data point
corresponds with an area of space on the map rather than a single point on
the map. Likewise, the visual change of data must coincide with adjustments
in the map zoom level; If a datum does not change its size parameter as zoom
is adjusted, the user will perceive the shape size to have no geographic binding in relation to the geo coordinates of the map. This is perfectly illustrated
in modern mapping tools such as Google Maps or Open Street Maps where
the map tiles change resolution in respect to the user's perceived distance
from earth (zoom).
Additionally, the color of data information plays a critical role in
both the visual legibility of each point of data as well as the intent expressed
by the visualization. For example, the question continually arises whether or
not certain types of data, radiation in our case, should be colored or have a
fixed color scale. The most common example is a linear hue shift from green
to red. In western culture green is universally accepted as safe, versus red,
which is understood as being dangerous. Ironically, in Japan the color red
represents heroism, love, and is a a positive visual indicator for the Tokyo
stock exchange. Further, how does one normalize scale to color where the
range value is either user generated or chosen arbitrarily? Non linear value
distributions cause additional complexities to representing data using a hue
shift and often need to be represented in logarithmic scale.
It was decided early on that the potential harm in suggestive coloring, especially within critical datasets like radiation, outweighs any aesthetic
benefit. To address these concerns GeoSense gives the user complete control
36

A view of bold, brightly coloredshapes atop a dark tonal map. Blue dots represent earthquakes sized
by magnitude.Red dots represent nuclear reactorssized by power.

over data representation; the choice of whether data is represented as a single


pixel, relatively sized circle, or bounding box, as well as single or hue-shifted
color is completely customizable.
By default, the application promotes bold color and is set against
dark, tonal map tiles which best suites the type of data uploaded. To do this,
we borrow a page from Swiss cartographer Eduard Imhof's first rule of color
composition:

Pure,bright or very strong colors have loud, unbearableeffects when they stand unrelieved over
large areasadjacent to each other, but extraordinary effects can be achieved when they are used
sparinglyon or between dull background tones.
"Noise is not music ... only on a quiet background can a colorful theme be constructed." [36]

37

GeoSense addresses the multivariate nature of geospatial visualization by


combining the proper amount of end user control with system constraints; in
turn addressing the technical, artistic, and culture complications that arise.
The figure below describes the three primary methods of data representation
and their literal to representational qualities:

Pixel

Box

Circle

LITERAL

REPRESENTATIVE
U

0100

Different renderingtechniques used by GeoSense. From left (most literal) to right (most representative) and their correspondingvisualizations below

38

Spatial-temporal narratives
In addition to the two and three dimensional canvases that GeoSense displays
information, a fourth dimension for time has been implemented through a
time series graphing system. Data sources containing temporal attributes may
be explored alongside their geophysical attributes in shared context. In order
to expose the value of a dataset's time quality, each datum is sequenced in
successive time spaced by uniform time intervals.
Coupling the spatial dimensions of the map viewport with the temporal sequence of the series graph deepens the an onlooker's understanding
of a the dataset depth. By reducing the complexity of of the data into two understandable, and intrinsically related parameters - time and space - an
equally interactive and elegant view in all four dimensions is made tangible.

Top: Earthquakesshown with both geospatial and temporal analysis.Bottom: Arrangements of series graph display types - bar chart,scatterplot,area,sparkline

39

Because data properties such as color and shape are selected at the data management level, parameters are synchronized across all visualization mediums:
a series of red dots for earthquakes on the maps will display as a red time series line on the graph. Additionally, temporal data may be explored through a
number of time based graph techniques that currently include scatterplot,
line, area fill, and bar chart.

SPACE

<0o0

tTIME
2012

T he space (xy) plane represents the user's current viewport.It is defined by a constraininglatitude/
longitude and zoom level. T he time (z) plane displays selections ofa data set based on occurrences
within a given time constraint.In this case, we show a selection between t1 and t2.

Users may also find interest in further exploring subsets of data through the
time graph and can easily do so by interacting with a number of UI features
allowing for time-range adjustment, and on-graph annotation. The above figure describes the spatial-temporal relationship between the user's view of the
time series graph and the visible geospatial viewport. As a user interacts with
the time constraint controls, in this case ti and t2, the amount of data shown
both on the map and graph are concatenated against the new parameters.

40

41

Process
GeoSense began with a simple concept: making the most simplistic, frictionfree experience for mapping geospatial data with special attention towards
social collaboration and data analysis. Moreover, this tool should allow for the
effortless creation of data and model mashups that expose insight into the
meaning of the data. The initial goal was largely unconstrained in its definition and by design was allowed to grow and evolve as certain points of development were reached.
At the time of conceiving the idea, a number of related projects had
been recently completed by members of the core team. For example, at least
three large scale geospatial projects had taken form, all of which we were required to build custom geospatial visualization. These projects, Peddl [37],
Place Pulse [38], and Sourcemap [39] provide deep insights into the complexities of design and implementation for custom made geospatial visualization
where the datasets where both community driven and dynamically updating.

42

YOA
hem 3 ffk"

**wfd

Want This

~4

ice mapShare

thi

Left: A view from the Peddl marketplace.Middle: A view from a PlacePulse visualization.Right: A
view from Sourcemap.com

To begin, GeoSense was prototyped as a wireframe concept to assist with


identifying the UI/UX foundation from which to build the service. These
early prototypes explored different arrangements of user interfaces that, if
implemented, would serve as the app's foundation. Early wireframes borrowed a common design pattern found in applications such as Google Maps
where the left most column of the screen, delegated for content related to the
right column, taking up nearly two-thirds of the total real estate with a
geovisualization.

Concept
The wireframe prototypes proposed three key features for the GeoSense platform: 1) a GUI with the map as the locus of interaction 2) a simple management interface for adding and subtracting data and 3) layers of interactivity
atop the map object that expose features to the users. Some of these features
were defined as the ability to comment on geospatial coordinates as well as
building'if-this-then-that' [40] style queries around the active datasets.

43

An early illustrationdemonstratingthe split, two column real estate, the ability to add data as well as
a three-dimensionalglobal view.

44

Demonstrationofan early "if this, then that"geo-bound condition. This feature was later removed for
the release version of GeoSense and further discussed in the Featured Work section

As is common in prototype design, a significant amount of time was spent on


iterating the UI and UX in the form of a visual storyboard where any amount
of development or system engineering would be postponed until the first
"functional prototype".

Safecast worldmap (Vi)


Upon completion of the GeoSense wireframe prototype, a production version
implementing the Safecast dataset underwent development. Understanding
that the application was going to be deployed periodically to a large user base,
the development of experimental features was put into a sandbox, forked
from the original repository, so that two instances of GeoSense could simultaneously exist: one for public viewing at http://blog.safecast.org/worldmap and
another which would eventually become GeoSense.
With the Safecast worldmap, referred to as version one, only the
most fundamental features were developed while a small amount of visual

45

design and aesthetic polished was applied over the entire application. Initial
features included the ability to show or hide the Safecast mobile dataset as
well as a choropleth map of Japanese population averages per prefecture.
Core features such as geospatial search, basic map controls, multiple map
themes, and social sharing were also implemented.
During this stage, the data being shown was populated from ten different

A view of the Safecast Worldmap showing data aggregationacross the islandof Japan.

Google Fusion Tables, each of which held a aggregated granularity of data


dependent on zoom level. The tables were mapped to the user's zoom level
within the application such that as the user clicked "zoom in" or "zoom out"
the tables would be queried to render the respective height (loom, 1000m,
1000m, and so on). Each table contained specific KML data which defined a 4
point geographic box. The benefit of rendering tiles from a dedicated map
server became immediately obvious, as the amount of client-side computation involved in displaying 10,000+ data points in a single view outmatched
the capabilities of a Javascript based approach. Further explanation and justi-

46

fication for and against the use of tile servers is explored further in the Technical Design section.

IsV/
aeag

0.229
76.481

A closer view of the Fukushimaarea,showing the 20km evacuation radiusand a finer data resolution

Zooming in on the Fukushima prefecture revealed the 20km evacuation zone,


as well as a higher granularity of data points. Clicking on any individual data
point, or cluster, would reveal information such as CPM (counts per minute)
and uSv/h (micro-Sievert per hour) per hour pertaining to that specific set of
data.
Version one of the Safecast Worldmap was live between February
15th, 2012, and May 11th, 2012, when it received more than 10,000 unique
visitors. Significant feedback was received both from the Safecast and public
community. The general sentiment was that the Worldmap was impressively
simple, easy to comprehend, and a step in the right direction.

47

Generalizing the platform (V2)


The second iteration of the platform, referred to as version two, began with a
complete rewrite of the application structure as will be outlined in the Technical Design chapter. Version one had been built as a standalone application,
more akin to an advanced prototype functional enough to garner interest and
insight, but without the fundamental framework required for additional features. With a number of new members joining the development, version two
quickly took on a much more structured framework with specific focus towards speed to development.
V2 takes a step back from Safecast, and a step towards generality.
Rather than build features specifically pertaining to the radiation dataset, efforts were spent building a platform that would expand the simplistic power
of the Safecast Worldmap to any and all users who had their own types of
geospatial data.

ICofirm and Add

The current user interfacefor reviewing recently added data. Users aregiven the ability to select
which columns represent the necessary attributesof location and intensity

48

USER INTERFACE FOR DATA MANAGEMENT


Many of the design and engineering cycles during version two were put into
the process of a seamless experience for users to add their own data. In order
to do so, a workflow had to be developed which would assist users in preparing their data such that it could be understood and interpreted by our system.
To do so, an "add data" wizard was developed, where users were instructed to attach a datafile either through uploading from their file system or
by URL link. Once the data had been received by GeoSense, it was parsed and
display back to the user as a table of columns and rows. To identify the data
headers, the user is instructed to drag and drop labels onto the columns.
Properly imported datasets are represented in the system on the left column

4 Close ULbrary

Comments

Comments
Nuclear Reactors (29)

Add New Data

Drag and drop rghr onto your map

Earthouakes M8771

Browse Data Library


Safecast

Display

Theme

Rat map

Dark

Ught

3D Globe

Standard

Earthquakes

Visible

Hidden

SingleColor

ColorScale

Nucear Reactors

Nuclear Accidents

Data

p xels
Remove

circles
Save and Update

Add New Data


Browse Data Library

The data managementpanel. Showingfrom left to right - initialview, data library browser,and expanded controlsfor added data sources

where they are shown inline with additional data sources. This visual group,
or "accordion" component, allows users to easily manage, edit, or remove the
current data sources.

49

An important advancement during V2 was the introduction of data models


that existed separately from the visual representation. All data added into the
system is pushed into a remote database where it stored and made accessible
through a public API. While this ultimately means that all data in GeoSense
could be re-visualized elsewhere by a third party, it also means that the application can easily iterate through different types or methods of visualizations.
V2 began exploring this by introducing a modal switch which toggles between
"Flat Map" and "3D Globe" respectively Clicking the toggle changes the display type and automatically rebinds the active data models as appropriate for
each visualization.

GeoSense (V.3)
The third version of the product brings the first actual instance of a "GeoSense" in its entirety. Wrapped with an additional layer of instruction and
messaging, GeoSense becomes less an experimental application and more a
widely available consumer service website.

AMAP
SCREAIT

LOAD YOUR QATA

SHARE AND DISCUSS

AddYour Awesom D.

The landingpagefor geo.media.mit.edu, inviting users to create a map by enteringa name and clicking
the prominent create button

50

Left: CoastalJapan showing earthquakes,nuclear reactors,and coastalflooding.Right: A view of asia


showing earthquakesalongside a time series graph.Bottom: A view of the webGL 3D globe

V3 features a homepage instructing users how to create their own mapping


sandbox. The homepage also features a number of community insight tools
such as "recently created maps" and "recently added datasets".Currently, all data
stored on GeoSense is made publicly available. As well, maps created on GeoSense are publicly viewable though only users with a special admin URL are
able to manipulate or add associated data.

51

SPATIAL COMMENTS AND CHAT


The release version of GeoSense also incorporates a number of critical features which add to the value of community input and collaborate around specific maps. A simple UI feature for leaving geotagged comments or commenting directly on a data point is provided. This familiar interface, akin to leaving
a comment on Youtube or Facebook, invites users to leave annotations in direct spatial context. Similarly, a set of "on/off' toggles allow users to see the
physical location of users currently viewing their map as well as the geo locations of where all past contributors and editors have been.
During this phase, GeoSense underwent a complete API overhaul
from basic restructuring of naming conventions to complete refactoring of
routes. The API was generalized and cleaned to improve workflow for the development team as well as to prepare for community wide usage. Methods for
data uploading, parsing, and aggregation were greatly enhanced during ver-

Two users converse about the safety levels of the Safecast radiationdataset in reference to their residences. Comment bubbles on the map create links to andfrom the chat window which user's may use
to specify a specific geo coordinate

52

sion three, which will be more fully detailed in the Technical Design chapter.
GeoSense version three, undergoing active development at the time
of writing this, will serve as the platform from which the project will continue
to evolve and also mirrors the state of recent releases, posted at GitHub
(http://github.com/tonydevincenzi/geo).

CONTINUED: BEYOND THE SCREEN


An obvious benefit of developing for- web accessibility is the vast number of
devices that can access the full range of the application. To test extensibility,
we developed an iPad application that, with a simple wrapper around webkit,
allows for full functionality on an iPad tablet device. To compliment the form
factor and push the boundaries on how to present the project in situ at the
MIT Media Lab, the GeoSense team developed a suite of technologies to
transform the entire platform into an experimental augmented reality installation.
Featuring a full sized physical globe, users are given tablet devices as
instruments to explore data on and around the tangible earth. Moving the
globe rotates the data accordingly, as does moving the tablet device around
the space. This exciting exploration creates questions about how to best represent virtual geospatial data tethered to a physical object, and what other
user interaction scenarios may emerge in the future.

53

54

Technical Design

The GeoSense technical implementation is best described by outlining the


underlying frameworks for the server, app, and web service respectively. A
large number of framework services have been employed, iterated on, removed, and revised during the development of GeoSense. The current technology stack is by no means the most practical or scalable implementation,
55

but is perhaps most fitting as it is built entirely atop open source platforms
whose ethos align with the goal and aim of GeoSense and Safecast.

Server structure
AMAZON EC2
The GeoSense web service, code named "Satellite", is hosted on an Amazon
EC2 instance server. Amazon EC2 was chosen for its ability to scale to meet
increased load demands as the service grows in size. It is also heavily adopted
and well documented by the contemporary web development community.

UBUNTU
The server runs an instance of Ubuntu Linux, a Unix based operating system,
that has a thriving community of developers who have documented the many
ways of "rolling a server" to your own specifications, much like Amazon EC2.
Satellite can run on any unix based operating system and is completely managed and deployed through terminal configuration.

Satellite & satellite API


ARCHITECTURE
NODE
The satellite web server is a node.js based application. Node.js is a javascript
framework for writing scalable internet applications, most commonly for web
servers [41]. Node uses an event driven Asynchronous I/O for improved scalability and reduced infrastructure overhead. Unlike the majority of Javascript
based programs, it is executed 'server side', the benefit of which is a close
coupling of language and method between server-side and client-side rendering. In the case of GeoSense, this was a obvious benefit as a number of the

56

applications features mix-server and client-side rendering techniques.


Node comes coupled with Node Package Modules, which is a stand
alone manager for installing a community curated collection of "modules"
that extend the basic functionality of Node. GeoSense uses the following major NPM packages:

EXPRESS / CONNECT
A fast, and small server-side JavaScript web development framework with
features including routing, session support, cookie handling, and logging.

MONGOOSE
An object modeling tool designed to work in an asynchronous environment,
making integration with MongoDB extremely pleasant and straight forward.

NOWJS
An implementation of web sockets (via socket.io) and node-proxy libraries for
real-time communication for live updates between users.

GEOSENSE DATABASE
MONGODB, GIS
For data storage and management, MongoDB [42] (from humongous) is used
as the central data repository. Mongo is a NoSQL database, meaning that it
stores structure as a JSON-like document with dynamic schemas. Table-free
database architectures are known to be more efficient in terms of speed and
efficiency for certain types of applications.
Mongo includes a number of crucial libraries referred to as "MongoGIS" that are optimized for geospatial data operations. These operations
are central to data storage and retrieval within GeoSense. For example,
Mongo makes easy the ability to index and quickly return search results for
complex queries such as "average the 500,000 points closest to my location
where value is never higher than 5".

57

Data import
Importing data is handled by the server once the client has specified and uploaded a suitable datatype. GeoSense currently supports XML, JSON, and
CSV datatypes.
Once a file has been posted to the server, it is put through a process
which cleans and standardizes the import. Each line of the data source is read
in linear order, where each column or property is then transformed into a
field within our associative MongoDB collection. Original conversations of
the uploaded data are kept as a collection prefaced with o_ in the active database. As the document is being parsed, transformed fields are asynchronously
dumped into a master collection that houses all uploaded points within GeoSense. Their unique _id is retained and used to associate the individual field
with its parent collection. Attributes unique to the dataset, such as title, default color, created by, and modified date are stored in an associative collection where the _id attribute is used as a linkage identifier.
Data import and parsing happens asynchronously once the user
uploads their first dataset. The time remaining is indicated to the user in the
GUI by showing the estimated time remaining on the data conversation. Once
the data is properly converted and stored, it is drawn into the user's current
viewport.

Aggregation and reduction through


MapReduce
For datasets exceeding a certain number of fields (arbitrarily ~1,000) an aggregation process is executed to greatly increase the performance of the data
for both the client and server. To accomplish this, we create sub collections of
the dataset, each containing a reduced aggregate as a function of zoom level.
We currently support reductions for 15 discrete zoom levels as well as temporal reductions that host only the time series for each dataset reduced into
days, weeks, months, and years in accordance with the zoom level aggregate.

58

To accomplish this, we employ a technique referred to as "MapReduce". Traditionally, MapReduce is a framework for distributing the processing of huge datasets across a large number of nodes. In the case of GeoSense
and the GIS libraries for MongoDB, it is a tool for batch processing data and
aggregation operations.

Spatial indexing and grid queries


As described in the previous Design Theory chapter, all data stored and displayed within GeoSense is subject to a mesh grid. This grid, mesh, or lattice,
serves the dual functions of one, reducing the amount of visual complexity for
the user and two, standardizing and reducing the amount of computational
processing for the client and server. For example, at a global zoom level showing all 180,000 earthquakes over a magnitude of 4.4 since 1973 would be both
visually and computationally inefficient. Instead, occurrences are organized
into micro clusters, fitted to the known geospatial grid, and displayed dynamically in regards to zoom level and the bounding extremities of the user's
viewport.
This approach produces an optimized number of queries against a
geospatial index. To create and manage these queries, the GeoSense application constructs the viewport grid in accordance with the aggregate collections
generated explained in the previous section Aggregation and Reduction through
MapReduce.
The following structuring logic was developed with and paraphrased
from Walter Mendez (MIT EE/CS 2015) who contributed to the GeoSense
project during the summer of Spring of 2012:

On constructingthe mesh grid The grid is managed by a set of ordered pairs, which are not created at random. They follow a geometric pattern that is based entirely on the physical

59

dimensions of the zoom level and the parameters of the viewport grid being
generated.
The origin of of this coordinate system, or (xO 9yO) is placed at the
lower left hand corner of the bounding area and as a result, a change in the
horizontal direction and the vertical direction, x and y respectively, can be
defined as the following:
-

lengthZ
0 ,,

A = widthzoom

lengthgrid

widthgid

It hence follows that, given the zoom level's bounding corners, the
lower left being (xO,yO) and the upper right being (xf ,y 1 ) any point in the
grid could be reached by the following general formula:

lengthZ
00 M,y0 +n widthzoom

lengthgrid

widthgrd

where m is in the range of {O,...,lengthgrd}I and n is in the range of

{o,...,Widthgrd } .The geometric constraint when it comes

to the bounds of

the grid is then defined. When m and n are equal to their respective
maxima:
length,_
widtho
xo+lengthgrid length
,yo + wdhgrid wi. thgrid) =
length,,rd
width,,

++d
f+length,,myo+widthzom=(x,,y,

Given MongoDB's geospatial indexing specifications, the database


indexes the data using spatial coordinates (longitude, latitude). To create the
boundaries of a grid, we specify a box by passing in a lower left hand corner
and an upper right hand corner. Thus, for any given m and n in our grid, a
bounded box would have as lower left and upper right corner respectively:

x0 + m

60

length

"

lengthgri

,yO + n

width

zoom

widthgrid )I

x0 + (m +

length
lengthgri

YO +(n+1)

width

widthgrid

This makes geometric sense. In order to get to the upper right hand corner of
a box given the lower left hand corner, we need only add AX and A , as well
as a single box side length and width, in each direction.
Finally, each cell within the grid contains an array storing all the
data points retrieved from the server, the number of points in said array, the
minimum, the maximum, the average, and the center point of the respective
container.

TEAMDATA DATABASE
POSTGIS
Data specific to Safecast is stored in a separate database, which operates outside the server bounds of GeoSense. Safecast's dataset, which is referred to as
teamdata, is stored within a PostGIS (Postre GIS) database and is subject to a
different upload and management process than data added directly through
GeoSense. Though the Safecast dataset is community driven, it's handled and
monitored by a number of Safecast volunteers due to the critical nature of the
data.

APPLICATION STRUCTURE
The map platform, which is the publicly visible portion of GeoSense, is a built
fully in HTML5 and Javascript. The application is organized in a MVC
(Model, View, Controller) framework using Backbone.js [43] that provides
logical structuring of the application into a manageable development flow.
The application is organized into the following structure:

VIEWS
The visual build is constructed through a simple templating engine that
serves views based on the application state. These views vary from '2D map
view' to '3D map view' and 'About GeoSense' view Each view is an individual
module that contains a linked HTML and CSS file for format and styling.

61

MODELS
Models are used to define the parameters around how individual pieces of
data are handled within the GeoSense application. For example, the most
common model is 'point', which refers to a singular point of data containing a
latitude and longitude coordinate. Each point may differ from the last, both in
lat/lon and in additional values (intensity, date added, etc).

COLLECTIONS
Collections are bundles of models that exist together under the umbrella of
parent properties. For example, a million points (taken from the point model)
may make up the collection 'air pollution' that then has its own properties
independent from the individual models themselves. Collections, as containers of models, are bound to views within the application.

EXTERNAL LIBRARIES
A number of widely adopted external libraries are used as part of the GeoSense application. Listed below are their titles and basic operation:

TWITTER BOOTSTRAP
Twitter's bootstrap framework is used underneath the application to provide
easy access to commonly used design patterns such as headers, footers, button types, forms, modal windows, and more. Bootstrap is a welcome additional to the technology stack as it reduces the vast amount of timeconsuming work by replicating expected behaviors of a web app. It is, in general, a fantastic boiler plate for starting a new application. However, precautions have to be taken to ensure that the ubiquitous "look and feel" of Bootstrap does not overtake the application. To do so, nearly all the default styles
provided are restyled or adjusted.

62

JQUERY/J QUERY UI
Jquery, a javascript framework library for accessing and manipulating the
DOM (Document Object Model) of the application is fundamental to any
Javascript based application. Jquery UI is a simple extension of Jquery that
appropriates certain features such as "drag and drop", which may be only
necessary in certain applications.

THREE.JS
Three.js is a javascript library that wraps a basic render model around the
OpenGL based WebGL. Three.js simplifies access to WebGL and is instrumental in Geo's ability to display data in the third dimension.

OPENLAYERS
OpenLayers is an open source library for displaying and manipulating map
data. It is built entirely in Javascript, and provides an API for constructing
interactive map applications. GeoSense uses OpenLayers as the rendering
engine for two-dimensional maps and has heavily extended the canvas rendering class to support features unique to GeoSense.

This list covers the most fundamental libraries but is not exhaustive. For
more information regarding the current state of the GeoSense library arrangement visit the project on Github

(http://www.github.com/tonydevincenzi/geo)

63

64

Challenges
Data purity
Because GeoSense does not offer itself as a source of data but rather a source
for data observation, there are certain precautions towards allowing the
community to generate and share data sources. For example, erroneous data
may be inserted into the system by any user and then replicated by future users. Rather than try and detect bad data, or even offer tools to report such incidents, GeoSense takes the position that it offers nothing but the platform
and that all data within the platform is community generated.
In the case of Safecast, the data is stored in the teamdata database,
which is part of the Safecast repository. GeoSense has integrated bespoke
hooks for the teamdata dataset, but only in a manner that is available at
safecast.org. Therefore, for all intents and purposes, the data available at
http://geo.media.mit.edu is community generated and not explicitly endorsed
by the platform. This is made clear in the GeoSense terms and conditions,
which are available online.
Data comes in many shapes and sizes. An ongoing challenge is continuing the development of upload compatibility from within the add data
wizard. To date, GeoSense requires that the user specify at least three crucial
65

columns for every uploaded dataset: latitude, longitude, and intensity. Ideally,
a lightweight algorithm could handle the majority of the guesswork involved
in specifying these columns as the names held within header rows of geospatial data are often similar (i.e., lat or latitude).
Finally, certain considerations are taken when choosing how to handle a maximum file size for user upload. For instance, it is computationally
expensive to upload and parse through a file the size of the Safecast dataset,
which at time of writing is over 3 Million entry points housed in a 50mb CSV
file. GeoSense currently limits the file size upload to 20mb, which can still
easily cover more than one to two million entries in a well managed document. Increasing this capacity would require significant server enhancements
and storage capacity, coming at significant cost.

Performance
When attempting to process and visualize large amounts of data, performance issues are one of the first hurdles to overcome. Rendering millions of
live data points requires a dynamic relationship between the rendering engine (front end) and data server (back end). In its current build, Satellite, the
GeoSense web service, aggregates and returns data from the back end based
on the specifications requested by the front end. Because the data within the
GeoSense application is handled separately from the visualizations, it is easy
to adjust the requests based on the currently application state. This is most
evident in the scenario of rendering to the flat map, where we begin to experience extreme performance loss when more than -20,000 individual objects
are being rendered.
Conversely, it is much easier to render large amounts of data
through the webGL pipe, which is utilized by the 3D globe display type. Because webGL has access to the video card's GPU, the majority of display logic
can be pushed off the CPU, which is the general bottleneck for JavaScriptheavy applications.
Future versions of GeoSense may implement a custom tile server,

66

similar to how Google Fusion Maps are rendered, which in turn would alleviate the constraints of rendering data points into the map tiles. Tile servers
are, at this time, complex and expensive to manage. New services such as
MapBox have begun to innovate with products like TileMill, though the infancy of the software comes with too many limitations for it to be used by
GeoSense.

Scale
As GeoSense begins to grow in users and scope, scale becomes a prevalent
issue. In its current state, scale is handled by basic load balancing and an elastic instance through Amazon EC2 [44]. GeoSense has been carefully designed
to handle a magnitude of scale, though the costs of operation would scale in
parallel. Future funding will be required to keep the service running if extreme growth is experienced.

Custom instances
As GeoSense continues to grow, the community may want to create their own
instance of the platform on a different server. Because it is open source, the
entirety of the project can be downloaded and installed via the public GitHub
repository. This creates complexity when trying to develop GeoSense for both
Safecast as well as community usage. Because of this, there may be ongoing
branches of the GeoSense project that are specific to a certain instance of the
project, Safecast in this example, and would differ in certain features from the
instance hosted at http://geo.media.mit.edu. This fragmentation can cause
complications when developing new futures, as it requires that all custom or
branched features are forward compatible with changes to the master repository To avoid further complication, GeoSense will only "officially support"
development of the master repository and specific derivatives that are generated by the core team.

67

68

Use Cases
GeoSense has been evaluated against a number of different usage scenarios
whose interests and datasets differ greatly. In order to prove the versatility of
the system, it was crucial to select example maps and users whose feedback
would differ based on their individual needs. Our tool's true power is demonstrated through how we observe the community using it to tell stories; the
narratives developed within GeoSense exceeded our original intent and expectations. The following case studies were conducted with the GeoSense
platform:

SAFECAST
The first and most obvious usage scenario is Safecast, whose dataset was the
spark behind the development of GeoSense. With over 10,000 active viewers
through the development of GeoSense V3, Safecast has been the primary
driver behind feature-set development. For the first time, the Safecast dataset
was fully visible as a perfect mirror of its current state in the teamdata database: there were no intermediary hand-built aggregates or reductions as was
previously the case.

69

SAFECAST H
An image of GeoSense for Safecast showing a coastalarea ofJapanfeaturing:Radiation levels
(green to pink), coastalflood zones (red coast), nuclear reactors (red dot), and earthquakes(blue)

For our usage scenario, the Safecast dataset was combined with historical
earthquake data, nuclear power reactors, and nuclear power plants with reported INES (International Nuclear Events Scale) incidents as well as modelgenerated coastal flooding models from the 3/11 earthquake and ensuing tsunami. The selective choice of data layering was done to not only tell an important story, but open the stage for discussion: common questions such as
"where should I consider building a house?", "Is my child's school playground
safe from radiation?", and "What areas are at high risk for similar catastrophe?" have been asked and addressed. By allowing the community to discuss
data placed in context, the back-and-forth of email news groups and repetitive question & answer has been reduced. Much like the ancient stone markers found in coastal Japan warning the inhabitants of tsunamis, GeoSense offers not only a view into the past but a glimpse into the future where individuals and communities alike can make concise, informed decisions.

70

SOURCEMAP
Sourcemap.com is the open directory of supply chains and environmental
footprints. Consumers use the site to learn about where products come from,
what they're made of, and how they impact people and the environment.
Companies use Sourcemap to communicate transparently with consumers
and to tell the story of how products are made. [39]
The GeoSense team is working closely with CEO Leonardo Bonanni
of Sourcemap on finding ways to explore the causal relationships between
climate, cultural, and ecological data in conjunction with product supply
chains. We have begun by exploring the relationship between North American farm location, food distribution patterns, global warming, and population
density. When properly visualized, new insights related to operational risk
factors and supply chain optimization have arisen.

THE LACE RACE


The Lace Race is an ongoing global game developed by a team of artists and
researchers from the MIT ACT, Media Lab, CSAIL and Department of Architecture. It debuted at the Reykjavik Arts Festival in Reykjavik, Iceland. The
Lace Race game is simple: participants are given a single shoe lace with a
unique identifier number. Each participant is then encouraged to continually
trade his or her shoelace(s) with strangers or other participants. Per each encounter, the exchanging user is encouraged to tweet in the following format
"#LaceRace 123 location" where "#LaceRace" refers to the game's hash tag,
"123" the unique identifier, and "location" to the physical location of the exchange. GeoSense was then used to watch the Twitter hashtag #LaceRace and
produce a realtime map of all ongoing Lace Race activity.
Users are also encouraged to use the geo-tagged comment system to
leave annotation on their exchange, where they saw specific laces or even to
hunt down specific numbers as a source of information exchange.

71

Results
As of writing, GeoSense has encountered more than 10,000 users through
Safecast alone. It was demonstrated to over 400 visitors and broadcast to
thousands during the 2012 spring MIT Media Lab Member's week. Many parties were interested in using GeoSense as a new way to decode their own,
cryptic data. Specific interest was shown by members of the National Wildlife
Federation in regards to better understanding the social, economic, and environmental impact of seasonal fires; we anticipate many future partnerships.
Thanks to Safecast, a constant stream of users encounters GeoSense
for mission-critical usage regarding the radiation dataset. Results so far are
positive, and optimistic, but we realize only the surface has been scratched
and will continue to feverishly develop GeoSense until it reaches its full potential.

72

73

Future Work
GeoSense is an ongoing ever evolving project. Because it is open source and
serves as the visualization platform for Safecast's future work, it will always
be defined not only by the experimental directions we hope to take but also
by features that best suit the needs of the active user base. Hundreds of potential directions have been discussed, of them these are some of the most
pressing:

Tile servers
As previously described in Technical Design and Challenges, technical limitations are quickly met when attempting to handle and visualize large and
dense sets of data. The most efficient methods remains to be one of the oldest, to render all of the data as part of the map tile on the server itself. GeoSense currently renders visual information into the canvas layer client- side
and displays it as an overlay atop a pre-generated map tile. To date, we have
reached an efficiency that challenges the performance of even a dedicated tile
server, however older machines and mobile users may find the experience
slower and in some cases, completely broken.

74

Expanded visualization types


With a robust method for handling large data sets and a community of active
users, GeoSense is in a prime position to iterate and experiment with new
types of visualizations. We imagine there to be a well of opportunity in exploring information visualization beyond geovisualization. We hope to work
towards finding new and expressive visual explanations of a dataset's potential meaning.

Models & mechanistic explanations


As is started to be explored by the introduction of time series graphs and pregenerated model overlays, the idea of allowing for user-specified models to
cast against their dataset is compelling. We imagine that once a set of data is
represented in GeoSense, a number of conditions can be applied against it.
These conditions are infinite but we are currently exploring falloff decay, parameters for attraction and deflection, as well as movement and inertia. Ultimately, a suite of tools could be developed to allow users, or communities of
users, to develop models towards understanding the meaning or future impact of their geovisualization.

Boolean conditions and spatially bound


alerts
Part in parcel of the original GeoSense proposal was to invite individual users
to create geo-fenced conditional alerts atop their geovisualization. This interface will allow users to specify an "if-this-then-that" problem statement where
if certain criteria is met, a series of specified outcomes will execute. A situational example of this would be:
"If radiationsover 500CPM is reported within 5KM of my home then email me a
notice".

75

This feature was deprecated in the current build of GeoSense as, during development, it was found to be less crucial than a stable infrastructure of geospatial commenting and live chat amongst current users. We are looking to
reevaluate the importance of boolean conditions and spatial alerts in the
coming months.

76

77

Conclusion
GeoSense liberates the author, viewer, and data. It proposes that design may
be used as a lens to enhance human understanding and promote imagination
- that provocative discoveries can be uncovered through intent and serendipity alike. We have demonstrated how, through the juxtaposition of visual language and observational analysis, insightful narratives can be discovered;
leading a community of individuals to generate hypotheses around the causality of data and worldly events.
With geovisualization comes many complexities. Daunting they may
be, their very presence also provides inherent value; to be massively complex
is both boon and bane. To explore, to probe at, and to liberate lifeless tabulated data into instructive, insightful, and human readable information is a
prelude to an even larger effort.
We have explored the visual marriage of time and space, where both
parameters are tuned and tweaked to provide the viewer with insights that
were once locked away within spreadsheets. We have also begun expanding
the known vocabulary of geovisualization for the digital age, where each pixel
can have tremendous meaning and consequence; devising a representational
taxonomy that serves both form and function.
Finally, we have seen the need for, and positive response to, com-

78

munity tools for building dialog and sharing intelligence. GeoSense has
opened the doors for both thought and voice, where the user plays the role of
designer, scientist, analyst, and philosopher. Our accomplishment is an important first step, but is it only that - the first step. To answer the harder questions, to gaze into the future, we must first have a tool to see into the past and
into the now; with GeoSense we may begin this process with massive data as
our vessel, assembled by and for a community of open minds and thinkers.

79

80

References
[1]

Safecast, "Safecast," blog.safecast.org. [Online]. Available:


http://blog.safecast.org/. [Accessed: 27-Apr.-2012].

[2]

J. Mackinlay, S. K. Card, and B. Shneidermann, Reading in Information Visualization: Using Vision to Think. Morgan Kaumann Publishers, 1999.

[3]

MobiThinking, "Global mobile statistics 2012," mobithinkingcom.


[Online]. Available:
http://mobithinking.com/mobile-marketing-tools/latest-mobile-stats.
[Accessed: 27-Apr.-2012].

[4]

Geospatial Today. [Online]. Available: http://geospatialtoday.com. [Accessed: 27-Apr.-2012].

[5]

E B. Viegas, M. Wattenberg, E van Ham, J. Kriss, and M. McKeon,


"Many Eyes: A Site for Visualization at Internet Scale," pp. 1-8, Aug.
2007.

[6]

E. R. Tufte, Visual Explanations: Images and Quantities, Evidence and


Narrative. Graphics Press, 1997, p. 156.

[7]

stamencom. [Online]. Available: http://stamen.com. [Accessed:


27-Apr.-2012].

[8]

"We feel fine and searching the emotional web," presented at the Proceedings of the fourth ACM international conference on Web search
and data mining, New York, NY, USA, 2011, pp. 117-126.

[9]

Google, "Google Earth," google.com. [Online]. Available:


http://www.google.com/earth/index.html. [Accessed: 27-Apr.-2012].

[10]

NASA, "World Wind JAVA SDK," worldwind.arc.nasa.gov, 18-Jul.-2011.


[Online]. Available: http://worldwind.arc.nasa.gov/java/. [Accessed:
27-Apr.-2012].

[11]

NSIDC, "View NSIDC Data on Virtual Globes: Google Earth,"


nsidc.org. [Online]. Available: http://nsidc.org/data/virtual-globes/.
[Accessed: 27-Apr.-2012].

[12]

unhcr.org. [Online]. Available: http://www.unhcr.org. [Accessed:


27-Apr.-2012].

[13]

ArcGIS, "ArcGIS Online," arcgis.com. [Online]. Available:


http://www.arcgis.com/home/. [Accessed: 27-Apr.-2012].

[14]

ESRI, "Maplt - Create Interactive Business Maps | Map SQL Server &
Excel Data," esri.com. [Online]. Available:
http://www.esri.com/software/mapit/index.html. [Accessed:
27-Apr.-2012].

[15]

Pachube, "The Internet of Things Real-Time Web Service and Applications - Pachube," pachube.com. [Online]. Available:
https://pachube.com/. [Accessed: 27-Apr.-2012].

[16]

Ushahidi, "Ushahidi :: Home," ushahidi.com. [Online]. Available:


http://www.ushahidi.com/. [Accessed: 27-Apr.-2012].

[17]

"Automatic generation of tourist maps," ACM Trans. Graph., vol. 27,


no. 3, pp. 100:1-100:11, 2008.

[18]

"Crowdsourcing graphical perception: using mechanical turk to assess


visualization design," presented at the Proceedings of the 28th international conference on Human factors in computing systems, New
York, NY, USA, 2010, pp. 203-212.

[19]

geocommons.com. [Online]. Available: http://geocommons.com. [Accessed: 28-Apr.-2012].

[20]

worldmap.harvard.edu. [Online]. Available:


http://worldmap.harvard.edu. [Accessed: 28-Apr.-2012].

[21]

MapBox, "MapBox I MapBox," mapbox.com. [Online]. Available:


http://mapbox.com/. [Accessed: 27-Apr.-2012].

[22]

TileMill, "TileMill | MapBox," mapbox.com. [Online]. Available:


http://mapbox.com/tilemill/. [Accessed: 27-Apr.-2012].

[23]

"Your place or mine?: visualization as a community component," presented at the Proceedings of the twenty-sixth annual SIGCHI conference on Human factors in computing systems, New York, NY, USA,
2008, pp. 275-284.

[24]

CAIS, "thquake Prediction, Japan," cais.gsi.go.jp. [Online]. Available:


http://cais.gsi.go.jp/YOCHIREN/activity/191/191.e.html. [Accessed:
27-Apr.-2012].

[25]

CBS, "New USGS number puts Japan quake at 4th largest - CBS
News," cbsnews.com, 14-Mar.-2011. [Online]. Available:
http://www.cbsnews.com/stories/2011/03/14/501364/main20043126.s
html. [Accessed: 27-Apr.-2012].

[26]

N. G. JP, "Damage Situation and Police Countermeasures associated


with 2011Tohoku district - off the Pacific Ocean Earthquake,"
npa.go.jp, 25-Apr.-2012. [Online]. Available:
http://www.npa.go.jp/archive/keibi/biki/higaijokyo-e.pdf. [Accessed:
27-Apr.-2012].

[27]

NISA, "INES (the International Nuclear and Radiological Event


Scale) Rating on the Events in Fukushima Dai-ichi Nuclear Power
Station by the Tohoku District," nisa.meti.go.jp. [Online]. Available:
http://www.nisa.meti.go.jp/english/files/en20110412-4.pdf. [Accessed:
27-Apr.-2012].

[28]

I. B. Times, "Analysis: A month on, Japan nuclear crisis still scarring International Business Times," ibtimes.co.in, 09-Apr.-2011. [Online].
Available:
http://www.ibtimes.co.in/articles/132391/20110409/japan-nuclear-cris
is-radiation.htm. [Accessed: 27-Apr.-2012].

[29]

L. Times, "Japan earthquake: Insurance cost for quake alone pegged


at $35 billion, AIR says - Los Angeles Times," articles.latimes.com,
13-Mar.-2011. [Online]. Available:

http://articles.latimes.com/2011/mar/13/world/la-fgw-japan-quake-ins
urance-20110314. [Accessed: 27-Apr.-2012].
[30]

Safecast, "Safecast Data Downloads," maps.safecast.org. [Online].


Available: http://maps.safecast.org/downloads/. [Accessed:
27-Apr.-2012].

[31]

CC, "Creative Commons - CCO 1.0 Universal," creativecommons.org.

[Online]. Available:
http://creativecommons.org/publicdomain/zero/1.o/. [Accessed:
27-Apr.-2012].
[32]

TEPCO, "TEPCO: Status of Fukushima Daiichi and Fukushima Daini


Nuclear Power Stations after great east japan earthquake," tepco.co.jp.
[Online]. Available:
http://www.tepco.co.jp/en/nu/fukushima-np/index-e.html. [Accessed:
27-Apr.-2012].

[33]

P. W Anderson, More and Different: Notes from a Thoughtful Curmudgeon, 1st ed. World Scientific Publishing Company, 2011, p. 424.

[34]

E. R. Tufte, Envisioning Information. Graphics Pr, 1990, p. 126.

[35]

J. Albers, Search versus re-search. Trinity College Press, 1969, p. 85.

[36]

E. Imhof, Cartographic Relief Presentation. ESRI Press, 2007, p. 388.

[37]

peddl.com. [Online]. Available: https://peddl.com. [Accessed:


27-Apr.-2012].

[38]

P. Pulse, "Place Pulse I The Collaborative Image of the City,"


pulse.media.mit.edu. [Online]. Available: http://pulse.media.mit.edu/.
[Accessed: 27-Apr.-2012].

[39]

sourcemap.com. [Online]. Available: http://sourcemap.com. [Accessed:


27-Apr.-2012].

[40]

ifttt.com. [Online]. Available: http://ifttt.com. [Accessed: 27-Apr.-2012].

[41]

ReadWriteHack, "Wait, What's Node.js Good for Again?,"


readwriteweb.com. [Online]. Available:
http://www.readwriteweb.com/hack/2011/O1/wait-whats-nodejs-goodfor-aga.php. [Accessed: 27-Apr.-2012].

[42]

mongodb.org. [Online]. Available: http://www.mongodb.org. [Accessed:


27-Apr.-2012].

[43]

backbonejs.org. [Online]. Available: http://backbonejs.org. [Accessed:


27-Apr.-2012].

[44]

C. Hidalgo, "Graphical Statistical Methods for the Representation of


the Human Development Index and its Components."

Appendix
Tablet AR installation
GSPEAK BRIDGE
In order to translate coordinate position of both the iPad and physical globe, a
translation bridge was developed and deployed as part of the GeoSense application. This bridge, written in Ruby acts as an interpreter between Oblong's
Gspeak system and the GeoSense platform.

THE INTENT OF AN AUGMENTED REALITY APPLICATION


Paraphrasedfrom Samuel Luescher's 2012 projectproposal "As a tangible interface to this data, we propose a physical globe whose position and orientation in space the application is monitoring. When holding up
a tablet to the globe, digital layers are superimposed on the camera image of
the globe that is displayed on the tablet screen. By coupling the physical affordances of the object with an AR application for tablet computers, we ex-

pect to tackle a number of usability problems that commonly occur with


mapping applications. We explore possible interaction techniques when coupling tablets with the globe and using them for individual navigation around
the geospatial data, subsequent decoupling of specific map views from the
globe and the tablet, as well as using the globe as a master control for larger
views."

Left: Samuel Luescher (front) and Anthony DeVincenzi (back) createda new map with GeoSense.
Right:A view of the tabletAR installation

You might also like