Welcome to Scribd!

NGDM - Lawrence O Hall and Kevin W Bowyer

Uploaded by

0% found this document useful (0 votes)

8 views19 pages

"Lookmarks" that point designers and users to interesting or anomalous regions greatly increase productivity. Lookmarks can point out interesting and anomalous regions; for example, Red Tide outbreaks in satellite images. If you can spend 40% as much time to do about 15x work, it is pretty important.

Original Description:

Original Title

NGDM_Lawrence O Hall and Kevin W Bowyer

Copyright

Available Formats

PPT, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Attribution Non-Commercial (BY-NC)

Available Formats

Download as PPT, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

8 views19 pages

NGDM - Lawrence O Hall and Kevin W Bowyer

Uploaded by

api-3798592

Copyright:

Attribution Non-Commercial (BY-NC)

Available Formats

Download as PPT, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 19

Search inside document

Finding “Lookmarks” for

Extreme-Scale Simulation and

Scientific Data

Lawrence O. Hall and Kevin W. Bowyer1

Computer Science & Engineering
University of South Florida
1
University of Notre Dame
Next Generation Data Mining - NGDM 07 1
Introduction
• Petascale simulations may require
significant time to debug / understand.
• Interesting regions in the simulation
are generally a small part of the whole.
• “Lookmarks” that point designers and
users to interesting or anomalous
regions greatly increase productivity.

Next Generation Data Mining - NGDM 07 2

Introduction
• Large-scale scientific databases are
being gathered, as from astronomical
observation or satellites.
• It is possible to gather more data than
there is time for experts to evaluate.
• Lookmarks can point out interesting
and anomalous regions; for example,
red tide outbreaks in satellite images.
Next Generation Data Mining - NGDM 07 3
MODIS Images from Southwest Florida Coast
(a) (b) (c)
11/13/2004 11/13/2004 11/13/2004
ERGB FLH Csat

26oN

Dark Water
Red Tide
25oN

83oW 82oW

Next Generation Data Mining - NGDM 07 4

Some Issues
• The data may be distributed, as in
Petascale simulations, in ways that
prevent it being all on one CPU.
• Classes of interest are typically much
smaller than the inhomogeneous “rest
of the data.”
• Most important for users may be to
point them to interesting regions, rather
than to correctly identify all examples.
Next Generation Data Mining - NGDM 07 5
Next Generation Data Mining - NGDM 07 6
Next Generation Data Mining - NGDM 07 7
How Important is This Idea?
A “before/after lookmarks” example:
• DOE lab staff ran a particular simulation
162 times, to detect tears/breaches.
• Each run generated 876GB of data.
• 180 person-hours were spent on finding
only tears in only the first 12 runs.

Next Generation Data Mining - NGDM 07 8

How Important is This Idea?
A “before/after lookmarks” example:
• Ground truth from first 12 runs, with an
added feature, used to train an “avatar.”
• Avatar reviews data, inserts lookmarks.
• 75 person-hours then spent checking
168 runs for tears and breaches.
40% of the time to do 15 times the data!
Next Generation Data Mining - NGDM 07 9
How Important is This Idea?
• Whenever you can spend about
40% as much time to do about 15x
work, it is pretty important.

Next Generation Data Mining - NGDM 07 10

Challenges
• Getting labeled data to start with.
– For simulations, for example, likely only
some regions in some time steps will be
labeled.
– The “uninteresting” data is likely
heterogeneous, with multiple underlying
classes.

Next Generation Data Mining - NGDM 07 11

Challenges
• In region-based prompting, should
the learning algorithm have a
measure of regional error?
– Does ground truth overlap define
regional error?
– Should measures of lift be used to
indicate how well regions are found?

Next Generation Data Mining - NGDM 07 12

Challenges
• User-marked regions are likely
imprecise.
– Even an interesting region where a
lookmark should exist may be
inhomogeneous.

• Distributed data may require a

distributed model be learned.

Next Generation Data Mining - NGDM 07 13

Approaches
• Active learning.
– May be helpful to indicate the most
useful examples/regions for labeling.
– May be used to gather more salient
regions for lookmark generation.

Next Generation Data Mining - NGDM 07 14

Approaches
• Semi-supervised learning.
– On a canister simulation looking at
stresses on bolts, close to 100% regional
accuracy was achieved with relatively
few training examples.

Next Generation Data Mining - NGDM 07 15

Approaches
• Ensembles of classifiers.
– May be used to learn on distributed data.
– Can be combined by weighted fusion
method.

• Synthetic examples and under-

sampling may be applied to address
skew in data.

Next Generation Data Mining - NGDM 07 16

Approaches
• Inhomogeneous classes may be
clustered into more homogenous
classes to ease the task of a
supervised learning algorithm.

Next Generation Data Mining - NGDM 07 17

Summary
• Advance science faster using “lookmarks”
that focus evaluation of large datasets.
• Data mining to develop lookmarks likely
requires distributed learning, learning under
data skew, and practical ways to enhance the
amount of labeled data.
• The combination of approaches needed will
cause the underlying approaches to evolve.

Next Generation Data Mining - NGDM 07 18

Questions ?

Next Generation Data Mining - NGDM 07 19

Acquisti NGDM
Document47 pages
Acquisti NGDM
api-3798592
No ratings yet
Shekhar SS SDMChallenges NGDM07
Document19 pages
Shekhar SS SDMChallenges NGDM07
api-3798592
No ratings yet
NGDM Senator 071011 DM
Document17 pages
NGDM Senator 071011 DM
api-3798592
No ratings yet
NGDM 10
Document8 pages
NGDM 10
api-3798592
No ratings yet
Bhavani NSF NGDM Oct2007 Short
Document15 pages
Bhavani NSF NGDM Oct2007 Short
api-3798592
No ratings yet
Berry Anomalies
Document40 pages
Berry Anomalies
api-3798592
No ratings yet
Clifton Privacy
Document13 pages
Clifton Privacy
api-3798592
No ratings yet
Ngdm07 Singh
Document30 pages
Ngdm07 Singh
api-3798592
No ratings yet
CD Iwork Shop Jiawei Han
Document36 pages
CD Iwork Shop Jiawei Han
api-3798592
No ratings yet
2007 10 10 NSF Ngdm07 Chris Fischer
Document47 pages
2007 10 10 NSF Ngdm07 Chris Fischer
api-3798592
No ratings yet
NGDM07v1 Wei Wang
Document26 pages
NGDM07v1 Wei Wang
api-3798592
No ratings yet
NGDM Final Donya Quick
Document31 pages
NGDM Final Donya Quick
api-3798592
No ratings yet
NSF Oct 2007 Vipin Kumar
Document37 pages
NSF Oct 2007 Vipin Kumar
api-3798592
No ratings yet
Alok Choudhary NGDM07 Panel Talk
Document16 pages
Alok Choudhary NGDM07 Panel Talk
api-3798592
No ratings yet
NGDM 10-09-07 - Haym Hirsh
Document17 pages
NGDM 10-09-07 - Haym Hirsh
api-3798592
No ratings yet
Kborne NGDM07
Document24 pages
Kborne NGDM07
api-3798592
No ratings yet
NGDM 07 Michael May
Document30 pages
NGDM 07 Michael May
api-3798592
No ratings yet
NGDM Talk Kargupta2
Document22 pages
NGDM Talk Kargupta2
api-3798592
No ratings yet
Finin NGDM Panel
Document17 pages
Finin NGDM Panel
api-3798592
No ratings yet
Ngdm07 Joshi
Document80 pages
Ngdm07 Joshi
api-3798592
No ratings yet
Xindong Wu NGDM07
Document32 pages
Xindong Wu NGDM07
api-3798592
No ratings yet
NGDM07 Philip Yu
Document22 pages
NGDM07 Philip Yu
api-3798592
No ratings yet
Weitzner Info Accountability Ngdm07
Document17 pages
Weitzner Info Accountability Ngdm07
api-3798592
100% (1)
HumanGeneFinding-NGDM2007 Salzberg
Document31 pages
HumanGeneFinding-NGDM2007 Salzberg
api-3798592
No ratings yet
Social Identity Group Discovery - Gary - Strong
Document7 pages
Social Identity Group Discovery - Gary - Strong
api-3798592
No ratings yet
NGDM07 Honavar
Document42 pages
NGDM07 Honavar
api-3798592
No ratings yet
Grossman Ngdm07
Document35 pages
Grossman Ngdm07
api-3798592
No ratings yet
Architecture Conscious Data Mining: Srinivasan Parthasarathy Data Mining Research Lab Ohio State University
Document16 pages
Architecture Conscious Data Mining: Srinivasan Parthasarathy Data Mining Research Lab Ohio State University
api-3798592
No ratings yet
Agouris
Document8 pages
Agouris
api-3798592
No ratings yet
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Rating: 4 out of 5 stars
4/5 (5784)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Rating: 3.5 out of 5 stars
3.5/5 (399)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Rating: 4 out of 5 stars
4/5 (890)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Rating: 4.5 out of 5 stars
4.5/5 (537)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Rating: 4 out of 5 stars
4/5 (587)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Rating: 4.5 out of 5 stars
4.5/5 (474)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Rating: 4 out of 5 stars
4/5 (98)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Rating: 4.5 out of 5 stars
4.5/5 (234)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
Rating: 3.5 out of 5 stars
3.5/5 (738)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Rating: 4.5 out of 5 stars
4.5/5 (838)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Rating: 4.5 out of 5 stars
4.5/5 (271)
Yes Please
From Everand
Yes Please
Amy Poehler
Rating: 4 out of 5 stars
4/5 (1888)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Rating: 3.5 out of 5 stars
3.5/5 (231)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Rating: 4.5 out of 5 stars
4.5/5 (265)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Rating: 4.5 out of 5 stars
4.5/5 (344)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
Rating: 4 out of 5 stars
4/5 (599)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Rating: 4 out of 5 stars
4/5 (72)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Rating: 3.5 out of 5 stars
3.5/5 (2219)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Rating: 3.5 out of 5 stars
3.5/5 (137)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
Rating: 4 out of 5 stars
4/5 (45)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
Rating: 4.5 out of 5 stars
4.5/5 (440)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Rating: 4 out of 5 stars
4/5 (1090)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
Rating: 4.5 out of 5 stars
4.5/5 (1711)
John Adams
From Everand
John Adams
David McCullough
Rating: 4.5 out of 5 stars
4.5/5 (2409)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
Rating: 4.5 out of 5 stars
4.5/5 (806)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
Rating: 4 out of 5 stars
4/5 (1015)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
Rating: 4 out of 5 stars
4/5 (1800)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Rating: 4.5 out of 5 stars
4.5/5 (119)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
Rating: 4.5 out of 5 stars
4.5/5 (789)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
Rating: 3.5 out of 5 stars
3.5/5 (2322)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
Rating: 4.5 out of 5 stars
4.5/5 (4609)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
Rating: 4 out of 5 stars
4/5 (3811)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Tóibín
Rating: 3.5 out of 5 stars
3.5/5 (1937)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
Rating: 4.5 out of 5 stars
4.5/5 (2099)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
Rating: 4.5 out of 5 stars
4.5/5 (1929)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Rating: 4 out of 5 stars
4/5 (1103)
Little Women
From Everand
Little Women
Louisa May Alcott
Rating: 4 out of 5 stars
4/5 (104)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
Rating: 3.5 out of 5 stars
3.5/5 (104)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Rating: 4 out of 5 stars
4/5 (4193)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
Rating: 3.5 out of 5 stars
3.5/5 (791)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Rating: 4 out of 5 stars
4/5 (821)
Mathematics 1994 Paper 2
Document20 pages
Mathematics 1994 Paper 2
api-3826629
0% (1)
EMV v4.2 Book 4
Document155 pages
EMV v4.2 Book 4
systemcoding
100% (1)
Fundamental Statistics For The Behavioral Sciences v2.0
Document342 pages
Fundamental Statistics For The Behavioral Sciences v2.0
Mark Kyle P. Andres
No ratings yet
Aw Hook SimulationXpress Study 1
Document13 pages
Aw Hook SimulationXpress Study 1
Mechatronics
No ratings yet
IFBI Solved Maths 7
Document52 pages
IFBI Solved Maths 7
Anurag Baraik
No ratings yet
Solve Systems of Differential Equations
Document225 pages
Solve Systems of Differential Equations
Inspire Lo
No ratings yet
Ch02 (1) Answers Economics
Document19 pages
Ch02 (1) Answers Economics
sheetal taneja
No ratings yet
ENG101 Chapter 4
Document5 pages
ENG101 Chapter 4
Abdul Haseeb
No ratings yet
Momentum Npte
Document30 pages
Momentum Npte
dilini
No ratings yet
Chap 01
Document66 pages
Chap 01
Gaurav Harsha
No ratings yet
Take Home Exam
Document3 pages
Take Home Exam
Christine Jeroso
No ratings yet
4 Quadratic Equations
Document2 pages
4 Quadratic Equations
TJSingh
No ratings yet
Lesson Plan in Mathematics 10
Document7 pages
Lesson Plan in Mathematics 10
shacarmi ga
No ratings yet
Mass, Momentum and Energy Equations
Document25 pages
Mass, Momentum and Energy Equations
lassi19a
100% (1)
SEISMIC DYNAMIC ANALYSIS
Document47 pages
SEISMIC DYNAMIC ANALYSIS
B S Praveen Bsp
No ratings yet
Quiz
Document4 pages
Quiz
Rechel
No ratings yet
2019 Staar Algebra I Test Tagged
Document44 pages
2019 Staar Algebra I Test Tagged
BUBBLEGUMDUM sickles
No ratings yet
Functions (Grade 10) - Google Forms
Document7 pages
Functions (Grade 10) - Google Forms
Чингиз Алиев
No ratings yet
Irreversible Thermodynamics Kenneth Pitzer
Document4 pages
Irreversible Thermodynamics Kenneth Pitzer
rollo
No ratings yet
Specialist Data Modeling 20201217
Document79 pages
Specialist Data Modeling 20201217
Putri Aulia Amara
No ratings yet
Bae 2016
Document12 pages
Bae 2016
IngénieurCivil
No ratings yet
MATH PETA 4th
Document56 pages
MATH PETA 4th
durpy minsh
No ratings yet
Vector Bundles On Rational Homogeneous Spaces: Rong Du, Xinyi Fang and Yun Gao
Document38 pages
Vector Bundles On Rational Homogeneous Spaces: Rong Du, Xinyi Fang and Yun Gao
Johan Gunardi
No ratings yet
BEE4413 Chapter 7
Document20 pages
BEE4413 Chapter 7
NUrul Aqilah Mus
No ratings yet
Thermotropic and Thermochromic Materials-03-05143
Document26 pages
Thermotropic and Thermochromic Materials-03-05143
Johannes Knesl
No ratings yet
CHE412 Process Dynamics Control BSC Engg 7th Semester
Document17 pages
CHE412 Process Dynamics Control BSC Engg 7th Semester
wreckieb
No ratings yet
Fluid at Rest
Document25 pages
Fluid at Rest
Zinb Himdan
No ratings yet
Structural Dynamics Syallbus Nptel
Document3 pages
Structural Dynamics Syallbus Nptel
Sushil Mundel
No ratings yet
Belief Statement
Document7 pages
Belief Statement
api-366576866
No ratings yet
Critical Review of Recent Advances and Further Developments Needed in AC Optimal Power Flow
Document14 pages
Critical Review of Recent Advances and Further Developments Needed in AC Optimal Power Flow
shady
No ratings yet