Professional Documents
Culture Documents
Bare-metal or Cloud?
Bare-metal Cloud
Bare-metal Cloud
Price-Performance
Data Privacy Data Gravity
Ratio
Data Productivity of
Enrichment Developers & Data Scientists
Copyright 2013 Accenture All rights reserved. 3
Big Data: Bare-metal vs. Cloud
Bare-metal Cloud
Price-Performance
Data Privacy Data Gravity
Ratio
Data Productivity of
Enrichment Developers & Data Scientists
Copyright 2013 Accenture All rights reserved. 4
Price-Performance Ratio Views
Bare-metal Cloud
Copyright 2013 Accenture All rights reserved. Servers designed by Daniel Campos from The Noun Project 5
Hadoop Deployment Comparison Study
Bare-metal Cloud
TCO analysis
+
Accenture Data
Platform Benchmark
Copyright 2013 Accenture All rights reserved. 6
Hadoop Deployment Comparison Study
TCO Analysis
Bare-metal Cloud
TCO analysis
+
Accenture Data
Platform Benchmark
Copyright 2013 Accenture All rights reserved. 7
TCO of Bare-metal Hadoop Cluster
Data center
Server Technical Staff for
facility and
hardware support operation
electricity
$3,000.00 $2,914.58 $6,656.00 $9,274.46
Copyright 2013 Accenture All rights reserved. Servers designed by Daniel Campos from The Noun Project 8
TCO of Hadoop-as-a-Service
Hadoop
service
m1.xl
m2.4xl
Hadoop
service
cc2.8xl
1/3 of budget
50% cluster Hadoop-as-
allocated for Spot
utilization assumed a-Service
instances
Bare-metal Cloud
TCO analysis
+
Accenture Data
Platform Benchmark
Copyright 2013 Accenture All rights reserved. 13
Accenture Data Platform Benchmark
Bucketing
Sorting
Slicing
1 million
~150 billion
users,
log entries,
1.1 billion
~24 TB Sessions
sessions
Ratings data
Used item-based collaborative Who rated what item?
filtering algorithm
Mahout example library used as
foundation
Co-occurrence matrix
How many people
rated the pair of
items?
Generated 3 million
300 million population, Recommendation
ratings 50,000 items Given the way the person rated
these items, he/she is likely to be
interested in these other items.
Copyright 2013 Accenture All rights reserved. 16
Accenture Data Platform Benchmark:
Document Clustering
TF vectors
Bare-metal Cloud
TCO analysis
+
Accenture Data
Platform Benchmark
Copyright 2013 Accenture All rights reserved. 18
Experiment Setup:
Price-Performance Ratio Comparison
Bare-metal Amazon
Price-Performance
Hadoop EMR
Ratio
Cluster Clusters
1 bare-metal Measure
Fixed Manual and
cluster vs. 9 execution
budget for automated
Amazon EMR time of
cluster size tuning
clusters benchmark
Measure
execution Manual and
Profile Optimize
time of automated
phase phase
optimize tuning
phase
Copyright 2013 Accenture All rights reserved. Speedometer designed by Filippo Camedda from The Noun Project 20
Experiment Results:
Starfish Automated Performance Tuning Tool
Starfish tuned
Recommendation Engine Manually tuned Sessionization
workload w/ 11 cascaded workload
MapReduce jobs
Achieve 2+ weeks of
8x
performance manual Manual and
improvement
increases tuning, - 1 automated
in one tuning
with less cost day tuning
cycle
using Starfish iterations
Bare-metal: 533
cc2.8xlarge
m2.4xlarge
m1.xlarge
408.07
381.55
250.13
229.25
204.10
172.23
125.82
166.82
114.35
13 20 68 28 41 112 53 77 192
ODI RI RI+SI
Amazon EMR Configuration
Copyright 2013 Accenture All rights reserved. 22
Experiment Results:
Recommendation Engine
Execution Time (minutes)
Bare-metal: 21.59
cc2.8xlarge
m2.4xlarge
m1.xlarge
23.33
20.13
14.28
21.97
19.97
16.30
18.48
16.92
15.08
13 20 68 28 41 112 53 77 192
ODI RI RI+SI
Amazon EMR Configuration
Copyright 2013 Accenture All rights reserved. 23
Experiment Results:
Document Clustering
Execution Time (minutes)
Bare-metal: 1186.37
cc2.8xlarge
m2.4xlarge
m1.xlarge
1661.03
1649.98
1157.37
1112.68
914.35
779.98
784.82
629.98
742.38
13 20 68 28 41 112 53 77 192
ODI RI RI+SI
Amazon EMR Configuration
Copyright 2013 Accenture All rights reserved. 24
Key Takeaways
Automated performance
tuning tools are a
necessity
Copyright 2013 Accenture All rights reserved. Servers designed by Daniel Campos from The Noun Project 25
Acknowledgement
Michael Wendt
R&D Developer
Data Insights R&D
Accenture Technology Labs
(408) 817-2190
michael.e.wendt@accenture.com
Scott Kurth
Group Lead
Data Insights R&D
Accenture Technology Labs
(408) 817-2775
scott.kurth@accenture.com