Professional Documents
Culture Documents
Girish Juneja
GM, Big Data Software
Software & Services Group
Data
Fab
Transistor
System
Enablement
Optimization
Intelligence
Data
Computing
Experience
Social
User Generated
Platform RDMS
90s
2000s
No SQL RDMS
Today
Unbounded Map Reduce Query Low Cost / Enterprise Use Arrival of vast amounts of unstructured data
Real-time - ie recommend engine Process @ storage node Built-in data replication/reliability Shared nothing, in memory
Unlimited Linear Scale
Distributed node addition
Security
Software
Global Ecosystem
45 nm
32 nm
22 nm
37%
Performance Gain at Low Voltage1
22nm
A Revolutionary Leap in Process Technology
High-k Metal Gate
Intel lead vs. Industry
Tri Gate
Intel lead vs. Industry
>50%
Active Power Reduction at Constant Performance1
3.5 years
4 years
Highest reliability & scalability Highest memory capacity Highest enterprise & database performance
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. 1 Source: Published results as of 8 May 2012. See http://www.intel.com/performance/server/xeonE7/summary.htm for full list of benchmarks and configuration details.
VT-*
MCA
AES-NI
SSD, 10GbE
TXT
File based encryption for Hadoop jobs ACLs for HDFS and HBase at cell level
Flash storage for MapReduce shuffle data Caching and non-volatile memory for increased throughput HDFS adaptive replication of hot-files
Batch Analytics
Graph Analytics
Full SQL
Up to 20X faster crypto with AES-NI* 30X faster Terasort on Intel Xeon processors, Intel 10GbE, and SSD
HPC
Up to 8.5X faster queries in Hive* Job profiling and configuration, automated by Intel Active Tuner
*Based on internal testing
Cloud
Server
Network
IT
Fraud & threat detection Life sciences research Behavioral analysis Warranty analysis Customer segmentation Infrastructure optimization
Analytics
Provides real-time retrieval of 6 months data Supports new BI with 15 types of queries Enables targeted ad serving and promotions
Intel Distribution
Data Management
30 TB/month of billing data 300K reads/second; 800K inserts/second
CDR
Analytics
Provide curated data sets with pre-computed analysis (classification, correlation, biomarkers) Provide APIs for applications to combine and analyze public and private data sets
Intel Distribution
Data Management
Use Hive and Hadoop for query and search Dynamically partition and scale Hbase 10-node cluster / Intel Xeon E5 processors / 10GbE
Analytics
Detect traffic law violations automatically Detect driver license fraud by data mining
Data Management
30,000 cameras 6Mb/s stream rate per camera 15 PB of images in use / 2B records in HBase
Foster the ecosystem and develop new markets for Intel and its partners
Resources
Content
Case Studies Whitepapers Demos
http://hadoop.intel.com