Professional Documents
Culture Documents
Revolution R Open
The Enhanced R Distribution
November 12, 2014
In todays webinar:
R Update
Revolution R Open
The Reproducible R Toolkit
MRAN
Other open-source projects
DeployR Open
ParallelR
Rhadoop
Revolution R Plus
Q&A
David Smith
Chief Community Officer
Revolution Analytics
@revodavid
david@revolutionanalytics.com
Editor, blog.revolutionanalytics.com
Co-author, Introduction to R
OUR COMPANY
OUR PRODUCT
REVOLUTION R: The
enterprise-grade predictive
analytics application platform
based on the R language
SOME KUDOS
Visionary
Gartner Magic Quadrant
for Advanced Analytics
Platforms, 2014
What is R?
Most widely used data analysis software
Used by 2M+ data scientists, statisticians and analysts
www.revolutionanalytics.com/what-is-r
Poll #1
What software do you use for statistical analysis? (Select all that apply.)
R
SAS
SPSS
Python
Other
R Usage Growth
Language Popularity
#9: R
Rexer Data Miner Survey
Multi-threaded performance
Intel MKL replaces standard
BLAS/LAPACK algorithms
Pipelined operations
Optimized for Intel, works for all archs
High-performance algorithms
Sequential Parallel
Uses as many threads as there are
available cores
Control with:
setMKLthreads(<value>)
100% Compatibility
Built on latest R engine
Currently R 3.1.1, R 3.1.2 in testing
www.nytimes.com/2011/07/08/health/research/08genes.html
http://arxiv.org/pdf/1010.1092.pdf
10
An R Reproducibility Problem
11
CRAN mirror
http://cran.revolutionanalytics.com/
Midnight
UTC
checkpoint
server
checkpoint
package
http://mran.revolutionanalytics.com/snapshot/
Daily
RR
snapshots
library(checkpoint)
checkpoint("2014-09-17")
12
Using checkpoint
Easy to use: add 2 lines to the top of each script
library(checkpoint)
checkpoint("2014-09-17")
13
14
Revolution Analytics
DeployR Open
Goal: embed results from R scripts into
existing applications, in real time
Problem:
Exposing arbitrary R functions is unwise
Need to handle concurrent R sessions
More at deployr.revolutionanalytics.com
16
DeployR : Integration
DeployR does not provide any application UI.
3 integration modes embed real-time R results into existing interfaces
Web app, mobile app, desktop app, BI tool, Excel,
17
No state preserved
3. Enterprise Authentication
18
Fraud detection
19
20
RHadoop
Collection of packages for interfacing R and Hadoop
Client (desktop) R interface to Hadoop:
rhdfs: Browse, read, write and modify files stored in HDFS
rhbase: Browse, read, write and modify tables stored in HBASE
ravro: Read, write and run map-reduce on Apache Avro files in HDFS
R computations in Hadoop:
rmr2: write map-reduce tasks in R to run in Hadoop
plyrmr: R-based data manipulation computations on data in Hadoop
Reduce:
Input: Words with several key values
Output: words with counts
Map-Reduce:
Apply map to lines of text
Gather like words together and count
22
http://bit.ly/W35PLR
23
ParallelR
foreach replaces for loops
doSNOW (grids)
library("doMC")
registerDoMC(2)
x <- foreach(j=1:100) %dopar% birthday(j)
Introducing
Revolution R Plus
27
28
29
DeployR
ConnectR
ScaleR
DistributedR
DevelopR
Poll #2
Which Revolution Analytics projects do you plan to use (or already use?)
Select all that apply:
1.
2.
3.
4.
5.
31
Wrapping up
Revolution R Open is available now from
mran.revolutionanalytics.com/download
David Smith
Chief Community Officer
Revolution Analytics
@revodavid
david@revolutionanalytics.co
m
www.revolutionanalytics.com/plus
32
Thank you.
Next up: