Professional Documents
Culture Documents
-
Lightning-Fast
Cluster
Compu6ng
by
Example
Ramesh
Mudunuri
Saturday,
November
29,
2014
About
me
Big
data
enthusiast
Product
developer
using
spark
technology
What to expect
Introduc6on
to
Spark
Spark
Eco
system
How
is
it
di
from
Hadoop
Map
Reduce
Where
it
shine
well
How
easy
to
install
and
start
learning
Small
code
demos
Where
to
nd
addi6onal
informa6on
This
is
not
Training
class
Work
shop
Product
demo
with
commercial
interest
What
is
Spark
Apache
Spark
is
a
fast
and
general
engine
for
large-scale
data
processing.
General
purpose
large-scale
high
performance
processing
engine
Spark
History
Started
as
research
project
at
UC
Berkeley
amplab
in
2010
and
now
a
apache
open
source
project
Prominent
research
team
member
Matai
Zaharia
Later
Ma6a
started
company
Databricks
Now
Apache
open
source
project
What
is
Spark
Apache
Spark
is
a
fast
and
general
engine
for
large-scale
data
processing.
Spark Ecosystem
Spark SQL
Spark Streaming
GraphX
Code
Demos
Write
some
interes6ng
code
snippets
on
REPL
using
scala
1.
Read
Meetup
par6cipants
into
get
some
counts
-
2.
create
as
table
get
some
counts
with
SPARK
SQL
3.
Mllib
example
List
Matai
Papers
Spark
Documenta6on
Spark
Summit
videos
Books
Workshop
Databricks
My
twiger
handle
Final
note
Thank
you
-
Hosts
and
Par6cipants
Share
the
knowledge