You are on page 1of 15

Big Data Analytics

Shreekant Kadam
XMBA - 58

What are we going to


understand

What is Big Data?

Why we landed up there?

To whom does it matter

Are we ready to handle it?

What are the concerns?

Tools and Technologies

Simple to start
What

is the maximum file size you


have dealt so far?

Movies/Files/Streaming video that you have used?

What have you observed?

What

is the maximum download


speed you get?

Simple

computation

How much time to just transfer.

What is big data?

Every

day, we create 2.5 quintillion bytes of data


so much that 90% of the data in the world today
has been created in the last two years alone. This
data comes from everywhere: sensors used to
gather climate information, posts to social media
sites, digital pictures and videos, purchase
transaction records, and cell phone GPS signals to
name a few.
This data is big

data.

Huge amount of data

There are huge volumes of data in


the world:
+

From the beginning of recorded


time until 2003,
+

We created 5 billion gigabytes (exabytes) of data.

In 2011, the same amount was


created every two days

In 2013, the same amount of data


is created every 10 minutes.

Big data spans three dimensions: Volume,


Velocity and Variety

Volume: Enterprises are awash with ever-growing data of all types, easily amassing
terabyteseven petabytesof information.

Turn 12 terabytes of Tweets created each day into improved product sentiment
analysis

Convert 350 billion annual meter readings to better predict power consumption

Velocity: Sometimes 2 minutes is too late. For time-sensitive processes such as


catching fraud, big data must be used as it streams into your enterprise in order to
maximize its value.

Scrutinize 5 million trade events created each day to identify potential fraud

Analyze 500 million daily call detail records in real-time to predict customer churn
faster

The latest I have heard is 10 nano seconds delay is too much.

Variety: Big data is any type of data - structured and unstructured data such as text,
sensor data, audio, video, click streams, log files and more. New insights are found
when analyzing these data types together.

Monitor 100s of live video feeds from surveillance cameras to target points of
interest

Exploit the 80% data growth in images, video and documents to improve customer
satisfaction

Finally.
`Big- Data is similar to Small-data but bigger
.. But having data bigger it requires different
approaches:

Techniques, tools,
architecture
with an aim to solve new problems

Or old problems in a better


way

Whom does it matter

Research Community

Business Community - New tools, new capabilities, new infrastructure, new


business models etc.,

On sectors

Financial Services..

The Social Layer in an Instrumented Interconnected World


30 billion RFID
12+ TBs

tags today
(1.3B in 2005)

of tweet data
every day

4.6
billion
camera
phones
world
wide

? TBs of

data every day

100s of
millions
of GPS
enabled
devices
sold
annually

2+
billion

25+ TBs of
log data
every day

76 million smart
meters in 2009
200M by 2014

people
on the
Web by
end 2011

What does Big Data trigger?

From Big Data and the Web: Algorithms for Data Intensive Scalable Computing, Ph.D Thesis, Gianmarco

Types of tools typically used


in Big Data Scenario

Where is the processing hosted?


Distributed server/cloud
Where data is stored?
Distributed Storage (eg: Amazon s3)
Where is the programming model?
Distributed processing (Map Reduce)
How data is stored and indexed?
High performance schema free database
What operations are performed on the data?
Analytic/Semantic Processing (Eg.
RDF/OWL)

When dealing with Big Data is


hard

When the operations on data are complex:

Eg. Simple counting is not a complex


problem.

Modeling and reasoning with data of


different kinds can get extremely complex

Good news with big-data:

Often, because of the vast amount of


data, modeling techniques can get
simpler (e.g., smart counting can replace
complex model-based analytics)

as long as we deal with the scale.

Time for thinking

What do you do with the data.

Lets take an example:

From application developers to video streamers, organizations of all sizes


face the challenge of capturing, searching, analyzing, and leveraging as
much as terabytes of data per secondtoo much for the constraints of
traditional system capabilities and database management tools.

Why Big-Data?

Key enablers for the appearance and growth of


Big-Data are:

+ Increase

in storage
capabilities

+ Increase

in processing

power
+ Availability

of data

THINK

You might also like