You are on page 1of 15

BIG DATA

REPORT

1
Table of Contents:
Abstract ................................................................................................................................................. 3
TOPIC 1................................................................................................................................................. 4
Definition and some facts of BIG Date ................................................................................. 4
Variety ................................................................................................................................... 6
Velocity .................................................................................................................................. 7
Volume .................................................................................................................................. 8

 Overall Diagram of 3V’s ........................................................................................................ 9


TOPIC 2................................................................................................................................................10
Advantages and Disadvantages of BIG DATA: ..................................................................10
TOPIC 3................................................................................................................................................11
Dialogue with Consumers ..................................................................................................11
Re-develop your Products ..................................................................................................11
Perform Risk Analysis ........................................................................................................11
Keeping your data safe ......................................................................................................11
TOPIC 4...............................................................................................................................................12
Car Makers (Toyota) .........................................................................................................12
Finance (Visa) ...................................................................................................................12
Utilities (oil & gas) (Chevron Corporation) ........................................................................12
General Manufacturing (General Motors India Limited, GM) ...........................................12
Policing (CBI).....................................................................................................................13
Retail and Marketing (Air Jordan) .....................................................................................13
Conclusion ............................................................................................................................................15

2
Abstract
This is a report that contains details about what is Big Data, advantages and
disadvantages of Big Data, Some things that you can accomplish with Big Data,
Utilization of Big Data and a conclusion. The Utilization of Big Data part consists of
significant information about where does the data comes from, what they can do with
data and how does this benefit them.

The conclusion part consists of information about with big data what would be the
future like, what are people going to be doing when everything makes data and finally
what do I want to do with big data.

3
TOPIC 1

Definition and some facts of BIG Date:

At the start people who work in companies called employees used to enter data
into computer systems.
Then the second generation came where us users online started entering our own
data into social networking sites.
Now a third generation has come. This generation is where machines in
companies or factories are automatically entering data into computer systems.
Overall BIG DATA is the term for a collection of data sets so large and complex
that it becomes difficult to process using on-hand database management tools or
traditional data processing applications.
Big data is a popular term used to describe the exponential growth and availability
of data, both structured and unstructured.

In Big Data there are 3Vs. The 3Vs are Big Volume, Big Velocity and Big Variety.

These are the defining properties and the dimensions of big data.

Volume refers to the amount of data.

Variety refers to the number of types of data.

Velocity refers to the speed of the data processing.

Big volume: With Simple (SQL) analytics, With complex (non-SQL) analytics.

Big Velocity: Drink from the fire hose.

Big Variety: Large number of diverse data sources to integrate.

SQL stands for Structured Query Language.

SQL is a standardized query language for requesting information from a database.

SQL was first introduced as a commercial database system in 1979 by thed


Oracle Corporation.

4
Historically, SQL has been the favorite query language for database management
systems running on minicomputers and mainframes

Big data is a buzzword, or catch-phrase, used to describe a massive volume of


both structured and unstructured data that is so large that it's difficult to process
using MStraditional database and software techniques. In most enterprise
scenarios the data is too big or it moves too fast or it exceeds current processing
capacity.

 “Big data is a term describing the storage and analysis of large and or complex
data sets using a series of techniques including, but not limited to: NoSQL,
MapReduce and machine learning.”

5
Variety
Unstructured Data- refers to information that either does not have a pre-defined
data model or is not organized in a predefined manner. Unstructured information
is typically text-heavy. In other words unstructured data is something that is at the
other end of the spectrum. It might be in any form: text, audio, video. We definitely
don’t know from looking at the data what it means ,unless we apply human
understanding to it.

Examples of Unstructured Data

Book

Story

Heavy text

audio

video

RSS Feeds

Word documents

Excel Spreadsheets

Email messages

Structured Data- Data that resides in a fixed field within a record or file is called
structured data. This includes data contained in relational databases and
spreadsheets. Structured data has the advantage of being easily entered, stored,
queried and analyzed.

Examples of Structured Data:

Census records (birth, income, employment, place etc.)

Library Catalogues (date, author, place, subject, etc)

Phone numbers (and the phone book)


6
Economic data (GDP, PPI, ASX etc.)

 XML-TEI (bringing structure to the text through tagging particular elements


like versions of the word ”canal’ in 17th C Dutch.

Databases

Data warehouse

Enterprise systems (CRM, ERP, etc)

Relational Data- Relational data is a data that speaks for itself – typically this is
the standard fare for data warehouses. This is extracted from ERP and other
operational systems. We already know what the data means and what its
structure are.

Semi structured Data: Semi-structured data is a form of structured data that does
not conform with the formal structure of data models associated with relational
databases or other forms of data tables

Examples of Semi structured Data:


 Web pages
 Information integration
 XML

Velocity
Velocity Rates

Real Time (Fastest)

Near Real Time

Periodic

Batch (Slowest)

Real Time- a real time big data analytics platform, delivers ultra-fast, interactive
analytical results with sub-second response time.

7
Batch: is another type of streaming data but is a slower than the Real time.

Benefits of Batch Processing:

It can shift the time of job processing to when the computing resources are
less busy.
It avoids idling the computing resources with minute-by-minute manual
intervention and supervision.
By keeping high overall rate of utilization, it amortizes the computer,
especially an expensive one.
It allows the system to use different priorities for batch and interactive work.

Rather than running one program multiple times to process one transaction
each time, batch processes will run the program only once for many
transactions, reducing system overhead.

Volume
Volume pretty much refers to the number of amount of data. Like PB, TB, GB, MB, KB
and so on.

Volume pretty much consists of


Records
Transactions
PB, TB, GB, MB, KB
Tables, Files

We currently see the exponential growth in the data storage, as the data is now more
than text data. We can find data in the format of videos, music and large images on
our social media channels. It is very common to have Terabytes and Petabytes of the
storage system for enterprises. As the database grows the applications and
architecture built to support the data needs to be reevaluated quite often. Sometimes
the same data is re-evaluated with multiple angles and even though the original data
is the same the new found intelligence creates explosion of the data. The big volume
indeed represents Big Data.

8
Overall Diagram of 3V’s

9
TOPIC 2

Advantages and Disadvantages of BIG DATA:

Advantages:

Data mining allows uses are that you can find correlations easier.

More calculated now therefore accuracy is higher.

Data is now combined into a big mass, which allows for links to be found.

For example: company with decades of information can make use of Big Data
and data analysis to create competitive advantages and open new business
opportunities.

Started because companies have been finding it hard to manage all their data.

Creates new growth opportunities, lots of jobs.

Disadvantages:

Big risks on security and privacy.

Challenges arise: expensive, need to spend a lot to get it working.

A lot of analyzing: uncover patterns, apply algorithms, connections


relationships.

Still need specialization regarding the analysts; hard to find the right skill set.

10
TOPIC 3
Things that you can accomplish with BIG DATA

Dialogue with Consumers-

 Today’s consumers are a tough nut to crack. They look around a lot before
they buy. You want to make customers to buy your products.

 Big Data allows you to profile these increasingly vocal and fickle little ‘tyrants’
in a far-reaching manner so that you can engage in an almost one-on-one,
real-time conversation with them. This is not actually a luxury. If you don’t
treat them like they want to, they will leave you in the blink of an eye.

Re-develop your Products-

 Big Data can also help you understand how others perceive your products so
that you can adapt them.

 Analysis of unstructured social media text allows you to uncover the


sentiments of your customers and even segment those in different
geographical locations or among different demographic groups.

Perform Risk Analysis-

 Success not only depends on how you run your company. Social and
economic factors are crucial for your accomplishments as well. Predictive
analytics, fueled by Big Data allows you to scan and analyze newspaper
reports or social media feeds so that you permanently keep up to speed on
the latest developments in your industry and its environment.

 Detailed health-tests on your suppliers and customers are another goodie


that comes with Big Data. This will allow you to take action when one of them
is in risk of defaulting.

Keeping your data safe-

 You can map the entire data landscape across your company with Big Data
tools, thus allowing you to analyze the threats that you face internally. You
will be able to detect potentially sensitive information that is not protected in
an appropriate manner and make sure it is stored according to regulatory
requirements.

11
TOPIC 4
Utilization of BIG DATA:
Big Data is used in many fields like:

Car Makers (Toyota):

 Fault Logging and cost predictions- Car makers place hundreds of sensors
on components around the car which constantly log data on performance and
faults. All of this data can be used to re engineer designs for more efficient
products and to predict what the strain of warranty repairs are likely to be on
cost and man resource.

Finance (Visa):

 B2B supplier profiling- Finance professionals can use big data to check on
the ‘health’ of their suppliers and business partners. They can monitor a
variety of indicators including when creditors pay their bills and whether there
is any change.

 Fraud detection- Companies like Visa are using big data to create fraud
detection models, which can flag up potential fraudsters.

Utilities (oil & gas) (Chevron Corporation):

 Asset monitoring- As with the machines in manufacturing plants, the utilities


companies use big data to keep track on all of their assets spread across a
country, continent or the globe. This enables them to fix any broken asset
(such as a sewage cleansing plant, a leaking pipe or a gas pump), perform
pre-emptive running maintenance or isolate areas in which repair actions
have been ineffective.

General Manufacturing (General Motors India Limited, GM):

 Simulations- Manufacturers can take real data from their products on the
market and then run simulations based on what would happen if they
changed one particular component or design aspect. They can then find
ways to make the product cheaper, more reliable or more environmentally
friendly. The Formula 1 racing teams are particularly adept in this area, as
are advanced aerospace companies.

 Expanded product design modeling- Similarly, with new big-data enabled


computer aided design programs, product designers can substitute
components or materials from huge databases and then access in-depth
information on how this affects the final product, including the ramifications
12
on cost, production processes, environmental effects, legislative
requirements, supply chain and so on.

Policing (CBI):

 Suspect tracking- By combining CCTV images, facial recognition software,


travel trends and identifiers on travel cards, police forces can capture
criminals by automatically linking people to their likely destinations on buses
and metro systems. This allows police to catch those that they miss at the
scene of the crime and also to control arrest statistics, meeting targets for
arrests in one London borough, for instance, as needed.

Retail and Marketing (Air Jordan):

Mood mapping- Retailers use feeds from social networks to build an


understanding of how their products and company reputation is seen among
the public. With the constant streams of opinions from Facebook, Twitter,
Google+ and the like, companies are able to cheaply and quickly gather large
samples of customer opinion.

Title Where Needs Benefits

1. Car Makers From the factories and from -Safety and Quality Feedback from
(Toyota) the sensors to the data analysis. design.
center (headquarters)

Type of Data: What


condition the car is in.

2. Finance Where ever they buy. - Detect Fraud Personal


(Visa) Recommendation.
Type of Data: What they - Customer’s
buy, where they buy, when behavior
they buy, how much they
buy it for.

3. General Several branches- -Safety and Quality Awareness and


Manufacturing Headquarters in Gurgaon analysis. indication on what
(GM) to fix.
Type of Data- What
condition the motor is in.

4. Policing Several Police departments - Detecting Give awareness

13
(CBI or CID) - to the main CBI person’s behavior for what that
Headquarters located in and actions. person is going to
CBI- Crime do next. What is
Delhi. They mainly track
Branch India their next plan?
people by their cellphones.
CDI- Crime
Investigation Type of Data- Detail of the
Department. person who they are
(India) tracking

Both are same.

5. Utilities, oil From the machines in the - Keep track of This gives them
& gas manufacturing plants - data what is going on in feedback from
(Chevron center (headquarters). the Manufacturing designs so they
Corporation) plants like broken know how to
Type of Data- What is pipes, leakage and improve the
going on in the etc... construction of the
Manufacturing plant. manufacturing
plant because that
is their main
source of how they
get oil and gas.

6. Retail and From social media - Customers This gives them


Marketing (Air networking sites- behaviors (like it or feedback on what
Jordan) headquarters of company not) the customers are
thinking about the
Air Jordan is a (data center)
- Helps to find out product. Gives
Basketball feedback from
Shoe Type of Data- Customer’s consumers
opinion or feedback on the opinions and audiences to
Company in improve their
product. feelings.
America. product.
- Feedback of their
brand.

14
Conclusion

With Big Data what would be the future like?

As larger and more complex data sets emerge, it becomes increasingly more difficult to
process Big Data using on-hand database management tools or traditional data processing
applications. To maximize the significant investments in these datacenter resources,
companies must tackle Big Data with “Big Workflow,” a term we’ve coined at Adaptive
Computing to describe a comprehensive approach that maximizes datacenter resources and
streamlines the simulation and data analysis process.

What could you do with Big Data that you couldn’t do before with?

With Big Data one of the major things that we can do is to predict the future. In today's world
we are surrounded by predictions. For instance, during political elections the main focus of
the media and the public is not on the differences between the candidates' positions, but
rather on the "horse race" aspect of the competition. Issues at stake are secondary compared
to the main question: who is going to win? So with these data trends that we receive we can
predict the future.

15

You might also like