You are on page 1of 52

The Big Book of

Big Data
A field guide for Industry-based Big
Data Opportunities

Oracle Inc.
1st Edition

What is Big Data .......................................................................6


Big Data & Analytics ...................................................................8
Financial Services Industry ........................................................10
Big Data Use Cases .............................................................................................11
Industry Solutions ...............................................................................................15
Data Sources .......................................................................................................15

Media Industry ...........................................................................16


Big Data Use Cases ............................................................................................16
Industry Solutions ...............................................................................................18
Data Sources .......................................................................................................18

Healthcare Industry ....................................................................19


Big Data Use Cases ............................................................................................19
Industry Solutions ...............................................................................................20
Data Sources .......................................................................................................21

Retail Industry ............................................................................21


Big Data Use Cases ............................................................................................22
Industry Solutions ...............................................................................................24
Data Sources .......................................................................................................25

Consumer Goods Industry .........................................................26


Big Data Use Cases ............................................................................................26
Industry Solutions ...............................................................................................28
Data Sources .......................................................................................................28

Telecommunications Industry ....................................................29


Big Data Use Cases ............................................................................................29
Industry Solutions ...............................................................................................32
Data Sources .......................................................................................................33

Utilities Industry.........................................................................33
Big Data Use Cases ............................................................................................34
Industry Solutions ...............................................................................................37
Data Sources .......................................................................................................38

Research Industry .......................................................................39


Big Data Use Cases ............................................................................................39
Industry Solutions ...............................................................................................42
Data Sources .......................................................................................................43

Automotive Industry ..................................................................44


Big Data Use Cases ............................................................................................44
Industry Solutions ...............................................................................................46
Data Sources .......................................................................................................46

Engineering & Construction ......................................................48


Big Data Use Cases ............................................................................................48
Industry Solutions ...............................................................................................50
Data Sources .......................................................................................................50

Oil and Gas Industry ..................................................................51


Big Data Use Cases ............................................................................................51

Data Sources .......................................................................................................53

Foreword

What is Big Data


Weve all heard the term Big Data, but what does it really mean? No doubt you can think
of many examples of organizations dealing in large amounts of data and able to handle this with
relative competence if not ease. NASA collects and processes vast amounts of satellite imagery
daily. Apple just announced that 7 trillion push notifications are sent to iOS devices daily. So
when does big data become Big

Data?

A simple definition would be that data becomes Big Data, or rather a Big Data Problem,
when the volume, velocity, and/or variety of the data exceeds the abilities of your current IT
systems to ingest, store, analyze, or otherwise process it.
This simple definition hints at some not-so-simple challenges. Certainly, there are solutions
for handling large quantities of data. Networking and bus technologies provide a transit system
for moving data rapidly. But, what happens when that data is messy - a mix of structured and
unstructured data - that doesnt fit neatly into defined data structures, AND its high volume,
AND it needs to be processed quickly? Think of data from millions of sensors on electricity
networks or manufacturing lines or oil rigs. Identifying deviations from past trends in this data
(and whether the deviations are safe or unsafe deviations) in real time can help avoid a
power outage, reduce waste and defects, or even avoid a catastrophic oil spill.
This type of problem is occurring in almost all industries today. The volume of data is
growing too fast for traditional analytics because the data is becoming richer (each element is
larger or more varied), more granular (time intervals decreasing from months to days or days to
minutes), or just needs to be processed much faster than it used to.

The volume, velocity and variety problems all contribute to the growing need for larger and
larger data storage and processing requirements, but the final component; value, is where
investments are made. Companies need to find value in storing and processing the data in order
to justify capturing and keeping it in the first place, and this is a fundamental shift for many
organizations. It can also be a huge competitive advantage. Much of what this book focuses on
is providing you with that trick; convincing a company that in their industry there is competitive
advantage to using this data.

Oracle commissioned a study and this was one of the surprising answers only 14% of the
interviewed executives have not been considering unstructured data as an important part of their
business. Because of this, many companies and organizations are starting to realize - their
traditional IT systems are not prepared to deal with the rapid growth of Big Data that the very
same IT systems have enabled.
We have asked each of our industry big data experts to describe for you the big data
challenges that have surfaced in each industry. Each chapter contains descriptions of use cases in
the customers language, and lists of common sources of data involved. Ask your customer
which data sources they have available; and even if they already capture it there are likely
additional uses for the data like the ones contained in this book. Weve also given guidance on
whom to speak to at the customer and the opportunity they would most likely be interested in.
This book will enable you to have a conversation about Big Data thats specific to your
customers industry, and link out to content for an executive discovery workshop.

Chapter 1

Big Data & Analytics


Big Data offers a new style of data analysis which is different from traditional Business
Intelligence. Traditional business intelligence projects required the business users to know what
kinds of questions they want to ask beforehand. The questions they wanted to ask drove the data
model of the Data Warehouse and how the data was stored. The data model also drove what data
was collected and by what mechanism. Creating this enterprise data model could be a lengthy
process and in many cases resulted in a system that is slow to adapt to changes within the
business.
Now; with Big Data, data analysis is happening bottom up. Organizations are collecting as
much data as they can without knowing beforehand exactly what questions they are going to ask.
This means it is no longer practical to transform every piece of data into the standard data model
of the corporate data warehouse, at the point of data collection. Instead, data is stored in the form
in which it was originally captured and only given an appropriate structure by the analysis
process actually using it. This much more flexible approach leads to more dynamic approach to
data analysis and can react much more quickly to the rapid changes within a business.

The real innovation here is that we can ask questions and get the answer
back before we have forgotten why we asked the question in the first
place.
Hilary Mason, Chief Scientist Bit.ly
The next question many people throw at this idea of providing structure to the data just in
time is why dont I do that with my whole data warehouse? The simple answer is that it is
cost prohibitive to structure all data the same way a data warehouse requires. Your customers
already know how long data warehousing projects take; and that is with data that is already
structured but needs transformation. Twitter feeds and sensor data dont have a structure to begin
with (at least not a data warehouse compatible structure). Have a few queries provide dynamic
structure to the data in a data warehouse and it will work just fine. Try to do that with thousands
or tens of thousands and it becomes cost prohibitive; so traditional data warehouses still have
their place, and in fact Oracles analytics solution handles the whole lifecycle from Information
Discovery all the way through traditional Business Intelligence. The key enabler of the big data
appliance is to reduce the data to the point that it can be put in a data warehouse with good
structure so it can be compared against the rest of the customers data.

The architecture for accomplishing this type of new data paradigm is extremely important.
Some companies are using pieces from a variety of vendors and open source components to
accomplish this new data lifecycle; but only Oracle provides an end to end solution for big data,
all the way from acquiring the data and finding new insights all the way to making repeatable
decisions and scaling the analytics out to the organization at large. This architecture is very
important to making use of big data, because the analysis of data is really on a continuum not
nicely isolated use cases. Data in your RDBMS (Oracle relational database) will be useful as
part of a big data analysis, and vice versa; so having a solution that lets data move easily between
the parts of the architecture is important. Corporate data warehouses dont become obsolete in
this model, they become more important as the business finds new analyses over time that are
new imperatives to running the business.

The last bit there is probably the best takeaway point figure out where your customer is on
their big data lifecycle. Do you need to convince them there is something worthwhile to
Explore? Do they need to be convinced of large areas to expand (like new data sources), but

already have some HADOOP or Exadata boxes that they are using data mining or similar
techniques within? Do they already have custom adapters and move big data aggregations from
a HADOOP platform into their data warehouses, and need to be shown the value of scale of
truly exploiting a common architecture and platform?

The final important point when examining this problem is that Oracle has a large number of
components that fit the big data problem, and we sell best when we sell them together. BI
applications, BI foundation, the Exalytics platform it runs on, Data mining and Exadata, the Big
Data Connectors and the Big Data appliance itself all have something to contribute to the
solution and the conversation. Dont forget about the real-time components covered by the
Fusion Middleware platform with Complex Event Processing and Real Time Decisions for
streaming analysis and recommendations, and the Exalogic platform they run on. Add in our
partners, GBU industry applications and IBU industry solutions and you have a team at your
fingertips ready to help you sell this vision.

Chapter 2

Financial Services Industry


The Financial Services Industry is amongst the most data driven of all. The regulatory
environment that commercial banks and insurance companies operate in requires these
institutions to store and analyze many years of transaction data, and the pervasiveness of

electronic trading has meant that Capital Markets firms both generate and act upon hundreds of
millions of market related messages every day. For the most part, financial services firms have
relied on relational technologies coupled with business intelligence tools to handle this everincreasing data and analytics burden.
It is however increasingly clear that while such
technologies will continue to play an integral role, new technologies many of them developed
in response to the data analytics challenges first faced in e-commerce, internet search and other
industries have a transformative role in data management within this industry.

BIG DATA USE CASES


Data-driven customer insight and Product Innovation
Banks are looking at ways to offer new targeted services
to their customers, in order to increase revenue, reduce churn
and increase customer engagement. These require the
integration of external sources of information to analyze
customers behavior, preferences, and even detect life events.
The availability of troves of web data about almost any
individual - including spending habits, risky behavior, etc.
provides valuable information that can target service offerings
with a great level of sophistication. Add location information
(available from almost every cell phone) and you can achieve almost surgical customer targeting.
For example, integrating external customer data requires resolution mechanisms that use
multiple factors to match the identity of an external individual (Facebook account, etc) with that
of the banks customer. Real-time targeting of products and discounts require the integration of
information across various channels, coupled with real-time decisioning software (RTD).
Oracle has a number of market leading technologies that enable banks to implement an
optimized cross-channel strategy. Oracle worked with a large US Bank that handles over 400
million interactions across channels each month to implement a customized customer treatment
model. The solution utilized Oracle Coherence for the underlying transaction integration and
Oracle Real Time Decisions software (RTD) to generate recommendations, which are pushed in
real-time to smartphones, Facebook, web-site and call-center channels.
Retail Banks, Investment Banks and Insurance companies are all looking for new methods
and techniques to develop new products and strategies. But current technology is not well suited
to support the ultimate needs for product innovation that must consider data found outside
traditional transaction systems and systems of record (e.g. customer interaction weblogs, social
media and news). The challenge is how to store and analyzed this massive amount of
unstructured data in an efficient and cost effective way.

Sentiment Analysis and Brand Reputation


Whether looking for broad economic indicators, market
indicators, or sentiments concerning a specific organization or
its stocks, there is obviously a trove of data on the web to be
harvested, available from traditional as well as social media
sources. While keyword analysis and entity extraction have
been with us for a while, and are available from several data
vendors, the availability of social media sources is relatively
new, and has certainly captured the attention of many
institutions looking to gauge public sentiment.
For example, sentiment analysis is becoming so popular that a couple of hedge funds are
basing their entire strategies on trading signals generated by twitter feed analytics. While this is
an extreme example, most firms at this point are using sentiment analysis to gauge public
opinion about specific companies, markets or the economy as a whole.
Financial institutions also want to ensure their own reputation and brand is held in high
regard, especially by key customers such as high-net-worth individuals. This includes overall
sentiment about the firm as well as customer satisfaction with specific brands and products.
Several offerings are available from data vendors to provide aggregate sentiment analysis
toward a company or as a broader indicator, but when it comes to gaging the opinions of key
customers towards specific products services, the bank has to do a lot more internal analysis to
match these customers internal data (about product usage and profile) with external sentiment
information.

On-Demand Risk Analytics

On-Demand risk, especially at a trading desk level, is now the desired goal of global banks.
The objective is not only faster measurement and reporting of risk, but also measurement across
asset classes.
Aggregation of global positions, pricing calculations,
and VaR, all fall within the realm of Big Data. This is due
to the mounting pressure to speed these calculations up well
beyond the capacity of current systems, but also because of
the need to deal with ever growing volumes of data. While
firms have adopted the use of compute grid technologies to
enable faster risk computations, the feeding of data into
these grids has become a bottleneck. Technologies such as Oracle Coherence complement
compute grid technologies, and enable faster calculations by allowing real time access to inmemory positions data and by allowing MapReduce style parallelization. Such an architecture

helped a global bank reduce VaR calculation time from 15 simulations was increased almost 25
fold.
For Enterprise Risk Management, the added challenge is of data integration from many
disparate systems. (It is not uncommon for data to be sent from source systems as flat files e.g.)
The Oracle Big Data Appliance and Big Data Connectors enable ETL-style processes to be
parallelized on Hadoop before the data is loaded into an enterprise risk warehouse. For the risk
warehouse itself, the Oracle Exadata machine with in-database analytics virtually eliminates the
performance bottlenecks typically associated with running SQL processes on database servers.
Enterprise risk software, including Oracle Financial Services Liquidity Risk Management,
benefit from this capability. In a benchmark Oracle Financial Services Liquidity Risk
Management running on Oracle Exadata Database Machine calculated business-as-usual
liquidity gaps for 371 million cash flows across 66 million accounts in just 69 minutes. After
applying modified behavior assumptions to simulate adverse market conditions, stressed
liquidity gaps were calculated in only 10 minutes. With the ability to execute an individual stress
test run in mere minutes, institutions can refine their scenarios to simulate any impact on
business-as-usual liquidity gaps and immediately assess the effects of a given counterbalancing
strategy.
Transaction Cost Analysis, which measures actual order execution performance against
established benchmarks metrics, is another excellent example. TCA, initially adopted mainly as a
check the box tool for compliance with best execution regulations, has now found wide
acceptance outside the compliance department. TCA is now used to assess broker performance
internally, to identify outlier trades and to measure performance of algorithms sold by the sellside. A number of large global banks have implemented trade data warehouses using Oracle
Exadata technology (BNP Paribas is a reference customer). The Big Data Appliance (BDA) is a
new technology, but one that can complement the Exadata based trade warehouse. Using the
BDA will allow faster trade capture on HDFS, and faster processing (using Hadoop and R)
before processed data is loaded into Exadata for analysis.

Rogue Trading Detection


Related to the topic of finance and accounting, this use case may not be as common, but is
considered frequently the more we're faced with the ever
increasing implications of rogue trading. Deep analytics that
correlate accounting data with position tracking and order
management systems can provide valuable insights that are
not available using traditional data management tools.
For example, in couple of well-known cases (UBS and
Socit Gnrale), inconsistencies between data managed by
different systems could have raised red flags if found early
on, and might have prevented at least part of the huge losses

incurred by the affected firms. Here too, a lot of data needs to be crunched from multiple,
inconsistent sources in a very dynamic way, requiring a new technical approach to the analytics
platform.

Investigation and Compliance (e-Discovery)


The recent increase in cases where financial institutions were found guilty of
misappropriating their customers funds or misguiding them as to the funds allocation, has
caused litigators and regulators to push for far greater degrees of scrutiny of information than
ever before. It involves records of all the interactions associated with a financial transaction such
as a trade order including emails, phone transcripts, text messages, contracts, etc. -- all of
which are unstructured in nature. Retention of these records has been mandated for years, but the
difficulty in associating them with the corresponding transactions has caused regulators to look
for a far greater degree of correlation between structured transaction records and unstructured
interaction records.
This obviously necessitates bringing disparate levels of structure into the holistic data
management platform, which can maintain their relationship and correlations. That is precisely
what the Oracle platform excels in, and any compliance executive should be interested in getting
in front of this type of analysis before regulators get involved in it.

Fraud Detection
Fraud detection involving cards debit, and wholesale payments is also quickly becoming a
big data problem, in as much as correlating data from multiple, unrelated sources has the
potential to catch more fraudulent activities earlier than current methods. Consider for instance
the potential of correlating Point of Sale data (available to any credit card issuer) with web
behavior analysis (either on the bank's site or externally), and potentially with other financial
institutions or service providers such as First Data or SWIFT, to detect suspect activities.
Payment providers have developed fraud detection tools that depend on massive datasets
containing not only financial details for transactions, but IP addresses, browser information, and
other technical data that will help these companies refine models to predict, identify, and prevent
fraudulent activity. These enhance the traditional approaches
to fraud prevention, which are mostly based on sanctions lists
and pre-defined rules.
Any compliance, fraud and security department in any
financial institution should be interested applying new
technologies to enhance the current Know Your Customer
initiatives, watch list screening, and the application of
fundamental rules. Correlating heterogeneous data sets has
the potential to dramatically improve fraud detection, and

could also significantly decrease the number of false positives (e.g. using a card while traveling).

INDUSTRY SOLUTIONS
Link into the Oracle Industry Solutions database for conversation scripts and questionnaires
that can get your client thinking about their big data solutions.
The Financial Services solutions are at:
http://my.oracle.com/site/ibu/portal/IndustrySalesPlays/Industries-A-K/FinancialServices/
Solution2/index.html?ssSourceNodeId=23441&ssSourceSiteId=ibu.
The Insurance solutions are at:
http://my.oracle.com/site/ibu/portal/IndustrySalesPlays/Industries-A-K/Insurance/Solution4/
index.html?ssSourceNodeId=23456&ssSourceSiteId=ibu.

DATA SOURCES
Structured Internal Sources:

CRM Data

Marketing Plans

Unstructured and External Sources:

Call Centers Logs

Syndicated and Retailer ePoint of Sale (ePOS) data

Social Listening Post data (enhanced with sentiment analysis)

Social Networking Data (unfiltered by a listening post)

Competitor Information

Zillow type websites for home prices

Weather Forecasts and History

Government Census data

Economic Data, such as unemployment rates

ISP website surfing data

Chapter 3

Media Industry
BIG DATA USE CASES
Digital Advertising Sales
The old adage that half of all advertising is effective, we just dont know
which half no longer applies in the digital age. A fast-growing proportion
of advertising inventory is now sold in real-time through online auctions
where media owners (publishers, web sites) trade detailed demographic
information about their users for performance based advertising. In other
words the Google model for advertising sales is appearing in mainstream
media. Data is now gold for media companies.
For example, a detailed understanding of audiences by socio-economic group,
demographics, content consumption patterns, likes and interests can be built up from account
details, content consumption logs, viewing data and social media interactions. This enables
tighter segmentation of consumers, enabling advertising sales managers to increase advertising
rates (CPMs) and win a greater share of online advertising. The data can be used directly by
advertising sales teams, or sold as an additional revenue source to advertising trading partners.
The same detailed audience data also allows advertising sales teams to offer targeted
addressable advertising to their buyers, where advertisers will pay a premium to reach known
specific demographic niches.
Advertising agencies are able to use the same types of data to create more effective content
(advertising creative), and plan more effective, cost-efficient campaigns when purchasing media
space on behalf of their brand clients.

Increasing subscribers
The same analysis of Big Data sources used to increase advertising rates can also be used to
produce and deliver more relevant content to users, target them with appropriate marketing
messages, and increase the number of users and paying subscribers using a media companys
services.

For example, a users content consumption at a granular level can be tracked


along with the journey they take through content on PC, tablet and mobile
devices. A detailed picture of that individuals interests can be created especially
when combined with account data for registered users, and external demographic,
sentiment and social media data. Marketing heads can use this data to target
individuals with outbound emails and social media messages offering the most
relevant content, attracting them to use the media companys services and consume more
content.
More engaged users are more willing to pay to consume the media services they value.
Heads of circulation and subscription sales can use this data to help convert free grazing users
into paying subscribers by offering them a tailored subscription offer and bundle of services
through behavior-driven marketing.

Targeted Content Creation


The same understanding of users interests and preferences can be used to commission,
produce and deliver content that is more relevant to users. Broadcast and publishing companies
traditionally rely on creative experience and intuition to choose what content to create and
publish, even though content is the single biggest cost for media companies.
Content consumed on connected devices like smartphones, tablets and connected set-top
boxes allows for detailed analysis of what consumers actually want to read or view, and how
they consume it, especially when linked to demographic, account and advertising data. For
example, if a viewer chooses one movie, what other movies does that same individual watch
later?
Editors, heads of commissioning and content-budget holders can use this analysis to make
smarter decisions about where to invest their content budget. With huge pressure on margins in
most media companies, this helps maximize impact and ROI and reduce content flops.

Managing the Digital Supply Chain


As multiple content types are delivered over an increasing number of platforms, devices and
channels to reach a diverse set of consumers, managing and maximizing profitability is a
growing challenge.
Aggregating and analyzing multiple data sources enable media businesses
to make informed forecasts and decisions about costs, revenue and
profitability by individual content asset; by customer; and by each platform
and channel partner.
Finance and operating officers manage an increasingly complex business,
rising content costs and fragmented sources of income. Effective analysis of supply chain data
enables business priorities to be set, and informed decisions made about new (analog and digital)

products and platforms. Informed decisions can be made about pricing, product bundling and the
most effective payment and advertising models for each service and platform.

INDUSTRY SOLUTIONS
The IBU Media Analytics and Content Personalization solution will be available shortly from
the IBU portal.
http://my.oracle.com/site/ibu/portal/IndustrySalesPlays/Industries-L-Z/MediaEntertainment/
Solution1/index.html
The executive conversation script and discovery questionnaire will allow you to work with
media execs to focus in on their Big Data priorities and position Oracle Big Data, Analytics, BI,
Discovery and Real-Time Decisions solutions.

DATA SOURCES
These are examples of the type of data we commonly see used in the Media industry. Only a
fraction of this data typically ends up in a data warehouse and is available for analysis. What
questions could be answered if all of this data could be combined and analyzed?

Web logs and content consumption


journeys

Blogs, social media and content


recommendations

Cookies

Billing data

Content access logs

Purchase histories

Click-throughs

Order management

User Profiles and demographics

Digital TV return-path data

Device location

Mobile payments

Third-party channels and platforms

Advertising

Content licensing

Traditional media

Content metadata

Competitor content

CRM data

Chapter 4

Healthcare Industry
BIG DATA USE CASES
Remote Patient Monitoring
With the world wide goal of reducing the costs of healthcare and
improving patient outcomes, many countries are looking to more closely
monitor patients on a constant real time basis. The monitoring can include in
home devices such as glucometers, weight scales, pedometers and others. The
Volume and Velocity of this data, as well as the real time nature of the analysis
and action necessitates a Big Data Solution.
For example, for patients suffering from a chronic
disease such as diabetes or congestive heart failure, the ability to monitor
the patient for weight gain, blood sugar levels and exercise attempts will
allow the care team to more appropriately converse with the patient. The
ability to extend the healthcare system into the home allows for a much
better quality of life for the patient, while at the same time giving more
visibility in the current health of the person.
The care team composed of a Case Manager, Physician or Nurse can proactively
contact the patient and provide suggestions to the patient to help improve the current condition of
concern, even being able to recommend that the patient report to an Emergency Room for
immediate treatment if needed.
Another example where real time in home devices can be used is for independent living. Just
because many countries are experiencing an ageing population, does not mean that the
population wants to give up the ability to live alone. But, living alone does not mean that there
are not people that are concerned about the well being of the person. Having the ability to
covertly monitor the person, with their permission, provides a level of safety to determine if
someone has fallen, not gotten out of bed, or has been missing meals.
Accountable Care Organizations (ACOs) or Service Providers will be interested
in providing the services needed to insure that their customers are living independent and healthy
lives.

Healthcare Analytics
With the healthcare industry moving from a paper based system to an on line digital system
around the world, the usage of EMR (Electronic Medical Record) systems is on the rise.

Unfortunately, much of this data is locked in a system designed to treat patients on an episodic
fashion, and may not contain the full longitudinal health record of the patient. Harvesting this
data is the current format has proven to be difficult. With the maturing of some solutions based
on Big Data architectures, the ability to unlock and analyze this information is now possible.
Having the ability to review patient outcomes with different treatment plans
has often been the want and need of the medical research community.
Solving the Volume and disparate nature of the data storage has long been
an issue in the industry.
The CMIO (Chief Medical Information Officer) or CRO (Chief Research
Officer) at many healthcare organizations is very interested in accessing the
scientific evidence to validate that the treatment plans being utilized are actually being effective,
efficient and at the best cost.

Translational Research Center


With the cost of DNA sequencing becoming more affordable and more common, the
emergence of personalized medicine is becoming more of a reality. Many organizations are
experiencing the need to combine clinical and genomics research data in order to determine the
effectiveness of personalized treatments.
There are many drug therapies that have been found to be effective for a certain
cohort of patients with specific gene expressions. Being able to determine if a
patient has the genetic gene expression before treatment begins allows for a better
prognosis and also determines if this is the right course of treatment for the
individual.
Many Research Institutes, Academic Medical Centers, Drug Makers or CROs (Contract
Research Organization) will be acquiring solutions in this space over the next few years in order
to stay competitive.

INDUSTRY SOLUTIONS
There are several Oracle Solutions to address Big Data in Healthcare
The Connected Health Solution can be utilized for Remote Patient Monitoring. This will
provide the framework needed to accept the data from each of the remote devices, and populate
the appropriate data stores.
The Translational Research Center is available - http://www.oracle.com/us/industries/healthsciences/oracle-translational-research-ds-497608.pdf - http://www.oracle.com/webapps/dialogue/
ns/dlgwelcome.jsp?p_ext=Y&p_dlg_id=11416590&src=7138239&Act=253 The conversation
around this solution can become pretty complex in short order, so the inclusion of SMExperts is
mandatory.

For the Healthcare Analytics Solution There are many Oracle products that can be
positioned for this solution, from Fusion MiddleWare for the data acquisition and database
population to the Health Sciences products with form the base of the solution. http://
www.oracle.com/us/industries/healthcare/058441.html -

DATA SOURCES
Much of the data required in Healthcare is proprietary data that is already in the possession of
the Healthcare Research Entity, or in public registries.
EMR Data
CDR
Data Warehouse
ERP
CRM
Cancer Genomics Hub
U.S. Health Data Healthdata.gov

Chapter 5

Retail Industry
Retailers are interested in solutions that help them differentiate from their competition and
maximize customer experience. Big Data capabilities enable retailers to collect and extract these
insights from transaction history, purchase frequency and web-behavior, as well as external
environments such as social media, demographics, weather and finance. The data can be
harnessed in multiple ways, from structured databases and distributed predictive analytic systems,
to mining of unstructured data.
Many companies are increasing the use of Data Discovery tools in addition to traditional BI
to tap into unmet customer demands. This new approach to analytics, often by easy-to-use, self
service analytics applications, helps the retailer to explore questions like why and what if,
and brings a new agility to BI and a wider use of analytics all over the organizations. This
chapter will focus on these use cases:
Omni-Channel Marketing how to get customers to spend with you.
Customer Satisfaction making sure the customer experience is more positive than the
competition.

Segment and Sentiment Analysis getting to know your customers from the data they
generate outside your walls.
Value Add to Customers being able to identify changes to your go to market strategy
based on customer sentiments.

BIG DATA USE CASES


Big data to support Omni-Channel Retail Marketing
Retail customers are online, selective and social, and retail marketing
struggles to understand each customer as an individual market segment of
one. The target is to present a personalized, tailored offering unique to the
individual customer to improve on relevance and perceived customer
value.
For example; to do this, the retailer looks to find the answers hidden in
massive amounts of customer, spending, inventory, pricing and promotion
data - to come closer to a holistic view of their customers. Who are my customers by categories?
What are the ways customers buy different product categories, and how do my customers behave
across a growing number of channels?
Merchandisers would be interested in applying this to an Omni Channel marketing
operation. Big Data helps retailers to understand customer behavior segmentation and what
actions trigger behavior attributes in different segments and channels, with the growing demand
for mobile retailing making retail ever present Big Data capabilities even allows for real-time
marketing execution to the hand of the consumer at the time of purchase.
Another example, Big Data also enables improvements to loyalty programs by revealing
what factors truly impact on customer loyalty and retention, such as customer experience, ease of
use, value for money and effect from rewards programs.
Marketing would be interested in using this solution to reduce marketing spend while
keeping results the same or for a competitive advantage. Customer churn is a major problem
with retailers and this big data solution can give insight not only into what the churn rate is but
also the reasons behind the churn.

Big Data to improve Customer Satisfaction


Retailers understand that improving customer satisfaction is vital. And it means more than
simply tracking complaints. Combining structured data from sales, marketing and supply chain
with unstructured or semi-structured data from surveys, syndication data and other outside
sources can give retailers a new perspective of their customers.

For example, merging structured with unstructured content to find underlying customer
satisfaction issues allows enterprises to proactively monitor customer satisfaction levels. At
many retailers, sales and customer service still work in separate silos and customer feedback is
often not allowed to flow freely between the different operations resulting in ineffective
distribution channels.
A COO would be interested in the convergence of sales information, call center operations
and social media enables Big Data to create correlation between product sales, support and
customer voice to validate the true issues impacting on customer satisfaction and for the
targeting of new customer segments, even competitors customers can be analyzed for industry
trends to reveal customers propensity to buy certain products or services.
Another customer satisfaction issue solved by Big Data is to identify the most valuable
customers from a 360 degree view; to be able to reward them with offers and benefits relevant to
a loyalty program, and to exclude those customers that merely take advantage of discounts
without shopping at the merchants again.
Store operations, customer services and to some extent marketing would be interested in this
solution to get the most benefit from sales and promotions. The purpose of these is to keep loyal
customers by making them feel rewarded and special, and these insights allow better focus and
less waste in that effort.

Improved Segment and Sentiment Analysis


Retailers already define groups of individuals based on demographic categorization and
geographical segmentation. The use of Big Data analytics provides retailers with the
opportunities for more refinement, to reach specific targets for sales and campaign efficiencies.
The main change is the rise of a new social consumer. Even online opinions from competitive
offerings can help the retailer to better position their own products by differentiating their own
offering to competitive product limitations and perceived quality.
For example, Big Data steps up to the challenge to merge different types of data from
multiple sources such as buying data, revenue, channels, geodata, social media, customer
feedback, emails, web-activities etc to allow marketers to make better informed and time-critical
decisions to improve product revenues from cross-sell and up-sell.
Marketing, Buying & Merchandising organizations would be interested in this solution to
improve their results by impacting the share of wallet they can achieve from each customer.

Big Data to provide added value to Customers


The rise of the online social customer forces retailers to refine the alignment of previous
business processes to gain competitive advantage. Many business processes in retail have not
been designed to keep pace with the explosion of channels through which a retailer engages with
its customers. Today unstructured data makes up more than 80% of all data created in retail, and

by efficient integration of social media (unstructured data) such as blogs, social networks, service
centers combined with Big Data capabilities, retailers can better understand their customers, their
preferred channels, lifestyles and evolving service needs.
For example, in a retail market where margins are under constant pressure and product
duplication is almost immediate on a global market, retail leaders need the capability to swiftly
respond to changes in customer demand where integration between structured and unstructured
data provides market leaders with improved decision-making and drive faster response times to
market needs.
A lead analyst or any C-level executive would be interested in this. The analysts in retail
companies today spend a lot of time in spreadsheets and discovery tools that allow them to spend
more time on analysis and less time managing and massaging data can improve the companys
ability to make timely decisions.

INDUSTRY SOLUTIONS
Link in to the Oracle Industry Solutions database for conversation scripts and questionnaires
that can get your client thinking about their analytics and data warehouse solutions and check out
the Retail Insights sales play at the IBU Retail Industry Play Portal at HTTP://MY.ORACLE.COM/
SITE/IBU/PORTAL/INDUSTRYSALESPLAYS/INDUSTRIES-L-Z/RETAIL/SOLUTION1/INDEX.HTML
Learn more about Oracle retail solutions enhanced by Big Data capabilities on the Oracle
retail Content Portal http://contentportal.oraclecorp.com/industries/retail.html
Listen to the PodCast for Business Services with an industry Overview on the Sales and
Marketing Content Portal for Engineered Systems.
http://my.oracle.com/site/ibu/portal/ExaBusinessSolutions/SalesPlays-Industries/
BusinessServices/index.html

Learn more about Data Warehousing Big Data Sales Content at:
h t t p : / / m y. o r a c l e . c o m / s i t e / i b u / t e c h n o l o g y / Te c h P r o d u c t M k t g H o m e / D a t a b a s e /
DataWarehousing/SalesKits/index.htm
If the external data sources are the biggest topic, then check out the High Perf Demand
Signal Repositories plays at http://my.oracle.com/site/ibu/portal/ExaBusinessSolutions/
SalesPlays-Industries/Retail/index.html). The Top 5 Objections, Competitive Traps or Questions
section of the Discovery Guide is really useful for convincing your client that they shouldnt go
off on their own and there are some great podcasts here to get you up to speed fast.

DATA SOURCES
There are two main classifications of data sources in Retail; internal and external. The
internal sources are available but its often too expensive and difficult to align their hierarchies
with the other data sources in the company so it just hasnt been done to date. The other group is
external data sources whose hierarchies dont match up nicely to the product or segment
hierarchies that are used internally either. With Big Data it is possible to build hierarchies on the
fly based on rules that can be easily found using tools like Endeca, making mix and match of
data sources easier than what many retailers expect.
Examples of Structured Internal Sources

CRM Data

Sales Data

Order Management Data

Marketing Plans

Shipments

Promotions

Retail Execution

Consumer Call Centers

Examples of Unstructured and External Sources

Syndicated and Retailer electronic Point of Sale (ePOS) data

Retailer Loyalty and Market Basket data

Wholesaler and Distributor Spin data

Social Listening Post data (enhanced with sentiment analysis)

Social Networking Data (unfiltered by a listening post)

Competitor Bench Prices and Promotions (from other Retailer public websites)

Zillow type websites for home prices

Weather Forecasts and History

Government Census data

Economic Data, such as unemployment rates

ISP website surfing data

Chapter 6

Consumer Goods Industry


Consumer Goods companies usually have very robust analytical capabilities
already and use their data warehouses extensively. Theyve been forced to use their supply chain
and sales data very effectively in order to stay in business with cost pressures coming from big
retailers like Walmart and increasing costs of goods.
They probably arent effectively using external data to be even more effective or
differentiated though, and this chapter goes through three big use cases:

Trade Data data from retailers that is probably not captured or not used effectively at
your customer.

Sales & Supply Data they have the data today but dont combine it well, and they dont
use it in real time.

Sentiment Data Consumer Goods companies are all about their brand, and theyll be
interested in solutions that help them understand the shopper better.

BIG DATA USE CASES


Trade Data big data from external sources about product sales
This is data external to the Consumer Goods company, and consists of sales data outside the
company; market measurement data, retailer data, and competitor sales estimates. This can be
Terabytes of data every day (Walmart alone sends data every day for every SKU for every store,
and is considering moving to hourly!) and most companies use it a little but havent really
figured out how to take full advantage of all of these data sources together at scale.
For example, most CG companies launch new products by spending a lot of money on R&D
and market trials, then launching the product and deciding based on first month sales whether the
product should be kept or killed. Most are killed.
If a marketing team used daily sales data from specific stores to test
how a product sold best in the store (at the checkout aisle, at the end of
the cosmetics aisle, with a special counter set up) during the first few
days of launch, they could change all of the other locations to do the
same and improve the success rate of new products saving the
company millions or billions in R&D costs of failed products.

For another example, looking at a few years worth of Walmart data and sales data for
historical product mixes and comparing those to competitor sales can provide a minimal product
mix designed to grab as much market share as possible with the fewest products.
Products in a category cannibalize each other, and big data can be used to estimate an optimal
mix of products to steal market share away from competitors while limiting cannibalization and
maximizing profits. Category managers could use the data to optimize their product mix, and
marketing managers could also use it to maximize return on marketing dollars.
Another example is Trade Promotion Optimization. Every Consumer Goods company
pays retailers to put their product on the shelf and these payments are called trade
promotions. They are always bi-directional agreements to promote the companys products, but
a lot of money in the industry is wasted.
Key Account Managers and Sales Managers can use this data to make sure only the most
profitable promotions are run and figure out through the data if retailers actually implemented
the promotion or not. Figuring out how to spend as little as possible to get your product in the
best position on the shelf with the right displays and coupons can save a company billions.

Sales and Supply Data


Sales, production and supply chain data in general is all data inside the company firewall and
probably in their data warehouses, but looking at this data in-flight and looking at it across all of
the data types is very difficult inside the warehouse because of all of the different hierarchies
involved (product hierarchies, customer hierarchies, etc.). Big data can do this on the fly.
For example, if sales promotions are not being executed as planned it throws off supply
estimates and the company can end up running out of product due to overselling or having extra
product that takes up warehouse space. Both of these reduce
profit margins.
Supply managers can use it to derive more accurate
projections of future sales (or modify a plan mid-month).
Avoiding having hundreds of thousands of extra cases in a
warehouse can save a company millions.The same data turned
around the other way can be used by marketing to help
increase sales to match up with production.
Marketing can use it to run new campaigns or cancel
campaigns in order to speed up or reduce sales of products.

Sentiment Analysis
Use of social media (Twitter, Facebook, etc.) - this is not just communications specific but a
good example applicable across industry. Collect/stream data from social media sites into CRM

and customer service applications to determine importance/clout of customer and to get a


better overall picture of each customers behavior, likes and dislikes.
For example Facebook posts and posts from brand sites can be analyzed as a new product
launches to get an idea of frequent likes and dislikes about how the product was marketed and
where people are buying it from.
Marketing can use this information to change the marketing campaign, especially the
electronic portions mid-launch, focusing on positive aspects of the product that are generating a
lot of buzz or cancelling an ad campaign that segments of the population are finding offensive.

INDUSTRY SOLUTIONS
Link in to the Oracle Industry Solutions database for conversation scripts and questionnaires
that can get your client thinking about their analytics and data warehouse solutions.
Check out the Retail Insights sales play and the rest of the Comprehensive Trade
Management solution at http://my.oracle.com/site/ibu/portal/IndustrySalesPlays/Industries-A-K/
ConsumerGoods/Solution1/index.html. This solution drove the sales of 11 Exa systems at P&G.
Download the Executive Conversation Script under the Retail Insights banner and it will walk
you through a whiteboard session about using data from partners (retailers and syndicated data
resellers like IRI/Nielsen) who are closer to the consumer.
Retail Insights covers this too, but if they are interested in making new product launches
more successful, pull down the Innovation Management (http://my.oracle.com/site/ibu/portal/
IndustrySalesPlays/Industries-A-K/ConsumerGoods/Solution3/index.html) Executive
Conversation Script for a great overview of how hard it is for the company to launch products in
the first place, and how only about 20% of new products meet their objectives. Any post-launch
assistance you can give a product should definitely be put to use.
If the external data sources are the biggest topic, then check out the High Perf Demand
Signal Repositories play (http://my.oracle.com/site/ibu/portal/ExaBusinessSolutions/SalesPlaysIndustries/ConsumerGoods/index.html). The Top 5 Objections, Competitive Traps or Questions
section of the Discover Guide is really useful convincing your client that they shouldnt go off on
their own and there are some great podcasts here to get you up to speed fast.

DATA SOURCES
There are two main classifications of data sources in Consumer Goods; internal and external.
The internal sources are available but its too expensive and difficult to align their hierarchies
with the other data sources in the company so it just hasnt been done to date. The other group
are external data sources whose hierarchies dont match up nicely to the product or segment
hierarchies that are used internally either. Big Data build hierarchies on the fly based on rules
that can be easily found using tools like Endeca, so this isnt as big of a hurdle.

Structured Internal Sources:


Data

Order Management Data

Billing Data

Promotions

Consumer Call Centers

Retail Execution

Marketing Plans

Shipments

Unstructured and External Sources:


Syndicated and Retailer ePoint of Sale (ePOS) data

Zillow type websites for home prices

Retailer Loyalty and Market Basket data

Weather Forecasts and History

Wholesaler and Distributor Spin data

Government Census data

Social Listening Post data (enhanced with sentiment analysis)

Economic Data, i.e., unemployment rates

Competitor Bench Prices and Promotions (slurped in from


Retailer public websites)

Social Networking Data (unfiltered by a


listening post)

ISP website surfing data

Chapter 7

Telecommunications Industry
BIG DATA USE CASES
Sentiment Analysis & Social Marketing
Combine social media feeds (from Twitter, Facebook, etc.) and customer demographic,
psychographic (values, attitudes, interests, or lifestyles), purchase, and network usage data to
determine importance or clout of customer and to get a better overall picture of each
customers behavior, likes, and dislikes. For example, analyzing Twitter feeds and Facebook
posts can reveal a better understanding of the service providers customer service performance
and if there are quality of service issues with in specific regions or customer groups.
This combined data can be used by marketing teams to better target campaigns and
collaborate with partners on joint campaigns (e.g. cinema companies to offer discount vouchers).
Customer care and operations teams can also leverage this information to determine the next best
action (treatment, remedy, etc.) associated with that customers social influence.

Service providers can also leverage sentiment analysis data to defend their brand image and
reputation by gaining deeper insight into overall social media impact and campaigns. They can
gauge social media sentiment on newly released products, offers, and campaigns in a costeffective manner and proactively create service requests to improve brand perception.

Cross Channel Insights


A customers perception of service providers performance and value is increasingly defined
by how well they manage interactions across any channel including mobile, web, call centers,
IVRs, dealers, and retail outlets. Customers often start the search for a particular product or
service on the web site, then talk with a call center agent for more
information, and finally complete the purchase in a retail store.
Simply optimizing the customer flow and service levels in each of
these channels will not deliver the results customers expect. Rather,
how well the service provider handles each of these cross channel
customer interactions and delivers a seamless, friction-less experience
will determine if the brand promise is fulfilled and whether the
customer become a promoter or detractor.
For example, customers will start a purchase online adding items to their shopping cart and
then abandon the purchase. If the customer service agent is aware of this during a subsequent
call they can help address the customers questions thereby improving customer satisfaction and
revenue. These customer interactions from each channel can then be captured, aggregated,
analyzed, and correlated with other KPIs like Net Promoter Scores to develop insights into
customer churn predictors, life time value, and brand improvement strategies.

Location Based Marketing


Capture a customers location when entering a certain area (geo-fencing) and correlate that
demographic, usage, and preference data to create targeted offers and promotions for CSP and
partner services. CSPs can also analyze a subscribers mobile network location data over a
certain period to look for patterns or relationships that would be
valuable to advertisers and partners.
For example, based on the results you may identify that a
particular subscriber always drives a certain route to work each
day. Therefore you know he tends to pass by a specific
Starbucks at 7:15am each day, so you send him an offer to stop by that Starbucks on his way to
work.

Real-time, Context Sensitive Advertising


Telecoms can enable near real-time offer creation and advertising display by collecting inputs
from subscription platforms, value-added services platforms, dealers, distribution management

systems, and other sources. Ads, offers and promotions can then be tailored and delivered to the
customer when they access the website, via mobile/SMS, or when talking with a retail store rep
or call center agents.
Today, when a customer logs into a telecom website, the ads that are served up have little
correlation to a particular customers service usage, content purchases, social media activity, or
site browsing history. Capturing all of this information would allow the service provider to
feature ads and offers that reflect recently consumed services and applications more relevant to
their current interests and likelihood to spend.
With a context-sensitive, 360 view profile of the customer telecoms can recommend services
or products in real time to the customer in the context of each interaction and prior history. The
adaptive logic can be integrated across multiple channels including the web, mobile, call center,
retail associate, in-store kiosk, etc. to reflect a customer preference for how and where they want
to interact with the service provider. Ad response, service usage and location data can be
collected and analyzed in real-time using complex event processing and to determine target
segments, product profitability margins prior to offer conceptualization to improve marketing
and advertising spend.

Network Optimization & Monetization


Big Data is used to deliver real-time analytics to detect when a network is down, overloaded
or reaching capacity. This information can be analyzed alongside to
marketing offers and promotions, seasonal trends, and customer
usage (e.g., mobile applications, online gaming, or over the top
services like Netfix) to identify network hotspots and determine
where to make capital investments to support value added services
and content offerings.
Today, Service Providers manage network bandwidth with either data caps or tiered pricing
models. Using Big Data, it is possible to create personalized network usage policies by combining
network data with unstructured data (captured off deep packet inspection probes and other
sources) to analyze customer behavioral patterns. These subscriber-specific usage policies would
increase customer satisfaction for the vast majority of users (theyre not subsidizing the heavy
consumers) and enable service providers to maximize data consumption revenue streams.Service
providers can monitor network usage and identify patterns for when the network is overloaded or
under-utilized and then developing offers, services, and pricing models that optimize network
bandwidth, customer satisfaction, and profitability. The ability to perform real-time network
analysis (including service outages, interruptions, or slow-downs) using structured and
unstructured data can be used to impact operations performance in areas outside of core
network engineering including sales, support, and call center agent operations to help determine:
Product Marketing: What services, products and bundles impacted by the network fault and
how should this influence marketing and promotions. They can also run campaigns targeted to

individual customers based on network events like first time user for a certain service or
download of specific applications.
Customer Management: Which customers were impacted when this network fault occurred
and should service requests or direct communication take place to acknowledge the issue and
offer treatment
Customer Care: Prepare the Call Centers with appropriate knowledge of the issue, customers,
and services affected to better prepare agents and scale up staffing volumes as necessary
Revenue and Churn Forecasting: What is the potential revenue, profitability, and churn
impact from the outage? What is the cost or impact to revenue, brand, and other KPIs of
different actions.

Weblog or Click Stream Analysis


Website mouse click data is captured as weblogs which can be analyzed to better
understand how customers navigate the website and online store. This can help the service
provider improve the overall user interface and user experience, deliver different pages to
specific groups of users, trial new page designs/campaigns in advance of full launch, and make
products, services, self-care, agent-assisted care, and support easier to find and simpler to use.
In the Cable and Satellite industry, video and set-top box behavior including Internet sites
visited, channels watched, DVR recording information, PPV/VOD usage, phone calling patterns,
mobile use, app adoption, etc. can be collected, tracked and analyzed. This data can improve the
mix of products and offerings, determine how content and services are discovered, better target
offerings to particular customer segments, and improve the overall usability of the service.

INDUSTRY SOLUTIONS
Link in to the Oracle Industry Solutions portal for conversation scripts and questionnaires
that can get your client thinking about their analytics and data warehouse solutions.
Check out the Communications Industry Engineered Systems plays:
http://my.oracle.com/site/ibu/portal/ExaBusinessSolutions/SalesPlays-Industries/
Communications/index.html
Open up the Data Warehouse discovery guide and start asking your customer about their
fulfillment and SLA improvement processes. Or take them on a discovery session about world
class analytics and start figuring out how your customer can get better visibility to revenue
leakage and the causes of it or integrating cross channel commerce into a single view of the
customer.
If Cross Channel takes on a life of its own in your conversations, link into the IBU Cross
Channel Customer Experience Solution for Communications:

http://my.oracle.com/site/ibu/portal/IndustrySalesPlays/Industries-A-K/Communications/
Solution2/index.html

DATA SOURCES
These are examples of the type of data we commonly see used in the Telecommunications
industry. Perhaps there are others that are more important to your customer? Only a fraction of
this data typically ends up in a data warehouse and is available for analysis. What questions
could be answered if all of this data could be combined and analyzed in some way?
Network usage

CRM data

Location-based data

Call Data Records (CDR)

Billing data

GPS data

Session Data Records


(SDR)

Supply chain data

Contracts, rights &


royalties

SMS data

Supplier & dealer data

System logs

Weblog data

Advertising data

Call center logs

Bandwidth usage

Financial system

Support logs

Communication faults

Order & Fulfillment data

QR Code data

User profiles &


psychographic data

Digital TV/set-top box


data

Augmented reality sources

Sensors (e.g. cars)

WiFi hotspot usage

Internal social media

Portals

Device profiles

External social media

Sales data

Mobile payments

Chapter 8

Utilities Industry
It is a time of great change and transition for the utilities industryan evolving regulatory
environment, a strong push toward renewable energy sources and conservation, the advent of
smart meter and grid technologies, and the potential of competition drive uncertainty. The most
significant opportunity and risk, if not properly addressed, is presented by the coming torrent of
new data and events (Big Data) resulting from efforts to modernize utility networks and the
entire operational framework.

Big Data opportunities in utilities are also rapidly evolving. Change doesnt come easy for an
industry that operated for a hundred plus years with systems that have worked relatively well.
But the traditional systems that have served utilities well over the years were not built to handle
the frequency and volume of data emerging from smart meters, grid devices and other network
controls and sensors. As a result, utility businesses are cautiously structuring their current IT
infrastructure, systems and tools to accommodate emerging needs such as customer prepay,
demand response, self-service analytics, near-real-time operational control, distributed
generation, etc. Given the current technology landscape, utilities may be sacrificing the rapid and
dependable throughput of data that ensures efficient network performance, high reliability and
timely revenue flow.

BIG DATA USE CASES


Network Capacity Planning
Intelligent network capacity planning and management, which takes a more proactive
approach to the management of distribution networks based
on measurement and more advanced control options, enables
greater numbers of Distributed Energy Resource connections
and increases their network access whilst minimising,
deferring or completely avoiding network reinforcements.
For example, an intelligent network uses sensing, embedded
processing, digital communications, and software to manage
network-derived information, thus making itself:

Observable (able to measure the states of all grid elements)

Controllable (able to affect the state of any grid element)

Automated (able to adapt and self-heal)

Integrated (fully connected to utility processes and systems)

If a network operations team implements this it can reduce network outages, limit exposure
during outages and generally improve reliability.

Customer Experience: Outage Notification


Utilities are constantly looking for new ways to enhance the customer experience. High mark
customer satisfaction has (and always will) remain as the single-most important measure of
success. To help meet this goal, utilities can provide the information they glean from smart grid
infrastructures back to customers in a variety of ways to enhance the overall customer
experience. Utilities are quickly learning that Big Data can help them achieve and maintain the

levels of satisfaction desired by customers and by regulators.For instance, by integrating


advanced metering, grid devices and network
management systems, utilities are able to address more
proactively outages and other system conditions that
exist within their territories. This allows them to be
much more proactive in the provision of network
condition information to customers and other
stakeholders. Leveraging Big Data and other analytical
tools, utilities can quickly address basic customer
concerns by providing interactive maps and other
visualization context via the Web and mobile platforms.
Implied in the scenario is the need to integrate key customer processes and interaction points
to operational applications for combined analysis and action. Today, many of these core
processes lack end-to-end integration, visibility and control, leading to high rates of exceptions,
poor efficiencies and negative impact on the customer experience. Leveraging a common data
model on top of these processes and systems to track and trace key transactions provide the
necessary visibility, insight and action to catch and remediate issues before they impact
customers.

Demand Response
Many Energy Service Providers and Market Operators administer customer-side Demand
Response and Load Control programs to ensure grid stability and stable operation during times
of peak demand or system emergencies arising from generator outages or transmission and/or
distribution constraints. With some programs, the customer residential, commercial, or
industrial - reduces the required load upon instruction from the Energy Service Provider or
Market Operator. With other programs, the Energy Service Provider, Market Operator, or a
Curtailment Service Provider remotely reduces the load via device management.
Big Data solutions provide the technology foundation and framework enabling the analysis
of meter and event data consumption from a broad array of sources, both stored and streaming.
Utilities are able to perform continuous analytics against that data to look for anomalies, patterns
and trends that might indicate an opportunity make actionable decisions on both supply and
demand.
Marketing and Operations would be interested in this solution. It provides the ability to
integrate these Big Data analytics into other core operational systems to kick off an action based
on rules and policy; and provide robust, business-centric visualization through real-time
dashboards to customers and other key stakeholders.

Location-Based Services

This consists of any and all geo-spatial data; assets, maintenance crews, electrical network
equipment and other resources. Many organizations have geo-spatial data available from their
equipment, diagrams and vehicles.
For example this data can be used to deliver real-time analytics to pin-point maintenance
resources needs when a network is down, overloaded or reaching capacity. Analytics can also
identify patterns for when a network has the potential for reaching load constraints or when it has
extra capacity.
Integration into outage and distribution management applications allows for further
development of business capabilities such as distribution load management switching, where
protocols can be established to move customers to alternate feeders during times of over
capacity. A utilitys use of Bid Data fundamentally changes the way they can address network
capacity needs.

Retail Marketing & Product Development


Utilities are gradually moving away from one size fits all marketing. Today, utilities can
make use of social media to analyze and correlate social sentiment for publicly available service
offerings. The use of Big Data solutions can help utilities determine competitor strengths and
weakness, enabling them to exploit competitive strongholds and target marketing programs
towards specific customers or segments of customers.
For example, at the customer premise level, utilities are able to analyze usage patterns at the
meter level and provide this usage information back to consumers with the intent of developing
market driven pricing offers that reflect individual consumption characteristics.
Customer and product organizations can use this data to create specific products for
customers and modify broader products for better performance. New products can also have a
huge impact on helping utilities achieve demand-side commodity reduction goals.
As another example, analytical information also allows utilities to look at similar granular
use and consumption patterns for neighborhoods, districts, or cities to facilitate better supply
planning and load forecasting in these service territories.
Network operations and asset management can utilize Big Data to optimize long term
investments and get the biggest benefit from short term investments, such as aligning network
refurbishment investments along a geo-spatial line instead of just renovating the oldest or
faultiest assets.

Renewable and Distributed Energy Generation Planning


Traditional generation investments involve large amounts of property to build a large plant
on, but newer renewable sources like wind and solar energy can be located closer to demand
sources. Big Data solutions that look at all of the factors of a city, from standard utility ones like

load profiles and capacity to more unstructured ones from city demographics, which can be used
to make smarter investment decisions.
For example, data on wealth distribution in office spaces,
commuter congestion and electric vehicle population history
combined with current load profiles and capacity can
combined to predict which buildings will have the highest
growth in electric vehicles over the next two decades. This
data can feed portfolio planning decision like deciding where
to invest in solar panels to help source cheaper and cleaner
local energy to charge those vehicles instead of transporting it
in from a remote fossil plant at high cost.
Network operations and finance groups can use this data to make the limited amount of
renewable investment be as beneficial as possible for the utility company in the long term. Most
of these decisions will otherwise be based only on current network load and capacity and not the
long term change.
Another example is wind farm investments. Traditional utility data, demographic information
and new sensor data can be combined to provide the optimal investment scenarios necessary to
meet growing renewable energy portfolio requirements. The demographic data like suburban
and urban growth and shrinkage can be also used to focus energy supply investments on long
term profitability instead of just short term views.
Asset management groups would be interested in this kind of analysis to reduce risks and
costs associated with new or replacement supply and infrastructure planning and delivery. Using
this long term plan can also maximize the long term return on investment by growing supply
resources just-in-time to meet demand instead of under or over profiling it.

INDUSTRY SOLUTIONS
Link in to the Oracle Industry Solutions database for conversation scripts and questionnaires
that can get your client thinking about their analytics and data warehouse solutions.
If youre in conversations with the network operations or customer care departments, go to
the Utilities Data Management Industry Solution portal (http://my.oracle.com/site/ibu/portal/
IndustrySalesPlays/Industries-L-Z/Utilities/Solution3/index.html).
In here youll find an
Executive Conversation Script and Discovery Questionnaire that focus on the structured and
unstructured data in the utilities space.
For the finance and asset management departments, go to the Asset Reliability &
Optimization Industry Solution (http://my.oracle.com/site/ibu/portal/IndustrySalesPlays/
Industries-L-Z/Utilities/Solution3/index.html).
The Executive Conversation Script and
Discovery Questionnaire here have information on how to talk to IT, finance and operations

about how to improve return on investment and overall revenue using Oracles asset lifecycle
management software, which are heavy analytical applications that can take advantage of big
data from meter data systems, real-time sensors and SCADA systems.
Oracle Utilities applications, technology and hardware products are engineered to work
together. Combined; these solutions process data exceptionally faster and more reliably than the
myriad of products used by traditional utility operators. Meter Data Management is a type of big
data solution in its own right.
Link into the Engineered Systems sales plays (http://
my.oracle.com/site/ibu/portal/ExaBusinessSolutions/SalesPlays-Industries/Utilities/index.html)
for more information. In order to get the customer interested pose this critical item of
information:
Did you know that running your Meter Data Management (MDM) system on Oracle
Engineered Systems delivers superfast performance and storage efficiency, drastically reducing
costs while meeting goals for your Smart Metering and Meter to Cash Process?

Superfast performance of 5x to 7x significantly reduces the number of processors, driving


down the cost of software licensing and maintenance.

Storage is reduced up to 2x.

Overall costs can be lowered up to 50%.

DATA SOURCES
Utility companies have been dealing with big data problems around smart meter
implementations using traditional approaches for years. Geo spatial and location based data has
also been available and in some cases integrated. They also run millions of sensor reads very
second through their operations systems. Utilities are used to big data problems, but very
focused ones that dont fully leverage or even store all of the data they process. Combine these
traditional utilities data sources with the ones below and you can workshop some great use cases
with your customer.
Network Usage

Portals

Location Based Data

Call Data Records

Asset Data

GPS Driving Data

Surfing Behavior

Event Data

Order Management

Bandwidth Usage

CRM Data

Remote Control

Network Faults

Billing Data

Call Center Logging

User Profiles

Device Profiling

Social Media

Sensors

Mobile Payments

Chapter 9

Research Industry
BIG DATA USE CASES
Scientific Instruments Data Generation
VOLUME is one of the challenges of Big Data. It is about being able to ingest and manage
very large quantities of data and to cope with its exponential growth without limiting or
hindering the ability to access critical information.
Most of the Research data comes from various kinds of scientific instruments that can be
distributed or can be large, expensive centralized facilities operated at a global scale. In both
cases Research collaborations need to effectively deal with the data deluge generated by these
machines, quickly load and organize large volumes of raw data and translate it into knowledge
and information. This and other large data sets are used in both complex analytics and real time
analytics.
The volume of worldwide climate data is expanding rapidly, creating challenges for both
physical archiving and sharing, for ease of access of relevant information in a multidisciplinary
environment. Data comes from many different sources, such as satellites, temperature sensors,
ground sensors, ocean and marine sensors, weather stations, atmospheric balloons, and many
more.
Data captured from the above sources is used by researchers to monitor climate changes, to
generate weather forecasts and to support the decision-making process in case of natural
disasters. Research climate data also has a direct impact on businesses that uses climate and
weather data to make informed economic decisions, such as agriculture, real estate, law firms,
and private research institutions.

Complex Analytics
Data is an asset in Research as in any other field, if not more, and it has a high potential
VALUE if harnessed correctly. This value (another characteristic of Big Data) is in the ability to
translate raw data into information and knowledge.

Most of the researchers and their organizations are required to exploit large data sets by
storing, retrieving and using deep analytics against a wide variety of data types while
simultaneously optimizing workloads and system operations

In genomics, the cost of sequencing is dropping by 50% every


5 months. The challenge here is to store these data and their
genomic DNA research results and have an efficient system
able to give researchers query performance.
analysis, not sequencing, will be the main expense hurdle
(Chris Ponting of the University of Oxford in the United Kingdom.)

Science is also the product of data analysis:


Science does not result from the launch of a mission or the collection of
data. Rather, science only occurs through the analysis and understanding of
that data.
- Philosophy of the NASA Science Mission Directorate (SMD)
Genomics data is usually combined with other types of data to provide insight about diseases
evolution and propagation, medical data and related treatments, and patient care. Given the
complexity of the data and its correlations, cutting-edge data analysis is required.
Gaining efficiency in terms of time-to-discovery, data insight and cost of analysis provides
better health-care and more cost-effective solutions for both patients, hospitals, health-care
providers and governments. Harnessing the power of data allows researchers to work together
with different stakeholders to bring better care through deeper knowledge and innovation.

(Almost) Real-time Data Streams


Another characteristic of Big Data is VELOCITY. Velocity means the rapid growth in speed
of data generation and the consequent need to deal with extremely fast streams of data. There
are indeed Research experiments at small and large scale where high velocity loading and
organization of information is key. As data comes in, successful filtering and retention on realtime (or almost) is a major challenge that requires complex workflows and fast decision-making
processes. The success of the scientific endeavor might depend on the experiments ability to
quickly store large amounts of data and to rapidly organize it.

In radio-astronomy the LOFAR Radio-Interferometre in the


Netherlands is producing 1.6TB/sec setting new frontiers in this
filed. In the same field, The Square Kilometer Array will generate
enough raw data to fill 15 million 64GB iPods every day!

High Energy Physics requires huge machines and facilities to


look into matter at very tiny dimensions. The largest production
scientific machine is the Large Hadron Collider (LHC) at CERN in
Geneva, Switzerland. The LHC generates 60TB of data per day and
experiments need to load huge data information in minimum time.
For example, researchers need to store large amounts of data generated by the scientific
instruments and their analytical devices, make changes to test parameters based on the results
and retest multiple times before their time slot with the equipment expires. With some
equipment it may be possible or required to make changes during the test based on the data
coming off of the sensors.
Big data can utilize these streams of semi-structured data to do real-time complex event
processing or near real-time analytics to give structure to the data and assess patterns in the
results. This also allows performing predictive analytics which is crucial to prevent issues in
these complex systems and ensures systems reliability. Research organizations can then operate
24/7 research infrastructures and provide users with best in-class, mission critical services.

Data Visualization
Big Data also means VARIETY. This means the ability for the eenterprise infrastructure to
quickly accommodate new data sources and to cope with a wide range of data types.
Enhancing the visualization of research information gives the ability to transform big data into
something easier to analyze, to enable new science with access to the latest investigative
methods and tools and to maximize analytic performance and achieve faster results.

The proposed Large Synoptic Survey Telescope will record 30


trillion bytes of image data every day

In Genomics on average scientists can fully sequence 167


individuals per week, generating 250GB of images or 200
movie files
Visualisation is for example key in clinical reporting. An appropriate and flexible
visualisation tools offers researchers and practitioners to gain faster and better understanding of
complex system they have to analyse. Increasingly, it can be a useful tool to support the decision
making-process in translational medicine. Throughout the whole chain, biologists,
bioinformaticians and doctors can see thought effective visualisation patients correlation,
pathways networks, heat-maps, chemical structures, sequencing data and best correlate them.
Both Research organisations and medical-care providers can then streamline process and
move more and more to the new frontier of personalised medicine.

INDUSTRY SOLUTIONS
Oracle has two Industry Solutions for the Research segment: Research Data Management
(http://my.oracle.com/site/ibu/portal/IndustrySalesPlays/Industries-A-K/EducationResearch/Solution2/index.html?
ssSourceNodeId=23425&ssSourceSiteId=ibu) and Research Analytics (http://my.oracle.com/site/ibu/portal/
IndustrySalesPlays/Industries-A-K/EducationResearch/Solution2/index.html?
ssSourceNodeId=23425&ssSourceSiteId=ibu). Both solutions were conceived to address the main

challenges Researchers and their organizations are facing, in line with the above use cases.
Research Data Management (RDM) focuses on the overall Research Data Lifecycle, while
Research Analytics (RA) addresses more specifically Big Data related issues. In particular:
Oracles Research Data Management solution empowers research institutions
to develop open, scalable, secure environments for knowledge development, discovery,
management, sharing and preservation.
Oracle Research Analytics helps Researchers to carry out collaborative and
high-performance analytics on large sets of structured or unstructured data in order to enable
innovative Research and reduce Time-to-Discovery
On the Research Enterprise portal, you will find for each solution (http://my.oracle.com/site/ibu/
portal/IndustrySalesPlays/Industries-A-K/EducationResearch/Solution2/index.html?
ssSourceNodeId=23425&ssSourceSiteId=ibu):
Document)Name

Audience

Goal/Descrip5on

Sales&CheatSheet

Oracle&Internal&

2&slides&on&how&to&best&posi6on&Oracle&with&
respect&of&the&Industry&challenges

Discovery&Ques6onnaire

Oracle&Internal&

Execu6ve&Conversa6on&Script

Oracle&internal&

Solu6on&Brief

Public/External&

Execu6ve&Presenta6on

Public/External&

A&document&with&all&high@yield&ques6ons&to&qualify&
the&pain&and&level&of&need&of&the&customer
A&document&to&help&prepare&for&an&introductory&
execu6ve&mee6ng,&including&&descrip6on&of&the&
target&buyers&role
2&slides&with&basic&informa6on&on&the&Industry&
challenges&and&Oracle&capabili6es
A&slide@deck&with&the&complete&story&on&the&Oracle&
value@proposi6on.&Success&Stories&are&also&
available.

A supporting glossary is also available.


More information on Oracle in Research as well as many other references and success stories
are available on the Research website on oracle.com: http://www.oracle.com/us/industries/educationand-research/oracle-research-institutions-1453042.html.

DATA SOURCES
The new frontier in Research is the possibility to perform distributed, interdisciplinary,
collaborative Research that harnesses the power of data in a reliable, cost-effective way. Most of
the effort focuses on aggregation, standardization and linkage of research data from multiple
sources and in multiple formats, providing an analytic approach to reporting on this data and the
accomplishments of research initiatives. How to access & preserve over time raw data, metadata
and research results in different format and in a trusted way is also crucial.
Major data sources you will encounter talking to your customers are:
Environmental Sensors Data

DNA sequencing Data

Climate Data

Clinical Trials Data

Meteorology Data

Large scientific instruments data

Radio Astronomy Data antennas

Events data

Aggregated scientific data sets

Experiments data

Social Sciences data

Government Data

Chapter 10

Automotive Industry
Automotive OEMs are grappling to answer questions such as: What are our customers saying
about our brands? Where do they get most of the information about our products? What
advertising works? What promotions and incentives are effective?
Automotive OEMs are looking for insights they believe can be mined from a 360 degree
view of the customer. By leveraging Big Data solutions, automakers can boost marketing ROI
and lead-conversion rates, align product mixes with customer demand, and reduce warranty
costs. The industry is looking for ways to gain competitive advantage by leveraging data being
collected from the automobile, web browsing data, social media data, dealer interactions,
customer interactions with the call center as well as repair and warranty information. Insights
mined from analyzing, co-relating and mining these types of data can be categorized into broad
categories such as Customer Insights and Service and Early Warning & Vehicle Quality.

BIG DATA USE CASES


Customer Insights
Customer data comes from a large variety of data sources,
internal and external to an Automotive OEM. The ubiquity of
social media is a vast source of customer data, and so is data
generated from the car itself through Telematics. Other
internal data sources are dealers interactions, web logs, data
generated through marketing campaigns and call center data.
Automotive customer data providers such as Polk and
Autodata provide aggregated customer data across brands.
Valuable insights can be gained by mining the deluge of
customer data coming from the variety of data sources. For example, Brand Managers are
concerned about understanding what is being said about their brands on the internet in blogs,
Twitter and other internet forums by customers and potential customers. A Marketing Manager
may be concerned about tracking the impact on sales from a marketing campaign.
Other key insights that could be garnered from leveraging such data are:
What are customers saying about our brands how is the perception trending?
What is the overall value of different customer touch points traditional and new digital?
Which web or social channels are most effective? Is print or TV or call center still valuable?

Customer Loyalty based Marketing driving effective campaigns. What marketing is most
effective?
Which incentives and promotions work? Does a one-size fit all incentive program work or
more segmented and targeted approach for incentives work?
How does this information change based on geography, region, and country? How can we
change our programs based on the local and regional effects?
How can we combine advertisement and incentive spend in a targeted to way to drive more
demand and sales?
How do we ensure dealer co-marketing programs work effectively?
Beneficiaries and users of this category of analytics include Brand Managers, Marketing,
Sales and Finance. These insights allow Marketing, Brand Management and Sales managers to
maximize the effectiveness of their marketing and incentive spend. Brand Managers focused on
customer loyalty can monetize these Customer Insights into improved customer loyalty and
higher repeat buying rates. Big Data solutions enable the conversion of customer insights into
actions that improve customer satisfaction and ultimately improved profitability.

Early Warning & Vehicle Quality


Bringing Warranty, Service, and Vehicle Diagnostics Data
Together
Understanding vehicle quality issues and responding to them
pro-actively remains one of the most significant challenges in
the automotive industry. In addition to latency in relevant data
available to automakers, the problem is further exacerbated
with the lack of a systemic approach to collecting and
analyzing data that will allow them to respond to vehicle
quality problems.

Automakers would like to answer questions such as:


Which products are having problems in service?
Are these problems recurring or one-off? If they are recurring, what is their affect on the
overall vehicle population?
Are the problems real issues or related to operator error?
How are our consumers or dealer service technicians explaining the problems? Can we
analyze their verbatim to get to the root cause?

How can we incorporate data from vehicle on-board diagnostics? With telematics and
connectivity, how can we leverage more detailed diagnostic trouble code data to get to
root causes?
How can we correlate problems on one vehicle line to others that might share common
components and parts?
How do we incorporate information from discussion forums, blogs, social media into
analysis of problems faced by consumers?
How do we incorporate voice of consumer into the overall quality improvement process?
How can we capture information from multiple sources to get a comprehensive view of
overall quality insight to improve customer and vehicle service?
Such Early Warning capabilities and Quality Analytics would be extremely interesting to
internal functions including Quality Engineering, Warranty, Product Development and
Manufacturing. Vehicle recalls are extremely expensive, and also damage the brand perception.
Therefore being able to provide early detection of quality issues could allow a rapid response,
thus avoiding potentially devastating costs and image problems.

INDUSTRY SOLUTIONS
Link in to the Oracle Industry Solutions database for conversation scripts and questionnaires
that can get your client thinking about their analytics solutions.
Check out the Automotive Insights sales play and the rest of the Automotive Sales,
Distribution & Aftermarket solution at http://my.oracle.com/site/ibu/portal/IndustrySalesPlays/
Industries-A-K/Automotive/Solution1/index.html
Download the Executive Conversation scripts for Integrated Sales & Marketing Industry
Solution as well as the After Sales Service Warranty Industry Solution.

DATA SOURCES
To gain better Consumer Insights, it is critical to link various data streams and consumer
touch points:

Link diverse data sets for different brands with specific need states

Social media including consumer sentiment

CRM interaction

Consumer Response (including 800 contacts)

Dealer merchandising conditions

Household purchases

To understand the drivers of consumer purchase decisions :

Triggers

Link dealer to purchase decision

What is the call center and CRM data telling us (call center vs. social media)

Value of Social Media Data

What is the value of social media data and how can we leverage it to impact revenue?

Is the Social Media Data telling us anything we didnt already know from the data
coming in from the call center?

Can social media help us reach/target consumers that are currently not loyal to our
brand?

To ensure deeper consumer insights, automakers need to have a strategy that allows them to
bring together diverse set of data from varied data sources including:

Marketing systems

Social media sites such as Twitter, Facebook, Automotive blogs and forums,
Consumer websites, Review websites

Call center interactions

Incentive systems

Web click-thru

Sales systems

Customer data systems, which capture individual and household information

To develop an effective Early Warning mechanism, it is critical to bring together diverse set
of data from:

Warranty claims systems

Call center customer service, dealer techline, roadside assistance, customer survey
interactions

Social media including consumer complaints

Vehicle Diagnostic Code systems

Vehicle BOM and parts systems

Repair order systems

Web blog and discussion forum entries

Chapter 11

Engineering & Construction


BIG DATA USE CASES
Material and Labor Costs
The construction industry is notoriously a small margin industry with most companies
earning margins in the range of 1% to 3%. Because of the low
margins there is a high risk in both bidding and executing on a
construction project. The two largest factors contributing to the
overall cost on a project are material and labor costs. If there is any
opportunity to drive down either cost component the impact to the
profitability to the project is greatly improved.
For example, material costs fluctuate constantly according to many
factors that impact the costs including supply and demand, regional
factors with the price of oil having a very large impact. Additionally,
labor rates vary by skill set across regions and fluctuate based upon
demand including unemployment rates. Labor and materials prices
impact the profitability of a project twice. The first is during the bid and estimating process and
the second is during the execution of the contract to procure and construct the asset the company
is building. Data accuracy in obtaining reliable material prices during the estimating phase helps
ensure that your bid that it is based upon material prices that will reflect the market prices and
reduces the risks that the material costs have been underpriced.
Operations and Supply Management would be interested in a solution in this space to use
price of oil and the various risk factors around it to predict impacts and be able to mitigate via
contracts or hedging their material investments. Previous performance against bids and margins
from the jobs against outside and internal factors can give the company new insight into how to
run more profitably. As construction companies move to provide global and expanded regional
services they need to know what the labor material costs may be for that area during both the
estimating and construction phases of the project.

Subcontractor Performance
How well will a contractor perform on a project? How well did he perform on the past
project? Do all subcontractors perform the same? Answering these questions will help you
determine what to expect from your subcontractor on your project. Unfortunately, there is little

to go on other in the E&C industry other than a companys own in house data that they have
collected over many years of performance data. This creates a risk when trying to expand into
new markets or provide additional services where they do not
have the history or data.As well a subcontractors performance
may be tied to the local talent and vary at other locations,
which introduces additional risk.
For example, imagine winning the contract to do work in
a part of the country where you dont know the
subcontractors, or you need to place your bid based upon
subcontractor performance. Your estimated price could be
too high and therefore you wont win the work. Or even
worse would be to win the work based upon pricing that is too low and your company is in the
position to lose money on the contract.
Operations executives would be interested in this type of big data usage. Often E&C
companies develop qualified subcontractors through surveys and in-house performance
databases. The personal performance success of estimators and project managers is tied directly
to the availability of data to help them do their jobs well. As a result, the performance of the
operational executives is also tied to the reliability of this information as well.
While within the walls of a construction company the data can be developed to qualify a
subcontractor, however, it does not indicate how they will perform (i.e., what is the cost per
linear foot of pipe installed). There are generally too many parameters associated with
installation that makes it impossible to create a simple table to specify the cost of installed
quantities, which in turn makes it difficult to predict the costs on a project or to compare the
performance of a subcontractor to that of another.
With a data source the rated the quality of the work performed as well as the price per unit a
qualified estimate and forecast could be developed.

Equipment Costs
How much equipment do you need to complete your project? That depends and the answer
is always changing. Should you buy, rent or lease equipment for your project? Optimization of
equipment procurement can have a significant impact on the profitability
of a project. Knowing where to obtain the equipment and what the
prevailing costs are expected to be can help in both the planning and the
execution of the project. Like the material and labor costs above, the
equipment costs impact the project in the same way, however, the market
for these resources has greater micro fluctuations. Two equipment dealers
will have different pricing models for either a purchase, a lease or rental options.

For example, imagine the ability of a contractor to be able to develop a forecast of equipment
needs (type, duration, location, etc.) and then be able to optimize the cost of that schedule
through locating the resources (pieces of equipment) and a mix of equipment with a variety of inhouse purchases, rentals and leased equipment.
An operations executive and finance would be interested in this big data solution to minimize
project impacts and avoid costly delays, but also to optimize the project costs and improve
margins.

INDUSTRY SOLUTIONS
The E&C Industry is very fragmented with many different types of companies providing
different services that need to be understood when engaging with these companies. For a better
understanding of how to position Big Data link in to the Oracle Industry Solutions database for
conversation scripts and questionnaires that can get your client thinking about their analytics and
data warehouse solutions.
Go to the Sales & Marketing content portal for industry overview documents as well as
specific materials to support our E&C Industry Solutions:
h t t p : / / m y. o r a c l e . c o m / s i t e / i b u / p o r t a l / I n d u s t r y S a l e s P l a y s / I n d u s t r i e s - A - K /
EngineeringConstruction/Solution1/index.html
Within the Industry Solution Executive Overview you will find several albeit short very
insightful paragraphs that with an understanding of that information will help in understanding
the complex nuances of this industry.

DATA SOURCES
Material and labor costs generally garner the greatest amount of interest and attention in
discussions of cost reduction because of the nature of the game and difficulties around obtaining
this information. Equipment costs, while a very large cost on a project, is discussed less because
companies have developed some solutions through some standardized methodologies. However
there is a tremendous upside to be able to further optimize equipment usages.
Engineering News Record
Material standards and provider institutes (American Concrete Institute, National Ready Mix
Concrete Association, World Steel Association, etc.)
Granger
Subcontractor performance review
Equipment sales, lease and rental companies (i.e., Caterpillar, Hertz, Sun Rental, etc.)
U.S. Bureau of Labor and Statistics
State and Local Government labor information
Payroll

Time cards with task performed

Chapter 12

Oil and Gas Industry


The history of the oil & gas industry is really the history of
Big Data. The oil & gas industry has been dealing with
problems of big data acquisition, storage, management,
processing and analysis since before digital computers
appeared on the scene. Seismic surveys which used to be
measured in 10s of gigabytes are now measured in terabytes,
and real-time data is now streaming in from producing wells
and from well delivery processes like drilling and hydraulic
fracturing. Historically, the industry dealt with its Big Data
challenges through specialized processing and storage hardware (think Cray supercomputers and
D2 tapes). As time has moved forward other industries and in particular those using the internet
- have begun to experience big data problems and have developed general-purpose strategies and
techniques for receiving and analyzing huge amounts of data; the oil & gas industry can benefit
from these Big Data approaches and the Oracle technologies which embody these approaches.

BIG DATA USE CASES


The use cases for Big Data in oil & gas are different from other industries, partly because the
techniques for analysis of huge data volumes in oil & gas are so mature that the value
proposition for a new paradigm is not clear, and partly because there is little in oil & gas which
has the public-facing aspect which creates the internet-derived variety so difficult for other
industries to handle.

Seismic Processing
In the seismic processing workflow millions upon millions of earth measurements are
integrated together into a coherent model of the earths subsurface. In spite of the magnitude of

the data problem, Big Data techniques like the BDA and
Hadoop are of little use. There are specific steps in the
seismic processing workflow which would benefit from
Hadoop like the sorting of seismic traces but the BDA
would only be really useful for those steps. There was a paper
given at OOW 2011 which describes the performance of
Hadoop for this step. The number of customers who do
seismic processing is small - the seismic contractors like
WesternGeco and CGG and some large oil companies who still do processing in-house.

Real-time Data Acquisition


The BDA would be a great solution for customers who want to centralize their real-time data
acquisition infrastructure and bring it in from the field. The situation today is that most
companies use historians to record sensor data in real time then summarize that data before
transmitting it after the fact to a central repository. Rather than have historians distributed
throughout the field whether that is a producing field, along a pipeline or in a plant
environment this function could be centralized in a BDA where all data would be preserved
and analyzed. This use case only makes sense where the communications to the sensing devices
is very reliable.

Research & Development


Every major operator and service company has an R&D lab where forward-looking computer
technology is evaluated. These groups are the perfect places to absorb Big Data technology. They
will all understand that the first company who figures out how to use this emerging paradigm
will have a tremendous competitive advantage. They will also not be disturbed by the lack of
commercial industry-specific software- its the lack of availability of that software to the world at
large which will make the opportunity attractive.

General-purpose Parallel Processing


There are opportunities to position the Big Data Appliance and Hadoop as a general-purpose
parallel processing machine where each compute node has its own local storage and where the
nodes do not communicate one to another.

Downstream Retail
Oil & gas companies do have a public face with the public with their retail operations. These
may be gas stations or convenience stores or even websites. Many of the retail uses of Big Data
like Sentiment Analysis, Pricing Optimization, Customer Experience Management, etc. would
apply equally to these operations.

Safety/Environment

may be an opportunity for Big Data techniques to be used to


improve safety or environmental compliance through analysis
of heterogeneous data after an incident or accident. Endeca
together with the BDA could be used to bring together
personnel location (from the GPSes in their cell phones),
weather data, process-related information (pressures and
temperatures), wave state (for a floating or other offshore
facility), vibration and other sensor data, and possibly reports
from the public via the internet.

Upstream Data Management


The BDA offers a place for companies to hold data which are currently difficult to manage.
There are many kinds of data acquired by oil & gas companies which are not well managed
because there are no data standards (microseismic, gravity/magnetic/electromagnetic, distributed
temperature sensors) or where the existing databases simply dont accommodate the datatypes
(bathymetry, weather, automated sampling and sample analysis, special core analysis, rock
mechanical properties tests, etc.). Some kinds of data like time-series data do not lend
themselves to storage in relational tables.
The BDA could be a permanent home for these data without requiring design and
implementation of a formal data model.

DATA SOURCES
Sensor data
Weather
Wave state
Earth Measurements
Lab Automation

You might also like