You are on page 1of 12

Database Systems Journal vol. III, no.

4/2012 3

Perspectives on Big Data and Big Data Analytics

Elena Geanina ULARU, Florina Camelia PUICAN, Anca APOSTU, Manole VELICANU
1
Phd. Student, Institute of Doctoral Studies Bucharest
2
Phd. Student, Institute of Doctoral Studies Bucharest
3
Phd. Student, Institute of Doctoral Studies Bucharest
4
Phd. Coordinator, Institute of Doctoral Studies Bucharest
ularugeanina@yahoo.com, puicanflorina@yahoo.com,apostuanca@yahoo.com
manole.velicanu@ie.ase.ro

Nowadays companies are starting to realize the importance of using more data in order to
support decision for their strategies. It was said and proved through study cases that “More
data usually beats better algorithms”. With this statement companies started to realize that
they can chose to invest more in processing larger sets of data rather than investing in
expensive algorithms. The large quantity of data is better used as a whole because of the
possible correlations on a larger amount, correlations that can never be found if the data is
analyzed on separate sets or on a smaller set. A larger amount of data gives a better output
but also working with it can become a challenge due to processing limitations.
This article intends to define the concept of Big Data and stress the importance of Big Data
Analytics.
Keywords: Big Data, Big Data Analytics, Database, Internet, Hadoop project.

1 Introduction

Nowadays the Internet represents a


competitive advantage is gained through
understanding the information
predicting the evolution of facts based on
and

big space where great amounts of data. The same happens with Big Data.
information are added every day. The Every organization needs to collect a large
IBM Big Data Flood Infographic shows set of data in order to support its decision
that 2.7 Zettabytes of data exist in the and extract correlations through data
digital universe today. Also according to analysis as a basis for decisions.
this study there are 100 Terabytes In this article we will define the
updated daily through Facebook, and a concept of Big Data, its importance and
lot of activity on social networks this different perspectives on its use. In addition
leading to an estimate of 35 Zettabytes of we will stress the importance of Big Data
data generated annually by 2020. Just to Analysis and show how the analysis of Big
have an idea of the amount of data being Data will improve decisions in the future.
generated, one zettabyte (ZB) eguals
1021 bytes, meaning 1012 GB. [1] 2. Big Data Concept
We can associate the importance
of Big Data and Big Data Analysis with The term “Big Data” was first introduced to
the society that we live in. Today we are the computing world by Roger Magoulas
living in an Informational Society and we from O’Reilly media in 2005 in order to
are moving towards a Knowledge Based define a great amount of data that traditional
Society. In order to extract better data management techniques cannot manage
knowledge we need a bigger amount of and process due to the complexity and size
data. The Society of Information is a of this data.
society where information plays a major A study on the Evolution of Big Data
role in the economical, cultural and as a Research and Scientific Topic shows
political stage. that the term “Big Data” was present in
In the Knowledge society the research starting with 1970s but has been
4 Perspectives on Big Data and Big Data Analytics

comprised in publications in 2008. [2] choose an alternative way to process it.” [6]
Nowadays the Big Data concept is treated In a simpler definition we consider Big Data
from different points of view covering its to be an expression that comprises different
implications in many fields. data sets of very large, highly complex,
According to MiKE 2.0, the open source unstructured, organized, stored and
standard for Information Management, processed using specific methods and
Big Data is defined by its size, techniques used for business processes.
comprising a large, complex and There are a lot of definitions on Big Data
independent collection of data sets, each circulating around the world, but we
with the potential to interact. In addition, consider that the most important one is the
an important aspect of Big Data is the one that each leader gives to its one
fact that it cannot be handled with company’s data. The way that Big Data is
standard data management techniques defined has implication in the strategy of a
due to the inconsistency and business. Each leader has to define the
unpredictability of the possible concept in order to bring competitive
combinations. [3] advantage for the company.
In IBM’s view Big Data has four aspects:
The importance of Big Data
Volume: refers to the quantity of data
gathered by a company. This data must The main importance of Big Data
be used further to obtain important consists in the potential to improve
knowledge; efficiency in the context of use a large
Velocity: refers to the time in which Big volume of data, of different type. If Big Data
Data can be processed. Some activities is defined properly and used accordingly,
are very important and need immediate organizations can get a better view on their
responses, that is why fast processing business therefore leading to efficiency in
maximizes efficiency; different areas like sales, improving the
Variety: Refers to the type of data that manufactured product and so forth.
Big Data can comprise. This data can be Big Data can be used effectively in
structured as well as unstructured; the following areas:
Veracity: refers to the degree in which a • In information technology in order to
leader trusts the used information in order improve security and troubleshooting
to take decision. So getting the right by analyzing the patterns in the
correlations in Big Data is very important existing logs;
for the business future. [4] • In customer service by using
In addition, in Gartner’s IT Glosarry Big information from call centers in
Data is defined as high volume, velocity order to get the customer pattern and
and variety information assets that thus enhance customer satisfaction
demand cost-effective, innovative forms by customizing services;
of information processing for enhanced • In improving services and products
insight and decision making. [5] through the use of social media
content. By knowing the potential
According to Ed Dumbill chair at the customers preferences the company
O’Reilly Strata Conference, Big Data can can modify its product in order to
be described as, “data that exceeds the
address a larger area of people;
processing capacity of conventional
database systems. The data is too big, • In the detection of fraud in the online
moves too fast, or doesn’t fit the transactions for any industry;
strictures of your database architectures. • In risk assessment by analyzing
To gain value from this data, you must information from the transactions on
the financial market.
Database Systems Journal vol. III, no. 4/2012 5

In the future we propose to anayze the companies nowadays are doing business
potencial of Big Data and the power that cross countries and continents and the
can be enabled through Big Data differences in privacy laws are considerable
Analysis. and have to be taken into consideration
when starting the Big Data initiative.
Big Data challenges In our opinion for an organization to
get competitive advantage from the
The understanding of Big Data manipulation of Big Data it has to take very
is mainly very important. In order to good care of all factors when implementing
determine the best strategy for a company it. One option of developing a Big Data
it is essential that the data that you are strategy is presented below. In addition, in
counting on must be properly analyzed. order to bring full capabilities to Big Data
Also the time span of this analysis is each company has to take into consideration
important because some of them need to its own typical business characteristics.
be performed very frequent in order to
determine fast any change in the business
environment.

Another aspect is represented by


the new technologies that are developed
every day. Considering the fact that Big
Data is new to the organizations
nowadays, it is necessary for these
organizations to learn how to use the new
developed technologies as soon as they
are on the market. This is an important
aspect that is going to bring competitive
advantage to a business.

The need for IT specialists it is


also a challenge for Big Data. According Fig 1. Developing a Big Data Strategy
to McKinsey’s study on Big Data called (Source http://www.navint.com) [7]
Big Data: The next frontier for
innovation, there is a need for up to
190,000 more workers with analytical
expertise and 1.5 million more data- 3 Big Data Analytics
literate managers only in the United
States. This statistics are a proof that in The world today is built on the
order for a company to take the Big Data foundations of data. Lives today are
initiative has to either hire experts or train impacted by the ability of the companies to
existing employees on the new field. dispose, interrogate and manage data. The
development of technology infrastructure is
adapted to help generate data, so that all the
Privacy and Security are also offered services can be improved as they are
important challenges for Big Data. used.
Because Big Data consists in a large As an example, internet today
amount of complex data, it is very became a huge information-gathering
difficult for a company to sort this data platform due to social media and online
on privacy levels and apply the according services. At any minute they are added data.
security. In addition many of the The explosion of data cannot be any more
6 Perspectives on Big Data and Big Data Analytics

measured in gigabytes, since data is In order to make every decision as


bigger there are used etabytes, exabytes, desired there is the need to bring the results
zettabytes and yottabytes. of knowledge discovery to the business
In order to manage the giant process and at the same time track any
volume of unstructured data stored, it has impact in the various dashboards, reports
been emerged the “Big Data” and exception analysis being monitored.
phenomena. It stands to reason that in the New knowledge discovered through analysis
commercial sector Big-Data has been may also have a bearing on business
adopted more rapidly in data driven strategy, CRM strategy and financial
industries, such as financial services and strategy going forward. See figure 2
telecommunications, which it can be
argued, have been experiencing a more
rapid growth in data volumes compared
to other market sectors, in addition to
tighter regulatory requirements and
falling profitability. At first, Big Data
was seen as a mean to manage to reduce
the costs of data management. Now, the
companies focus on the value creation
potential. In order to benefit from
additional insight gained there is the need
to assess the analytical and execution Fig 2. Big Data Management
capabilities of “Big Data”.
Up until mid 2009 ago, the data
To turn big data into a business management landscape was simple: Online
advantage, businesses have to review the transaction processing (OLTP) systems
way they manage data within data centre. (especially databases) supported the
The data is taken from a multitude of enterprise's business processes; operational
sources, both from within and without the data stores (ODSs) accumulated the business
organization. It can include content from transactions to support operational reporting,
videos, social data, documents and and enterprise data warehouses (EDWs)
machine-generated data, from a variety of accumulated and transformed business
applications and platforms. Businesses transactions to support both operational and
need a system that is optimised for strategic decision making.
acquiring, organising and loading this Big Data Management is based on
unstructured data into their databases so capturing and organizing relevant data. Data
that it can be effectively rendered and analytics supposes to understand that
analysed. Data analysis needs to be deep happened, why and predict what will
and it needs to be rapid and conducted happen. A deeper analytics means new
with business goals in mind. analytical methods for deeper insights.[9]
The scalability of big data Big data analytics and the Apache
solutions within data centres is an Hadoop open source project are rapidly
essential consideration. Data is vast emerging as the preferred solution to
today, and it is only going to get bigger. business and technology trends that are
If a data centre can only cope with the disrupting the traditional data management
levels of data expected in the short to and processing landscape. Enterprises can
medium term, businesses will quickly gain a competitive advantage by being early
spend on system refreshes and upgrades. adopters of big data analytics. Even though
Forward planning and scalability are big data analytics can be technically
therefore important. challenging, enterprises should not delay
Database Systems Journal vol. III, no. 4/2012 7

implementation. As the Hadoop projects Fig 3. Oracle Big Data Solution (Source:
mature and business intelligence (BI) tool myoracle.com)
support improves, big data analytics
implementation complexity will reduce, In terms of Big Data Management
but the early adopter competitive and analytics Oracle is offering Engineered
advantage will also wane. Technology Systems as Big Data Solutions (Fig.3), such
implementation risk can be reduced by as Oracle Big Data Appliance, Oracle
adapting existing architectural principles Exadata and Oracle Exalytics. Big Data
and patterns to the new technology and solutions combine best tools for each part of
changing requirements rather than the problem. The traditional business
rejecting them. [10] intelligence tools rely on relational
Big data analytics can be databases for storage and query execution
differentiated from traditional data- and did not target Hadoop. Oracle BI
processing architectures along a number combined with Oracle Big Data Connectors.
of dimensions: The architecture supposes to load key
- Speed of decision making being elements of information from Big Data
very important for decision sources into DBMS. Oracle Big Data
makers connectors, Hive and Hadoop aware ETL
- Processing complexity because it such as ODI provide the needed data
eases the decision making process integration capabilities. The key benefits are
- Transactional data volumes which that the business intelligence investments
are very large and skills that are leveraged, there are made
- Data structure data can be insights from Big Data consumable for
structured and unstructured business users, there are combined Big Data
- Flexibility of processing/analysis with Application and OLTP data. [11]
consisting in the amount of
analysis that can be performed on
it
- Concurrency [9]
The big data analytics initiative
should be a joint project involving both
IT and business. IT should be responsible
for deploying the right big data analysis
tools and implementing sound data
management practices. Both groups
should understand that success will be
measured by the value added by business Fig 4. BI and Data Warehousing on Big
improvements that are brought about by Data (Source: myoracle.com)
the initiative.
Big Data provides many
opportunities for deep insights via data
mining:

• Uncover relationships between social


sentiment and sales data
• Predict product issues based on diagnostic
sensor data generated by products in the
field
8 Perspectives on Big Data and Big Data Analytics

• In fact, the signal-to-noise issues often processes the data and any changes in
mean deep analytics to mine insight source syntax or semantics [8]
hidden in the noise is essential, as many
• Although the storage nodes in a Hadoop
forms of Big Data are simply not cluster may be built using low cost
consumable in raw form
commodity x86 servers, the master nodes
(Name Node, Secondary Name Node and
“Big Data” is a Data Management
Job Tracker) requiring higher resilience
& Analytics market opportunity driven
levels to be built into the servers if disaster
by new market requirements. In-Database
is to be avoided. Map-Reduce operations
Analytics – Data Mining there are used
also generate a lot of network chatter so a
Big Data Connectors to combine Hadoop fast private network is recommended.
and DBMS data for deep analytics. Also
These requirements combine to add
there is the need to re-use SQL skills to
significant cost to a production cluster
apply deeper data mining techniques or
used in a commercial setting. [8]
re-use skills for statistical analysis.
Everything is all about “Big Data” • Compression capabilities in Hadoop are
instead of RAM-scale data. This is how limited because of the HDFS block
the predictive learning of relationships structure and require an understanding of
between knowledge concepts and the data and compression technology to
business events is done. [12] implement adding to implementation
Big-Data presents a significant complexity with limited impact on storage
opportunity to create new value from volumes.
giant data. It is important to determine Other aspects to consider include the
appropriate governance procedures in true cost of ownership of pre-production
order to manage development and and production clusters such as the design
implementations over the life of the build and maintenance of the clusters
technology and data. Failure to consider themselves, the transition to production of
the longer term implications of Map-Reduce code to the production
development will lead to productivity cluster in accordance with standard
issues and cost escalations. operational procedures and the
On the face of it, the cost of physically development of these procedures. [8]
storing large quantities of data is Whatever the true cost of Big-Data
dramatically reduced by the simplicity by compared to a relational data storage
which data can be loaded into a Big-Data approach, it is important that the
cluster because there is no longer development of Big-Data strategy is
required a complex ETL layer seen in any consciously done, understanding the true
more traditional Data Warehouse nature of the costs and complexity of the
solutions. The cluster itself is also infrastructure, practice and procedures that
typically built using low cost commodity are put in place.
hardware and analysts are free to write
code in almost any contemporary
language through the streaming API
available in Hadoop. 4 Big Data Analytics Software
Currently, the trend is for enterprises
• The business logic used within an ETL to re-evaluate their approach on data
flow to tokenise a stream of data and storage, management and analytics, as the
apply data quality standards to it must volume and complexity of data is growing
be encoded (typically using Java) so rapidly and unstructured data accounting
within each Map-Reduce program that is for 90% of the data today.
Database Systems Journal vol. III, no. 4/2012 9

Every day, 2.5 quintillion bytes of sensor data, audio, video, click
data are created — so much that 90% of streams, log files and more. The
the data in the world today has been analysis of combined data types
created in the last two years alone. This brings new aspect for problems,
data comes from various sources such as: situations etc.( e.g. monitor 100’s of
sensors used to gather climate live video feeds from surveillance
information, posts to social media sites, cameras to target points of interest;
digital pictures and videos, purchase exploit the 80% data growth in
transaction records, and cell phone GPS images, video and documents to
signals, web and software logs, cameras, improve customer satisfaction);
information-sensing mobile devices, • Veracity: Since one of three business
aerial sensory technologies and leaders don’t trust the information
genomics. This data is referred to as big they use to make decisions,
data. establishing trust in big data presents
“Legacy systems will remain a huge challenge as the variety and
necessary for specific high-value, low- number of sources grows.
volume workloads, and compliment the
use of Hadoop - optimizing the data Apache Hadoop is a fast-growing
management structure in the organization big-data processing platform defined as “an
by putting the right Big Data workloads open source software project that enables the
in the right systems”[14]. distributed processing of large data sets
As it was mentioned in the across clusters of commodity servers”[15]. It
Introduction Big data spans four is designed to scale up from a single server
dimensions: Volume, Velocity, Variety, to thousands of machines, with a very high
and Veracity degree of fault tolerance.
• Volume: Enterprises are awash Rather than relying on high-end
with ever-growing data of all hardware, the resiliency of these clusters
types, easily amassing terabytes - comes from the software’s ability to detect
even petabytes - of and handle failures at the application layer.
information(e.g. turn 12 terabytes Developed by Doug Cutting,
of Tweets created each day into Cloudera's Chief Architect and the
improved product sentiment Chairman of the Apache Software
analysis; convert 350 billion Foundation, Apache Hadoop was born out
annual meter readings to better of necessity as data from the web exploded,
predict power consumption); and grew far beyond the ability of traditional
• Velocity: For time-sensitive systems to handle it. Hadoop was initially
processes such as catching fraud, inspired by papers published by Google
big data flows must be analysed outlining its approach to handling an
and used as they stream into the avalanche of data, and has since become the
organizations in order to de facto standard for storing, processing and
maximize the value of the analyzing hundreds of terabytes, and even
information( e.g. scrutinize 5 petabytes of data.
million trade events created each Apache Hadoop is 100% open
day to identify potential fraud; source, and pioneered a fundamentally new
analyze 500 million daily call way of storing and processing data. Instead
detail records in real-time to of relying on expensive, proprietary
predict customer churn faster). hardware and different systems to store and
• Variety: Big data consists in any process data, Hadoop enables distributed
type of data - structured and parallel processing of huge amounts of data
unstructured data such as text, across inexpensive, industry-standard
10 Perspectives on Big Data and Big Data Analytics

servers that both store and process the file system. HDFS assumes nodes
data, and can scale without limits. will fail, so it achieves reliability by
In today’s hyper-connected world replicating data across multiple
where more and more data is being nodes
created every day, Hadoop’s
breakthrough advantages mean that HDFS is expected to run on high-
businesses and organizations can now performance commodity hardware; it is
find value in data that was recently known for highly scalable storage and
considered useless. automatic data replication across three nodes
Hadoop can handle all types of for fault tolerance. Furthermore, automatic
data from disparate systems: structured, data replication across three nodes
unstructured, log files, pictures, audio eliminates need for backup (write once, read
files, communications records, email - many times).
regardless of its native format. Even Hadoop is supplemented by an
when different types of data have been ecosystem of Apache projects, such as Pig,
stored in unrelated systems, it is possible Hive and Zookeeper, that extend the value
to store it all into Hadoop cluster with no of Hadoop and improve its usability. Due to
prior need for a schema. the cost-effectiveness, scalability and
By making all data useable, streamlined architectures, Hadoop changes
Hadoop provides the support to the economics and the dynamics of large
determine inedited relationships and scale computing, having a remarkable
reveal answers that have always been just influence based on four salient
out of reach. characteristics. Hadoop enables a computing
In addition, Hadoop’s cost solution that is:
advantages over legacy systems redefine
the economics of data. Legacy systems, • Scalable: New nodes can be added
while fine for certain workloads, simply as needed, and added without
were not engineered with the needs of needing to change data formats, how
Big Data in mind and are far too data is loaded, how jobs are written,
expensive to be used for general purpose or the applications on top.
with today's largest data sets. • Cost effective: Hadoop brings
Apache Hadoop has two main massively parallel computing to
subprojects: commodity servers. The result is a
• MapReduce - The framework that sizeable decrease in the cost per
understands and assigns work to terabyte of storage, which in turn
the nodes in a cluster. Has been makes it affordable to model all your
defined by Google in 2004 and is data.
able to distribute data workloads • Flexible: Hadoop is schema-less, and
across thousands of nodes. It is can absorb any type of data,
based on “break problem up into structured or not, from any number
smaller sub-problems” strategy of sources. Data from multiple
and can be exposed via SQL and sources can be joined and aggregated
in SQL-based BI tools; in arbitrary ways enabling deeper
• Hadoop Distributed File System analyses than any one system can
(HDFS) - An Apache open source provide.
distributed file system that spans • Fault tolerant: When you lose a
all the nodes in a Hadoop cluster node, the system redirects work to
for data storage. It links together another location of the data and
the file systems on many local continues processing without
nodes to make them into one big missing a beat.
Database Systems Journal vol. III, no. 4/2012 11

name of the rack (more precisely, of the


Text mining makes sense of text- network switch) where a worker node is.
rich information such as insurance Hadoop applications can use this
claims, warranty claims, customer information to run work on the node where
surveys, or the growing streams of the data is, and, failing that, on the same
customer comments on social networks. rack/switch, reducing backbone traffic. The
Optimization helps retailers and Hadoop Distributed File System (HDFS)
consumer goods makers, among others, uses this when replicating data, to try to
with tasks such as setting prices for the keep different copies of the data on different
best possible balance of strong-yet- racks. The goal is to reduce the impact of a
profitable sales. Forecasting is used by rack power outage or switch failure so that
insurance companies, for example, to even if these events occur, the data may still
estimate exposure or losses in the event be readable.
of a hurricane or flood.
Cost will certainly be a software
selection factor as that's a big reason
companies are adopting Hadoop; they're
trying to retain and make use of all their
data, and they're expecting cost savings
over conventional relational databases
when scaling out over hundreds of
Terabytes or more. Sears, for example,
has more than 2 petabytes of data on
hand, and until it implemented Hadoop
two years ago, Shelley says the company
was constantly outgrowing databases and Fig 5. A multi-node Hadoop cluster[13]
still couldn't store everything on one
platform. As shown in Fig. 5, a small Hadoop
Once the application can run on cluster will include a single master and
Hadoop it will presumably be able to multiple worker nodes. The master node
handle projects with even bigger and consists of a JobTracker, TaskTracker,
more varied data sets, and users will be NameNode, and DataNode.
able to quickly analyze new data sets A slave or worker node acts as both a
without the delays associated with DataNode and TaskTracker, though it is
transforming data to meet a rigid, possible to have data-only worker nodes,
predefined data model as required in and compute-only worker nodes; these are
relational environments. normally only used in non-standard
From architectural point of view, applications.
Hadoop consists of the Hadoop Common Hadoop requires JRE 1.6 or higher.
which provides access to the filesystems The standard startup and shutdown scripts
supported by Hadoop. The Hadoop require Secure Shell(SSH) to be set up
Common package contains the necessary between nodes in the cluster.
JAR files and scripts needed to start In a larger cluster, the HDFS is
Hadoop. The package also provides managed through a dedicated NameNode
source code, documentation, and a server to host the filesystem index, and a
contribution section which includes secondary NameNode that can generate
projects from the Hadoop Community. snapshots of the namenode's memory
For effective scheduling of work, structures, thus preventing filesystem
every Hadoop-compatible filesystem corruption and reducing loss of data.
should provide location awareness: the
12 Perspectives on Big Data and Big Data Analytics

Similarly, a standalone JobTracker server hic-big-flood-of-big-data-in-digital-


can manage job scheduling. marketing/
In clusters where the Hadoop [2] H. Moed, The Evolution of Big Data as a
MapReduce engine is deployed against Research and Scientific Topic: Overview of
an alternate filesystem, the NameNode, the Literature, 2012, ResearchTrends,
secondary NameNode and DataNode http://www.researchtrends.com
architecture of HDFS is replaced by the [3] MIKE 2.0, Big Data Definition,
filesystem-specific equivalent. http://mike2.openmethodology.org/wiki/Big
One of the cost advantages of _Data_Definition
Hadoop is that because it relies in an
internally redundant data structure and is [4] P. Zikipoulos, T. Deutsch, D. Deroos,
deployed on industry standard servers Harness the Power of Big Data, 2012,
rather than expensive specialized data http://www.ibmbigdatahub.com/blog/harnes
storage systems, you can afford to store s-power-big-data-book-excerpt
data not previously viable. [5] Gartner, Big Data Definition,
Big data is more than simply a http://www.gartner.com/it-glossary/big-data/
matter of size; it is an opportunity to find
insights in new and emerging types of [6]E. Dumhill, “What is big data?”, 2012
data and content, to make businesses ,http://strata.oreilly.com/2012/01/what-is-
more agile and to answer questions that big-data.html
were previously considered beyond [7] A Navint Partners White Paper, “Why is
reach. BIG Data Important?” May 2012,
Enterprises who build their Big http://www.navint.com/images/Big.Data.pdf
Data solution can afford to store literally [8] Greenplum. A unified engine for
all the data in their organization, and keep RDBMS and Map Reduce, 2009.
it all online for real-time interactive http://www.greenplum.com/resources/mapre
querying, business intelligence, analysis duce/.
and visualization. [9] For Big Data Analytics There’s No Such
Thing as Too Big The Compelling
Economics and Technology of Big Data
5 Conclusions Computing, White Paper, March 2012, By:
4syth.com, Emerging big data thought
The year 2012 is the year when leaders
companies are starting to orient [10] Big data: The next frontier for
themselves towards the use of Big Data. innovation, competition, and productivity.
That is why this articol presents the Big James Manyika, Michael Chui, Brad
Data concept and the technologies Brown, Jacques Bughin, Richard Dobbs,
associated in order to understand better Charles Roxburgh, and Angela Hung Byers.
the multiple beneficies of this new McKinsey Global Institute. May 2011
concept ant technology. [11] Oracle Information Architecture: An
In the future we propose for our research Architect’s Guide to Big Data, An Oracle
to further investigate the practical White Paper in Enterprise Architecture
advantages that can be gain through August 2012
Hadoop.
[12] http://bigdataarchitecture.com/
References
[1] G. Noseworthy, Infographic: [13]http://www.oracle.com/us/corporate/pre
Managing the Big Flood of Big Data in ss/1453796
Digital Marketing, 2012
http://analyzingmedia.com/2012/infograp [14]http://www.informationweek.com/softw
Database Systems Journal vol. III, no. 4/2012 13

are/business-intelligence/sas-gets-hip-to-
hadoop-for-big-data/240009035?pgno=2

[15]http://en.wikipedia.org/wiki/Apache_
Hadoop
14 Real-Time Business Intelligence for the Utilities Industry

Elena-Geanina ULARU graduated from the Faculty of Cybernetics, Statistics and


Economic Informatics of the Academy of Economic Studies in 2008. She holds a Master
Degree obtained at Faculty of Cybernetics, Statistics and Economic Informatics of the
Academy of Economic Studies at the Academy of Economic Studies and is currently a Phd.
Student, in the second year, at the Institute of Doctoral Studies, doing her reasarch at the
University of Economics from Prague.

Florina Camelia PUICAN is a Phd. Student, in the second year, at the Institute of
Doctoral Studies. Bucharest. In 2008, she graduated from Faculty of Business Administration
with teaching in foreign languages (English), at the Academy of Economic Studies, Bucharest
and in 2009, from Faculty of Mathematics and Computer Science, section Computer Science,
University of Bucharest. From 2010, she holds a Master Degree obtained at Faculty of
Business Administration with teaching in foreign language (English), at the Academy of
Economic Studies, Bucharest. During her studies and work experience she undertook a wide
range of skills in economics, information technology and information systems for business,
design and management of information systems and databases.

Anca Apostu has graduated The Academy of Economic Studies from Bucharest
(Romania), Faculty of Cybernetics, Statistics and Economic Informatics in 2006. She has a
Master diploma in Economic Informatics from 2010 and in present she is a Ph.D. Candidate
in Economic Informatics with the Doctor’s Degree Thesis: “Informatics solution in a
distributed environment regarding unitary tracking of prices”. Her scientific fields of interest
include: Economics, Databases, Programming, Information Systems, Information Security
and Distributed Systems.

Manole VELICANU is a Professor at the Economic Informatics, Cybernetics and Statistics


Department at the Faculty of Cybernetics, Statistics and Economic Informatics from the
Academy of Economic Studies of Bucharest. He has graduated the Economic Informatics
from the Academy of Economic Studies in 1976, holds a PhD diploma in Economic
Informatics from 1994 and starting with 2002 he is a PhD coordinator in the field of
Economic Informatics. He is the author of 21 books in the domain of economic informatics,
84 published articles, 74 scientific papers presented at conferences. He participated (as
director or as team member) in more than 50 research projects that have been financed from
national and international research programs. His fields of interest include: Database Systems,
Business Intelligence, Information Systems, Artificial Intelligence, Programming languages.

You might also like