Professional Documents
Culture Documents
Publication History
Received: 25 December 2015
Accepted: 12 January 2016
Published: 1 February 2016
Citation
Bhuvaneswari Ragothaman, Elsa Jose, Surya Prabha M, Sarojini Ilango B. Big Data Framework for Healthcare using Hadoop. Indian
Journal of Science, 2016, 23(78), 121-126
121Page
Big Data Framework for Healthcare using Hadoop
Bhuvaneswari Ragothaman1, Elsa Jose2, Surya Prabha M3 , Dr. B.Sarojini Ilango4
1,2
MPhil Scholar, Avinashilingam University, Coimbatore
3
Ph.D Scholar, Avinashilingam University, Coimbatore
4
Assistant Professor, Avinashilingam University, Coimbatore
Abstract
The technological advancements prevailing today make it possible to sense, collect, transmit, store and
analyze the voluminous, heterogeneous medical data. This data could be mined to provide valuable and
interesting patterns. However, extracting knowledge from big data poses a number of challenges that
need to be addressed. In this paper we propose big data architecture for processing medical data. To
handle this large volume of data, Big Data’s Hadoop tool can be used. The Apache Hadoop Framework
for processing distributed healthcare database is also discussed.
Velocity
Big Variety
Data
Veracity
Role of Big Data in Healthcare quality of the treatment and the quality of hospital
could be improved with the examination which is
In healthcare there is a huge amount of data done by the Indian Medical Council
generated from various departments like research
data, clinical data, medical insurance claim, drug
details, electronic medical record (EMR) and etc.
This leads to enormous amount of data which is
said to be Big Data, which helps in overcoming the
problems like storage, processing and management.
There is various advancement in healthcare due to
Big Data analytics. With the help of analysis,
disease could be identified easily and drugs and
treatment can be given efficiently. And side effects
of the treatment could be avoided with this process.
This also minimizes the cost for the treatment. The
Big Data Architecture for Healthcare 3. Application Service Layer: This layer
consists of Application Program Interface
Big Data Architecture for healthcare consists of (API), User Interface and Data Access.
three layers as Data Collection Layer, Data API is used for the developer and the user
Management Layer and Application Service Layer interface is used for end user in an
[6]. efficient way to access the data.
Apache Hadoop Framework There are more than 150 eco systems available in
124
of computers using simple programming models. large amount of data. Map Reduce and HDFS are
the two main eco systems of Hadoop. Hadoop was Pig
developed by Google and facebook to handle a
large amount of data generated and later it was Pig is a language used for analyzing and processing
taken by apache to handle the large amount of data of large data sets. It is done in two modes like
in various fields. Hadoop framework consists of Local mode and Hadoop mode. To run the script in
four modules as Local mode no HDFS or Hadoop is required. It is
run in local file systems. To run the script in
1. Hadoop Common consists of common Hadoop mode Hadoop installation is required. The
utilities that support the other Hadoop code is run in the Hadoop virtual machine.
modules.
2. Hadoop Distributed File System Hadoop Distributed File System (HDFS)
(HDFS) is a distributed file system that HDFS is a distributed file system designed to hold
provides high-throughput access to very large amount of data and this is highly fault
application data. tolerant. In the case of any system failure the loss
3. Hadoop YARN is a framework for job of data is being minimized. HDFS replicates the
scheduling and cluster resource data nodes to all the computers in the cluster and
management. manages the data transfer.
4. Hadoop Map Reduce is a YARN based HDFS Architecture
system for parallel processing of large
amount of data sets. HDFS follows master slave architecture. HDFS
consists of Name node, Data node and Blocks.
Data Node: This node manages the data storage Blocks: Data are stored in HDFS are divided into
125
and perform read – write operations of the file segments and stored in individual data nodes know
system as per the client request. According to the as blocks. And the minimum amount of data that
instructions given by the name node it also perform HDFS can read and write is called as Block
Page
References:
1. Dr Saravana kumar N M, Eswari T , Sampath P & Lavanya S, Predictive Methodology for Diabetic
Data Analysis in Big Data, Elsevier - 2nd International Symposium on Big Data and Cloud Computing
(ISBCC’15) - 2015.
2. J.Archenaa and E.A.Mary Anita, A Survey Of Big Data Analytics in Healthcare and Government,
Elsevier - 2nd International Symposium on Big Data and Cloud Computing (ISBCC’15) - 2015.
3. Aisling O’Driscoll, Jurate Daugelaite, Roy D. Sleator, ‘Big data’, Hadoop and cloud computing in
genomics, Elsevier - Journal of Biomedical Informatics – 2013
4. Marco Viceconti, Peter Hunter and Rod Hose, Big Data, Big Knowledge: Big Data for Personalized
Healthcare, IEEE Journal of Biomedical and Health Informatics, Vol 19, No 4 – July 2015.
5. IbrahimAbakerTargioHashem, IbrarYaqoob, NorBadrulAnuar, Salimah Mokhtar, AbdullahGani,
SameeUllahKhan , The rise of “big data” on cloud computing: Review and open research issues,
Elsevier – Information Systems47(2015) 98 – 115 – 2015
6. Yin Zhang, Meikang Qiu, Chun-Wei Tsai Mohammad Mehedi Hassan and Atif Alamri, Health – CPS:
Healthcare Cyber-Physical System Assisted by Cloud and Big Data.IEEE Systems Journal - 2015
126
Page