You are on page 1of 10

CHARM: A Cost-efficient Multi-cloud Data

Hosting Scheme with High Availability

Abstract:
More and more enterprises and organizations are hosting their data into the cloud,
in order to reduce the IT maintenance cost and enhance the data reliability.
However, facing the numerous cloud vendors as well as their heterogenous pricing
policies, customers may well be perplexed with which cloud(s) are suitable for
storing their data and what hosting strategy is cheaper. The general status quo is
that customers usually put their data into a single cloud (which is subject to the
vendor lock-in risk) and then simply trust to luck. Based on comprehensive
analysis of various state-of-the-art cloud vendors, this paper proposes a novel data
hosting scheme (named CHARM) which integrates two key functions desired. The
first is selecting several suitable clouds and an appropriate redundancy strategy to
store data with minimized monetary cost and guaranteed availability. The second is
triggering a transition process to re-distribute data according to the variations of
data access pattern and pricing of clouds. We evaluate the performance of CHARM
using both trace-driven simulations and prototype experiments. The results show
that compared with the major existing schemes, CHARM not only saves around
20% of monetary cost but also exhibits sound adaptability to data and price
adjustments.

Existing System
Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore
www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457

In existing industrial data hosting systems, data availability (and reliability) are
usually guaranteed by replication or erasure coding. In the multi-cloud scenario,
we also use them to meet different availability requirements, but the
implementation is different. For replication, replicas are put into several clouds,
and a read access is only served (unless this cloud is unavailable then) by the
cheapest cloud that charges minimal for out-going bandwidth and GET
operation. For erasure coding, data is encoded into n blocks including m data
blocks and nm coding blocks, and these blocks are put into n different clouds. In
this case, though data availability can be guaranteed with lower storage space
(compared with replication), a read access has to be served by multiple clouds that
store the corresponding data blocks. Consequently, erasure coding cannot make
full use of the cheapest cloud as what replication does. Still worse, this
shortcoming will be amplified in the multi-cloud scenario where bandwidth is
generally (much) more expensive than storage space.

Proposed System:

The proposed CHARM scheme. In this paper, we propose a novel cost-efficient


data hosting scheme with high availability in heterogenous multi-cloud, named
CHARM. It intelligently puts data into multiple clouds with minimized monetary
cost and guaranteed availability. Specifically, we combine the two widely used
redundancy mechanisms, i.e., replication and erasure coding, into a uniform model
to meet the required availability in the presence of different data access patterns.
Next, we design an efficient heuristic-based algorithm to choose proper data
storage modes (involving both clouds and redundancy mechanisms). Moreover, we
implement the necessary procedure for storage mode transition (for efficiently redistributing data) by monitoring the variations of data access patterns and pricing
policies. We evaluate the performance of CHARM using both tracedriven
Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore
www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457

simulations and prototype experiments.The traces are collected from two online
storage systems:, both of which possess hundreds of thousands of users. In the
prototype experiments, we replay samples from the two traces for a whole month
on top of four mainstream commercial clouds: Amazon S3, Windows Azure,
Google Cloud Storage, and Aliyun OSS. Evaluation results show that compared
with the major existing schemes which will be elaborated in x VII-B), CHARM
not only saves around 20% (more in detail, 7% 44%) of monetary cost but als
Advantages:
Replication mechanism when the files size is small. That is why gray level 4 puts
its feet into the region of lower read count and smaller file size. This storage mode
table only depends on prices of the available clouds and required availability. If
the prices change, the table will change accordingly, becoming a different one.

Problem Statement
Nevertheless, as for multi-cloud people still encounter the two critical
problems:
How to choose appropriate clouds to minimize monetary cost in the
presence of heterogenous pricing policies?
How to meet the different availability requirements of different services?
As to monetary cost, it mainly depends on the data-level usage, particularly
storage capacity consumption and network bandwidth consumption.
As to availability requirement, the major concern lies in which redundancy
mechanism (i.e., replication or erasure coding) is more economical based on
specific data access patterns. In other words, here the fundamental challenge
is:

Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore
www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457

How to combine the two mechanisms elegantly so as to greatly reduce


monetary cost and meanwhile guarantee required availability?
Data Hosting and SMS are two important modules in CHARM. Data
Hosting decides storage mode and the clouds that the data should be stored
in.
This is a complex integer programming problem demonstrated in the
following subsections. Then we illustrate how SMS works in detail in x V,
that is, when and how many times should the transition be implemented.

Scope
As a holistic storage system, there are several other factors to be considered, such as
cache strategies, geographical data consistency, etc. However, we only focus on the data
hosting strategy to minimize monetary cost while meeting flexible availability
requirements. Though we have considered the complexity and feasibility when designing
this strategy, the system design is out of the scope of this paper, and we put the detailed
system design of multi-cloud data hosting into future work. the complexity of this
algorithm is mainly the first loop, and the worst case complexity is O(Fn), where Fn is
the number of files. In order to reduce the complexity further, we can classify files with
similar access patterns into groups, and implement transition in the unit of group. This is
out of the scope of this paper.

Implementation of modules
Architecture:

Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore
www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457

Multi-cloud:
Lots of data centers are distributed around the world, and one region such as
America, Asia, usually has several data centers belonging to the same or different
cloud providers. So technically all the data centers can be access by a user in a
certain region, but the user would experience different performance. The latency
of some data centers is very low while that of some ones may be intolerable high.
CHARM chooses clouds for storing data from all the available clouds which meet
the performance requirement, that is, they can offer acceptable throughput and
latency when they are not in outage. The storage mode transition does not impact
the performance of the service. Since it is not a latency-sensitive process, we can
decrease the priority of transition operations, and implement the transition in batch
when the proxy has low workload.

Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore
www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457

Data hosting:
In this section, we elaborate a cost-efficient data hosting model with high
availability in heterogenous multi-cloud, named CHARM. The architecture of
CHARM is shown in Figure 3. The whole model is located in the proxy in Figure
1. There are four main components in CHARM: Data Hosting, Storage Mode
Switching (SMS), Workload Statistic, and Predictor. Workload Statistic keeps
collecting and tackling access logs to guide the placement of data. It also sends
statistic information to Predictor which guides the action of SMS. Data Hosting
stores data using replication or erasure coding, according to the size and access
frequency of the data. SMS decides whether the storage mode of certain data
should be changed from replication to erasure coding or in reverse, according to
the output of Predictor. The implementation of changing storage mode runs in the
background, in order not to impact online service. Predictor is used to predict the
future access frequency of files. The time interval for prediction is one month, that
is, we use the former months to predict access frequency of files in the next
month. However, we do not put emphasis on the design of predictor, because there
have been lots of good algorithms for prediction. Moreover, a very simple
predictor, which uses the weighted moving average approach, works well in our
data hosting model. Data Hosting and SMS are two important modules in
CHARM. Data Hosting decides storage mode and the clouds that the data should
be stored in. This is a complex integer programming problem demonstrated in the
following subsections. Then we illustrate how SMS works in detail in x V, that is,
when and how many times should the transition be implemented.

Cloud Storage:

Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore
www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457

Cloud storage services have become increasingly popular. Because of the importance of
privacy, many cloud storage encryption schemes have been proposed to protect data from
those who do not have access. All such schemes assumed that cloud storage providers are
safe and cannot be hacked; however, in practice, some authorities (i.e., coercers) may
force cloud storage providers to reveal user secrets or confidential data on the cloud, thus
altogether circumventing storage encryption schemes. In this paper, we present our
design for a new cloud storage encryption scheme that enables cloud storage providers to
create convincing fake user secrets to protect user privacy. Since coercers cannot tell if
obtained secrets are true or not, the cloud storage providers ensure that user privacy is
still securely protected. Most of the proposed schemes assume cloud storage service
providers or trusted third parties handling key management are trusted and cannot be
hacked; however, in practice, some entities may intercept communications between users
and cloud storage providers and then compel storage providers to release user secrets by
using government power or other means. In this case, encrypted data are assumed to be
known and storage providers are requested to release user secrets. we aimed to build an
encryption scheme that could help cloud storage providers avoid this predicament. In our
approach, we offer cloud storage providers means to create fake user secrets. Given such
fake user secrets, outside coercers can only obtained forged data from a users stored
ciphertext. Once coercers think the received secrets are real, they will be satisfied and
more importantly cloud storage providers will not have revealed any real secrets.
Therefore, user privacy is still protected. This concept comes from a special kind of
encryption scheme called deniable encryption.

Owner Module:
Owner module is to upload their files using some access policy. First they
get the public key for particular upload file after getting this public key owner
request the secret key for particular upload file. Using that secret key owner upload
their file.
Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore
www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457

User Module:
This module is used to help the client to search the file using the file id and
file name .If the file id and name is incorrect means we do not get the file,
otherwise server ask the public key and get the encryption file.If u want the the
decryption file means user have the secret key.

Algorithm:
The key idea of this heuristic algorithm can be described as follows:
We first assign each cloud a value which is calculated based on four factors (i.e.,
availability, storage, bandwidth, and operation prices) to indicate the preference of
a cloud. We choose the most preferred n clouds, and then heuristically exchange
the cloud in the preferred set with the cloud in the complementary set to search
better solution. This is similar to the idea of Kernighan-Lin heuristic algorithm ,
which is applied to effectively partition graphs to minimize the sum of the costs on
all edges cut. The preference of a cloud is impacted by the four factors, and they
have different weights. The availability is the higher the better, and the price is the
lower the better.

Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore
www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457

Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore
www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457

Conclusion:

Cloud services are experiencing rapid development and the services based on
multi-cloud also become prevailing. One of the most concerns, when moving
services into clouds, is capital expenditure. So, in this paper, we design a novel
storage scheme CHARM, which guides customers to distribute data among clouds
cost-effectively. CHARM makes fine-grained decisions about which storage mode
to use and which clouds to place data in. The evaluation proves the efficiency of
CHARM.

Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore
www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457