You are on page 1of 5

IPASJ International Journal of Computer Science (IIJCS)

Web Site: http://www.ipasj.org/IIJCS/IIJCS.htm


A Publisher for Research Motivation ........ Email:editoriijcs@ipasj.org
Volume 5, Issue 10, October 2017 ISSN 2321-5992

COST MINIMIZATION MODEL FOR ONLINE SOCIAL


NETWORKS IN GEO-DISTRIBUTED CLOUDS
2
Abdul Abrar, 1Md Ateeq Ur Rahman
2
SCET, Hyderabad,

1
Professor, Dept of CSE, SCET, Hyderabad,

Abstract
The growth of internet services have greater impact among the social users. Cloud computing is the recent technology adopted
by several Internet Service Providers (ISPs). Data center is the core part of the cloud technologies. Irrespective of different
geographic locations, the data availability should be an easier process.Different sorts of system objectives can be addressed for
data placement and replication process at different cloud servers.Social users have different proximities for every location. This
paper resolves the data placement issue for Online Social Networks (OSN). Usually, the users data are interconnected at
different locations. Every read/write operations incur storage costs. Thus, we have proposed an enhanced costoptimization
modelfor placing social media data which investigates the storage cost, redistribution cost and total estimated cost.
Experimental analysis is carried out in Twitter dataset. The local information about the user i.e profile data and their
connection details are collected for certain period of time. The cost is optimized from the throughput factor of the data
placements over the cloud servers. Based on the geographical distance value, the user data are placed over the cloud servers.
The cost is explored for one-time cost reduction and continuous cost reduction for inter-cloud . We have achieved 10% lesser
cost than prior model.
Keywords: Internet Service Providers, Data placement, Data replication, Data center, Inter-cloud and Cost
optimization.
I. INTRODUCTION

A substantial numbers of clients have been topographically dispersed over the Online Social Networks [1].The
OSNsuser shares different sorts of data such as image, video with different size among their associations.The data
volume and its items grow tremendously. Moreover, the users requirements like low dormancy, information
consistency and accessibility receiving from the Online Social Networks Provider. Each user is administered with the
specific threshold to protect their information from anonymous users [2].A possible solution would be to store the data
related to every user in every available datacentre. However, as different copies of user data may need updating
regularly, and due to its very large size, such a huge investment becomes infeasible and uneconomic. Hence, there is
always a trade-off between the data storage cost and latency [3].
Several social network service providers maintaintheir own data center to protect the users data. The construction of
the private cloud is a complex task and expensive too. The providers should incessantly enhance the storage space, data
transmission speed, energy consumption model and costs [4].
To resolve this cost-related issue, a recent technology, named, cloud computing is adopted. Cloud computing
is the latest technologywhere users can take lease of the softwares, hardware and platforms for certain period of time.
Relied upon the process usages, the cost is explored. Anyhow, the privacy of the outsourced data has to be explored. In
addition, cloud rental is also verycostly and energy expensive if nave social media datareplication and distribution
were used [5].Even though, several cloud providers are available; the basic amenities like set up, private storage and its
managements are to be analyzed from each service providers.
Let us consider two users from different locations share their sensitive data. The solution is the storage of their
sensitive data in either of the two datacenters which devises storage cost of those two datacenters [6].The alternate
solution is the reduction of storage cost in any one of the datacenters which throws serious drawbacks, higher latency
towards the users.Thus, both users can have a tolerable latency by paying only one time storage cost. Hence, we need to
explore all possible placement strategies to find out the best one.Placement and replication of sensitive data is a
complex task in Online Social Networks (OSNs). Taking the latency for users and all friends to access themain users
data into account makes our work different fromothers.

Volume 5, Issue 10, October 2017 Page 67


IPASJ International Journal of Computer Science (IIJCS)
Web Site: http://www.ipasj.org/IIJCS/IIJCS.htm
A Publisher for Research Motivation ........ Email:editoriijcs@ipasj.org
Volume 5, Issue 10, October 2017 ISSN 2321-5992

The rest of the paper is organized as follows: Section II depicts the related work; Section III presents the proposed
work; Section IV gives the experimental analysis and results and concludes in Section V.
II. RELATED WORK

This section depicts the prior works carried out by other researchers. The explosion on Online Social Networks has
witnessed a tremendous growth in the recent years. Several internet applications like YouTube, Facebook, twitter ETC
has ruined the human world. Most of the researchers studied about social structure, user behavior and network usage of
the YouTube application. The migration of the data from social media to cloud should be effectively designed for cost
minimization factor. The author in [7] discussed about the cluster based social media system with cloud assistance
which upgrades the cost minimization functionalities. Similar study was performed by author [8] who stated that
content migration using Weighted Partitioning around Medoids (WPAM) algorithm. Their model balanced the load of
the data in terms of users access model. The author in [9] suggested SPAR model that defines data migration using
middle ware application. The model replicated the attached nodes in the server. The aim of their model is to reduce the
rate of data replication. The author in [10] discussed about SNAP model that specifically designed for cloud scenarios
which partitions the small tasks in small-scaled network. They also modularized the connected graph in parallel
concepts.
Multi-cloud or multi-datacenter is the recent concept explored for optimal data center communication process. The
inter-datacenter communication service was explored by author [11]. They maintained the replicated data in remote
server to reduce the burden of the social users. The data resided at remote server should handle with care to perform
update operations. Their model performed either read or write operations that ensures replica should reduce the total
inter data center communications. The author in [12] discussed about the Facebook issue to the users in US due to
internet connectivity problem.
TCP proxies are used for resolving the interconnectivity issue. The authors proposed to use local servers as
TCP proxies and caching servers to improve service responsiveness and efficiency, focusing on the interplay between
user behavior, OSN mechanisms, and network characteristics.
The author in [13] discussed about the geo-distributed clouds for social media streaming algorithms. The objective of
their model was to reduce the response time taken for data retrieval process. Lyapunovoptimization techniques was
proposed for user influence in both online and offline model.The author in [14] studied about the social applications for
efficient data collection and preprocessing process. The user-generated information was collected in both local and
global distribution for designing networking protocols.They determined the resources allocation across clouds with
different network size. The author in [15] defined partitioning algorithms for content analysis of social media networks.
Initially, the media data was analyzed for Skew-ness distribution. The authors formulated an optimization problem and
solved it to preserve social relations and to balance the workload among servers.
III. PROPOSED WORK

This section depicts the working model of our proposed work. The objective of our study is to leverage the cloud
service cost without compromising the Quality of Service (QoS). The proposedmodel is explained as follows:
a) Construction of Online Social Networks (OSNs)

The construction of Online Social Networks (OSNs) portrays the type of networks used for building trust relationship
between the users and their connections. Authentication is a significant process incorporated between the social users
and the social network service providers. The new social users get registered with the OSNs providers and then seek
approval from them. Once the approval is verified, the new users access the social networks. In similar to, the prior
social users can directly access the social data. The connected graph is plotted for every social user in the social
networks. In this step, we have deployed Single- Master Multi- Slave paradigm. Consider a social graph with N users
N1,N2...Nnwith its weights wi. There is C, numbers of cloud servers, C1, C2Ck with k different locations with
calculated weight as:

The constraint is to realize the load balance among the cloud servers bycontrolling the weight difference and mitigate
the cross-boundary traffic.
b) Storage modeling and inter-cloud traffic cost

Volume 5, Issue 10, October 2017 Page 68


IPASJ International Journal of Computer Science (IIJCS)
Web Site: http://www.ipasj.org/IIJCS/IIJCS.htm
A Publisher for Research Motivation ........ Email:editoriijcs@ipasj.org
Volume 5, Issue 10, October 2017 ISSN 2321-5992

Generally, the social graph indicates nodes as user and edges as social association between the users.The storage cost
for each social user with single replica of her data. During billing period, each user maintains the traffic costs. The
traffic is analyzed for inter-cloud and intra cloud for synchronization process. The read/write operations are analyzed
for calculating the cost. Fortunately, its charging can be included just as part of a users traffic cost.
For example, let u = wuT denote user us traffic cost, where wu is the number of writes performed on us data and
T is the average traffic cost incurred by a single write. Then, one can include the cost charged for a single write into T
so that optimizing the total inter-cloud traffic cost by our model can actually optimize the sum of the traffic and the
read/write operations cost.
c) Prototyping the redistribution cost

Cost optimization is the major objective of our study. This step investigates placement of the data based on its cost.
Latency is the index number used with billing period for long-term services. Relied upon the users budget, the service
cost is analyzed for the users. The redistribution cost occurs at the beginning of every billing period.Every user has
aprimary datacentre that is the nearest datacentre to theirlocation.Every user reads data from the nearest data centre
that has a copy of the data. Thus, the final latency for every user is the summation of the latency between them and
their data and the latency between all their friends and the nearest secondary replicas to them.
d) Approximating the total cost:

Let us consider the social graph constructed during billing period. The storage cost in is for storing users' data
replicas, including the data replicas of existing users and of those who just join the service in this period. The inter-
cloud traffic cost in is for propagating all users' writes to maintain replica consistency. The redistribution cost is the
cost of moving data across clouds for optimization; it is only incurred at the beginning of a period. There is also some
underlying cost for maintenance.

Thus, the storage cost is the cost for storing users data and replicas for one month in different data centers.

Fig.1 Workflow of the Proposed System

IV. EXPERIMENTAL RESULTS AND ANALYSIS

This section depicts the experimental analysis of our proposed model. We have collected social media data from
Twitter dataset which composed of 3,117,553 users with 23,888, 143 connections. Each social user owns her profile,
tweets and list of followers. Geographic distance is widely adopted in our proposed model between user and the cloud.
From the continuous network monitoring, we extracted the local and global information of the user. Since, the objective
of the study is to optimize the cost factor of the Geo-distributed clouds.The standard storage cost is renormalized for
inter-cloud traffic cost whereas the redistribution cost is analyzed from the standard inter-cloud traffic cost. Finally, the
total cost is estimated from normalized standard cost.

Volume 5, Issue 10, October 2017 Page 69


IPASJ International Journal of Computer Science (IIJCS)
Web Site: http://www.ipasj.org/IIJCS/IIJCS.htm
A Publisher for Research Motivation ........ Email:editoriijcs@ipasj.org
Volume 5, Issue 10, October 2017 ISSN 2321-5992

a) One-time cost reduction:

Throughput is the performance factor used for the cost analysis.In order to ensure fair comparison for social
networks, the local information about social users is used for data placement and replication in cloud server. Cost is
influenced by the accessibility of the data and its QoS requirements. The Greedy and Random method are used for
selecting and placing the data over cloud. Let us assume each user place the data over 10 clouds. Greedy method is
used for data placement on most preferred cloud.Users who are geographically close to one another tend to have similar
sorted lists of clouds. Thus, greedy can assign local users to the same nearby cloud and random tends to straddle local
social relations across clouds. When no. of cloud C=0 to 9 increases, the cost analysis for storage, inter-cloud traffic
and estimated total costsare optimized for every user. Similarly, the redistribution cost also decreases which depicts the
data overhead is reduced. Fig.2 shows the total cost estimated using our proposed model for incremented C.

Fig.2. Total cost estimation for one-time cost reduction

b) Continuous Cost reduction:

Our proposed model incurs 10% lesser maintenance cost using greedy model. Depends on the growth rate of Online
Social Networks (OSN), the OSN provider estimates the total cost from its billing period.
Fig.3. shows the cost administration and redistribution cost over the estimated total cost. From the results, it is
quite natural because the reduction of storage cost dominates the total cost reduction, and a users stored data
accumulates and becomes much bigger than the amount of her traffic amount as time elapses.

Fig.3. cost administration and redistribution cost over the estimated total cost

Depends on the cloud usage, the sensitive data is placed to find local optimal solution. Since, the social users can
have increased number of friends, the time complexity of an algorithm grows without any limit. Our proposed model
yields time complexity of O (N2) where N is depicted as number of clouds in systems and is the number of iterations.

Volume 5, Issue 10, October 2017 Page 70


IPASJ International Journal of Computer Science (IIJCS)
Web Site: http://www.ipasj.org/IIJCS/IIJCS.htm
A Publisher for Research Motivation ........ Email:editoriijcs@ipasj.org
Volume 5, Issue 10, October 2017 ISSN 2321-5992

V. CONCLUSION

In this paper, we have studied about the cost optimization process on Geo-distributed clouds for Online Social
Networks(OSNs).We have devised storage cost of data placement without compromising the Quality of Service(QOS)
requirement. The objective of the study is to optimize the cost via the minimized usage of data replicas over Geo-
distributed clouds. The proposed model achieves reduced total cost by ensuring efficient QOS and data accessibility.
Experimental analysis is carried out in Twitter dataset using the throughput factor. Geographic distance is widely
adopted in our proposed model. From the continuous network monitoring, we extracted the local and global
information of the users. The cost is optimized for the one-time cost reduction and continuous cost reduction. Finally,
we have achieved 10% lesser cost than the state-of-the art approaches.

REFERENCES
[1] Lei Jiao et al, Optimizing Cost for Online Social Networks onGeo-Distributed Clouds, IEEE/ACM
Transactions on Networking,Vol.24,No. 1,February 2016.
[2] E. Protalinski. (2015). Facebook Passes 1.55B Monthly Active Users and 1.01B Daily Active Users. Available:
http://venturebeat.com/2015/11/04/facebook-passes-1-55b-monthlyactive-users-and-1-01-billion-daily-active-users/
[3] J. Constine. (2012). How Big is Facebooks Data? 2.5 Billion Pieces of Content and 500+ Terabytes Ingested
Every Day. Available: http://techcrunch.com/2012/08/22/how-big-is-facebooks-data-2-5- billion-pieces-of-content-
and-500-terabytes-ingested-every-day/
[4] R. Miller. (2012).Facebooks $1 Billion Data Center Network. Available:
http://www.datacenterknowledge.com/archives/2012/02/02/facebooks -1-billion-data-center-network/
[5] R. Miller. (2013).How Drop box Stores Stuff for 200 Million Users. Available:
http://www.datacenterknowledge.com/archives/2013/10/23/howdropbox-stores-stuff-for-200-million-users/
[6] A. Weiss, "Computing in the clouds," Computing 16, 2007, pp. 16-25.
[7] Advantages and Disadvantages of Cloud Computing. Available: http://www.levelcloud.net/why-levelcloud/cloud-
educationcenter/advantages-and-disadvantages-of-cloud-computing/
[8] Y. Wu, C. Wu,B. Li, L. Zhang, Z. Li, and F. C. M. Lau, "Scaling social media applications into geo-distributed
clouds," IEEE Conference on Computer Communications (INFOCOM), 2012, pp. 684-692.
[9] M. Mitchell. (1998). An introduction to genetic algorithms. MIT press.
[10] The Facebook Data Center FAQ. Available: http://www.datacenterknowledge.com/the-facebook-data-center-
faq/
[11]M.Obitko(1998).Introduction to Genetic Algorithms. Available:
http://www.obitko.com/tutorials/geneticalgorithms/recommendations.php
[12] X. Liu, J. Chen, and Y. Yang, "A Probabilistic Strategy for Setting Temporal Constraints in Scientific
Workflows," Concurrency and Computation: Practice and Experience, Wiley, 23(16) , 2011, pp. 1893-1919.
[13] J. M. Pujol, V. Erramilli, G. Siganos, X. Yang, N. Laoutaris, P. Chhabra, et al., "The little engine (s) that could:
scaling online social networks," ACM SIGCOMM Computer Communication Review 41(4), 2011, pp. 375-386
[14] D. A. Tran, K. Nguyen, and C. Pham, "S-CLONE: Socially-aware data replication for social networks,"
Computer Networks 56, 2012, pp. 2001-2013.
[15] L. Jiao, J. Li, T. Xu, W. Du, and X. Fu, "Optimizing Cost for Online Social Networks on Geo-Distributed
Clouds," IEEE/ACM Transactions on Networking, 2014, pp. 99-112.

Volume 5, Issue 10, October 2017 Page 71

You might also like