Green Cloud: A Literature Review of Energy-Aware Computing

Green Cloud
A literature review of Energy-Aware Computing James W. Smith

jws7@cs.st-andrews.ac.uk
Keywords
Cloud Computing, Energy Efficiency, Energy-Aware Computing, Power-Efficient Software, Green Cloud, Low-Carbon Computing
Table of Contents
Keywords ......................................................................................................................1 1 Introduction...............................................................................................................2 2 Cloud Computing......................................................................................................4 3 Virtualisation.............................................................................................................7 4 Datacentres ................................................................................................................8 4.1 Smart Construction........................................................................................................8 4.2 Power Usage Effectiveness ............................................................................................9 4.3 Data Centre Productivity ............................................................................................10 4.4 Datacentre Cooling.......................................................................................................10 5 Energy Aware Computing .....................................................................................14 5.1 Justification...................................................................................................................14 5.2 Constructing Energy-Aware Systems ........................................................................14 5.2.1 Powering down .......................................................................................................15 5.2.2 Task Consolidation .................................................................................................15 5.2.3 Resource Scaling.....................................................................................................16 5.2.4 Load balancing v Load skewing .............................................................................17 5.3 Measuring Energy Consumption................................................................................18 6 Power Efficient Software........................................................................................20 6.1 Power Efficient Software Principles...........................................................................20 6.1.1 Consume power according to work done ...............................................................20 6.2 Software Modularity....................................................................................................21 7 Future Research ......................................................................................................23 Bibliography ...............................................................................................................24
1 Introduction
In 2007, the total carbon footprint of the IT industry including personal computers, mobile phones, telecom devices and datacentres was 830 MtCO2e, 2% of the estimated total emissions from all human activity that year. This figure is expected to grow in the coming years. [1] Energy powering device in use contributes to 75% of this total [2]. There are two possible solutions to make IT Systems greener: 1) improve efficiency or 2) find a plentiful supply of clean, affordable energy. As the later is still in the realms of science fiction, energy efficiency is where the main focus of research will be in the near future. IT companies are learning that cutting emissions and cutting costs naturally go together, by making systems energy efficient money may be saved automatically. Energy-Aware Computing research is attempting to addresses this problem. Work in this field is tackling issues ranging from reducing the amount of energy required by a single processor chip to finding the most effective means of cooling a warehouse sized datacentre. Cloud Computing has the potential to have a massive impact, positive or negative, on the future carbon footprint of the IT sector. On the one hand, Cloud Computing datacentres are now consuming 0.5% of all the generated electricity in the world, a figure that will continue to grow as Cloud Computing becomes widespread particularly as these systems are always-on always-available. However, the large datacentres required by clouds have the potential to provide the most efficient environments for computation. Computing on this concentration and scale will drive cloud providers to build efficient systems in order reduce the total cost of ownership (TOC) as well as improve their green credentials. Even in local datacentres, moving to a Private Cloud system can tap into these benefits and steps can be taken to apply solutions from large-scale public clouds. For example, by accepting the performance degradation that is inevitable with a virtualized system, many current servers can migrate to a lower number of physical machines, enabling surplus equipment to be powered off. This is a simple example, but one which theoretically can have a significant impact on energy consumption. The main aim of Energy-Aware Computing is to promote awareness of energy consumption in both soft and hard systems. The unique position of Cloud Computing allows this area to be brought into sharper focus and will go some way to improving the carbon footprint of IT now and in the future. Despite this progress, Enterprise has been reluctant to take up Cloud Services over fears in the areas of security, privacy and administrative control. These companies would rather employ their own people to administer hardware they own, on premises, with controlled access and established security procedures. The alternative to using
James W. Smith Green Cloud Literature Review
large-scale public Cloud Systems is to use a Private Cloud, which can provide all of these properties. Clouds are typified by utility access, large-scale elasticity and rapid provision of resources, all of which are made possible through the use of virtualization on large IT clusters. A Cloud is generally considered to be unique in terms of geography and architecture. A traditional cluster is a set of computing machines where the user knows both the location of the system and the complete make-up of its hardware. A grid is a system where location is unknown (exactly, may be distributed) but the architecture is known. With a Cloud you traditionally know neither thanks to the level of abstraction provided by virtualization which could allow both of these factors, architecture and location, to change during the lifetime of the system. This is a problem for researchers, who need to be able to re-produce results and report architecture information in scientific papers. The question remains, is it possible to perform scientific experiments on a Cloud if one cannot control the architecture, longevity and number of the computing nodes? What if Amazon wanted to do research by measuring their own Cloud, are they now monitoring a Private Cloud? For the sake of argument, lets say that yes, they are. A Cloud, any cloud, is a Private Cloud for someone at some point if they are able to control the architecture, longevity and the number of nodes. Therefore research into Private Clouds, particularly for the purposes of measurement and usage, is useful research. Lessons learned here are transferrable to larger systems. The research that will be undertaken in the future by the author will be done on the StACC Private Cloud system with the hope that results gathered can benefit the scientific community and public cloud operators alike. The following analyses how Cloud Computing can influence the energy usage of a computing system and how traditional energy-aware computing techniques can be applied to Cloud Computing.
2 Cloud Computing
Cloud Computing, a forefront research channel in Computer Science, has the potential to change the face of the IT industry. There has been a significant amount of disagreement in how Cloud Computing is defined. The US National Institute of Standards and Technology (NIST) definition [3], one of the most widely accepted, describes cloud computing having: Essential characteristics: on-demand self-service, broad network access, resource pooling, rapid elasticity and measured service Deployment models: private, community, public and hybrid clouds Service models: Software, Platform and Infrastructure as a Service (SaaS, PaaS, IaaS). Some have suggested that Cloud Computing is not a new technology, arguing that it effectively utilises technologies that have been around for a number of years, in particular it shares a connection with the field of Grid Computing [4]. The idea of centralized utility computing delivered over a network is not a new one John McCarthy predicted that computation may someday be organized as a public utility as far back as 1961. Research in grid computing has led to significant development of on demand computational resources, but these are usually only for the benefit of members of the owner organizations. Cloud Computing however, allows anyone, anywhere in the world to access a huge amount of computational resources with their credit card. There is no doubt over the significant momentum Cloud Computing has been gaining in recent years. There are now a number of conferences on Clouds and related topics, such as IEEE Cloud1 and Cloud Expo2, which are an exhibition for the significant interest in this area, both from academia and industry. Almost every major corporation in the IT industry now has some interest the area of Cloud Computing. These companies are leading the way in technological advances in the area, in recent years we have seen the launch of a vast number of services such as Amazon EC2, Google App Engine, Microsoft Azure giving users a extensive range of choices. They are investing large amounts of capital and personal to research ways of leveraging some of the current $16bn market, which analysis firm IDC suggests will grown to $42bn by 20123. Some reports have even estimated that revenue could reach $150bn by 20134.
1 2
http://www.thecloudcomputing.org/2010/ http://cloudcomputingexpo.com/ 3 http://www.idc.com 4 http://www.gartner.com/DisplayDocument?id=914826 James W. Smith Green Cloud Literature Review 4
Cloud Computing is not simply a technological improvement of datacentres but a fundamental change in how IT is provisioned and used [5]. Now any organization or individual can dynamically scale their resources according to how much they require and only be charged for how much they use. This means that developers with innovative new ideas and products can utilise Cloud services and will no longer require the large capital expenditure outlays in hardware to deploy their service. It removes the problem of over or under-provisioning services and passes that cost onto the provider of the service [6]. There are a number of benefits in that can be achieved from using Cloud Computing, including system scalability, reliability, and security. There are also a number of issues; like trust, privacy, availability, and performance [7]. Reviews such as Research Challenges for Enterprise Cloud Computing cover the particular issues and challenges with Cloud Computing in more depth than will be presented here [8][9]. One of the in areas Cloud Computing academia can contribute to is to examine the questions users and system administrators will have about migrating their systems to Cloud platforms. It will be important to provide impartial advice to support decisionmaking in migration to cloud systems and examine the feasibility of cloud adoption enterprise. Motahari-Nezad et, all have called attention to the lack of such an environment [10]. The Cloud Adoption Toolkit developed at the St Andrews Cloud Computing Colaboratory is one effort attempting to provide a framework to support decision-making and provide tools to combat their concerns [11]. The Toolkit could easily be expanded to include an analysis of the Energy Consumption of a system and how efficiency may be affected by a move to a cloud system. There are different types of Clouds available, each with benefits and drawbacks. A Public Cloud is one where services are provided to the general public. They require no capital investment from the users, moving this cost and risk to the providers. The service usually provides fine-grained control over services and billing allowing users to pay for their computing as a utility. They are however vulnerable to issues relating to privacy, security and administrative control. Private Clouds have been developed as an alternative designed for use by a single organisation. Therefore they provide control over the issues that can plague a private cloud, but do not provided the same lack of expenditure or flexibility as tradeoffs. They can often be criticised by some as being not true Clouds. Hybrid Clouds is a the term given when an organisation uses a combination of the above models, for example using a Private Cloud but bursting out to a Public Cloud service should demand exceed the current infrastructure. They provide more flexibility than Private Cloud systems but go someway to also easing concerns over privacy, security and general control.
Virtual Private Clouds are an alternative solution where a Public Cloud service is presented to the user as if it was on-premises thanks to VPN (Virtual Private Network) technology virtualizing the network layer, allowing the organisation to implement its own network configuration and controls on top of the public service.
3 Virtualisation
Virtualisation may refer to many things, but in this context we will take it to mean hardware virtualisation allowing multiple guest operating systems to run on a single node. Hardware virtualisation hides the underlying computing system to present an abstract computing platform by using a hypervisor. In datacentres, the number of physical machines can be reduced using virtualisation by consolidating virtual appliances onto shared servers. This can help to improve the efficiency of IT systems and this assumption will be examined in this section. Virtual Appliances are virtual machine images containing everything needed to run a particular task (operating system, databases, configuration, software, etc) that can be used as a "black box". The advantages are simple, it allows multiple virtual machines to be run on a single physical machine in order to provide more capability and increase the utilization level of the hardware. It always increases efficiency; it allows you to do more work with less IT equipment. Naturally, virtualized systems perform at a lower throughput due to the intermediately level of complexity, a throughput reduction of 15% has been reported [12]. Virtualized systems may require particular attention from an energy-aware computing perspective, as higher utilization does equal increased power consumption and more waste heat. It has not yet been decided if virtualisation helps energy efficiency by reducing the number of physical machines, or hinder by increasing power consumption levels and cooling needs. To demonstrate these challenges, Brebner et al [13] modelled a number of architectures for a SOA system, including a traditional multiple physical server approach, and optimised version of that initial model, a virtualized system and finally a system running on Amazons EC2 Cloud Computing platform (which uses virtualisation to provide Infrastructure as a Service (IaaS)). Their work focuses on the utilization of hardware and the consumption of electricity from IT equipment only. The results show that a traditional refined system consumed 51kW/h per day for their simulated workload. The virtualized system used 33.6kW/h per day a saving of nearly 40% compared to the traditional system. Using Amazon Web Services EC2 platform allows the system to be scaled dynamically when workload is needed giving an approximate power usage of only 10kW/h for the workload. This is huge saving is down to the fact that in EC2 you only pay for the workload you are doing, so when load is light the physical machines that you will later use are S.E.P (someone else's problem), meaning that your carbon footprint is light. In a system where you own the physical machines doing the virtualisation, those machines remain your responsibility even when no useful work is being done. However, there are some questions remaining unanswered. Brebner et al [13] only takes into account the energy consumed by IT equipment, not that from infrastructure such as cooling. They also assume the electricity monitoring for an unloaded and fully
utilised server, at 100W per CPU for idle and double that for a fully utilized system. This seems slightly out of place with other opinions in the field, so could be clarified in future work [14].
4 Datacentres
Cloud Computing is advancing the age of the computing datacentre, where massive plants of computation provide IT as a utility to users over a network. However, these plants can have a massive impact on the sectors energy consumption and carbon emissions. The United States Environmental Protection Agency (EPA) highlighted some key issues with regards to computing datacentres in the United States in their 2007 report: [15] U.S. Datacentres consumed 1.5% of the total electricity generated in the whole country that year, the equivalent of the combined consumption of 5.8 million average U.S. households or a cost of roughly $4.5 billion dollars. The amount of total energy consumption in this area has doubled in the period 2000-2006 and equivalent growth is expected in the coming years. Only 50% of the energy consumed by datacentres can be attributed to the useful work done by the computing servers. The other 50% is expended on infrastructure needs such as climate control, lighting, security, etc. Significant energy consumption is not limited to the United States; of all the electricity generated globally, 0.5% is consumed by datacentres. This has a notable impact on carbon dioxide emissions. Datacentres are responsible for around 80-116 Metric Megatons of Carbon emissions (MtCO) each year, a figure not unlike that of countries such as the Netherlands and Argentina (146 and 142 MtCO respectively) [1] [16]. The figures provide motivation to improve the efficiency of datacentres, from silicon chips, load distribution and cooling. A remarkable amount of research in this area is taking place in industry with major IT corporations involved in an arms race to build the largest capacity, cost effective and environmentally friendly datacentres. The following literature reviews the questions of how to construct smart datacentres, how to measure their performance and how to improve their efficiency. While we cannot improve the current efficiency of large scale industrial datacentres, by learning the techniques that are used we may be able to improve the efficiency of smaller organisational datacentres and systems such as the StACC Private Cloud.
4.1 Smart Construction

There are two major issues to be considered when building datacentres; one is energy supply, the second is energy efficiency. As any energy supply is finite, the second becomes increasingly important as systems are expanded and expected to do more work. As physicists are years, possibly decades, away from creating a renewable and clean energy source improving the efficiency of datacentres is an area in which the most James W. Smith Green Cloud Literature Review 8
progress will be seen in the coming years. Computer Science can have a fundamental impact here, from improving the effectiveness of hardware to constructing powerefficient software that minimises waste and is energy-optimised. Engineers building these systems take into consideration a large number of factors, including the location of potential warehouse sites. Potential areas are evaluated upon their geographical features; climate, infrastructure links, fibre-optic connectivity and perhaps most importantly access to a plentiful supply of affordable electricity [17]. Affordable electricity does not always equal green electricity. In fact, usually the cheapest power available is also the dirtiest. For example in January 2010 Facebook signed a contract with PacifiCorp to power their new highly efficient datacentre with electricity generated from coal, the fuel which releases the most amount of carbon per Watt of electricity. This move angered organizations such as Greenpeace, who suggested that another clean and renewable form of energy should have been used. They reported; increasing the energy efficiency of servers and reducing the energy footprint of the infrastructure of datacentres is clearly to be commended, but efficiency by itself is not green if you are simply working to maximise output from the cheapest and dirtiest energy source available [2]. A number of datacentres have recently been constructed by major corporations such as Google, Microsoft and Yahoo in the north-western US state of Washington near the banks of the Columbia River [18]. This location provided relatively low land costs, proximity to the strong running water of the Columbia River (used for cooling and hydro electricity generation), and low ambient air temperature. A single site with all of these attributes can be key to building the most efficient datacentre. Smart Construction is only one area that can affect the efficiency of datacentres; other areas are particularly open to improvement from Computer Science research. Cooling can be made effective through the use of smart monitoring and control algorithms, Virtualisation can be examined to choose the best platform for work, and other Energy-Aware Computing principles can be applied.
4.2 Power Usage Effectiveness

Power Usage Effectiveness (PUE), defined by The Green Grid [19] a global consortium of IT companies and professionals seeking to improve energy efficiency in datacentres, is an approach to measuring the efficiency of a datacentre in terms of its electricity use. It aims to compare how much power is being used for useful computing and how much is needed for infrastructure. A datacentres PUE is the ratio of total power consumed by the facility the to power used by the computing equipment. It is defined as: PUE Total Facility Power / IT Equipment Power Total Facility Power is defined as the power used for the datacentre (power not used for any other purposes, for example where the datacentre is in a mixed-use building) and IT Equipment Power is defined as the power used by all ICT equipment within the datacentre.
In an ideal world 100% efficiency would be achieved to give PUE rating of 1.0 meaning all power is used by IT equipment only. Research from the Lawrence Berkley National Labs [20] shows that 22 datacentres measured in 2008 have PUE Values in the range 1.3 to 3.0. Clearly, PUE is a helpful metric. However, it is difficult to measure the electricity consumption of components with such granularity that only IT and not infrastructure items are measured. To do this, unless the IT equipment is on a separate electricity supply, then every single IT device will need to be monitored, a laborious task. Conducting a survey to determine the PUE of datacentres will go someway to understanding the current situation with relation to energy and will help raise awareness to the problems of inefficiency.
4.3 Data Centre Productivity

The Green Grid is also working on producing a metric to define the productivity of the datacentre, where the amount of useful work done is divided by the total power consumed by the facility [19]. Datacentre Productivity = Useful Work / Total Facility Power This would be a useful metric particularly as the entire idea of Energy Aware Computing is to use the least amount of power possible to complete a given task. Datacentre Productivity as a measurement would be a natural extension of this principle and particularly adept for Cloud Computing. While this metric would certainly be useful, how does one measure Useful Work? Evaluating the datacentre in this manner would treat it like a black box, where electricity and requests go in and useful work comes out. This is an entirely appropriate level of abstraction, but the complexities of the underlying components and the work they are doing will make measure this black box very difficult indeed. So instead measurement will begin at a per-component level for each piece of specific work, for example transferring data from point A to point B or running an algorithm on a CPU. By calculating the energy usage of each component, an overall picture for the warehouse may be build up over time.
4.4 Datacentre Cooling

Cooling is a great challenge in making IT systems more efficient. Waste heat is generated by nearly all-electrical components, particularly when those components are running at a high workload, as they ideally would be in a Cloud Computing environment. Ranganathan [21] suggests that for every dollar spent on electricity costs in large-scale datacentres another dollar is spent on cooling. Cooling is a large problem in the IT industry. Studies [21][22] have shown that operating electronic components outside of their optimal temperature range can lead
10
to a significant decline in reliability. For every 10C increase in temperature over 20C, the Uptime Institute has identified a 50% increased chance of server failure and similar statistics for hard-drives It is therefore vital to keep IT equipment within its optimal temperature range to preserve the lifetime of the system. In datacentres cooling hot electrical equipment can take one of two approaches to; air and water based systems. Air has the advantage that it can be directly circulated around the equipment, concentrated and directed to where it is most needed. Inefficient application of air, for example by cooling the entire room rather than directly applying to hot components may instead need to be somewhat colder than would be otherwise required in order to provide the some level of cooling. Water can be used as an approach to directly apply cooling to hot equipment by piping the water across hot components. The necessary caveat with water is that care must be taken to ensure that electrical equipment is not damaged in the case of failure or leakage. As water has a large specific heat and very high density relative to air it requires much less volume flux per unit of heat flux being transferred. Hybrid schemes can also be used where small fans blow chilled air cooled by cooling fins connected to the water coils.[23] If the suggested temperature range of working servers is around 20-25C and the warm air vented by those systems is 35C then an airflow rate of 1-2 cubic meters per square meter of floor area is required. For a water-cooled system, only 1-2 cubic meters of cooled water is required for an entire 4,000 square meter data centre, a significant saving. Some datacentres do not use water despite these advantages. This is because using water to cool a datacentre requires efficient and safe distribution a difficult task that may involve custom designed server racks and other infrastructure equipment, increasing the cost and complexity of the system. Location, and in particular the local climate, can have a large impact on the required energy need to cool a datacentre. Cooling systems have a desired temperature for the inflow air. In a warm climate the ambient air temperature going into the system may be higher than the desired range, resulting in additional energy expenditure to cool the air through refrigeration units before it enters the rest of the datacentre. In a situation where the external air temperature is sufficiently cool, a direct-heat exchange system can be used where the warm interior air is passed through a heat exchanger and exposed to the cooler external air. Such a system can be an effective means of cooling the air since the heat exchanger requires only mechanical power to drive the air through the system and not refrigeration [24]. For example, in a location where the outside ambient air temperature is lower than the warm air produced by the hot components a direct-exchange system can be used all year round, with no need for refrigeration. If the interior air temperature rises too high, then recirculated air can directed through a refrigeration unit to be cooled. Such a hybrid system could be used only when required (i.e. the external air temperature rises to such a level where a direct-exchange system no longer provides sufficient levels of cooling) so that the more expensive James W. Smith Green Cloud Literature Review 11
heat-pump system is only used when absolutely necessary or employed to save cost on a predominately refrigeration based system. The mechanical operations required to manipulate airflow require a certain amount of energy to operate, which in an ideal situation increases proportionally with the amount of air displaced, much like energy consumption to useful work done in computing terms. When the temperature of the air increases then the amount of air that must be moved also increases, the opposite is true also. Therefore it is possible that less power can be used for mechanical purposes by cooling the air using a refrigeration unit, which also consumes energy. A balance must be struck between the energy expended cooling air in a refrigeration unit and that expended in the process of moving air. By optimising a coupled system the most energy efficient mechanical cooling may found.[23] Distributing the air around a datacentre can be done in a number of ways, one of the more sophisticated architectures is to pump the cool air through the floor and vent the air through the racks in an upwards motion. The hot air can then either be collected at the top of the room or funnelled back down to the floor. The benefit of the former is that the incoming cool air supply has a reduced risk of becoming heated while the latter protects components in higher parts of the data centre from receiving heat from the lower levels. [23] Cooling and computational load balancing can be done in the same system as proposed by Parolini, et al [25]. In traditional datacentres Cooling systems and Computation systems are controlled independently. The Cooling system tries to efficiently control Computer Room Air Conditioning (CRAC) units in order to cool the datacentre to its optimum operating temperature. Computation load is then distributed according to what will give the best throughput performance, regardless of the current temperature considerations, information that it is not even privy to. If these two systems were combined, and computational work was distributed smartly according to information reported by the cooling infrastructure, significant savings would be made in the energy needed to cool warm parts of the datacentre. This would of course require a unified framework for reporting information between the two systems and allow controls to be adjusted according to administrative policy. Parolini, et al, propose an algorithm that allows the system to maintain a small difference between the input and desired output temperature of each CRAC node, thereby minimising the work it has to do. This will lower the power consumption required by the cooling system. The algorithm is still a prototype and requires computational load to be submitted as homogenous and independent tasks with no priority structure, but could be potentially be adapted, at least in part into a real-world load allocation algorithm. Kumar, et al [26], have taken this approach and developed it to include coordination between the physical and virtual platform layers allowing them save power and
12
maintain system usability. For example, they are able to smartly allocate Virtual Machines according to system power consumption. This improves on other approaches in the field by not only maintain smart distribution of work at the software level, but also controlling the operation of hardware components, powering down when not in use.
13
5 Energy Aware Computing

5.1 Justification
Reports have estimated that the capital cost of acquiring computing hardware will be exceeded by the total cost of operation, mostly from the cost of electricity needed to operate and cool, even over a relatively short amortization period of 3-5 years [27] [28][29]. This means that the total cost of ownership (TOC) of a computing system is no longer dominated by the initial capital expenditure but operating costs instead are becoming an increasingly significant factor. Forecasting operating cost is an area where enterprise traditionally has difficulty, which will only be compounded by the rising cost of energy. Microsoft has stated that they believe operational costs for servers will exceed hardware purchase costs by 2015 [30]. Cost is not the only problem, producing the energy releases harmful carbon dioxide emissions into the atmosphere. In the U.S. alone, computing equipment consumes more than 20 million gigajoules of energy per year, producing that energy releases the equivalent of four million tons of carbon-dioxide emissions into the atmosphere [31]. This is a problem seems unlikely to go away as global energy prices rise and users increase their demand on Cloud Computing services. The overall problem is to minimise the energy used to perform a certain piece of useful work by controlling the resources (CPU, Memory, Network) and employing smart management of the system. This is in contrast to the traditional computing model where systems have been designed to achieve a maximum performance, usually speed in pure wall-clock terms, for any workload. In energy efficient systems, this may sometimes be desired in some situations but the system must also now minimise the total amount of energy used.
5.2 Constructing Energy-Aware Systems

Constructing systems that are aware of their components and tasks and are able to use that information to implement sophisticated energy-aware management is a challenging problem. Systems must be able to identify and monitor where power is consumed and control either the supply or demand for that power. Each component in a system must expose their power consumption information and control mechanisms to some central authority. By doing this, a system command module will be able to react to consumption rates and increase or decrease workload causing the demand. In the event that this exposing of internal component information is not possible due to propriety reasons or lack or implementation, there are tolls that may be able to provide similar functionality. In the School of Computer Science at the University of St Andrews, Yi Yu is developing a tool that will provide Scalable Energy Monitoring to system administrators. This software will allow energy information to be collected from real resources at scale and in a heterogeneous environment. Agents will pass data to a
14
Controller which may then be able to use the information collected to make informed decisions relating to system management policy [32]. Alongside the consumption figures of components in different power states, a system must also know the cost of transitioning a system between those states. This knowledge will allow the controller to calculate how a system should be managed when state migration is desirable. 5.2.1 Powering down Switching off or powering down components and entire systems effectively when not in use can be considered a key area of Energy Aware Computing [27]. The effect and extent of these power state transitions requires careful consideration. For example, powering down a CPU can be an effective means of saving energy. However, suspending the system cache, memory and controllers in addition to powering down a CPU will save even more energy but at the penalty of increase cost and time to return the system to a useful state. A balance must be achieved between energy savings a system performance. There is a simple analogy that can explain these tradeoffs. A traditional laptop computer may be sent to sleep by its operating system when not in use. However, the user may return to the machine sometime later to discover that the laptop battery has lost some of the energy it had before it went to sleep. This is because a sleeping machine will still consume energy to keep its system state stored in volatile memory, allowing it to respond quickly when required. More energy could be saved be shutting down the machine rather than sending it to sleep. In this case, the system state will be copied to disk and the machine will be powered off completely incurring greater time to return to a useful state. In this case any energy lost from the battery will be caused by battery drain rather than system consumption. In a large IT system such as a Cloud Computing datacentre, components may be algorithmically transitioned between power states. To do so requires information gathered from monitoring the system, knowledge of the overheads incurred from transitions between states, and the means to perform transitions from some central authority. If this can be achieved then significant energy savings could be made by powering down parts of the datacentre that are not being utilized. Further investigation of the cost of power transitions in production servers is required to ascertain the effectiveness of power down schemes for Cloud Computing environments. 5.2.2 Task Consolidation Srikantaiah, et al [33] discuss an approach to consolidating applications or tasks on a lower number of physical machines, therefore allowing surplus machines to be switched off by employing powering down techniques. The goal of Srikantaiahs approach is to keep servers well utilised so that power costs are effectively amortized. This is balanced against over-utilization that can cause internal contentions such as cache contentions, conflicts at the functional units of the James W. Smith Green Cloud Literature Review 15
CPU, disk scheduling conflicts, and disk write buffer conflicts. The authors propose that there is some optimal point where an appropriate level of performance is retained while amortizing energy costs effectively among tasks. The challenge is to find this optimal point and track it through the life of the system. This differs from traditional virtualisation in that consolidation is achieved at the task level, rather than complete virtual machines, so would fit at either the Platform or Software as a Service (PaaS, SaaS) levels of the Cloud Computing stack. Systems are modelled as a bin-packing problem; where each server is a bin with dimensions relative to available resources such as CPU, memory, network bandwidth, etc. Tasks are treated as objects with their resources requirements as dimensions. Finding a way to minimise the number of bins required for a given workload provides a solution to the consolidation problem. There are problems with this approach. It relies upon being able to accurately predict the resources a task will consume in order to allocate it correctly. In some situations, such as when an application adjusts its performance according to the available resources, this may prove impossible. These applications will consume all resources given; therefore a new problem becomes a question of deciding how much resources they should be allocated. Software designers will need to take these unique requirements into account when designing applications to be used on such a system.
5.2.3 Resource Scaling An alternative to task consolidation is to modify the amount of resources given to a task in order to adjust the time taken for it to complete its work. If a task can be done within its expiry deadline with fewer resources than those provided for quickest performance then it might be possible to save resources and therefore reduce energy consumption. Maximum resources for best performance will be assigned if a tasks deadline is lower or equal to shortest amount of time the task can be done in with all resources fully utilised. Otherwise, resources will be assigned according to the task deadline. In this approach the control algorithm requires information about the deadlines of incoming tasks. This is a difficult problem unless tasks are engineered to be predictable. As in Task Consolidation, the caveat regarding performance-adjusting applications also applies here. In Cloud Computing, service providers regularly have to adhere to Service Level Agreements (SLA) with their clients. These SLAs may take the form of time commitment for a task to be done. In this case, tasks have hard deadlines that the resource scheduling management software can be configured to meet. An example of a resource-scaling algorithm is Speed Scaling, which varies the speed of a CPU processor in order to consume less power and generate less heat. Aside from saving the energy used in a processor there is also motivation to bound the maximum temperature a processor can reach in a given schedule in order to aid the
16
process of cooling [34]. The goal is to balance the quality of service performance against the possible benefits [35] In a Speed Scaling algorithm, jobs arrive at a server, or cluster of servers, with an amount of work that to be done. A server is running at a given speed, and the time taken to complete the job is the amount of work required divided by this speed. The power consumed by the processor is a function of its speed. It is therefore suggested that by reducing the speed of a processor, the time taken to complete a given job will increase, but power will be saved. Amount of work is constant w Server running at speed s Time taken for task t = w / s Power consumption p = f(s) If s () then t () and p () There are two types of scheduling algorithms that can be considered; online and offline. Online algorithms are unaware of the nature of jobs arriving in the future, where offline algorithms have the opportunity to be optimal, as they know the whole sequence of events in advance and can therefore schedule jobs accordingly. The latter is impractical in real world situations and is often used theoretically to evaluate the performance of online algorithms [36]. Each speed-scaling model relies upon some quality of service measure to change. Two of the most common are response and slowdown. The response of a job is the time taken from when it is released to when it is completed; the slowdown is time taken for a job to complete relative to how long it would have taken on a dedicated (non speed scaled) processor. For example, a slowdown of 2 would mean a job would take twice as long as it would on a traditional processor. At the moment there a number of techniques that allow individual system components to have their performance adjusted to save energy. For Cloud Computing and other large-scale IT systems, the challenge will be to perform autonomic power adjustments on a system wide scale.
5.2.4 Load balancing v Load skewing Traditional techniques used to improve performance can have a detrimental effect on the consumption of energy. For example, in a traditional server cluster load-balancers are used to spread the load in an equal fashion in order to achieve the best possible performance and scalability. However, with work distributed across many servers at low levels of utilization, the consumption of energy is disproportionate to the amount of useful work done. Load balancing algorithms allocate work fairly to each active server. For example, imagine a four-node cluster and 8 tasks to be allocated. If each task requires around 25% of a single nodes resources, then a traditional load-balancing algorithm would apply 2 tasks to each node in our cluster, giving each a 50% utilization rate.
17
While this may provide the best possible performance in terms of speed, it may not provide the most optimal result for energy efficiency. As current systems do not consume power proportionally to how much work they are doing (with low levels of utilization incurring disproportionate amounts of energy) perhaps a better solution would be to skew the load on the servers. Chen, et al [14] point out that it may be possible to rewrite load balancing algorithms to be more energy aware and introduce the concept of load-skewing. If servers were continually allocated work while they have resources remaining, in our example this would mean giving 2 nodes 4 tasks each to reach 100% utilization on both nods, then we would be able to power down unused servers and therefore save on energy consumption. This would of course incur additional penalties in performance by, perhaps, utilizing servers past their optimal state [33], and be liable to the cost of starting up and switching off servers on demand, as discussed in section 3.3. However, if these penalties could be understood and protected against, generally improving the energyawareness of load allocation algorithms in Cloud Computing could have a significant impact on datacentre energy consumption.
5.3 Measuring Energy Consumption

Some reports have estimated that only 13.4% of organisations monitor power consumption [37]. This is a serious challenge for energy aware computing to overcome, as the best way to save energy is to make people think about how much energy they are currently using. If the cloud were monitored, this would allow smart system management to be introduced, allowing policy makers the change to control the system with reference energy use information giving incentive to save money and make their systems greener. If Cloud datacentres were monitored smart system management could be introduced, allowing administrators could control the system and set policies with reference to energy consumption. Given this information and control they would have incentives and the means to save money and improve their emissions. Monitoring usually requires specialist hardware to be installed in the datacentre, which can present an additional cost and complexity to the Cloud system. There is also no guarantee that this hardware will present data in such a way that can be easily incorporated into control programs. Creating these systems to present useful results for administrative domains is a complex task. Cloud Computing brings its own unique demands to this area, as datacentres are normally much larger centres of computation that would traditionally be expected. In these situations, monitoring software would be required to scale appropriately while retaining a light footprint on system resources.
18
Monitoring through the use of specialist hardware, for example on legacy systems where no energy monitoring capability exists, can increase the cost and complexity of a system. Cloud Computing adds to these demands, which much larger centres of computation and elastic scaling of resources. Any monitoring devices or software would have to be inherently scalable while retaining a light footprint on resources to minimise the observer effect. Even then, there are no guarantees that such a monitoring setup would provide data in a presentable and useful fashion to controllers nor that it is vendor independent. Current monitoring systems are proprietary and compete with other vendors for sales, giving little incentive to collaborate on a cross platform information model. IT environments are rarely single-vendor specific, so there is not currently a consistent or easily accessible set of data for use in system management. However, tools are beginning to emerge, such as the Scalable Energy Monitor developed by Yi Yu at St Andrews [38]. There are a number of domestic energy monitoring tools available, such as the Envi CC128 and a number of enterprise level hardware like Smart Power Distribution Units. These systems will either report information regarding energy consumption or even allow you to control individual power sockets remotely depending on their level of functionality. However, as was pointed out previously, there is not yet a standardised software application that can report information from these sources and present it in a useful manner. There are however software tools such as the open-source Intel developed program PowerTOP that allows you to find programs that are consuming power when the computer is idle. It does this by examining which programs are interrupting the CPU idle state with requests for computation. The lack of fine-grained useful information from these sources means that it is not yet possible to specify how much energy a given application consumes. This is a research challenge that will be answered if Power Efficient Software is brought to the forefront of Software Engineering. Monitoring for energy consumption is only the beginning of making a system more efficient. Once energy statistics have been collected and presented in a useful manner the challenge will be to use that information to minimise waste and increase productivity. Adapting the system workload according to current energy consumption, as VMware's vSphere5 does, would allow work to be migrated and system resources manipulate according to the current energy sate. In Cloud Computing this could introduce VM migration across hosts, increasing the utilization of some and allowing under-utilized nodes to be powered down Cloud providers may also look to introduce energy into their SLAs. This could mean that users are incentivised to use the system in an energy efficient manner in return for lower costs and a pre-agreed performance derogation [38].
5
http://www.vmware.com/products/vsphere/upgrade-center/ 19
6 Power Efficient Software

Future improvements in energy efficiency are likely to result from rethinking algorithms and applications at the higher levels of the computing stack, alongside improvements at the circuitry and other low-level components [21]. R.N. Mayo et al [39] discovered that even simple tasks such as listening to music, making a phone call, etc, can consume significantly different amounts of energy on a variety of heterogeneous mobile devices. As these tasks have the same purpose on each device, the results show that the implementation of the task and the system upon which it is performed can have a dramatic impact on efficiency. Software has always been constructed and optimised to maximise its efficiency in certain terms, wall clock time efficiency for example. Other optimisations are carried out for scalability or robustness but rarely made for energy consumption [27]. However, techniques exist to reduce the power needed by a piece of software to complete a set of tasks, leading engineers to realise that software can be constructed in an Energy Efficient manner [40]. The potential to enhance energy efficiency through software will depend upon the dissemination of these techniques to make them as ubiquitous as other performance enhancing measures.
6.1 Power Efficient Software Principles

Saxe outlines four key principles to producing Power Efficient software [40]: 1) The amount of work done by the software directly corresponds to the amount of resources consumed. Therefore if more energy is applied and the system runs in a higher state then the software will do more useful work by some magnitude appropriate to the energy increase. 2) The software will minimise the amount of unnecessary computation by using an event-based architecture over a polling system, and therefore remain dormant until action is required. 3) There should be extra care taken to ensure that the software has no problems with memory leaks or freeing unallocated memory. These problems will cause increased interference from the host operating system, resulting in additional energy consumption. 4) If possible, software requests to access additional resources should be done infrequently and in batches, allowing overhead costs to be amortized.
6.1.1 Consume power according to work done Brown et al [27], state that most software systems today do not gracefully adjust their power consumption according to the amount of useful work they are doing. Ideally,
20
computer systems would consume an amount of power directly corresponding to their level of utilization [28]. Studies have shown that a far more common occurrence in contemporary software is that while system utilization is low, a disproportionate amount of power is being consumed. For example, at a 10% level of utilization a typical blade server will consume 50% of its peak power. [14] This unexpected level of consumption can be down to a number of factors, one of the most important being the activation of hardware when no useful work is being done. Programs must be examined in detail to assess their impact on energy efficiency and if possible, the power efficient software principles should be applied. They can be very important when applied to large-scale systems that already place heavy demand on resources and energy. If these systems can implement a change which saves them even a single percentage point in electricity consumption that could have a significant monetary saving, not to mention potentially preserving the lifetime of the system components. Cloud Computing could benefit from following these approaches. For example running PowerTOP on the front-end node of the StACC Private Cloud it is possible to see that the application consuming the most power is the open source Cloud Computing platform Eucalyptus. PowerTOP is a utility created by Intel in 2007 that monitors a system and reports to the user which processes are responsible for wakeups that prevent a CPU from entering a sleep state. [42] Running on the StACC front-end, it reports that between 30% and 45% (or 80 90 per 5 second interval) of the wakeups are caused by Eucalyptus. The noted increase in power consumption most likely down to the polling architecture Eucalyptus employs to gather/report information about the system, where each node in the clusters is polled for updates every few seconds using the describeResource request. [43] An event based architecture where nodes are only contacted when they are needed to do some work would be a more efficient performance in terms of power consumption, but may suffer from poor speed performance or inaccurate information reporting. Engineers must examine tradeoffs of this type and if possible, implementations can be modified to suit the system requirements.
6.2 Software Modularity

System modularity can cause inefficiencies in energy consumptions A potential key problem with system inefficiency is modular development. Different teams often develop different parts of a system without consideration for their interaction with one another. An example would be an entirely different company from the one, which developed the underlying operating system, could write MP3 player software on a traditional desktop computer. However, on a mobile phone, there is a chance that the music player has been built into operating system by the same team that has knowledge of how best to make the code as efficient as possible. There
21
may be some times where a modular approach to system development is not the best possible idea. Workload on any system is variable and therefore requires performance and resources that may not be fully utilised at all times. Studies have reported that the average utilization of traditional servers can be as low as 10-30%. The over-provisioning of resources is built into the system to deal with sudden surges in workload and the systems are optimised to perform well under these peak-performance situations. However, current systems do not proportionally increase their energy consumption when utilization increases causing power inefficiencies.
22
7 Future Research
The above work will provide a platform on top of which the Green Cloud could be built. The practises from Energy Aware Computing will improve the efficiency of Cloud systems and their datacentres and Clouds themselves will produce naturally efficient and focused centres of computation, advancing the pursuit of green computing. Future research could answer some of the questions that have not yet been covered by previous work in this field. It would be useful to monitor Virtual Machine performance both for energy consumption and performance derogation compared to traditional systems. This would help to prove the case for virtualisation's energy efficiency credentials. Tests could be conducted for different hypervisors, allocation software and other areas related to virtualisation. Standard Linux performance benchmarking tools could be used in this evaluation to give a constant amount of work for both environments, allowing the effects of power and the VM to be carefully monitored. Appropriate figures from this work could be given to advise how administrators should consider virtualisation and IaaS platforms in terms of energy. The StACC Private Cloud, running the Eucalyptus IaaS software, will provide a good test bed for these experiment, allow monitoring of real-time use of a Cloud service alongside controlled experiments. This energy information may also allow the construction of a framework for calculating the energy usage of an application. It is hoped that once this information is gathered, it may be possible to introduce energy-smart control algorithms for Cloud management, by modifying the Eucalyptus open source software for example. If such algorithms were developed and implemented successfully, Cloud providers could begin to introduce Energy terms into their SLAs. It would also be interesting to access the potential of software optimisation in terms of energy consumption. Currently software engineers optimise for traditional performance, but guidelines for energy could be drafted to aid the long-term efficiency of computing systems.
23
Bibliography
[1] [2] [3] [4] [5] [6]
[7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19]
E. Farnworth and J.C. Castilla-rubio, "SMART 2020 : Enabling the low carbon economy in the information age," Group. Greenpeace, "Make IT Green: Cloud Computing." P. Mell and T. Grance, The NIST Definition of Cloud Computing, 2010. I. Foster, Y. Zhao, I. Raicu, and S. Lu, "Cloud computing and grid computing 360-degree compared," Grid Computing Environments Workshop, 2008. GCE', 2008, pp. 1-10. A.C. Board, P.W. Vogels, G. Olsen, E.C. Cloud, E.B. Storage, S.S. Service, L. Tucker, S. Microsystems, C. Machine, R. Networks, and G. Badros, "cTo Roundtable : cloud computing," 2009. M. Armbrust, A. Fox, R. Griffith, A. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and others, "Above the clouds: A berkeley view of cloud computing," EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2009-28, 2009. H. Erdogmus, "Cloud Computing: Does Nirivana hide behind the Nebula?," October, 2008, pp. 4-6. A. Khajeh-Hosseini and I. Sommerville, "Research Challenges for Enterprise Cloud Computing," StACC White Paper. I. Sriram and A. Khajeh-hosseini, "Research Agenda in Cloud Technologies," Methodology, 2008. H.R. Motahari-nezhad, B. Stephenson, and S. Singhal, "Outsourcing Business to Cloud Computing Services : Opportunities and Challenges," Development, 2009. D. Greenwood, A. Khajeh-hosseini, J. Smith, and I. Sommerville, "The Cloud Adoption Toolkit : Addressing the Challenges of Cloud Adoption in Enterprise," Challenges, 2010. "Load Testing SugarCRM in a Virtual Machine - Determining the CPU cost of virtualization with VMware ESX - Web Performance." P. Brebner, L.O. Brien, and J. Gray, "Performance Modelling Power Consumption and Carbon Emissions for Server Virtualization of Service Oriented Architectures ( SOAs )," Carbon, 2009. G. Chen, W. He, J. Liu, S. Nath, L. Rigas, L. Xiao, and F. Zhao, "Energy-Aware Server Provisioning and Load Dispatching for Connection-Intensive Internet Services," Computer. S. ENERGY, "Report to Congress on Server and Data Center Energy Efficiency Public Law 109-431," Public Law, vol. 109, 2007, p. 431. W. Forrest and J.M. Kaplan, "Data centers : How to cut carbon emissions and costs." "IEEE Spectrum: Tech Titans Building Boom." J. Markoff and S. Hansell, "Hiding in Plain Sight, Google Seeks More Power," 2006. A. Rawson, J. Pfleuger, and T. Cader, "GREEN GRID DATA CENTER POWER EFFICIENCY METRICS," 2008.
24
[20] S. Greenberg, E. Mills, B. Tschudi, P. Rumsey, and B. Myatt, "Best Practices for Data Centers : Lessons Learned from Benchmarking 22 Data Centers," Proceedings of the ACEEE Summer Study on Energy Efficiency in Buildings in Asilomar, CA. ACEEE, August, vol. 3, 2006, p. 7687. [21] B.P. Ranganathan, "Recipe for Efficiency : Principles of Computing." [22] R. Sullivan, "Alternating cold and hot aisles provides more reliable cooling for server," Uptime Institute, 2000. [23] A. Woods, "Cooling the data center," Communications of the ACM, vol. 53, 2010, p. 36. [24] a. Novoselac, "A critical review on the performance and design of combined cooled ceiling and displacement ventilation systems," Energy and Buildings, vol. 34, 2002, p. 497509. [25] L. Parolini, B. Sinopoli, and B.H. Krogh, "Reducing Data Center Energy Consumption via Coordinated Cooling and Load Management," Bernoulli. [26] S. Kumar, "vManage : Loosely Coupled Platform and Virtualization Management in Data Center," System, 2009. [27] D. Brown and C. Reams, "Toward energy-efficient computing," Communications of the ACM, vol. 53, 2010, p. 5058. [28] L.A. Barroso and U. Hlzle, "The Case for Energy-Proportional Computing," Computer, vol. 40, 2007, pp. 33-37. [29] S. Shankland, "Power could cost more than servers, Google warns," cnet news, 2005. [30] "Microsoft Environment - The Green Grid Consortium [31] H. Tutorial and P. Scientist, "Enterprise Power and Cooling Goals of Tutorial Enterprise Power & Cooling Tutorial," October, 2007. [32] Y. Yu and S. Bhatti, "Scaleable Energy Monitoring," 2010. [33] S. Srikantaiah, "Energy Aware Consolidation for Cloud Computing," Scenario. [34] N. Bansal, T. Kimbrel, and K. Pruhs, "Speed scaling to manage energy and temperature," Journal of the ACM (JACM), vol. 54. [35] L. Atkins, "Algorithmic Aspects of Power Management," 2009. [36] A. Borodin and R. El-Yaniv, Online computation and compettive analysis, Cambridge University Press, 1998. [37] EDS and NCC, The green IT paradox: Results of the NCC Rapid Survey, EDS; NCC, 2009. [38] Y. Yu and S. Bhatti, "Energy Measurement for the Cloud," CloudCAT2010, 2010. [39] R. Mayo and P. Ranganathan, "Energy Consumption in Mobile Devices : Why Future Systems Need Requirements-Aware Energy Scale-Down," Power-Aware Computer Systems, 2005. [40] E. Saxe, "Power-Efficient Software," Queue, vol. 8, 2010, p. 10. [41] G. Chen, W. He, J. Liu, S. Nath, L. Rigas, L. Xiao, and F. Zhao, "Energy-Aware Server Provisioning and Load Dispatching for Connection-Intensive Internet Services," Computer. [42] Intel Corporation, "Less Watts White Paper," Most, 2010, pp. 1-8. [43] D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Youseff, and D. Zagorodnov, "The Eucalyptus Open-Source Cloud-Computing System," 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, 2009, pp. 124-131. James W. Smith Green Cloud Literature Review 25
26

Green Cloud: A Literature Review of Energy-Aware Computing

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Green Cloud: A Literature Review of Energy-Aware Computing

Uploaded by

Copyright:

Available Formats

Green Cloud

A literature review of Energy-Aware Computing James W. Smith

James W. Smith Green Cloud Literature Review

James W. Smith Green Cloud Literature Review

James W. Smith Green Cloud Literature Review

James W. Smith Green Cloud Literature Review

James W. Smith Green Cloud Literature Review

4.1 Smart Construction

4.2 Power Usage Effectiveness

James W. Smith Green Cloud Literature Review

4.3 Data Centre Productivity

4.4 Datacentre Cooling

James W. Smith Green Cloud Literature Review

James W. Smith Green Cloud Literature Review

James W. Smith Green Cloud Literature Review

5 Energy Aware Computing

5.2 Constructing Energy-Aware Systems

James W. Smith Green Cloud Literature Review

James W. Smith Green Cloud Literature Review

James W. Smith Green Cloud Literature Review

5.3 Measuring Energy Consumption

James W. Smith Green Cloud Literature Review

James W. Smith Green Cloud Literature Review

6 Power Efficient Software

6.1 Power Efficient Software Principles

James W. Smith Green Cloud Literature Review

6.2 Software Modularity

James W. Smith Green Cloud Literature Review

James W. Smith Green Cloud Literature Review

James W. Smith Green Cloud Literature Review

James W. Smith Green Cloud Literature Review

James W. Smith Green Cloud Literature Review

You might also like