You are on page 1of 22

Data Center Energy Efficiency

Data Center Energy Efficiency: Power vs. Performance



Anish Dhesikan
Massachusetts Academy of Math and Science

Abstract

In the computational equipment of large data centers, there are ongoing problems with
the ratio of performance delivered to power consumed, resulting in unnecessary increases in
costs, carbon emissions, and global effects. A new method of balancing the load among the
servers was designed to increase the energy efficiency of data centers. Computers were simulated
within a network as a server cluster in which all web requests were directed to one IP address
and distributed to the servers by a load balancer. Various methods of load balancing, including
methods in which some servers were shut down while the majority of the load was processed by
fewer servers, were implemented and tested for energy efficiency and availability. The new
designs were more energy efficient than the typical methods, but several demonstrated increased
energy efficiency at the cost of decreased availability and decreased ability to handle load spikes.
Nevertheless, these designs can be scaled and implemented in large server farms to improve the
ratio of performance delivered to power consumed in the data center.

Introduction

Data centers today contribute a substantial percentage to global energy usage, consuming
as much energy as is produced by 26 nuclear power plants. Experts predict that data centers
consume a total of 26 GW of power worldwide. This is equivalent to approximately 1.4% of all
energy consumption in the world (Riccardi, Careglio, Santos-Boada, Sol-Pareta, Fiore, &
Palmieri, 2011). Almost every modern institution houses data centers, or rooms of servers and
other technologies to assist in information technology related operations. Approximately $3.3
billion are spent annually on energy to power data centers in the United States alone. This
consumption is only growing as time goes by, increasing by approximately 12% each year;
therefore, more energy efficient metrics must be implemented immediately (Dixit, Ekbote,
Judge, & Pouchet, 2008).
As technology advances, data centers are becoming more complex, causing the amount of
computational power and power density to increase drastically. Computational jobs that were
previously impossible are now being executed frequently, but the power needed for the servers to
perform and maintain performance has increased considerably. With this growing power density
comes a growing need for efficient cooling methods in data centers. This increasingly pertinent
issue has resulted in an obvious need for immediate improvements in the energy efficiency of
data centers (Patterson, 2012).

Literature Review

Background

A growing problem with global warming has led to renewed interest in studies of
improving the energy efficiency of common technologies. Data centers consume a significant
Data Center Energy Efficiency

amount of energy throughout the world. Knowledge of data center energy metrics has become
essential in reducing energy costs, carbon emissions, and global effects.
The cooling of the equipment contributes a substantial portion of this consumption.
Conventionally, computer room air conditioning (CRAC) units chill hot air and maintain the data
center at a constant temperature. It is generally recommended that the data center be maintained
at temperatures between 18 and 27 Celsius. However, this range is sometimes exceeded
depending on the various factors of the data center. Information Technology (IT) equipment,
including servers, contributes approximately 50% of energy consumption by an average data
center. The cooling, on average, accounts for between 25% and 30% of the energy use (Sun, Lee,
2006). The cooling thus consumes approximately 50% of the consumption by the IT equipment;
this baseline comparison demonstrates the magnitude of the energy spent on simply cooling the
equipment that makes up the data center (Iyengar et al., 2012). Hence a need has become
obvious for improvements in the energy efficiency of cooling systems.

Design and Function

Data centers are designed to allow multiple people to access applications hosted on
servers; therefore, the larger the organization, the larger the data center required to support its
processes. Data centers house the electronic equipment that processes information, stores data,
and networks communication within an institute (Alger, 2010).
The energy efficiency of a data center can be measured and compared with power usage
effectiveness (PUE), which is a ratio of total power consumed by the data center to the power
Figure 1. Cooling system layout and energy consumption distribution in typical data centers. Cooling
the facility consumes the equivalent of approximately 50% of the energy used by the information
technology equipment (Iyengar et al., 2012).

Data Center Energy Efficiency

consumed by the IT equipment (Cho, Lim, & Kim, 2012). This ratio should optimally be as close
to 1 as possible, indicating that almost all of the energy consumed by the data center is going into
powering the equipment. The current average PUE value is approximately 2, indicating that only
50% of the energy consumed by the entire data center is being used by the IT equipment
(Riccardi et al., 2011).
The essential goal in all attempts of improving data center energy efficiency is to
decrease the PUE. The lower the PUE, the more energy efficient the data center is. By changing
various components of the data center, a much lower PUE can be achieved (Riccardi et al.,
2011).

Infrastructure and Layout

A typical data center includes several server racks and one or many CRAC units, oriented
in such a way that there are hot and cold aisles in the room, as shown in Figure 2. This system
separates the hot air of the servers from the cooled air of the CRAC unit. However, the layout
continues to have some undesirable results because the hot exhaust air generally cannot be
completely separated from the cold air (Fakhim, Behnia, Armfield, & Srinarayana, 2011).


Figure 2. Hot and cold aisle system of typical data centers. The configuration of the server racks results in a separation of hot air
and cold air in different areas, commonly referred to as hot aisles and cold aisles (Fakhim, Behnia, Armfield, & Srinarayana,
2011).
The CRAC unit recycles the hot air from the servers by cooling it and pumping it back
into the under-floor plenum of the data center. The cool air rises up through the perforations in
the tiles of the floor, cooling the servers along the way (Fakhim et al., 2011).





Data Center Energy Efficiency

Typical Load Distribution Methods

Web servers generally deal with requests from clients through a process that involves
load balancing. The process begins with several clients sending requests. These requests are
carried through the Internet to a hosting network, which brings the request to a load balancer.

Figure 3. Web-content delivery process. Requests are sent by multiple clients and are eventually managed by a load balancer or
request distributor. The load balancer sends the requests only to certain servers (Doyle, Chase, Gadde, & Vahdat, 2002).
This load balancer manages the requests by using a setting that determines to which web server
the request will be sent. The request is then acknowledged and the desired product is sent back to
the client (Doyle et al., 2002).
One typical method of load distribution is random selection, in which each request is sent
to a random server in the network. More complex versions of random selection factor other
elements including previous load sent and current CPU utilization of each server. Another
common method known as the round robin method sends requests to servers in sequential
order such that the requests are equally dispersed among the servers (P. Kohli, personal
communication, January 14, 2013).

Recent Developments

Ongoing concern with the energy efficiency of conventional load distribution methods
has yielded a need for newer, more effective methods. In a study conducted by Microsoft, data of
server CPU utilization, memory usage, and power consumption were collected over a 45-day
period. With this data, algorithms for server provisioning were developed and implemented in
the data center. The study reported that the algorithm saved approximately 20% to 30% of the
energy consumption, and only slightly affected user experience. Microsoft later conducted
another study in which it was found that spinning down storage disks could lead to energy
savings of 28% to 36% (Alger, 2010).
Although the servers consume most of the energy used in data centers, up to 42% of the
energy can be required by the cooling systems (Dixit et al., 2008). The conventional method of
cooling involves an air conditioning unit that maintains the data center at a constant temperature.
The American Society of Heating and Refrigerating Engineers (ASHRAE) advises data centers
to have an inlet temperature setting between 20C and 25C. However, some dense data centers
are set to operate at an inlet temperature of 15C or 16C, while other data centers operate at
temperatures up to 32C (Patterson, 2012).
As previously stated, ASHRAE has conducted various experiments on data center
cooling and has provided recommendations for temperatures to maintain in the room; however,
ASHRAE has recently attempted to increase the range of possible temperatures. Several
Data Center Energy Efficiency

experiments have supported the hypothesis that maintaining a lower temperature in the data
center requires significantly more energy. Hence, further research is being completed to test
whether data centers can be safely maintained at even higher temperatures.
One team of researchers set out to combine ASHRAE standards with a system that
monitors the path of the cooling. The scientists tested several conditions and designs, looking
specifically into optimizing the airflow dynamics within the room. After much experimentation
and research in several different scenarios, the team concluded that their most efficient design
allows for up to a 14.2C increase in temperature. This is a vast improvement that could lead to
significant savings in energy consumption of cooling systems in data centers (Green, 2012).
With a growing need for more energy and cost efficient cooling methods, a way to
determine the perfect temperature for the data center is necessary. Many studies have attempted
to consider various factors of the servers in determining this perfect temperature. These factors
include the server PSU (Power Supply Unit), computer memory, CPU (Central Processing Unit)
usage, voltage regulators, internal fans, and spinning media. Other factors include the effect of
temperature on the servers and the power usage of the computer room air-handling (CRAH)
units, which are all elements in determining the ideal temperature for the data center.
Nevertheless, a change in the temperature of the data center may cause problems in life and
reliability of servers (Patterson, 2012).
A guide from the California Energy Commission (CEC) states that for every one-degree
increase in inlet temperature, 2% of expenses for air conditioning is saved (Poniatowski 2010).
However, this metric is a common misconception because many other factors are not considered.
One study indicates that the optimal method of deciding the inlet temperature in the data center is
by increasing the room temperature by a fixed amount such as 2C, and then measuring the total
energy consumption to test if there are significant savings.
However, there are many other methods of cooling the data center, including natural
means. Intel conducted an experiment in which the researchers used a simpler method with the
servers in the data center. The researchers tested efficiencies of economizers, which are devices
that take outside air or water to help reduce energy spent on cooling the air.
In the experiment, there was a control data center in which a conventional method of
cooling was used. In such typical data centers, cooling units take air from the data center itself,
cool this air, and recirculate it into the room. The variable room used an air economizer, which
brought outside air into the data center and pumped out hot air from the room. A standard air
filter kept dust and particles from entering the server room, but the researchers did not control
humidity during the experiment. Intel concluded that the economizer reduces energy-
consumption by 74% for a data center in New Mexico, and estimated that this averages to
approximately a 67% decrease in cooling-related energy consumption per year for this data
center (Alger, 2010).
However, some researchers have completely redesigned the cooling of the servers for
optimal energy efficiency. One team designed a system that uses water-based cooling rather than
air conditioning. It is not a combination of two methods of cooling, but rather it is a 100% water-
cooled process. The team concluded that its design has a potential of saving up to 25% of the
energy consumption of the data center, which is approximately a 90% increase in efficiency of
the cooling systems. This is an incredibly significant improvement that could lead to substantial
changes in the future of data center energy metrics (Iyengar et al., 2012).
On a different note, Dell performed an experiment to find out more about the different
temperatures within a server cabinet. The researchers found that there was a significant
Data Center Energy Efficiency

difference in inlet temperatures of servers when they placed them at different heights. The higher
the server was placed, the higher its inlet temperature was. The temperatures varied by
approximately 6C, which is significant enough to cause execution or computational errors in the
servers. Dell concluded that servers in the top rows of the server rack are far more likely to have
malfunctions or server failures than those in the bottom rows (Alger, 2010). This issue could be
largely due to heat recirculation, in which air from the hot aisles of the data center escapes into
the cold aisles from the top (Mukherjee, Banerjee, Varsamopoulos, Gupta, & Rungta, 2009).


Figure 4. Thermal image demonstrating heat recirculation. It is found that heat often escapes from the hot aisle into the cold aisle
from the top, causing servers located in the top rows of the rack to receive less cooling than those in the bottom rows (Mukherjee
et al., 2009).
Another team of researchers designed a product known as Energy Farm, which optimizes
server energy consumption by concentrating the functions of the data center to fewer servers
when possible, as predicted by mathematical models of data center performance needs. Energy
Farm makes the best use of the servers and shuts off the servers that are not needed at any given
time. With such a setup, not all computers must be running and consuming energy; the processes
of many computers are concentrated to fewer computers. After much testing, researchers
concluded that Energy Farm has the potential to increase resource allocation efficiency by up to
68 percent, which can lead to significant savings in costs, energy, and carbon emissions
(Riccardi et al., 2011).

Engineering Proposal

Engineering Problem:
The ratio of power spent on servers to performance delivered by servers is not optimal in
large data centers, resulting in unnecessary increases in costs, carbon emissions, and global
effects.

Engineering Goals:
A new method of balancing the load among the servers can be designed to increase the
energy efficiency of data centers.



Data Center Energy Efficiency

Methods and Procedures:
One server will be housed in a temperature-controlled room. A laptop will be connected
to the server via Ethernet cable and the laptop will have load simulation software. The laptop
will send simulated requests to the server at an increasing rate to understand the relationship
between number of requests and CPU Utilization, as well as the relationship between number of
requests and power consumption of the server.
With the understanding of these relationships, an entire data center will be simulated in
which it is assumed that all servers in the data center are the same. The number of requests will
be naturally fluctuating to simulate the natural flow of users to the web page or application.
Based on the number of requests at any given time, the CPU utilization and power consumption
of each server will be predicted. Additionally, an algorithm will control how many servers are on
at any time to handle the given load. Various designs of load balancing will be tested and
evaluated for energy efficiency, availability, and other factors.
The analysis of the data is straightforward. With quantitative data points, it will be simple
to determine which method of load distribution is most energy efficient by seeing which one
consumes the least amount of total energy. However, there are many other factors that play a role
in deciding which method is best for a business: intensity of shifts in load, speed of computer
startups, and several others. Such elements may be accounted for in the process itself through the
variations in load distribution, but it may be difficult to obtain concrete quantitative data to
determine these optimal settings.
The new designs will be tested for energy efficiency, automation, self-maintenance, user
control, affordability, adaptability, server stability, and durability. Comparing the total energy
consumption of the new system to its conventional counterpart will test the energy efficiency.
This is the most important of the engineering criteria because it accomplishes the primary
engineering goal. Secondary goals include automating the process, having user input for error
detection, and ensuring that the servers do not fail. These criteria will be tested by considering
them either implemented or not implemented with the final product.

Methodology

Data Center Energy Efficiency

A desktop computer (1. CyberPowerPC brand, Windows 8 64-bit (6.2, Build 9200), 64-
bit OS, x64-based processor, Intel Core i7-3770K CPU @ 3.50GHz 3.5GHz Processor,
16.0GB RAM, AMD Radeon HD 7700 Series Graphics Card, AMD Radeon Graphics Processor
(0x683F) Chip) was housed in a temperature-controlled room. The ambient room temperature
was maintained at 20C. The computer was plugged into a hardware power logger (Watts Up?
Pro Brand, Watt Meter), and the power logger was plugged into a standard wall outlet.


Figure 5. Watts Up? Pro Watt Meter. The desktop computer was plugged into the power-logging device, and the device was
plugged into a wall outlet.


XAMPP was downloaded and installed on the desktop computer, and the XAMPP
control panel was opened. The services for Apache and MySql were started.
Data Center Energy Efficiency


Figure 6. XAMPP control panel. Both the Apache and MySql services were started and are running on the desktop computer.

A web browser was opened and localhost was typed into the URL bar to bring up the
interface of XAMPP for Windows. Under the Tools heading in the left sidebar, phpMyAdmin
was clicked to lead to the database interface.


Figure 7. XAMPP interface on desktop computer. The link to phpMyAdmin is found on the left sidebar of the XAMPP for
Windows interface.
Data Center Energy Efficiency

In the phpMyAdmin interface, a new database entitled anish_db was created and a
table entitled company_tbl was made within the new database. See Figure 8 for all
specifications of the database.


Figure 8. Interface of phpMyAdmin. Within phpMyAdmin, a new database anish_db was created with the specifications of the
server listed on the right.
In the XAMPP\htdocs folder found in the drive in which XAMPP was installed, code was
written to read csv files and insert the data from these files into the database (see appendix for
code). In the same directory, csv files of a specific format were added (see appendix for format).
The function file was accessed through a web browser from the desktop computer to verify that
the information was written to the database. The database table was then emptied.
A laptop (Lenovo Y560 brand, Windows 7 64-bit, 64-bit OS, Intel Core i5 CPU
M480 @ 2.67GHz 2.67GHz Processor, 8.0 GB RAM, Intel HD Graphics) was connected to the
desktop computer using a standard Ethernet cable. The IP address of the desktop computer in the
Local Area Network was found using the ipconfig command in terminal. On the laptop, the
network settings were changed such that the IPv4 address of the laptop was automatically chosen
based on that of the desktop computer. Notepad was run as administrator, and the file in the
directory C:\Windows\System32\drivers\etc\hosts was opened. A line with the IP address of the
Data Center Energy Efficiency

desktop computer, followed by a space and anish.com was added to the end of the hosts file
such that anish.com, when typed in a web browser, was redirected to the web server.


Figure 9. Desktop computer connected to laptop. The desktop computer was connected to the laptop via Ethernet cable, and web
requests were sent from the laptop to the computer.
On the desktop computer, Windows Performance Monitor software was used to log the
CPU utilization, memory availability, and disk reads/writes at every 1-second interval. Software
known as Real Temp was also used on the desktop to log the CPU temperature at ever 1-second
interval. On the laptop, the Watts Up USB software was used to set the logging interval on the
power logger to every 1 second. Software known as Web Performance Load Tester was
downloaded on the laptop and used to simulate users. In this software, the record button was
pressed, causing a web browser to open. The default web page was shown and anish.com was
entered into the page to navigate to the web server. Once the page was loaded, the web browser
was closed and the recording was automatically stopped. A new linear load configuration was
created with the recording such that the number of simulated users gradually increased to 1000
users over 5 minutes.
The user-defined data collector set on the Windows Performance Monitor of the desktop
computer was started, and then the load configuration on the laptop was initialized. When the
load configuration was complete, the performance monitor was stopped and the data from the
power logger was collected. The final data comprised number of users, CPU utilization, power
consumption, and CPU temperature of the server.


Data Center Energy Efficiency

Results


Figure 10. Graph of number of users vs. CPU utilization. Approximately 1200 data points were collected comparing number of
users and CPU utilization.

Figure 11. Desktop computer logging performance. CPU utilization and other performance measurements were logged using
software known as Windows Performance Monitor.
y = 0.0701x
R = 0.9012
0
5
10
15
20
25
30
35
40
0 50 100 150 200 250 300 350 400
C
P
U

U
t
i
l
i
z
a
t
i
o
n

(
%
)

Number of Users
Number of Users Vs. %CPU Utilization (A.
Dhesikan)
Data Center Energy Efficiency


Figure 12. Graph of number of users vs. power consumption in watts. Approximately 1200 data points were collected comparing
number of users to power consumption.


Figure 13. Graph of %CPU utilization vs. CPU temperature in C. Approximately 4000 data points were collected comparing
CPU utilization to CPU temperature.

y = 0.0763x + 54.263
R = 0.929
50
55
60
65
70
75
80
85
90
0 50 100 150 200 250 300 350 400
P
o
w
e
r

C
o
n
s
u
m
p
t
i
o
n

(
W
a
t
t
s
)

Number of Users
Number of Users Vs. Power Consumption
(A. Dhesikan)
y = 0.5571x + 25.093
R = 0.7602
15
25
35
45
55
65
75
85
0 5 10 15 20 25 30 35 40 45
C
P
U

T
e
m
p
e
r
a
t
u
r
e

(

C
)

CPU Utilization (%)
CPU Utilization vs. CPU Temperature (A.
Dhesikan)
Data Center Energy Efficiency

Table 1. Engineering Matrix.
Criteria Max
Points
Design
A
Design
B
Design
C
Design
D
Energy Efficiency 10 0.0 10.0 8.8 6.3
High Availability 10 10.0 5.2 8.2 8.3
Automation 8 8 8 8 8
Server Reliability 5 5.0 1.1 3.3 2.3
Low Script Run Time 4 4 4 4 4
Total 42 27.0 28.3 32.3 28.9
Percent 100% 64.29% 67.38% 76.90% 68.81%

Design A: Load is distributed equally among four computers

Design B: Entire load is sent to one computer and the others are powered off; if the CPU
utilization of the computer exceeds 70%, another computer is powered on and load is distributed
evenly among computers that are powered on. When each of the computers that are powered on
reach 70% CPU utilization, another computer is powered on, etc.

Design C: Load is distributed equally between two computers and other computers are powered
off; if the CPU utilization of each computer that is powered on exceeds 70%, another computer
is powered on.

Design D: Entire load is sent to one computer and the others are powered off; if the CPU
utilization of the computer exceeds 50%, another computer is powered on and load is distributed
evenly among computers that are powered on. When each of the computers that are powered on
reach 50% CPU utilization, another computer is powered on, etc.

Criteria Determination:
Energy Efficiency- The four designs were simulated over 30 minutes and the total energy
saved was calculated after that amount of time. A linear function was used to calculate
the score based on energy saved
High Availability- The four designs were simulated over 30 minutes and the average
percent of requests delayed was found. A linear model was used to calculate the score
based on the average percent of requests delayed.
Automation- If the design worked with no user input, a score of 8 was given. If the
design did not work without user input, a score of 0 was given.
Server Reliability- The four designs were simulated over 30 minutes and the total number
of times a server switched on or off was found. A high number of power cycles results in
a low score for reliability and a low number of power cycles results in a high score. A
linear model was used to calculate the score based on the number of times a server was
switched on or off.
Low Script Run Time- The time for execution of the script was found to be the same for
all designs due to the nature of the script. This results in all designs receiving the same
score of 4.
Data Center Energy Efficiency

Data Analysis and Discussion

The first set of data shows the relationship between number of users and CPU utilization.
The data suggest a moderate relationship between the two variables, with an R
2
value of 0.90123.
There are a few anomalies in the data toward the end of experimentation; this is most likely due
to other factors like disk reads/writes causing variance in the processing capabilities. The slope
of the linear function, 0.0701, indicates that every user causes an increase of 0.0701% in CPU
utilization of the computer. The y-intercept of 0 suggests that when the number of users is zero,
the CPU utilization of the server is 0%.
The second set of data shows the relationship between number of users and power
consumption in watts. The data suggest a moderate relationship between the two variables, with
an R
2
value of 0.92904. Similarly to the previous set of data, there are anomalies toward the end
of experimentation. There are also few anomalies throughout experimentation, suggesting other
variables affecting the power consumption. The slope of the linear function, 0.0763, indicates
that every user causes the power consumption of the server to increase by 0.0763 watts. The y-
intercept of 54.263 suggests that when the number of users is zero, the power consumption is
54.263 watts.
The third set of data shows the relationship between number of users and CPU
temperature in degrees Celsius. The data suggest a weak relationship between the two variables,
with an R
2
value of 0.76021. There are several anomalies in the beginning of experimentation,
which may have caused a large difference in strength of the model. However, there is an
increasing trend between the two variables. The slope of the linear function, 0.5571, indicates
that every user causes the CPU temperature to increase by 0.5571C. The y-intercept of 25.093
suggests that when the number of users is zero, the CPU temperature is 25.093C.
Data of similar external studies have not been made publicly available at this time.
Nevertheless, the variables in this study are specific to the conditions provided; the type of
request sent by the user and the specifications of the computer may be vastly different in external
studies, possibly causing discrepancies.

Conclusions

A method of load balancing with greater energy efficiency can be designed; powering off
servers that are not necessary to process the current web requests to the data center saves energy
over time. In a data center simulation with only four servers, Design B was demonstrated to be
7.2% more energy efficient than the conventional method (Design A). Design C was
demonstrated to be 6.3% more energy efficient, and Design D was demonstrated to be 4.14%
more energy efficient. These percentages will only increase if the simulation was scaled to a
larger data center due to the greater number of combinations of servers, allowing for more
variation in server status, which essentially translates to even greater energy efficiency.

Limitations and Assumptions

Limitations for this project include the use of only one server to simulate a large data
center. In doing so, it is assumed that a single server can accurately represent a large server farm
in which all servers are the same model. In the simulation, four servers were assumed to be able
to represent a large data center and the processes deciding the load distribution in the data center
Data Center Energy Efficiency

were assumed to be able to scale to a large data center. It was also assumed that the CPU
utilization and other performance data of one server is the same as all servers of the same model.
Additionally, instead of having a natural flow of users to the webpage, a virtual user simulation
tool was used. It was assumed that this load simulation tool was an accurate representation of
real visitors to the website.
Based on the data gathered from experimentation, it was assumed that the CPU utilization
could be predicted to a moderate degree of accuracy based on the number of users or number of
requests being processed. It was also assumed that the power consumption could be predicted by
number of users.
Ambient room temperature, number of processes running on the server, physical location
or state of servers, type of computer or server, and software used to simulate load were
controlled in the experiment. However, humidity and background processes running on the
servers were not maintained.
There were several possible sources of error during the experiment. For instance, other
processes may have been running in the background of the server, causing random increases and
fluctuations in energy consumption. Also, the software used to simulate the users to the web
server may not have produced a perfectly concurrent flow of users, causing the results to shift in
either direction. Finally, there may have been a difference in the time stamps of the different sets
of logged data, causing all dependent variable values to be measured for the incorrect
independent variable setting. However, because the independent variable was changed gradually,
this would not have caused a great difference in understanding the relationship between number
of users and power consumed.

Applications and Future Experiments

The proposed methods of load balancing can be scaled up to large server farms to
increase the energy efficiency of data centers throughout the world.
The prototype must be redesigned for a larger server cluster and tested for various criteria
including energy efficiency and availability. Once one of the methods is demonstrated to be
more efficient in large data centers, the automation of the process must be scaled up for the
greater number of servers. With all aspects tested and prepared, the new method can be
implemented in data centers to begin saving energy immediately.
In the future, the study could be repeated with brand new servers that do not have many
background processes running to obtain a more accurate measure of energy saved. Also, the
entire test could be performed on a full-scale data center to get an actual representation of total
energy consumption and availability. The variation in number of users to any given web page
could be predicted using visitor history and mathematical modeling, and this prediction could be
factored into the load distribution design. Additionally, even more factors of the servers,
including CPU temperature, could be considered in the balancing of the load.







Data Center Energy Efficiency

Literature Cited

Alger, D. (2010). Grow a greener data center. Indianapolis, IN: Cisco Press.

Cho, J., Lim, T., Kim, B.S. (2012). Viability of datacenter cooling systems for energy efficiency
in temperate or subtropical regions: Case study. Energy and Buildings, Retrieved from
http://www.sciencedirect.com

Dixit, S., Ekbote, A., Judge, J., and Pouchet, J. (2008). Reducing data center energy
consumption. ASHRAE Journal, 50 (11), 14.

Doyle, R., Chase, J., Gadde, S., Vahdat, and A. (2002). The Trickle-Down Effect: Web Caching
and Server Request Distribution. Computer Communications, Retrieved from
http://www.sciencedirect.com

Fakhim, B., Behnia, M., Armfield, S.W., and Srinarayana, N. (2012). Cooling solutions in an
operational data centre: A case study. Applied Thermal Engineering, 31 (14-15), 2279-
2291. Retrieved from http://www.sciencedirect.com

Green, M., Karajgikar, S., Vozza, P., Gmitter, N., and Dyer, D. (2012). Achieving Energy
Efficient Data Centers Using Cooling Path Management Coupled with ASHRAE
Standards. Semiconductor Thermal Measurement and Management Symposium, 288-292.
Retrieved from http://ieeexplore.ieee.org/ doi: 10.1109/STHERM.2012.6188862

Iyengar, M., David, M., Parida, P., Kamath, V., Kochuparambil, B., Graybill, D.,Chainer, T.
(2012). Extreme energy efficiency using water cooled servers inside a chiller-less data
center. Thermal and Thermomechanical Phenomena in Electronic Systems (ITherm),
137-149. Retrieved from http://ieeexplore.ieee.org/ doi: 10.1109/ITHERM.2012.6231424

Mukherjee, T., Banerjee, A., Varsamopoulos, G., Gupta, S., and Rungta, S. (2009). Spatio-
temporal thermal-aware job scheduling to minimize energy consumption in virtualized
heterogeneous data centers. Computer Networks, 53 (17), 2888-2904. Retrieved from
http://www.sciencedirect.com

Patterson, M.K. (2008). The Effect of Data Center Temperature on Energy Efficiency. Thermal
and Thermomechanical Phenomena in Electronic Systems, 1167-1174. Retrieved from
http://ieeexplore.ieee.org/

Poniatowski, M. (2010). Foundations of green it: consolidation, virtualization, efficiency, and
ROI in the data center. Indianapolis, IN: Prentice Hall.

Ricciardi, S., Careglio, D., Santos-Boada, G., Sol-Pareta, J., Fiore, U., and Palmieri, F. (2011).
Saving Energy in Data Center Infrastructures. Data Compression, Communications and
Processing (CCP), 265-270. Retrieved from http://ieeexplore.ieee.org/ doi:
10.1109/CCP.2011.9

Data Center Energy Efficiency

Sun, H.S., and Lee, S.E. (2006). Case study of data centers energy performance. Energy and
Buildings, 38 (5), 4078-4094. Retrieved from http://www.sciencedirect.com
Appendix
DataSource.php :
Publicly available at http://code.google.com/p/php-csv-parser/

db_config.php:
<?php

// These constants define how to connect to the database and which database to connect to
// Make sure to change these values to fit your own
define('DB_HOST', 'localhost');
// DO NOT USE root user in a production environment
define('DB_USER', 'root');
// Make sure to add a secure password
define('DB_PASS', '');
// This needs to change to match your database. Often, on shared hosting, it'll be your cpanel
username, followed by an underscore,
// followed by the actual database name
define('DB_NAME', 'anish_db');

?>

insert_csv_data_into_db.php:
<?php

// Contains the login info for the database
require_once('db_config.php');

$csv_data_file_src = 'FILTERED_hair salon wrentham ma 59.csv';
$csv_data_file_src2 = 'FILTERED_frozen yogurt wrentham ma.csv';

// THESE ALSO NEED TO BE CHANGED UPON UPLOAD
$visual_menu_id = 1;
$user_id = 2;

// Does the actual storing of data into the DB
storeDataFromCSV($csv_data_file_src, 1, 2);
storeDataFromCSV($csv_data_file_src2, 1, 3);

function retrieveAllDataRowsAsAssociativeArray($csv_filesrc, $row_to_use_as_header = 1){
require_once('DataSource.php');
// Instantiate the datasource class
$csv = new File_CSV_DataSource;
// We can set it to ignore X number of rows
Data Center Energy Efficiency

// load the csv file, ignore the first line since it doesn't have the header
$csv->load($csv_filesrc, $row_to_use_as_header);

// Make the CSV symmetric so that we can actually use most of the functions
$csv->symmetrize();
// Retrieve the data from the new CSV as a set of arrays with arrays in them
/* EXAMPLE:
array (
0 =>
array (
'name' => 'john',
'age' => '13',
'skill' => 'knows magic',
),
1 =>
array (
'name' => 'tanaka',
'age' => '8',
'skill' => 'makes sushi',
),
2 =>
array (
'name' => 'jose',
'age' => '5',
'skill' => 'dances salsa',
),
)
*/
$dataArray = $csv->connect();
return $dataArray;
}

function storeDataFromCSV($csv_data_file_src, $visual_menu_id, $user_id){
// Get the array with all data rows matched to header row
$csvDataArray = retrieveAllDataRowsAsAssociativeArray($csv_data_file_src, 0);
print_r(array_keys($csvDataArray[0]));
//print_r($csvDataArray);

// Connect to the database
$dbc = mysqli_connect(DB_HOST, DB_USER, DB_PASS, DB_NAME)
or die('Could not connect to database...');

// Check for duplicates in the database before inserting
foreach($csvDataArray as $csvRow){

$company_name = mysqli_real_escape_string($dbc, trim($csvRow['Company']));
Data Center Energy Efficiency

$website_url = mysqli_real_escape_string($dbc, trim($csvRow['Webpage']));
// $slogan
$addr1 = mysqli_real_escape_string($dbc, trim($csvRow['Address1']));
$addr2 = mysqli_real_escape_string($dbc, trim($csvRow['Address2']));
$phone = mysqli_real_escape_string($dbc, trim($csvRow['Phone']));
$email = mysqli_real_escape_string($dbc, trim($csvRow['E-mail']));
// $is_mobile

/*
$company_name = $csvRow['Company'];
$website_url = $csvRow['Webpage'];
// $slogan
$addr1 = $csvRow['Address1'];
$addr2 = $csvRow['Address2'];
$phone = $csvRow['Phone'];
$email = $csvRow['E-mail'];
// $is_mobile
*/

// Assume it is already there
$duplicate_exists = True;

// Check duplicate by comparing the website URL
$query1 = "
SELECT website_url FROM company_tbl WHERE
website_url = '$website_url'
";

$data = mysqli_query($dbc, $query1)
or die("Error INS: ".mysqli_error($dbc));

/*
if(mysqli_num_rows($data) == 0){
// There are no duplicates
$duplicate_exists = False;
}
*/
// We want more duplicates right now because we want to try to increase the CPU
usage
$duplicate_exists = False;

// Insert the same one 30 times
for($i=0; $i < 30; $i++){
// This is a new one, so let's insert it into the DB
if(!$duplicate_exists){

Data Center Energy Efficiency

$query2 = "
INSERT INTO company_tbl (name, website_url,
address_line1,
address_line2, phone, email, visual_menu_id, user_id,
date_added)
VALUES ('$company_name', '$website_url', '$addr1',
'$addr2',
'$phone', '$email', '$visual_menu_id', '$user_id', NOW())
";

// Submit the queries
$result = mysqli_query($dbc, $query2)
or die("Error RES: ".mysqli_error($dbc));
if($result){
echo "$website_url successfully inserted into database for
user $user_id <br />";
}

}else{
echo "Duplicate already exists for $website_url. It was not
inserted.<br />";
}
}
}

// Close the database
mysqli_close($dbc);
}

?>



csv file format:
Company,Address1,Address2,Webpage,Phone,E-mail
American Skin Care,158 Main Street,"Norfolk, MA
02056",http://americanskincarenorfolk.com/,(508) 528-2888,none
Beauty Nail & Spa Salon,11 Robert toner Blvd,"North Attleborough, MA
02760",http://beautynailspa-attleboro.com/,(508) 699-8881,none
Hair's Boston,225 Franklin Village Drive,"Franklin, MA 02038",http://hairsboston.com/,(508)
520-3919,none
Joseph Witt Salon,313 North Main Street,"Mansfield, MA
02048",http://josephwittsalon.com/,(508) 339-2623,none
L'Equipe Personalized Hairdressing,276 Franklin Village Drive,"Franklin, MA
02038",http://lequipesalon.com/,(508) 520-7828,none
Data Center Energy Efficiency

MG Salon & Spa,114 Main Street,"Medway, MA 02053",http://mgsalonspa.com/,(508) 533-
0779,none
Phillip Richard Salon,9 Washington Street,"Plainville, MA
02762",http://philliprichardsalon.com/,(508) 643-3700,none
Salon Michique's,1764 Mendon Road,"Cumberland, RI
02864",http://salonmichiquesspa.com/,(401) 333-6111,none
Unique Eyebrow Threading,3335 Mendon Rd,"Cumberland, RI
02864",http://uniqueeyebrowthreading.com/,(401) 405-0787,none
American Laser Skincare,550 North Main Street #4,"Attleboro, MA
02703",http://www.americanlaser.com/,(508) 223-4400,none
Beauty By Zangi,5 West Street,"Walpole, MA 02081",http://www.beautybyzangi.info/,(508)
660-1031,none
Bohemia,762 East Washington Street,"North Attleborough, MA
02760",http://www.bohemiasalon.com/,(508) 695-5500,none
Brian Richards Salon,Suite 3,"456 West Central Street, Franklin, MA
02038",http://www.brianrichardsalon.com/,(508) 528-7300,none



Acknowledgements

The author wishes to thank several mentors who contributed in various aspects of this
project. Mr. Harvell, a WPI graduate and visiting scholar at Mass Academy, provided ongoing
support and guidance throughout the project, and also offered helpful ideas and suggestions
during project development. Mr. Puneet Kohli, Director of Software Engineering at RSA,
provided valuable information to assist in the understanding of the project and surrounding
fields. Finally, the author would like to thank his parents and brother for contributing support,
funding, and resource acquirement, including all computers, servers, and routers used for
experimentation.