02 Types PDF

Contents
In this class
Types of Grids
Types of Grids
Desktop Grids
Motivations
Applications
Dr. Alejandro Zunino

CONICET / ISISTANUNICEN
Limitations
Boinc!
Clusters
1
Types of Grids
Grid computing vendors have adopted
various nomenclatures to explain and define
the different types of grids.
based on the structure of the organization
(virtual or otherwise) that is served by the grid
Departmental Grids
Solve problems for a particular group of
people within an enterprise:
Cluster Grids (Sun Microsystems):
One or more systems working together to provide a
single point of access to users.
Used by a team for a single project
Support both high throughput and high performance
jobs.
defined by the principle resources used in the

grid.
Infra Grids (IBM)

A grid that optimizes resources within an enterprise
Does not involve any other internal partner.
Enterprise Grids
Consist of resources spread across an
enterprise. Provide service to all users within
that enterprise:
Enterprise Grids (Platform Computing):
deployed within large corporations that have a global
presence or a need to access resources outside a
single corporate location.
Intra Grids (IBM):
Enterprise Grids
Campus Grids (Sun Microsystems):
Enable multiple projects or departments to share
computing resources in a cooperative way
May consist of dispersed workstations and servers as

well as centralized resources located in multiple
administrative domains, in departments, or across the
enterprise.
resource sharing among different groups within an

enterprise constitutes an intra grid.
Can be local or traverse the wide area network.
5
Extraprise Grids
Established between companies, their
partners, and their customers. The grid
resources are generally made available
through a virtual private network (VPN):
Extra Grids (IBM):
Global Grids
Grids established over the public Internet.
They can be established by organizations to
facilitate their business or purchased in part,
or in whole, from service providers:
Global Grids (Sun):
Enable sharing of resources with external partners.

Assumes that connectivity between the two
enterprises is through some trusted service, such as a
private network or a VPN.
Partner Grids (Platform Computing):

Grids between organizations within similar industries,
which have a need to collaborate on projects and use
each other's resources as a means to reach a common
goal.
Allow users to tap into external resources.

Provide the power of distributed resources to users
anywhere in the world for computing and
collaboration.
They can be used to send overflow work over the
public network to a grid services provider.
Inter Grids (IBM):

provide the ability to share compute and data/storage
resources across the public Web
Compute Grids
Provide access to computational resources:
Desktop Grids:
Others
Data Grids:
optimized for data-oriented operations.
Leverage the resources of desktop computers.
Server Grids:
Some corporations, while adopting Grid Computing ,
keep it limited to server resources
Special servers are bought solely for the purpose of
creating an internal utility grid with resources made
available to various departments.
No desktops are included in server grids.
High-Performance/Cluster Grids:
High-end systems, such as supercomputers or HPC
clusters
9
Utility Grids:
commercial compute resources that are
maintained and managed by a service provider
Customers that have the need to augment their
existing, internal computational resources may
purchase cycles from a utility grid.
also offer applications that can be purchased by
the minute.
10
Overview
Desktop Grids
11
Historical context
What is (and what isn't) a Desktop Grid?
Deployment challenges and value proposition
development for Desktop Grid technology
Key areas to assess when evaluating the
suitability of a Desktop Grid
The role of Desktop Grids in an Enterprise
computing infrastructure standards on
Desktop Grids
Examples
12
Motivation
CPU Availability
High computational power at low cost

Reuse existing infrastructure of resources
Successful deployment of high-throughput,

compute-intensive applications
High-Throughput?
Typically means that the performance metric is
the task completion rate over long periods of
time (e.g., month)
As opposed to makespan
13
14
Cause Computing
Searching for extra-terrestrials:
SETI@home: http://setiathome.ssl.berkeley.edu/
Evaluating AIDS drug candidates:

FightAIDS@Home:
http://www.fightaidsathome.org/
Screening for extremely large prime

numbers:
Greater Internet Mersenne Prime Search: http://
www.mersenne.org/prime.htm
Predicting climate on a global scale:

ClimatePrediction.net
http://www.climateprediction.net/index.php
15
Features
Long computations
Short communication packets
User-initiated tasks have preference
Minimally intrusive on the user and his
Internet connection
Primitive version of today's Desktop Grids
18
Why Would I Donate CPU Time?

Donate to scientific cause
Limitations
Lack of Resource Management
Many users wish to advance the specific field of

study
Projects that help fight disease may have an
emotional connection for those participating
Stress test computers

places a computer under full CPU load
Teams, credits, and competition

Hopes of climbing to the top of the world charts
Personal benefits and recognition
Projects such as PlanetQuest plan on

allowing individuals to name those planets
discovered using their19computers
Passive resource management

rely on the enrolled PCs to initiate communication with
the central administration server on a periodic basis.
Limits the degree to which the timeliness of

results from such a grid can be predicted.
Limits the ability to re-prioritize the computational
behavior of the grid
for example: replacing the PC that is working on a
particular task, in a timely manner.
20
Limitations
Lack of Security
Limitations
Machine Heterogeneity
Even if some form of encryption is used in transit,

the data usually reside in an unencrypted format
on the enrolled PC.
This limits the nature of the problems that can be
attempted over the public Internet to those in
which compromise of the data is not a pressing
issue.
The answers produced on the enrolled PC may
be vulnerable to tampering:
A wide variety of machines might be enrolled;

these can vary in CPU speed, RAM, hard-drive
capacity, and operating system level.
The management infrastructure either needs to
operate at the lowest common denominator or
needs to be aware of differences in the machines
and assign tasks appropriately.
Ex: SETI@Home alternative clients (buggy)

21
22
Limitations
Resource Availability
cause-computing paradigm relies on the idea of
voluntary participation
The PC may be turned off for the night, the
screensaver may be changed, the control
program may be disabled (either deliberately or
inadvertently), etc.
This adds another layer of unpredictability to the
performance expectations that can be associated
with such a grid.
23
So... What For?

Embarrassingly Parallel (aka Pleasantly
Parallel) applications
independent tasks (concurrent, out of order)
Example: Mandelbrot
Data Parallel / Iterative applications

Synchronized processes
Example: Jacobi, Matrix Multiply
Workflow applications
Described by a DAG
Example: some image processing applications
24
Examples
67 TFlops/sec, 500,000 workers, $700,000
1 7 .5 TFlo p s /s e c , 8 0 ,0 0 0 wo rke rs
1 8 6 TFlo p s /s e c , 1 9 5 ,0 0 0 wo rke rs
Desktop Grid
A defined (named) collection of machines on
a shared network.
may include dedicated machines, intermittently
connected machines, and shared machines
Any single machine is part of one, and only one,
Desktop Grid.
A set of user-controlled policies describing

the way in which each of these machines
participates in the grid:
Support for automated addition and removal of
machines without user or administrative
intervention.
25
26
Desktop Grid
The machines on the grid are unaware of
each other except as informed by the central
server.
client-server architecture (no peer-to-peer)
Managed mechanism for distribution,

execution, and retrieval of work to and from
the grid under control of a central server.
Components
Grid Server
This is a central machine that controls and
administers the Desktop Grid.
Grid Client
An individual node that is a member of the
Desktop Grid from which spare computational
resources will be harvested.
Grid Client Executive

The software component of the grid
infrastructure that resides on a PC, enables that
PC to serve as a Grid Client, and manages all
interaction between the Grid Client and the Grid
Server.
27
28
Components
Technologies: Considerations
Security
Work Unit
A computation assigned to a Grid Client by the
Grid Server
a grid-enabled version of an application
instructions for establishing an environment for the
application on the Grid Client
Unobtrusiveness
Application Integration
Robustness
input data (or a pointer to the location of the input

data)
Scalability
instructions on how to execute the application and

produce the output data.
Central Management
29
30
Technologies: Security
Disallow (or limit) access to network or local
resources by the distributed application.
Encrypt application and data to preserve
confidentiality and integrity.
Ensure that the Grid Client environment (disk
contents, memory utilization, registry contents, and
other settings) remains unchanged after running
the distributed application.
Prevent local user from interfering with the
execution of the distributed application.
Prevent local user from tampering with or deleting
data associated with the distributed application.
31
Technologies: Integration
Ability to simulate a standalone environment
within the Grid Client.
Integrated security and encryption of
sensitive data.
Easy integration (tools, examples, and
wizards are provided).
Support for any application...
Binary-level integration (no recompilation,
relinking, or source code access...).
33
Technologies: Scalability
Automatic addition, configuration, and
registration of new Grid Clients.
Compatible with heterogeneous resource

population.
Configurable over multiple geographic

locations.
Technologies: Unobtrusiveness
Centrally manage unobtrusiveness levels that are
changeable based on time-of-day or other factors.
Ensure that the Grid Client Executive relinquishes
client resources automatically.
Ensure invisibility to local user.
Prevent distributed application from displaying
dialogs or action requests.
Prevent performance degradation (and total system
failure) due to execution of the distributed
application.
Require very little (ideally, zero) interaction with the
day-to-day user of the Grid Client.
32
Technologies: Robustness
Allocate work to appropriately configured Grid
Clients.
Automatically reallocate work units when Grid
Clients are removed from grid either permanently or
temporarily.
Automatically reallocate work units due to other
resource or network failures.
Prevent aberrant applications from completely
consuming Grid Client resources (disk, memory,
CPU, etc.).
Provide transparent support many OSs in the Grid
Client population.
34
Technologies: Central Manageability

Automated monitoring of all grid resources.
Central queuing and management of work units for
the grid.
Central policy administration for grid access and
utilization.
Compatibility with existing IT management systems.
Product installation and upgrade can be
accomplished using typical enterprise software
management environments.
Remote client deployment and management.
35
36
PC Grids Versus Supercomputers
An Example: BOINC
Berkeley Open Infrastructure for Network
Computing (BOINC)
http://boinc.berkeley.edu/
Features:
Flexible application framework
Existing applications in common languages (C, C++,
Fortran) can run as BOINC applications with little or
no modification.
New versions of applications can be deployed with no
participant involvement.
Security
37
38
An Example: BOINC
BOINC protects against several types of attacks:
digital signatures based on public-key encryption
protect against the distribution of viruses.
Multiple servers and fault-tolerance
An Example: BOINC
Support for large data
BOINC supports applications that produce or consume
large amounts of data, or that use large amounts of
memory.
Separate scheduling and data servers, with

multiple servers of each type.
Data distribution and collection can be spread across

many servers, and participant hosts transfer large
data unobtrusively.
Clients automatically try alternate servers; if all

servers are down, clients do exponential backoff
to avoid flooding the servers when they come
back up.
Users can specify limits on disk usage and network

bandwidth. Work is dispatched only to hosts able to
handle it.
39
40
BOINC
BOINC Credits
Credit System is designed to avoid cheating
by validating results before granting credit
This ensures users are returning accurate
results
41
42
BOINC Manages the Details, But...

Validation:
when a sufficient number (a 'quorum') of
successful results have been returned, the
application compares them and sees if there is
a 'consensus':
method of comparing results (which may need to take
into account platform-varying floating point arithmetic)
policy for determining consensus (e.g., best two out of
three)
If a consensus is reached, a particular result is

designated as the 'canonical' result.
Second, if a result arrives after a consensus has
already been reached, the new result is
compared with the canonical result; this
determines whether the
43 user gets credit.
Projects Using BOINC

SZTAKI Desktop Grid: search for generalized
binary number systems.
LHC@home: improve the design of the CERN
LHC particle accelerator
Quantum Monte Carlo at Home: study the
structure and reactivity of molecules using
Quantum Chemistry.
SIMAP: calculate protein similarity data for
use by many biological research projects.
Rosetta@home: help researchers develop
cures for human diseases.
45
Using BOINC: hello.C

#include "diagnostics.h"
#include "boinc_api.h"
#include "filesys.h"
#include "util.h"
boinc_sleep()
// boinc_init_diagnostics()
//
// boinc_fopen(), etc...
// parse_command_line(),
int main(int argc, char **argv) {

int rc;
// return code from various functions
char resolved_name[512];
// physical file name for out.txt
FILE* f;
// file pointer for out.txt
/* Before initializing BOINC itself, intialize diagnostics so as
to get stderr output to the file stderr.txt */
rc = boinc_init_diagnostics(BOINC_DIAG_REDIRECTSTDERR|
BOINC_DIAG_DUMPCALLSTACKENABLED|
BOINC_DIAG_TRACETOSTDERR);
if(rc) exit(rc);
/* Output written to stderr will be returned with the Result
(task) */

Climateprediction.net, BBC Climate Change
Experiment, and Seasonal Attribution Project:
study climate change.
Cell Computing biomedical research
Einstein@home: search for gravitational
signals emitted by pulsars.
Predictor@home: predict protein structure
from protein sequence
44

World Community Grid: advance our
knowledge of human disease
SETI@home: Look for radio evidence of
extraterrestrial life.
46

/* BOINC apps that do not use graphics just call boinc_init() */
rc = boinc_init();
if (rc){
fprintf(stderr, "APP: boinc_init() failed. RC=%d\n", rc);
fflush(0);
exit(rc);
}
/* Input/output files need to be "resolved" from their logical name
for the application to the actual path on the client's disk */
rc = boinc_resolve_filename("out.txt", resolved_name,
sizeof(resolved_name));
if (rc){
fprintf(stderr, "APP: cannot resolve output file name. RC=%d\n",
rc);
boinc_finish(rc);
/* back to BOINC core */
}
fprintf(stderr,"Hello, stderr!\n");
47
48
/* Open files with boinc_fopen() not just fopen()
/* All BOINC applications must exit via boinc_finish(rc), not

merely exit() */
(Output files should usually be opened in "append" mode, in case

this is actually a restart (which will not be the case here)) */
fclose(f);
f = boinc_fopen(resolved_name, "a");
fprintf(stderr,"goodbye!\n");
fprintf(f, "Hello, BOINC World!\n");

/* Now run up a wee bit of credit.
boinc_finish(0);
This is the "worker" loop */
/*
N = 123456789;
fprintf(f, "Starting some computation...\n");
Dummy graphics API entry points.
This app does not do graphics,
but it still must provide these empty callbacks.
*/
void app_graphics_init() {}
for ( j=0 ; j<N ; j++ ){

num=rand()+rand();
/* does not return */
{ int j, num, N;
// just do something to spin the wheels
void app_graphics_resize(int width, int height){}

void app_graphics_render(int xs, int ys, double time_of_day) {}
}
fprintf(f, "Computation completed.\n");
...
}
49
Using BOINC: hello_re.xml (Results)

<file_info>
<name><OUTFILE_0/></name>
<generated_locally/>
<upload_when_present/>
<url><UPLOAD_URL/></url>
<max_nbytes>2048</max_nbytes>
</file_info>
<result>
<file_ref>
50
Using BOINC: hello_wu.xml (Work

Unit)
<workunit>
<min_quorum>
1
<target_nresults>
2
<max_error_results>
3
<max_total_results>
9
<max_success_results> 11
<rsc_fpops_est>
3e9
<rsc_fpops_bound>
9e11
<delay_bound>
8000
<rsc_memory_bound> 204800
<rsc_disk_bound>
307200
</min_quorum>
</target_nresults>
</max_error_results>
</max_total_results>
</max_success_results>
</rsc_fpops_est>
</rsc_fpops_bound>
</delay_bound>
</rsc_mem_bound>
</rsc_disk_bound>
</workunit>
<file_name><OUTFILE_0/></file_name>
<open_name>out.txt</open_name>
</file_ref>
</result>
51
Applications for Desktop Grids
52
Analyzing Application Distribution

Possibilities
Data Parallel:
process large input datasets in a sequential
fashion with no application dependencies
between or among the records of the dataset.
Parameter Sweep:
use an iterative approach to generate a
multidimensional series of input values used to
evaluate a particular set of output functions.
I1
I2
...
In
App*
App*
...
App*
O1
O2
...
On
Input
Application
Probabilistic:
process a very large number of trials using
randomized inputs to generate input values used
to evaluate a particular set of output functions.
53
Output
Output
54
*The application is untouched
Enabling Applications for Grids

how to decompose the input(s) of a large,
monolithic job into an equivalent set of
smaller input(s) that can be processed in a
distributed fashion?
how to recompose the output(s) from these
smaller distributed instances of the
application into a combined output that is
indistinguishable from that which would have
been produced by the single large job?
Determining Application Suitability

Compute Intensity:
reflects the relative percentage of time spent
moving data to and from the Desktop Grid Client
compared to the time spent performing
calculations on that data:
CI=
4WorkUnitDuration
InputSize OutputSize
In general, grid-enabled applications where CI is

greater than 1.0 are well suited for distributed
processing using a Desktop Grid solution.
What if:
the network is very fast: 1Gbps -> <1 may be OK
55
Determining Application Suitability

Example:
56
The Grid Server: Additional Services

Client Group-level Operations
a typical work unit executes in 15 minutes (900

seconds) on a hypothetical average grid client
consumes 2MB (2,000 KB) of input data
produces 0.4MB (400 KB) of output data
CI = (4 * 900) / (2000 + 400) = 1.5
57
As the size and complexity of the grid grows, it is

more useful to administer the grid as a collection
of virtual, overlapping groups.
Set of rules that allow client membership to be
determined automatically for both new Grid
Clients and for Grid Clients that have changed
status (for example, upgrading the Windows
operating system on that client or adding
memory to that client).
58
Data Caching
Job-level Scheduling and Administration:
The time needed to move data to and from the

Grid Client plays an important role in the
calculation of CI.
Data caching in which data needed for a work
unit can be placed in (or very close to) the Grid
Client:
be manually controlled:
certain data sets can be pushed to particular Clients and then
any work unit that needs those data sets are assigned
exclusively to those Clients
automatically administered:
the Grid Server examines its queue of work and ensures that
any data needed for a work unit will be available at the Client
59
run this application using these inputs with this

priority and put the answers here.
This is substantially more abstract than the
fundamental work unit level of the internal
workings of the Desktop Grid.
Should support various levels of job priority along
with the ability to select particular Clients (or
groups of Clients) for a particular job based on
characteristics of the job itself.
60
Performance Tuning and Analysis
Security
data and reports to allow an administrator to

determine important performance characteristics
of each Grid Client and the grid as a whole:
optimum (theoretical) throughput calculations
actual throughput calculations for any particular job or
set of work units
Each separately identified function within the Grid

Server user environment should include userlevel security
which users may add new applications
which users may submit jobs,
identification of any problematic Clients (or groups of

Clients)
which users may review job output
...
...
61
62

System Interfaces
The Grid Server should support a variety of
interfaces for its various user and administrative
functions.
At minimum, a browser-based interface.
Other interfaces that might be provided include a
command-line interface (for scripting support),
An API (for invoking grid functionality from other
programs)
An XML interface
63
Data Mining.
Engineering Design, CAD/CAM and rendering
Financial Modeling: Portfolio management and risk
management
Geophysical Modeling: Climate prediction and seismic
computations
Graphic Design: Animation and three-dimensional
rendering
Life Sciences: Disease simulation and target
identification
Material Sciences: Physical property prediction and
product optimization
Supply Chain Management: Process optimization and
total cost minimization
64
Conclusions
Started just a few years ago as noble-minded
projects for combining spare compute
capacity of individual PCs
Can aggregate the unused cycles of an
organization's existing PC resources into a
powerful, virtual computing engine.
Supplement (or replace) existing HPC
resources at a fraction of the cost
Not all computing problems are well suited
65
Practical Uses of Desktop Grids

02 Types PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

02 Types PDF

Uploaded by

Copyright:

Available Formats

Contents

Dr. Alejandro Zunino

defined by the principle resources used in the

Infra Grids (IBM)

Intra Grids (IBM):

May consist of dispersed workstations and servers as

resource sharing among different groups within an

Enable sharing of resources with external partners.

Partner Grids (Platform Computing):

Allow users to tap into external resources.

Inter Grids (IBM):

Leverage the resources of desktop computers.

High computational power at low cost

Successful deployment of high-throughput,

Evaluating AIDS drug candidates:

Screening for extremely large prime

Predicting climate on a global scale:

Why Would I Donate CPU Time?

Many users wish to advance the specific field of

Stress test computers

Teams, credits, and competition

Projects such as PlanetQuest plan on

Passive resource management

Limits the degree to which the timeliness of

Even if some form of encryption is used in transit,

A wide variety of machines might be enrolled;

Ex: SETI@Home alternative clients (buggy)

So... What For?

Data Parallel / Iterative applications

A set of user-controlled policies describing

Managed mechanism for distribution,

Grid Client Executive

input data (or a pointer to the location of the input

instructions on how to execute the application and

Compatible with heterogeneous resource

Configurable over multiple geographic

Technologies: Central Manageability

PC Grids Versus Supercomputers

Multiple servers and fault-tolerance

Separate scheduling and data servers, with

Data distribution and collection can be spread across

Clients automatically try alternate servers; if all

Users can specify limits on disk usage and network

BOINC Manages the Details, But...

If a consensus is reached, a particular result is

Projects Using BOINC

Using BOINC: hello.C

int main(int argc, char **argv) {

Projects Using BOINC

Projects Using BOINC

Using BOINC: hello.C

Using BOINC: hello.C

Using BOINC: hello.C

/* Open files with boinc_fopen() not just fopen()

/* All BOINC applications must exit via boinc_finish(rc), not

(Output files should usually be opened in "append" mode, in case

fprintf(f, "Hello, BOINC World!\n");

fprintf(f, "Starting some computation...\n");

Dummy graphics API entry points.

This app does not do graphics,

but it still must provide these empty callbacks.

for ( j=0 ; j<N ; j++ ){

/* does not return */