Professional Documents
Culture Documents
In this class
Types of Grids
Types of Grids
Desktop Grids
Motivations
Applications
Limitations
Boinc!
Clusters
1
Types of Grids
Grid computing vendors have adopted
various nomenclatures to explain and define
the different types of grids.
based on the structure of the organization
(virtual or otherwise) that is served by the grid
Departmental Grids
Solve problems for a particular group of
people within an enterprise:
Cluster Grids (Sun Microsystems):
One or more systems working together to provide a
single point of access to users.
Used by a team for a single project
Support both high throughput and high performance
jobs.
Enterprise Grids
Consist of resources spread across an
enterprise. Provide service to all users within
that enterprise:
Enterprise Grids (Platform Computing):
deployed within large corporations that have a global
presence or a need to access resources outside a
single corporate location.
Enterprise Grids
Campus Grids (Sun Microsystems):
Enable multiple projects or departments to share
computing resources in a cooperative way
Extraprise Grids
Established between companies, their
partners, and their customers. The grid
resources are generally made available
through a virtual private network (VPN):
Extra Grids (IBM):
Global Grids
Grids established over the public Internet.
They can be established by organizations to
facilitate their business or purchased in part,
or in whole, from service providers:
Global Grids (Sun):
Compute Grids
Provide access to computational resources:
Desktop Grids:
Others
Data Grids:
optimized for data-oriented operations.
Server Grids:
Some corporations, while adopting Grid Computing ,
keep it limited to server resources
Special servers are bought solely for the purpose of
creating an internal utility grid with resources made
available to various departments.
No desktops are included in server grids.
High-Performance/Cluster Grids:
High-end systems, such as supercomputers or HPC
clusters
9
Utility Grids:
commercial compute resources that are
maintained and managed by a service provider
Customers that have the need to augment their
existing, internal computational resources may
purchase cycles from a utility grid.
also offer applications that can be purchased by
the minute.
10
Overview
Desktop Grids
11
Historical context
What is (and what isn't) a Desktop Grid?
Deployment challenges and value proposition
development for Desktop Grid technology
Key areas to assess when evaluating the
suitability of a Desktop Grid
The role of Desktop Grids in an Enterprise
computing infrastructure standards on
Desktop Grids
Examples
12
Motivation
CPU Availability
14
Cause Computing
Searching for extra-terrestrials:
SETI@home: http://setiathome.ssl.berkeley.edu/
Features
Long computations
Short communication packets
User-initiated tasks have preference
Minimally intrusive on the user and his
Internet connection
Primitive version of today's Desktop Grids
18
Limitations
Lack of Resource Management
Limitations
Lack of Security
Limitations
Machine Heterogeneity
22
Limitations
Resource Availability
cause-computing paradigm relies on the idea of
voluntary participation
The PC may be turned off for the night, the
screensaver may be changed, the control
program may be disabled (either deliberately or
inadvertently), etc.
This adds another layer of unpredictability to the
performance expectations that can be associated
with such a grid.
23
Workflow applications
Described by a DAG
Example: some image processing applications
24
Examples
67 TFlops/sec, 500,000 workers, $700,000
1 7 .5 TFlo p s /s e c , 8 0 ,0 0 0 wo rke rs
1 8 6 TFlo p s /s e c , 1 9 5 ,0 0 0 wo rke rs
Desktop Grid
A defined (named) collection of machines on
a shared network.
may include dedicated machines, intermittently
connected machines, and shared machines
Any single machine is part of one, and only one,
Desktop Grid.
25
26
Desktop Grid
The machines on the grid are unaware of
each other except as informed by the central
server.
client-server architecture (no peer-to-peer)
Components
Grid Server
This is a central machine that controls and
administers the Desktop Grid.
Grid Client
An individual node that is a member of the
Desktop Grid from which spare computational
resources will be harvested.
27
28
Components
Technologies: Considerations
Security
Work Unit
A computation assigned to a Grid Client by the
Grid Server
a grid-enabled version of an application
instructions for establishing an environment for the
application on the Grid Client
Unobtrusiveness
Application Integration
Robustness
Scalability
Central Management
29
30
Technologies: Security
Disallow (or limit) access to network or local
resources by the distributed application.
Encrypt application and data to preserve
confidentiality and integrity.
Ensure that the Grid Client environment (disk
contents, memory utilization, registry contents, and
other settings) remains unchanged after running
the distributed application.
Prevent local user from interfering with the
execution of the distributed application.
Prevent local user from tampering with or deleting
data associated with the distributed application.
31
Technologies: Integration
Ability to simulate a standalone environment
within the Grid Client.
Integrated security and encryption of
sensitive data.
Easy integration (tools, examples, and
wizards are provided).
Support for any application...
Binary-level integration (no recompilation,
relinking, or source code access...).
33
Technologies: Scalability
Automatic addition, configuration, and
registration of new Grid Clients.
Technologies: Unobtrusiveness
Centrally manage unobtrusiveness levels that are
changeable based on time-of-day or other factors.
Ensure that the Grid Client Executive relinquishes
client resources automatically.
Ensure invisibility to local user.
Prevent distributed application from displaying
dialogs or action requests.
Prevent performance degradation (and total system
failure) due to execution of the distributed
application.
Require very little (ideally, zero) interaction with the
day-to-day user of the Grid Client.
32
Technologies: Robustness
Allocate work to appropriately configured Grid
Clients.
Automatically reallocate work units when Grid
Clients are removed from grid either permanently or
temporarily.
Automatically reallocate work units due to other
resource or network failures.
Prevent aberrant applications from completely
consuming Grid Client resources (disk, memory,
CPU, etc.).
Provide transparent support many OSs in the Grid
Client population.
34
35
36
An Example: BOINC
Berkeley Open Infrastructure for Network
Computing (BOINC)
http://boinc.berkeley.edu/
Features:
Flexible application framework
Existing applications in common languages (C, C++,
Fortran) can run as BOINC applications with little or
no modification.
New versions of applications can be deployed with no
participant involvement.
Security
37
38
An Example: BOINC
BOINC protects against several types of attacks:
digital signatures based on public-key encryption
protect against the distribution of viruses.
An Example: BOINC
Support for large data
BOINC supports applications that produce or consume
large amounts of data, or that use large amounts of
memory.
39
40
BOINC
BOINC Credits
Credit System is designed to avoid cheating
by validating results before granting credit
This ensures users are returning accurate
results
41
42
// boinc_init_diagnostics()
//
// boinc_fopen(), etc...
// parse_command_line(),
46
fprintf(stderr,"Hello, stderr!\n");
47
48
fclose(f);
f = boinc_fopen(resolved_name, "a");
fprintf(stderr,"goodbye!\n");
boinc_finish(0);
This is the "worker" loop */
/*
N = 123456789;
*/
void app_graphics_init() {}
{ int j, num, N;
}
fprintf(f, "Computation completed.\n");
...
}
49
50
</min_quorum>
</target_nresults>
</max_error_results>
</max_total_results>
</max_success_results>
</rsc_fpops_est>
</rsc_fpops_bound>
</delay_bound>
</rsc_mem_bound>
</rsc_disk_bound>
</workunit>
<file_name><OUTFILE_0/></file_name>
<open_name>out.txt</open_name>
</file_ref>
</result>
51
52
Data Parallel:
process large input datasets in a sequential
fashion with no application dependencies
between or among the records of the dataset.
Parameter Sweep:
use an iterative approach to generate a
multidimensional series of input values used to
evaluate a particular set of output functions.
I1
I2
...
In
App*
App*
...
App*
O1
O2
...
On
Input
Application
Probabilistic:
process a very large number of trials using
randomized inputs to generate input values used
to evaluate a particular set of output functions.
53
Output
Output
54
4WorkUnitDuration
InputSize OutputSize
55
56
57
58
Data Caching
automatically administered:
the Grid Server examines its queue of work and ensures that
any data needed for a work unit will be available at the Client
59
Security
...
...
61
62
Data Mining.
Engineering Design, CAD/CAM and rendering
Financial Modeling: Portfolio management and risk
management
Geophysical Modeling: Climate prediction and seismic
computations
Graphic Design: Animation and three-dimensional
rendering
Life Sciences: Disease simulation and target
identification
Material Sciences: Physical property prediction and
product optimization
Supply Chain Management: Process optimization and
total cost minimization
64
Conclusions
Started just a few years ago as noble-minded
projects for combining spare compute
capacity of individual PCs
Can aggregate the unused cycles of an
organization's existing PC resources into a
powerful, virtual computing engine.
Supplement (or replace) existing HPC
resources at a fraction of the cost
Not all computing problems are well suited
65