You are on page 1of 7

Vl 0 L 0 (

from machine to machine after those turkey dinners, and then


Thomas J. Owens the SCEC folks came along and made the whole thingpurr. The
E-mail: owens@sr ES apologizes if any other contributors may have been over-
Department of Geological Sciences looked ...
University of South Carolina
Columbia, SC 29208 GRID COMPUTING IN THE SCEC COMMUNITY
Phone: +1-803-777-4530 MODELING ENVIRONMENT
Fax: +1-803-777-0906
Philip Maechling ~, Vipin Gupta~, Nitin Gupta~,
Over the Network and to the Grid ... Edward H. Field 2, David Okaya~, and Thomas H.
Jordan ~
The folks at the Southern California Earthquake Center are ris-
ing rapidly to the top of the Electronic Seismologist's Favorite Dynamic CommunitiesSharing ComputerResources
People List. First, because they are doing interesting work In our work on the Southern California Earthquake Center
in keeping seismology on the leading edge of the IT world. Community Modeling Environment (SCEC/CME) Project
Second (and far more important in their ESFPL ranking), they (Jordan et al., 2003), we are developing computer systems to
are more than willing to share their experiences through contri- support dynamic distributed scientific collaborations. Scientists
butions to the ES. This issue the subject is "Grid Computing." participating in SCEC collaborations are often willing to share
Although not the most overhyped IT term at the moment (that their computer resources, particularly if in return they can gain
would be "Web Services"), grid computing is a phrase that gets access to computing capabilities that they do not currently pos-
tossed around in a lot of settings where the ES suspects those sess. Interorganizational computer sharing can be difficult to
involved may not have a firm grasp of what it really means. achieve due to the many organizational and technical differences
As luck would have it, the ES knows a few things about grid between groups. Recently, however, a new software technology
computing, having pioneered the method in the early 1980's. called grid computing (Foster et al., 2001) has emerged which
Over Thanksgiving break. In today's vernacular, the ES ran his is designed to help dynamic organizations, such as SCEC, share
own Virtual Organization back then when traveling from Utah heterogeneous collections of computer and storage resources.
to Lawrence Livermore National Laboratory to access data
and computer resources (and to bicycle ... a lot). In addition Grid Computing and Virtual Organizations
to being a key financial supporter of the ES's Ph.D. research, Grid technology enables organizations to share computer
LLNL is conveniently located near the Fall AGU venue, and resources with other organizations even when the shared com-
Thanksgiving is conveniently placed a couple of weeks before puters are administered differently and have dissimilar hardware
AGU. And all those lab folks actually take Thanksgiving off! and operating systems. Organizations can create a grid environ-
Recognizing this unique opportunity to start a revolution in ment to provide their users with computer resources such as
information technology (and finish his AGU talk), the ES CPU cycles, disk storage, and software programs that are avail-
spent a couple of Thanksgivings at LLNL running dozens of
receiver function inversions (12 hours each). Grid security was 1. SouthernCalifornia Earthquake Center
no problem in the early days, once you got past the people with University of Southern California
uniforms and guns. Resource discovery and monitoring--wan- Los Angeles, CA 90089-0742
der the halls and computer rooms looking for free systems! Telephone: + 1-213-740-5843
Fax: + 1-213-740-0011
Data transfer--9-track tapes. Job submission--walk around to E-mail: maechtin@usc.edu,vgupta@usc.edu, niting@usc.edu, okaya@
as many machines as you could find and run a script. After that, usc.edu, tjordan@usc.edu
it got a little boring. Fortunately, these were also the early days
of microwave dinners, so the ES could settle in for one or more 2. U.S.GeologicalSurvey
pseudoturkey dinners while his grid worked. Now, there have 525 South Wilson Avenue
been a few minor innovations since this original seismological Pasadena, CA 91106-0001
foray into grid computing. My friend A1 invented the Internet, Telephone: + 1-626-583-7814
which quickly replaced the rolly chair that the ES used to get E-mail: fieidOusgs.gov
URL: http://www.scec.org/cme/

Seismological ResearchLetters Volume76, Number5 September/October2005 581


TABLE 1
Fundamental Grid ComputingCapabilities
Grid ComputingCapability Descriptionof Functionality
Grid Security Identify computer users and computers in the grid and define what each user is permitted to do
Data Transfer Transfer data from one computer to another
Job Submission Run a computer program on a local or remote computer
Resource Discovery Determine what computers and what data storage devices are in the grid and the status of these
resources

able outside of their local computer administrative domains. cally from application programs. Then complex systems can be
This is done by creating a new computer administrative domain built on top ofgrid services. When this is done, grid-based com-
referred to as a virtualorganizatian (VO). A VO has its own set puter sharing occurs transparently to the user. The grid-based
of administrative policies that represents a combination of local OpenSHA Hazard Map application described in a companion
computer policies, the computer policies of the groups you are article (Field et al., 2005) is an example of how grid computing
sharing with, plus some administrative policies required by the can be so well integrated into an application program that the
VO itself. When we run a program on the "grid", we are saying, use of the grid becomes transparent to the user.
in a sense, that our program is running outside of our own local
administrative domain. Grid middleware is used to facilitate Relationship between Grid Computing and Distributed
the execution of computer programs in a VO. Computing
In addition to creating multiorganizational administrative Grid computing is a type of distributed computing. In an ear-
domains, grid middleware also strives to hide the heterogeneity lier ES article (Maechling et al., 2005) we discussed a variety
of the shared computing environment. Grid software provides of distributed computing techniques, including Java servlets,
a set of commands to perform basic computing operations, and Java RMI, CORBA, and Web services. We are sometimes asked
these commands are the same regardless of the underlying com- how those distributed computing technologies are related to
puters and operating systems. grid computing. To answer this, we start by characterizing those
other technologies as distributed component technologies.
Basic Grid Computing Capabilities Software developers utilize distributed components to execute
Grid computing is built upon four basic capabilities. These programs on other people's computers. Distributed compo-
capabilities are security, data transfer, job submission, and dis- nents do not provide general-purpose computing capabilities,
covery and monitoring of computing resources. Grid comput- however. Organizations offering distributed components are
ing is based on the premise that these four capabilities are the offering fixed solutions. As long as you want to use the distrib-
basic building blocks required to share computer resources in a uted component exactly as defined by the organization that
meaningful way. Table 1 briefly describes each of these funda- deployed it, then the system works. Ifyou have your own version
mental capabilities. of a component, however, you cannot immediately begin to run
Before we describe these grid functions in detail, let's look it on someone else's computer. You must negotiate the deploy-
at how these capabilities can be combined to share computer ment of your version of the component. Grid computing, in
resources. Assume a user wants to run a program on a remote, contrast, offers a general-purpose computing environment on
grid-enabled, computer. First he runs a grid security program other people's computers. Once the grid VO is established, you
to establish his identity on the grid. Then he issues a grid moni- can run your own component on someone else's computer. So
toring command to confirm that the remote computer isn't too the grid provides a general-purpose distributed computing envi-
busy. Then he runs a grid data transfer command to move his ronment. This is a more powerful capability than running only
program, and input files, from his local computer to the remote existing distributed components.
computer. Now he issues a grid job submission command to
run the program on the remote computer. Finally, he uses a grid Addressing Grid Hype
data transfer command to copy the resulting output files back to Within the computer science world, particularly within the
his local computer for further analysis. high-performance computing community, there has been a lot
In our following discussion, we describe grid commands of interest in grid computing over the last few years. For exam-
that are provided by our grid software. While these commands ple, there is now a collection of supercomputers in the U.S.
are a useful starting place, and they help explain basic grid capa- called the TeraGrid (http://www.teragrid.0rg/) that is config-
bilities, we should point out that users typically don't interact ured to support grid computing. The level of interest and activ-
with the grid using these basic grid commands because the ity in grid computing have, in some cases, risen to the level of
commands are quite cumbersome. The real grid-computing grid hype. Grid hype can hurt organizations in a couple of ways.
payoff comes when grid commands are called programmati- For one, it leads to unrealistic expectations that grid computing

582 SeismologicalResearchLetters Volume76, Number5 September/October2005


cannot meet. For another, it can attract organizations into grid and a grid spacing of 1 km requires 10,000 hazard-curve calcu-
computing that are not equipped to handle the additional sys- lations. These hazard-curve calculations are independent and
tem administrative burden required to establish and maintain can be performed in any order. Each hazard'curve calculation
a grid. outputs at least one file.
One of the more common unrealistic grid-computing A second seismological application that we have grid-
expectations is that you can plug a grid-enabled computer into enabled is an anelastic wave model (AWM) earthquake wave-
a network port and immediately gain free, or low-cost, comput- propagation simulation program written by Kim Olsen called
ing cycles. While easy sharing of computer resources through AWM-Olsen (Olsen et al., 1997). AWM-Olsen is used by
grid technology may eventually make this possible, grid tech- SCEC researchers for a wide variety ofgeophysical research. For
nology has not yet reached this level of ease of use. example, it was used to run the TeraShake (Minster et al., 2004)
Another expectation is that grid computing is a replace- simulations in Fall 2004. AWM-Olsen is a 4th-order finite-dif-
ment for parallel computing technologies such as computa- ference Fortran90 program that utilizes the Message Passing
tional clusters. Grid software works with computational clus- Interface (MPI) in order to run on computational clusters. We
ters. For example, it can provide easier access to clusters by use grid software on SCEC computers to submit this program
providing standardized job submission, data transfer, and moni- to both the USC and TeraGrid computational clusters and to
toring commands. But grid computing does not replace clusters transfer the large input and output files between the compu-
computing. Besides technical issues such as the very high-speed tational clusters and local SCEC disk storage. As we describe
network connections used by clusters, an important distinction the basic capabilities of grid computing, we'll comment on how
is that computational clusters are typically homogenous collec- grid computing can support these two very different seismo-
tions of computers, while grids are typically heterogeneous col- logical applications.
lections.
There are additional grid issues. Grid software and admin- Grid Security
istration are currently quite complex, and there are very few Globus designers recognized that every grid operation had to
experienced grid administrators. Our SCEC grid requires a sig- be highly secure. If Globus isn't secure, people won't use it to
nificant amount of system administrator time. If your collabo- share. Basic Globus grid security is a two-step process: (1) verify
ration will benefit from sharing computers, grid software may a user's identify, and (2) verify that the identified user has per-
work for you. Costs will be associated with rolling out a grid sys- mission to use the grid resources he is trying to use. Step 2 is
tem, however, including training, system administration time, dependent on Step 1.
support and maintenance of the grid, and user learning time. In a Globus grid, every user is issued a trustworthy identi-
fication called a grid certificate. When a user tries to run a grid
Establishing the SCEC Grid command, his grid certificate is sent along with the command
On the SCEC/CME Project, we built the SCEC grid using so the target computer system can determine who has issued the
Globus Toolkit (http://www.gtobus.0rg/), which is the grid soft- command. Every time a user tries to run a grid command, he
ware standard in the scientific and academic research worlds. must prove his identity with a grid certificate. In security terms,
Globus Toolkit is an open-source software distribution available proving one's identify is called authentication. By using robust
for download from the Globus Web site. Globus Toolkit is a col- grid certificates, Globus ensures that it can reliably identify all
lection of software programs that, once installed and configured users in the grid. Computers are also issued grid certificates so
on your computer, provides basic grid functionality. that Globus can reliably identify all computers in the grid.
We installed Globus Toolkit on several of our SCEC com- The other half of Globus grid security is called authoriza-
puters and configured our grid software so that we could access tion. Once a user proves his identity using a trustworthy grid
computer resources at collaborating institutions. For example, certificate, Globus checks to see if that person has permission
the SCEC grid computers are configured to share computer and to run the command he has issued. So when a user issues a grid
storage resources with the USC High Performance Computing command, Globus first checks his identification, then checks
and Communications (HPCC; http://www.usc.edu/hpcc/) whether that person has permission to do what he is asking to
group, with collaborating groups at USC's Information Sciences do. It is entirely possible for a grid system to accept a user's iden-
Institute (ISI), and with the TeraGrid network of supercom- tification but to deny that user's grid command due to lack of
puters. authorization.
As we mentioned previously, the SCEC/CME OpenSHA Globus uses Public Key Infrastructure (PKI)-based grid cer-
working group has implemented a Probabilistic Seismic Hazard tificates. Grid certificates are issued and signed by a Certificate
Analysis (PSHA) Hazard Map program that uses grid software. Authority (CA). A CA is typically the computer system admin-
By using grid software, this PSHA program can be run on a istration department of an organization. As with personal iden-
large shared collection of USC workstations called a Condor tifications, certain CA's are stricter, or more demanding, than
Pool. The OpenSHA software that performs these hazard map others. A strict CA won't issue a grid certificate without sub-
calculations is written primarily in Java. The PSHA calculations stantial verification that you are who you say you are. It is often
performed by this software consist of a series ofhazard-curve cal- the case that strict CA's are more widely accepted than less strict
culations. A hazard map with dimensions of 100 km • 100 km CA's. Personal ID's often work the same way. For example, the

Seismological ResearchLetters Volume76,Number5 September/October2005 583


TABLE2
Globus Grid Certificate Initialization Command Example
Globus Command Globus Security Infrastructure (GSI)
Example % gFid-proxy-init-hours 2
Your identity:/C:US/O:USC/OU=SCEC/CN=Phi[ip MaechLing/UlD=phifipm
Enter GRID pass phrase for this identity:
Creating proxy ......................................
Done
Your proxy is valid untiL: Mon Mar 7 17:37:16 2005

Description User enters a pass phrase to verify identity. Once the pass phrase is accepted, grid commands
issued from this account will use that identity for the next two hours.

local fitness center may issue you an ID without asking many Organizations are understandably cautious about which
questions. Not many other organizations will trust that ID, Certificate Authorities they will trust and therefore which
however. The federal government is significantly more demand- grid certificates they will accept. In our experience, CA issues
ing. When you apply for a passport, they take your picture and (authentication issues) are the most time-consuming adminis-
your fingerprints and they check up on you before they issue trative aspects of setting up a grid.
you a passport. Once you have the passport, it is widely trusted Table 2 shows an example of a commonly used Globus
throughout the world. security command.
This leads us to an important practical issue that organiza-
tions face as they begin to use grid computing. To start using Common Grid Security Issues
Globus software, your organization must decide the following Before we leave the topic of grid security, we'd like to men-
security issues: Which certificate authority (CA) will issue the tion three specific security issues that organizations are often
grid certificates that your users, and computers, will use ? Also, concerned about: Can an organization limit the use of its grid
which certificate authorities you will trust ? to only approved individuals? Can an organization prevent a
For a test environment, an organization can act as its own trusted user's grid-based program from damaging its computer,
CA and issue its own grid certificates. If an organization wishes or data? Can grid users transfer data across the grid without
to interoperate with external grids, however, it will need to find exposing the data in clear text ?
a CA that all participating organizations trust. In the Internet q-he first issue is addressed by Globus with the two-part
world, an organization called I C A N N (http://www.icann.org/) authentication and authorization-based grid security system
is charged with coordinating the names and numbers used on described earlier. To use an organization's system, a user must
the Internet. There is no equivalent centralized grid Certificate present a trusted identification, a grid certificate. Once the user is
Authority, so organizations usually implement their own CA's reliably identified, the grid software will then verify that the user
and coordinate the use of their grid certificates with other orga- has permission to issue grid commands on the specified system.
nizations. Access to, or operation of, a Certificate Authority is Properly configured, grid software does enable organizations to
one of the administrative overheads associated with grid com- limit use of their computers to approved individuals only.
puting. q-he second issue is important once trusted users are
In the case of our grid-based PSHA program, we want to allowed to run programs, or perform other grid operations,
utilize computers in the USC campus grid. SCEC faculties are on an organization's computers. Can an organization protect
on the USC campus, so our users, and computers, are issued its shared, grid-enabled systems from accidental, or malicious,
grid certificates signed by the USC Certificate Authority. USC activities of trusted users? Typically, this is handled by mapping
accepts its own grid certificates, so SCEC grid computers can external users to local computer accounts. When an external
interoperate with computers on the USC grid using USC CA- user issues a grid command, the grid software maps the exter-
signed grid certificates. nal user to a local user account, q-hen the external user has all
For the AWM-Olsen program, we want to submit jobs to the permissions of the local user, but no more. For example, the
the TeraGrid. In order to interoperate with the TeraGrid, USC local account may have a disk quota, and so the external grid
spent a substantial amount of time working with TeraGrid secu- user will be limited to the quota of the local account to which
rity groups to agree upon appropriate computer security poli- he is mapped. By mapping remote grid users to local accounts,
cies and procedures. After significant review, and some policy the disk allocation and file access permissions of external grid
updates, USC and the TeraGrid agreed to accept each other's users can be controlled and remote grid access can be reason-
grid certificates. Now, when SCEC users issue grid commands ably safe.
to be executed on TeraGrid computers, the SCEC users prove Some grid tools, such as the Condor system that we discuss
their identity using USC CA-signed grid certificates. later, provide a "sandbox"-based approach for running grid pro-
grams. In a "sandbox"-based system, external programs run in

584 SeismologicalResearchLetters Volume76, Number5 September/October2005


TABLE3
Globus Data Transfer Command Example
Globus Command Globus Data Transfer (GridFTP)
Example % gLobus-urt-copy gsiftp://earth2.usc.edu/tmp/testfiLe2.txt fite:///tmp/testfiLel.txt
Description This command will copy a file from the computer earth2 to a file in a local directory called/tmp/
testfiLel_.txt.

a secure, well controlled region of the computer and are pre- and "cron" commands that schedule programs to run at specific
vented from accessing anything outside this "sandbox." Xhis times.
technique is helpful if the grid users don't have local accounts When you are submitting your program to run on a col-
on all of the grid-enabled computers. lection of computers (e.~, a pool of computers), however, or if
Globus addresses the third concern, transmission security, you are submitting your program to run on a computing cluster,
by providing the capability to encrypt data during transmission job submission is more complex. On these systems, programs
using Secure Socket Layer (SSL) software that is bundled with are submitted to a job submission manager, often using a job
Globus. Even sensitive data sets such as passwords and financial submission script. Submitted programs are placed in a job
data can be transferred securely using Globus grid tools. queue by the job submission manager and a program runs when
it reaches the front of the queue. Job queues are managed by a
Grid Data Transfers job scheduler program that uses some type of scheduling algo-
Once grid software is installed, and the grid security issues are rithm. From the system operator's perspective, it is important to
worked out, grid commands, such as data transfers, can be issued. keep the system as busy as possible as long as programs are in the
Globus data transfers use a program called GridFTP. GridFTP queue. From the user's perspective, it is important to minimize
has been optimized for high-performance transfers with capa- the wait time before the job runs.
bilities such as parallel transfers and partial file transfers that There are a variety of job submission managers, and each
are not commonly found in other versions of FTP. When trans- one has its own job submission language. Job submission sys-
ferring files using GridFTP, the source and destination files are tems used in the SCEC grid include Condor (http://www.
specified using Uniform Resource Locators (URL's) like the cs.wisc.edu/c0nd0r/) and the Portable Batch System (PBS;
URL's that locate Web pages. This means that files transferred http://www.0penpbs.0rg/), each of which has its own scripting
with GridFTP must be placed in locations that are externally language.
visible as URL's. Table 3 shows an example of a Globus URL Globus implements yet another job submission scripting
Copy command that copies a file from a remote computer to language called Resource Specification Language (RSL). RSL
a local file. is a scripting language that can be used to submit a job to run on
The data transfer requirements for our two seismological a Globus grid. RSL is designed to be a universal job-submission
applications are fairly similar. In order to run either of these scripting language that can be translated into any other job-sub-
programs on a remote computer, we copy the executable to the mission scripting language. Globus takes an RSL command and
remote computer, copy the input parameter files (if any), start translates it into the appropriate underlying job-submission
the program on the remote computer, and, when the calcula- language. Because Globus can translate RSL into a variety of
tions are done, copy the resulting output files back to our local job-submission languages, RSL can be used as a universal job-
computer. submission language. Table 4 shows an example of an RSL com-
mand that submits a job for execution on a remote host.
Grid Job Management Our two example grid-based seismological application
Next, let us consider Globus job management. In the Globus programs have significantly different job-submission require-
world, job management refers to two main capabilities: job sub- ments and illustrate how Globus helps support a heterogeneous
mission and job monitoring. Job submission refers to the pro- computing environment.
cess of starting a program on a computer. Job monitoring refers The characteristics of our PSHA hazard map program
to determining what happened after the program started. We make it an ideal candidate to run on a collection of indepen-
will focus on job submission here, but Globus also provides job dent computers because we run the same program repeatedly
monitoring capabilities. and because there are no dependencies between the runs. The
3-hose of us who primarily run programs on personal com- USC H P C C group has configured a collection of more than
puters or workstations do not commonly work with job sub- 100 campus workstations as a "pool" of computers that is avail-
mission programs. For the most part, we just double-click the able for general computing when they are not busy. Xhis collec-
program icon, or type the program name and hit the "ENTER", tion of computers is called a Condor Pool. Programs can be run
key, and the program starts to run. For UNIX users, the most on computers in the Condor Pool by using a job-submission
common job submission programs are the "&" (ampersand) program called Condor. The Condor job manager monitors all
operator that runs the program in the background, and the "at" the computers in the Condor Pool and runs the job at the front

Seismological ResearchLetters Volume76, Number5 September/October2005 585


TABLE4
GlobusJob SubmissionCommandExample
Globus Command Globus ResourceAllocation Management(GRAM)
Example % gl.obus-job-run earth l.usc.edu -s myprog

Description This command will submit the program called "myprog" to execute on the computer earthl .usc.edu
and will copy the executable to the target machine if necessary.

TABLE5
Giobus Monitoring and DiscoveryCommandExample
Globus Command Globus Monitoring and DiscoveryServices (MDS)
Example % grid-info-search -h earthl.usc.edu -x
dn: Mds-Host-hn=earthl.usc.edu,Mds-Vo-name=[ocaL,o=grid
Mds-Cpu-modet: Inter(R) Xeon(TM) CPU I
Mds-Cpu-speedMHz:1394
Mds-Os-name: Linux
Mds-Memory-Ram-TotaL-sizeMB: 4800
Mds-Cpu-TotaL-Free-15minX100:385
Mds-Device-name:/usr/toca[
Mds-Fs-sizeMB: 9844
Description This Globus Monitoring and Discovery command returns detailed information about computers in the
grid, including operating system, type of CPU's in the system, amount of RAM, free time of the CPU's,
and file system information.

of the queue on the next available computer. USC H P C C has Grid Monitoring and Discovery
installed a version of Condor, called CondarG, which works Before running a job on a remote computer, it is important to
with Globus. verify that the remote computer meets the minimum system
To submit our PSHA program to the USC Condor Pool, requirements for your program. Glabus provides a Monitoring
we create a Glabus RSL script and submit the RSL script to the and Discovery Service (MDS) to make this possible. MDS
Globus job manager. Globus then converts our RSL script to allows users to determine information about computers in
a Candor script and submits the Condor script to the Candor the grid such as the type of CPU's, the operating system, the
job-submission manager. The Candor job-submission manager amount of computer memory, how busy the system is, and file
places our PSHA program in the Condor Pool queue, and the system information such as size and free space. Table 5 shows
Condor job scheduler runs the program on the next available an example of an MDS command and the type of information
computers. that is returned.
Running the AWM-Olsen program requires a significantly The Globus MDS system separates system-monitoring
different type of job submission script. The AWM-Olsen pro- capabilities from the query and reporting capabilities for per-
gram runs on computational clusters. Clusters often use job- formance and convenience reasons. In the background, Glabus
submission managers such as the Portable Batch System (PBS) continuously monitors the grid and places status information
to handle user job submissions. It's worth noting how the queu- into a cached schema. When a user queries for system status,
ing approach for clusters reverses the queuing approach for a status information is retrieved from the cached schema. By
Condor Pool. Condar establishes a single queue over a large col- using this caching system, users can query a single system, and
lection of computers. PBS establishes many queues, and each Globus can respond quickly with information about all the sys-
queue refers to portions of one large computer. For example, tems in the organization's grid.
the job queues on the USC HPC Linux Cluster vary by the
number of processors that the job will run on, and by the inter- Grid Issues and Risk Reduction Strategies
connection (e.~, Ethernet, Myrinet) between nodes accessed by No grid computing discussion is complete without comments
the queue. on limitations with current grid systems. One significant issue
To run AWM-Olsen on the USC HPC Cluster, or on the regarding grid software is that it is changing rapidly. Glabus,
TeraGrid, we create an RSL submission script and submit it to in particular, has been changing versions quite frequently.
the Glabusjob-submission manager. Globus then translates the Due to this rapid rate of change, Globus installations around
RSL script into the appropriate underlying PBS commands and the country have a variety of Glabus versions deployed, which
submits the PBS commands to the cluster's own job-submission leads to compatibility issues. On the SCEC/CME Project, as
manager for execution. our baseline, we use the version of Globus that is distributed in

586 SeismologicalResearchLetters Volume76, Number5 September/October2005


the current National Science Foundation Middleware Initiative defining priority of access, fair use, and compensation for use
(NMI, http://www.nsf-middtewar.org/) software distribution. of shared computer resources. For those of us building collabo-
When NMI releases a software distribution that contains a ratories, the long-term significance of grid computing is likely
new version of Globus, we upgrade our systems with the new to be less technical and more social because it challenges us to
release. NMI releases tend to be less frequent than new versions define, in very unambiguous terms, what sharing means within
of Globus. We believe that the NMI releases are well tested and our collaborations. El
that they are widely installed. Interoperability is high. The NMI
version of Globus lags behind the latest Globus release, however, References
so you don't have immediate access to new features. Field, E. H., N. Gupta, V. Gupta, M. Blanpied, P. Maechling, and T. H.
Another issue to consider is that the grid software shake- Jordan (2005). Hazard calculations for the WGCEP-2002 forecast
using OpenSHA and distributed object technologies, Seismological
out is just beginning. The Globus Toolkit is the de-facto grid ResearchLetters 76, 161-167.
software standard within scientific communities, and therefore Field, E. H., V. Gupta, N. Gupta, P. Maechling, and T. H Jordan (2005).
we believe it is a good choice for SCEC. Other grid software Hazard map calculations using grid computing, Seismolagical
tools are available, however, including versions from commer- ResearchLetters (in review).
cial vendors such as Sun, Avaki, Data Synapse, and others. It Field, E. H., T. H. Jordan, and C. A. Cornell (2003). OpenSHA: A devel-
oping community-modeling environment for seismic hazard analy-
is not clear what the grid software standard will be five years sis, SeismologicalResearchLetters 74, 406-419.
from now. The key to selecting grid software is to select stan- Foster I., C. Kesselman, and S. Tuecke (2001), The anatomy of the grid:
dards-based grid software. The leading grid software standards Enabling scalable virtual organizations, International Journal of
bodies are the Global Grid Forum (http://www.ggf.org/) and SupercomputerApplications 15.
the World Wide Web Consortium (http://www.w3.org/). Jordan, T. H., P. J. Maechling, and the SCEC/CME Collaboration
(2003). The SCEC Community Modeling Environment: An infor-
Standards-based grid software from one vendor should interop- mation infrastructure for system-levelscience, SeismologicalResearch
erate with standards-based grid software from another vendor. Letters 74, 324-328.
We recommend working with grid software that is based on Maechling, P., Vipin Gupta, Nitin Gupta, Edward H. Field, David Okaya,
GGF and W 3 C software standards. and ~l-homasH. Jordan (2005). Seismichazard analysisusing distrib-
uted computing in the SCEC Community Modeling Environment,
SeismologicalResearchLetters 76, 177-181.
Discussion Minster, J. B., K. Olsen, R. Moore, S. Day, P. Maechling, T. Jordan, M.
We believe that SCEC and other research groups will benefit Faerman, Y. Cui, G. Ely, Y. Hu, B. Shkoller, C. Marcinkovich, J.
by sharing computer resources. Since grid computing provides Bielak, D. Okaya, R. Archuleta, N. Wilkins-Diehr, S. Cutchin, A.
at least a partial solution to the problem of sharing computer Chourasia, G. Kremenek, A. Jagatheesan, L. Brieger, A. Majumdar,
resources, it addresses a real need in the scientific community, G. Chukkapalli, Q. Xin, R. Moore, B. Banister, D. ~Ihorp,P. Kovatch,
L. Diegel, T. Sherwin, C. Jordan, M. Thiebaux, and J. Lopez (2004).
and it is likely to persist in some form. We believe that grid soft- ~-heSCEC TeraShake earthquake simulation,Eos, Transactionsof the
ware will eventually be integrated into operating system and American GeophysicalUnion 85, Fall Meeting Supplement, abstract
network software installed on most computers. SF31B-05.
As the technical issues related to grid computing are Olsen, K., R. Madariaga, and R. Archuleta (1997). Three dimensional
resolved, and as organizations begin to recognize the benefits dynamic simulation of the 1992 Landers earthquake, Science 278,
834-838.
of sharing computers, the challenges associated with using WGCEP (Working Group on California Earthquake Probabilities)
grids will shift from technical issues to organizational issues. (1988). Probabilities of large earthquakes occurring in California on
Research organizations will need to develop new processes for the San Andreas Fault, USGS Open-File Report 88-393.

Seismological ResearchLetters Volume76, Number5 September/October2005 587

You might also like