You are on page 1of 23

The Impact of Policy on Uptake and Usage of National and International Computational Grids

S. J. Zasada, S. Manos, P. V. Coveney Centre for Computational Science, Department of Chemistry, University College London, Christopher Ingold Laboratories, 20 Gordon Street, London, WC1H 0AJ

Contents

Case studies: Motivating applications


TeraGyroid SPICE GENIUS

General requirements of these project Impact on the policies of grid resource providers Looking to the future

LB3D/TeraGyroid Project

J. Chin and P. V. Coveney,


Proc. R. Soc. London A, 462, 3575-3600 (2006).

The TeraGyroid project coupled cutting-edge grid technologies, high-performance computing, visualisation and computational steering capabilities to produce a major leap forward in soft condensed matter simulation. G. Giupponi, J. Harting, P.V. Coveney, Europhysics Letters, 73, 533-539 (2006).

Lattice sizes up to 10243

Won the award for "Most Innovative Data-Intensive Application" in the HPC Challenge competition at SC'03.

Computational Biomedicine: Simulated Pore Computing Environment (SPICE)


Interactive HPC ANALYTICS CHALLENGE WINNER @ SC05

Translocation of biomolecules through protein pores. Size, complexity & timescale make computations expensive. Millions of CPU hours using simple MD. Need to do better... Novel Algorithm: Steered Molecular Dynamics to pull DNA through the pore. Jarzynksi's Equation to compute equilibrium free energy profile from nonequilibrium pulling. Reduce comp. cost by approx. 100.

SPICE: Grid Infrastructure

RealityGrid Steering Infrastructure (http://www.realitygrid.org)


Underlying grid middleware and complexity hidden from end-user RealityGrid Steering API Application uses client side API

Federated Grid: UK-NGS & US-TeraGrid High-end systems provide real-time interactivity. Advanced networks provide schedulable capacity and high QoS Significant performance using optical switched light-paths -- UKLight/GLIF

Grid Enabled Neurosurgical Imaging Using Simulation


The GENIUS project aims to model large scale patient specific cerebral blood flow in clinically relevant time frames Objectives:
To study cerebral blood flow using patient-specific image-based models. To provide insights into the cerebral blood flow & anomalies. To develop tools and policies by means of which users can better exploit the ability to reserve and co-reserve HPC resources. To develop interfaces which permit users to easily deploy and monitor simulations across multiple computational resources. To visualize and steer the results of distributed simulations in real time

The clinical work flow


Book computing resources in advance or have a system by which simulations can be run urgently. Shift imaging data around quickly over high-bandwidth low-latency dedicated links. Interactive simulations and real-time visualisation for immediate feedback.
15-20 minute turnaround

Modelling blood flow using HemeLB


Efficient fluid solver for modelling brain bloodflow called HemeLB: Uses the lattice-Boltzmann method Efficient fluid solver for sparse geometries, like a vascular tree

Machine-topology aware graph growing partitioning technique, to help hide cross-site latencies
Optimized inter- and intra-machine communications Full checkpoint capabilities

Haemodynamic simulation and visualisation


First step is the conversion of patient-specific MRA or 3DRA data (DICOM format) to a 3D model, vasculature is of high contrast, 300 - 400 m resolution, 5003 - 7003 voxels 3DRA - 3-dimensional rotational angiography, vasculature is obtained using digital subtraction imaging with a high-contrast x-ray absorbing fluid.

Beyond the batch job


Typical computing scenario involves jobs submitted into a queue
Submit -> Run -> Post-process

This wont work in a clinical scenario since correctness and timeliness are important in clinical computing - late results are useless

Advance reservations GENIUS Toolkit Emergency computing Grid middleware - the Application Hosting environment Blood flow modelling, computational steering and real-time in-situ visualisation Distributed Computing Lightpaths
THE REQUIREMENT: To incorporate these methodologies into a clinicians day to day activities, rather than just providing such facilities on an ad hoc basis.

Cross-site Runs with MPI-g


GENIUS has been designed to run across multiple machines using MPI-g Some problems wont fit on a single machine, and require the RAM/processors of multiple machines on the grid. MPI-g allows for jobs to be turned around faster by using small numbers of processors on several machines - essential for clinician HemeLB performs well on cross site runs, and makes use of overlapping communication in MPI-g

HemeLB/MPI-g Requires Co-Allocation


We can reserve multiple resources for specified time periods Co-allocation is useful for meta-computing jobs like HemeLB, viz and for workflow applications. We use HARC - Highly Available Robust Coscheduler (developed by Jon Maclaren at LSU).

Slide courtesy Jon Maclaren

HARC
HARC provides a secure co-allocation service
Multiple Acceptors are used Works well provided a majority of Acceptors stay alive Paxos Commit keeps everything in sync Gives the (distributed) service high availability Deployment of 7 acceptors --> Mean Time To Failure ~ years Transport-level security using X.509 certificates

HARC is a good platform on which to build portals/other services


XML over HTTPS - simpler than SOAP services Easy to interoperate with Very easy to use with the Java Client API

SPRUCE
Special PRiority and Urgent Computing Environment

Applications with dynamic data and result deadlines are being deployed Late results are useless
Wildfire path prediction Storm/Flood prediction Patient specific medical treatment

Some jobs need priority access Right-of-Way Token

Real Time Visualisation and Steering


A way to let HemeLB know the parameters to be steered -> we use the RealityGrid steering system to steer the input data on the fly. One aim is to do all this for distributed (cross-site) simulations
For medical applications, need may be urgent

Application Hosting Environment


Need to utilize resources from globally distributed grids
Administratively distinct Running different middleware stacks

Wrestling with middleware can't be a limiting step for scientists Need tools to hide complexity of underlying grids

General requirements of these projects


Ability to co-reserve resources Launch emergency simulations Consistent interfaces for federated access Access to back end nodes: steering, visualisation Lightpath network connections Cross site simulations (MPIg) Support for software (ReG steering toolkit etc)

Impact on resource provider policies


TeraGrid, NGS & HPCx starting to support advanced reservation with HARC DEISA are evaluating HARC deployment on their systems Some TeraGrid sites support emergency jobs with SPRUCE Lightpath connection in place between Manchester and Oxford NGS nodes MPIg and RealityGrid steering deployed on NGS and TeraGrid resources

Virtual Physiological Human


Funded under EU FP 7 15 projects: 1 NoE, 3 IPs, 9 STREPs, 2 CAs. a methodological and technological framework that, once established, will enable collaborative investigation of the human body as a single complex system ... It is a way to share observations, to derive predictive hypotheses from them, and to integrate them into a constantly improving understanding of human physiology/pathology, by regarding it as a single system.

VPH requires clinical (grid) computing?


Computational experiments integrated seamlessly into current clinical practice Clinical decisions influenced by patient specific computations: turnaround time for data acquisition, simulation, post-processing, visualisation, final results and reporting.

Fitting the computational time scale to the clinical time scale:


Capture the clinical workflow Get results which will influence clinical decisions: 1 day? 1 week? This project - 15 to 30 minutes

Development of procedures and software in consultation with clinicians Security/Access is a concern On-demand availability of storage, networking and computational resources

Conclusions
The projects presented have all put pressure on resource providers to offer new services and new ways of working For interactive work the batch processing model does not work If HPC is to be exploited by clinicians it needs to be used in a way that fits in with the clinical workflow VPH initiative: Likely to increase pressure for nonstandard services from resource providers

Acknowledgements
Rob Haines Robin Pinning John Brooke Stephen Pickles Mark Mc Keown NGS staff TeraGrid Staff LONI Staff JANET/David Salmon Simon Clifford Frank Smith Nick Ovenden Brian Toonen Nicholas Karonis David Hawkes Jon Maclaren Shantenu Jha Daniel Katz Shawn Brown Ken Yoshimoto Doru Marcusiu

You might also like