You are on page 1of 1

Accessing

Cloud Computing to Support Water Resources Modeling


Scott D. Christensen, sdc50@byu.net
Nathan R. Swain, nathan.swain@byu.net
E. James Nelson, jimn@byu.edu
Norman L. Jones, njones@byu.edu

A Background

C Applications

Advances in water resources modeling are providing us with better information,


however, they require more computational power to run. Cloud computing
enables universal access to cost-effective computing, yet there still remains a
significant technical barrier to accessing these resources. Here we present a set
of Python tools, TethysCluster and CondorPy, that have been developed to
lower the barrier to modeling in the cloud by providing :

Stochastic Analysis

Tethys Platform

and is often accounted for my performing a


stochastic analysis which requires running
hundreds or thousands of model simulations.
For a spatially-distributed, physics-based models
such as GSSHA running thousands of models

(2) a batch scheduling system to queue and dispatch the jobs to the
computing resources

may take months or even years. TethysCluster

(3) data management for job inputs and outputs


(4) the ability to dynamically create, submit, and monitor computing jobs

done much faster using cloud computing.

and CondorPy enable this type of analysis to be

CondorPy
HTCondor is a software system that that enables

TethysCluster

computing resources and scheduling computing

and lots of time. One way to alleviate this


problem

interface for HTCondor, and allows jobs to be


created, submitted and monitored from a Python
scripting environment. This interface facilitates the
use of HTCondor in a web environment like Tethys
Platform (see panel D).

Large modeling tasks often require a large amount of


computing resources. Commercial cloud providers

to

partially

parallelize

the

computation by decomposing the domain into


smaller models. This results in a series of
hierarchical models whose execution must be
coordinated. CondorPy facilitates running this
type of workflow with HTCondor in a parallel
computing environment.

Azure

provide

on-demand,

scalable

E Summary
Two Python modules have been developed to lower the technical barrier to

Ensemble Forecast Processing

accessing cloud computing for performing large modeling tasks. TethysCluster

TethysCluster and CondorPy are used by the Streamflow Prediction Tool (a Tethys web app) to

automates the process of provisioning diverse cloud resources and configuring


them with HTCondor. CondorPy interfaces with HTCondor to enable computing

process each ensemble forecast every 12 hours when a new forecast is available. TethysCluster can

jobs to programmatically be created, submitted, and monitored.

be used to automatically provision and de-provision cloud computing resources.

CondorPy and TethysCluster have been integrated into Tethys Platform enabling

such as Amazon Web Services (AWS), and Microsoft

web apps to easily perform large computing tasks.

resources,

however configuring them HTCondor can prove


challenging. StarCluster is a Python module that
automatically

CondorPy

is

Medium-Range Weather Forecasts. A scheduled Python script creates 52 jobs using CondorPy to

TethysCluster

CondorPy has been integrated into


the Tethys Platform Python SDK in the
form of a job manager that enables
developers to define computing jobs
and submit them to the HTCondor
pools to offload large computing
tasks.

domains often requires powerful computers

automatically process a 52-member ensemble forecast produced by the European Center for

jobs. It enables diverse computing systems to be


CondorPy serves as a cross-platform, high-level

Job Manager

Top: large watershed shown divided into hierarchical sub-basis.


Bottom: Diagram showing the parallelization and hierarchy of the
models.

High Throughput Computing (HTC) by managing

CondorPy

Probabilistic flood map resulting from 5000 model runs using the
spatially-distributed physics-based hydrologic model GSSHA.

Running high fidelity models over large

B CondorPy and TethysCluster

Cloud computing resources are easy to


provision through admin site of Tethys
Portal, the web interface of Tethys
Platform. TethysCluster works behind
the scenes to automatically configure
the cloud resources into an HTCondor
computing pool.

TethysCluster

Hierarchical Modeling

While TethysCluster and CondorPy can be used independently to provision


computing resources and perform large modeling tasks, they have also been
integrated into Tethys Platform, a development platform for water resources
web apps, to enable computing support for modeling workflows and decision
support systems deployed as web apps.

Cluster Management

Tethys Platform is a water resources


web development platform that
lowers the barrier to creating web
apps. Tethys Platform provides open
source web GIS and visualization tools
all integrated into a unified Python
SDK .

Uncertainty is inherent to hydrologic modeling,

(1) programmatic access to dynamically scalable computing resources

linked together into a unified computing pool.

D Tethys Platform Integration

provisions

and

configures

Linux

computing resources with AWS. TethysCluster is an


adaptation of StarCluster and expands its functionality
to work with both Linux and Windows resources with
AWS as well as Azure.

Screenshot of a Tethys
web app, the
Streamflow Prediction
Tool, which uses
CondorPy and
TethysCluster to
process ensemble
forecasts.

CondorPy

ci-water.github.io/condorpy

TethysCluster

ci-water.github.io/TethysCluster

This material is based upon work supported by the National Science Foundation under Grant No. 1135483.

You might also like