Science Faculty HPC Facility: Projects

Prospective users of the Science Faculty HPC Facility are asked to provide a short overview of the work that they propose to carry out upon the facility, so that their domain science's techniques and methodologies become visible to the VUW research community as a whole.

Provisioning Projects

These first three projects were selected, from amongst the research groups that drove the accquisition of the facility, for running during the its provisioning phase, as they were felt to represent the three major gaps in research computing provision at VUW that it is hoped that the facility will fill.

SBS: Biodiversity: HPC helps reveal the diversity of an ant genome

School of Biological Sciences PhD candidate Monica Gruber will be using the VUW Science Faculty's new High Performance Computing Facility to analyse complex genomic data for a number of studies of the invasive yellow crazy ant. The yellow crazy ant is a pest in many Indo-Pacific island nations, Australia and South-east Asia.

The theme of Monica's studies relates to whether there is a genetic basis for the ant's invasion success. Her genomic studies include comparing genomes with other ants, and bees and wasps, to identify genes associated with population growth, and behaviour.

A more practical aim of this research is to discover species-specific pathogens or parasites that may be applied to biological control of this ant.

The typical data set size is around 10GB, with 7 data sets requiring multiple analyses. In-core memory requirements during pilot analyses have already exceeded 170GB. Searching gene databases to detect pathogens and gene homologues is potentially parallelisable across distributed computing resources.

ARC: Multi-millennial Ice Sheet Modeling at the Continental-scale

Antarctic Research Centre researcher Nicholas Golledge, will be using VUW Science Faculty's new High Performance Computing Facility to produce high spatial resolution simulations of the entire Antarctic ice sheet.

Nick's research focuses on running numerical ice sheet models to simulate the three-dimensional structure and dynamic character of past and present ice sheets and glaciers, however, in order to most accurately capture the physical processes of the ice-sheet system it is necessary to implement these models at high spatial resolutions, which, for the Antarctic continent, is less than a 10km grid-scale.

Nick uses a model code, PISM, developed by a team at the University of Fairbanks, Alaska, that is specifically written to make use of massively parallel architectures, allowing the processing to be distributed across several hundred cores and so achieve significant reductions in model runtimes.

As a researcher targetted for pre-deployment testing of the facility, Nick has already been able to produce a 7.5km resolution simulation of the Antarctic ice sheet. at the height of the Last Glacial Maximum, some 20,000 years ago, and is already gaining new insights into the research as a result.

SCPS: Quantum chemical calculations shed light on electronic structure

School of Chemical and Physical Science researcher Matthias Lein will be using the VUW Science Faculty's new High Performance Computing Facility to compute the electronic structure of transition metal coordination compounds and nano-sized materials.

Accurate calculations of the electronic interaction and structural optimizations are still a formidable computational tasks even though the general approach has been known for a long time. While smaller molecules can be computed quickly even on a desktop computer, the kind of molecules that tickle the interest of researchers need several orders of magnitude more computational power to achieve the same level of precision.

The new facility will be mainly used for the theoretical prediction of chemical structures and spectroscopic properties of the associated compounds.

A typical computation will run on 8 compute cores simultaneously and use up to 64 GB of memory along the way.

The data set that is produced at the end is dwarfed by the amount of intermediate data that is generated while the calculations is running. Those data sets can grow as large as several TB, but are then reduced to a small set of results that is much smaller.

Projects

SGEES: Rainfall-runoff modelling for the Lake Taupo catchment.

School of Geography, Environment and Earth Sciences PhD student Deborah Maxwell will be using the VUW Science Faculty's new High Performance Computing Facility to improve model prediction of inflows to Lake Taupo, and consequently, Mighty River Power's management of water that can pass through the Taupo Control Gates into the Waikato Power System.

Deborah's development, using MATLAB, of a rainfall-runoff model for Lake Taupo, overlays the spatial distribution of effective precipitation, losses and storage within the whole catchment, onto a routing of runoff through various sub-catchments to Lake Taupo.

Whilst individual simulations can be run quickly, calibration of the model parameters requires Monte Carlo methods, involving random sampling from the distribution of inputs and successive model runs, until a statistically significant distribution of outputs is obtained, which necessitates running a large number of simulations.

Deborah's overall processing times benefit from the ability to compile the MATLAB codes and then run concurrent multiple simulations in non-interactive batch-mode, against the MATLAB Compiler Runtime.

SMSOR: Modelling the joint survival function parametrically using copulas in R.

School of Mathematics, Statistics, and Operations Research Masters student Boyd Anderson is using the Science Faculty HPC to find a parametric joint survival function of a real automotive warranty data-set. In particular, using a copula (a function linking marginal variables into a multivariate distribution) to model the underlying dependence structure of the data-set. The copulas being considered are the Archimedean family, and the Elliptical family. In total, nine different copulas, each with at least 5 parameters.

The computationally expensive part of this project is finding the optimum copula parameters to describe the behaviour of the data-set. To do this, two optimisation heuristics were selected, Differential Evolution, and Particle Swarm Optimisation. Both DE and PSO have been implemented in R, and are sufficiently parallelised to run on 24 cores. The average run time per optimisation is 2-3 days, and each model will be run multiple times to increase the confidence in the computed best fit.

SECS: Large scale evaluation of graph layout algorithms.

School of Engineering and Computer Science PhD student Roman Klapaukh will be using the Science Faculty HPC Facility to run simulations of graph layout algorithms.

Specifically we are looking at many different variants of the force directed layout and how the different variants affect the final layout.

Unlike many other HPC projects, doing a single computation is very quick. The difficulty lies in performing enough runs of the algorithm to do all the tests which we require. Running on the Science Faculty HPC allows us to perform many sufficiently simultaneous trails.

SECS: Simulation Framework for Classifying Handwritten Image Patterns

School of Engineering and Computer Science Evolutionary Computation researcher Toktam Ebadi, will be using the Science Faculty HPC Facility to execute a simulation framework for classifying image patterns.

Toktam’s research focuses on developing Feature Pattern Classification System (FPCS) that investigates suitability of Learning Classifier Systems (LCSs) for the image domain.

Two implementations of FPCS have been developed. The original FPCS that suits online reinforcement learning scenarios and supervised FPCS that suits supervised scenarios where the ground truth data is available. Such a system is beneficial in identifying objects in digital images, however as the number of classes increase, more rules are required and therefore more memory.

In order to overcome memory limitations, larger datasets were previously being divided into separate parts with the training performed on each part, however, this resulted in an unwanted behaviour in the FPCS. Using the Science Faculty HPC Facility will thus enable execution of the FPCS on problems with larger number of classes and examples.

SCPS: Dynamics of Bose-Einstein condensates

Department of Physics (University of Otago) PhD student Sam Rooney, currently being hosted by SCPS, will be using the Science Faculty HPC Facility to simulate the dynamics of Bose-Einstein condensates at finite temperatures.

To quantitatively account for atomic interactions and thermal fluctuations, the equation of motion takes the form of a stochastic nonlinear Schrödinger equation which must be solved numerically.

A major computational difficulty is performing enough simulations to achieve stochastic convergence, which typically requires hundreds of trajectories. Individual simulations can require anywhere from 10^3 to 10^6 modes, leading to simulation times of the order of hours to months on a single cpu. Using the HPC facility enables us to trivially parallelize our numerics by performing many trajectories simultaneously.
 
VUW logo
Contact Us | Section Map | Disclaimer | RSS feed RSS FeedBack to top ^

Valid XHTML and CSS | Built on Foswiki

Page Updated: 12 Mar 2013 by kevin. © Victoria University of Wellington, New Zealand, unless otherwise stated. Header image used and relicensed under Creative Commons. Original author: whurley.