Mathematical Methods and Algorithms for Extreme Scale (MMAES) Session 1

Time and Date: 16:20 - 18:00 on 2nd June 2015

Room: M201

Chair: Vassil Alexandrov

127	Efficient Algorithm for Computing the Ergodic Projector of Markov Multi-Chains [abstract] Abstract: This paper extends the Markov uni-chain series expansion theory to Markov multi-chains, i.e., to Markov chains having multiple ergodic classes and possible transient states. The introduced series expansion approximation (SEA) provides a controllable approximation for Markov multi-chain ergodic projectors which may be a useful tool in large-scale network analysis. As we will illustrate by means of numerical examples, the new algorithm is for large networks faster than the power algorithm.	Joost Berkhout, Bernd Heidergott
376	Transmathematical Basis of Infinitely Scalable Pipeline Machines [abstract] Abstract: A current Grand Challenge is to scale high-performance machines up to exascale. Here we take the theoretical approach of setting out the mathematical basis of pipeline machines that are infinitely scalable, whence any particular scale can be achieved as technology allows. We briefly discuss both hardware and software simulations of such a machine, which lead us to believe that exascale is technologically achievable now. The efficiency of von Neumann machines declines with increasing size but our pipeline machines retain constant efficiency regardless of size. These machines have perfect parallelism in the sense that every instruction of an inline program is executed, on successive data, on every clock tick. Furthermore programs with shared data effectively execute in less than a clock tick. We show that pipeline machines are faster than single or multi-core, von Neumann machines for sufficiently many program runs of a sufficiently time consuming program. Our pipeline machines exploit the totality of transreal arithmetic and the known waiting time of statically compiled programs to deliver the interesting property that they need no hardware or software exception handling.	James Anderson
420	Multilevel Communication optimal Least Squares [abstract] Abstract: Using a recently proposed communication optimal variant of TSQR, weak scalability of the least squares solver (LS) with multiple right hand sides is studied. The communication for TSQR based LS solver for multiple right hand sides remains optimal in the sense that no additional messages are necessary compared to TSQR. However, LS has additional communication volume and flops compared to that for TSQR. Additional flops and words sent for LS is derived. A PGAS model, namely, global address space programming framework (GPI) is used for inter-nodal one sided communication. Within NUMA sockets, C++-11 threading model is used. Scalability results of the proposed method up to a few thousand cores are shown.	Pawan Kumar
406	Developing A Large Time Step, Robust, and Low Communication Multi-Moment PDE Integration Scheme for Exascale Applications [abstract] Abstract: The Boundary Averaged Multi-moment Constrained finite-Volume (BA-MCV) method is derived, explained, and evaluated for 1-D transport to assess accuracy, maximum stable time step (MSTS), oscillations for discontinuous data, and parallel communication burden. The BA-MCV scheme is altered from the original MCV scheme to compute the updates of point wise cell boundary derivatives entirely locally. Then it is altered such that boundary moments are replaced with the interface upwind value. The scheme is stable at a maximum stable CFL (MSCFL) value of one no matter how high-order the scheme is, giving significantly larger time steps than Galerkin methods, for which the MSCFL decreases nearly quadratically with increasing order. The BA-MCV method is compared against a SE method at varying order, both using the ADER-DT time discretization. BA-MCV error for a sine wave was comparable to the same order of accuracy for a SE method. The resulting large time step, multi-moment, low communication scheme is well suited for exascale architectures.	Matthew Norman