Architecture, Languages, Compilation and Hardware support for Emerging ManYcore systems (ALCHEMY) Session 2
Time and Date: 13:25 - 15:05 on 8th June 2016
Room: Macaw
Chair: Stephane Louise
488 | Using Semantics-Aware Composition and Weaving for Multi-Variant Progressive Parallelization [abstract] Abstract: When writing parallel software for high performance computing, a common practice is to start from a sequential variant of a program that is consecutively enriched with parallelization directives. This process - progressive parallelization - has the advantage that, at every point in time, a correct version of the program exists. However, progressive parallelization leads to an entanglement of concerns, especially, if different variants of the same functional code have to be maintained and evolved concurrently. We propose orchestration style sheets (OSS) as a novel approach to separate parallelization concerns from problem-specific code by placing them in reusable style sheets, so that concerns for different platforms are always separated, and never lead to entanglement. A weaving process automatically generates platform-specific code for required target platforms, taking semantic properties of the source code into account. Based on a scientific computing case study for fluid mechanics, we show that OSS are an adequate way to improve maintainability and reuse of Fortran code parallelized for several different platforms. |
Johannes Mey, Sven Karol, Uwe Aßmann, Immo Huismann, Joerg Stiller, Jochen Fröhlich |
402 | Evaluating Performance and Energy-Efficiency of a parallel Signal Correlation Algorithm on current Multi- and Many-Core Architectures [abstract] Abstract: Increasing variety and affordability of multi- and many-core embedded architectures can pose both a challenge and opportunity to developers of high performance computing applications. In this paper we present a case study where we develop and evaluate a unified parallel approach to correlation signal correlation algorithm,currently in use in a commercial/industrial locating system. We utilize both HPX C++ and CUDA runtimes to achieve scalable code for current embedded multi- and many-core architectures (NVIDIA Tegra, Intel Broadwell M, Arm Cortex A-15). We also compare our approach onto traditional high-performance hardware as well as a native embedded many-core variant. To increase the accuracy of our performance analysis we introduce dedicated performance model. The results show that our approach is feasible and enables us to harness the advantages of modern micro-server architectures, but also indicates that there are limitations to some of the currently existing many-core embedded architectures, that can lead to traditional hardware being superior both in efficiency and absolute performance. |
Arne Hendricks, Thomas Heller, Andreas Schaefer, Maximilian Kasparek, Dietmar Fey |
201 | Tabu Search for Partitioning Dynamic Dataflow Programs [abstract] Abstract: An important challenge of dataflow programming is the problem of partitioning dataflow components onto a target architecture. A common objective function associated to this problem is to find the maximum data processing throughput. This NP-complete problem is very difficult to solve with high quality close-to-optimal solutions for the very large size of the design space and the possibly large variability of input data. This paper introduces four variants of the tabu search metaheuristic expressly developed for partitioning components of a dataflow program. The approach relies on the use of a simulation tool, capable of estimating the performance for any partitioning configuration exploiting a model of the target architecture and the profiling results. The partitioning solutions generated with tabu search are validated for consistency and high accuracy with experimental platform executions. |
Malgorzata Michalska, Nicolas Zufferey, Marco Mattavelli |
283 | A Partition Scheduler Model for Dynamic Dataflow Programs [abstract] Abstract: The definition of an efficient scheduling policy is an important, difficult and open design problem for the implementation of applications based on dynamic dataflow programs for which optimal closed-form solutions do not exist. This paper describes an approach based on the study of the execution of a dynamic dataflow program on a target architecture with different scheduling policies. The method is based on a representation of the execution of a dataflow program with the associated dependencies, and on the cost of using scheduling policy, expressed as a number of conditions that need to be verified to have a successful execution within each partition. The relation between the potential gain of the overall execution satisfying intrinsic data dependencies and the runtime cost of finding an admissible schedule is a key issue to find close-to-optimal solutions for the scheduling problem of dynamic dataflow applications. |
Malgorzata Michalska, Endri Bezati, Simone Casale Brunet, Marco Mattavelli |
309 | A Fast Evaluation Approach of Data Consistency Protocols within a Compilation Toolchain [abstract] Abstract: Shared memory is a critical issue for large distributed systems. Despite several data consistency protocols have been proposed, the selection of the protocol that best suits to the application requirements and system constraints remains a challenge. The development of multi-consistency systems, where different protocols can be deployed during runtime, appears to be an interesting alternative. In order to explore the design space of the consistency protocols a fast and accurate method should be used. In this work we rely on a compilation toolchain that transparently handles data consistency decisions for a multi-protocol platform. We focus on the analytical evaluation of the consistency configuration that stands within the optimization loop. We propose to use a TLM NoC simulator to get feedback on expected network contentions. We evaluate the approach using five workloads and three different data consistency protocols. As a result, we are able to obtain a fast and accurate evaluation of the different consistency alternatives. |
Loïc Cudennec, Safae Dahmani, Guy Gogniat, Cédric Maignan, Martha Johanna Sepulveda |