506 |
Numerical modelling of pollutant propagation in Lake Baikal during the spring thermal bar [abstract] Abstract: In this paper, the phenomenon of the thermal bar in Lake Baikal and the propagation of pollutants from the Selenga River are studied with a nonhydrostatic mathematical model. An unsteady flow is simulated by solving numerically a system of thermal convection equations in the Boussinesq approximation using second-order implicit difference schemes in both space and time. To calculate the velocity and pressure fields in the model, an original procedure for buoyant flows, SIMPLED, which is a modification of the well-known Patankar and Spalding's SIMPLE algorithm, has been developed. The simulation results have shown that the thermal bar plays a key role in propagation of pollution in the area of Selenga River inflow into Lake Baikal. |
Bair Tsydenov, Anthony Kay, Alexander Starchenko |
730 |
I have a DRIHM: A case study in lifting computational science services up to the scientific mainstream [abstract] Abstract: While we are witnessing a transition from petascale to exascale computing, we experience, when teaching students and scientists to adopt distributed computing infrastructures for computational sciences, what Geoffrey A. Moore once coined the chasm between the visionaries in computational sciences and the early majority of scientific pragmatists. Using the EU-funded DRIHM project (Distributed Research Infrastructure for Hydro-Meteorology) as a case study, we see that innovative research infrastructures have difficulties to be accepted by the scientific pragmatists: The infrastructure services are not yet "mainstream". Excellence in workforces in computational sciences, however, can only be achieved if the tools are not only available but also used. In this paper we show for DRIHM how the chasm exhibits and how it can be crossed. |
Michael Schiffers, Nils Gentschen Felde, Dieter Kranzlmüller |
523 |
Random Set Method Application to Flood Embankment Stability Modelling [abstract] Abstract: In this work the application of random set theory to flood embankment stability modelling is presented. The objective of this paper is to illustrate a method of uncertainty analysis in a real geotechnical problem. |
Anna Pięta, Krzysztof Krawiec |
260 |
MPJ Express Meets YARN: Towards Java HPC on Hadoop Systems [abstract] Abstract: Many organizations—including academic, research, commercial institutions—have invested heavily in setting up High Performance Computing (HPC) facilities for running computational science applications. On the other hand, the Apache Hadoop software—after emerging in 2005— has become a popular, reliable, and scalable open-source framework for processing large-scale data (Big Data). Realizing the importance and significance of Big Data, an increasing number of organizations are investing in relatively cheaper Hadoop clusters for executing their mission critical data processing applications. An issue here is that system administrators at these sites might have to maintain two parallel facilities for running HPC and Hadoop computations. This, of course, is not ideal due to redundant maintenance work and poor economics. This paper attempts to bridge this gap by allowing HPC and Hadoop jobs to co-exist on a single hardware facility. We achieve this goal by exploiting YARN—Hadoop v2.0—that de-couples the compu- tational and resource scheduling part of the Hadoop framework from HDFS. In this context, we have developed a YARN-based reference runtime system for the MPJ Express software that allows executing parallel MPI-like Java applications on Hadoop clusters. The main contribution of this paper is to provide Big Data community access to MPI-like programming using MPJ Express. As an aside, this work allows parallel Java applications to perform computations on data stored in Hadoop Data File System (HDFS). |
Hamza Zafar, Farrukh Aftab Khan, Bryan Carpenter, Aamir Shafi, Asad Waqar Malik |
393 |
Scalable Multilevel Support Vector Machines [abstract] Abstract: Solving different types of optimization models (including parameters fitting) for support vector machines on large-scale training data is often an expensive computational task. This paper proposes a multilevel algorithmic framework that scales efficiently to very large data sets. Instead of solving the whole training set in one optimization process, the support vectors are obtained and gradually refined at multiple levels of coarseness of the data. The proposed framework includes: (a) construction of hierarchy of large-scale data coarse representations, and (b) a local processing of updating the hyperplane throughout this hierarchy. Our multilevel framework substantially improves the computational time without loosing the quality of classifiers. The algorithms are demonstrated for both regular and weighted support vector machines. Experimental results are presented for balanced and imbalanced classification problems. Quality improvement on several imbalanced data sets has been observed. |
Talayeh Razzaghi, Ilya Safro |
407 |
Arbitrarily High-Order-Accurate, Hermite WENO Limited, Boundary-Averaged Multi-Moment Constrained Finite-Volume (BA-MCV) Schemes for 1-D Transport [abstract] Abstract: This study introduces the Boundary Averaged Multi-moment Constrained finite-Volume (BA-MCV) scheme for 1-D transport with Hermite Weighted Essentially Non-Oscillatory (HWENO) limiting using the ADER Differential Transform (ADER-DT) time discretization. The BA-MCV scheme evolves a cell average using a Finite-Volume (FV) scheme, and it adds further constraints as point wise derivatives of the state at cell boundaries, which are evolved in strong form using PDE derivatives. The resulting scheme maintains a Maximum Stable CFL (MSCFL) value of one no matter how high-order the scheme is. Also, parallel communication requirements are very low and will be described. Using test cases of a function with increasing steepness, the accuracy of the BA-MCV method will be tested in a limited and non-limited context for varying levels of smoothness. Polynomial $h$-refinement convergence and exponential $p$-refinement convergence will be demonstrated. The overall ADER-DT + BA-MCV + HWENO scheme is a scalable and larger time step alternative to Galerkin methods for multi-moment fluid simulation in climate and weather applications. |
Matthew Norman |
434 |
A Formal Method for Parallel Genetic Algorithms [abstract] Abstract: We present a formal model that allows to analyze non trivial properties about the behavior of parallel genetic algorithms implemented using multi-islands. The model is based on a probabilistic labeled transition system, that represents the evolution of the population in each island, as well as the interaction among different islands. By studying the traces these systems can perform, the resulting model allows to formally compare the behavior of different algorithms. |
Natalia Lopez, Pablo Rabanal, Ismael Rodriguez, Fernando Rubio |
484 |
Comparison of Two Diversication Methods to Solve the Quadratic Assignment Problem [abstract] Abstract: The quadratic assignment problem is one of the most studied NP-hard problems. It is known for its complexity which makes it a good candidate for the parallel design. In this paper, we propose and analyze two parallel cooperative algorithms based on hybrid iterative tabu search. The only difference between the two approaches is the diversification methods. Through 15 of the hardest well-known instances from QAPLIB benchmark, our algorithms produce competitive results. This experimentation shows that our propositions can exceed or equal several leading algorithms from the literature in almost all the hardest benchmark instances. |
Omar Abdelkafi, Lhassane Idoumghar, Julien Lepagnot |
522 |
A Matlab toolbox for Kriging metamodelling [abstract] Abstract: Metamodelling offers an efficient way to imitate the behaviour of computationally expensive simulators. Kriging based metamodels are popular in approximating computation-intensive simulations of deterministic nature. Irrespective of the existence of various variants of Kriging in the literature, only a handful of Kriging implementations are publicly available and most, if not all, free libraries only provide the standard Kriging metamodel. ooDACE toolbox offers a robust, flexible and easily extendable framework where various Kriging variants are implemented in an object-oriented fashion under a single platform. This paper presents an incremental update of the ooDACE toolbox introducing an implementation of Gradient Enhanced Kriging which has been tested and validated on several engineering problems. |
Selvakumar Ulaganathan, Ivo Couckuyt, Dirk Deschrijver, Eric Laermans, Tom Dhaene |
607 |
Improving Transactional Memory Performance for Irregular Applications [abstract] Abstract: Transactional memory (TM) offers optimistic concurrency support in modern multicore architectures, helping the programmers to extract parallelism in irregular applications when data dependence information is not available before runtime. In fact, recent research focus on exploiting thread-level parallelism using TM approaches. However, the proposed techniques are of general use, valid for any type of application. This work presents ReduxSTM, a software TM system specially designed to extract maximum parallelism from irregular applications. Commit management and conflict detection were tailored to take advantage of both, transaction ordering constraints to assure correct results, and the existence of (partial) reduction patterns, a very frequent memory access pattern in irregular applications. Both facts are used to avoid unnecessary transaction aborts. A function in 300.twolf package from SPEC CPU2000 was taken as a motivating irregular program. This code was parallelized using ReduxSTM and an ordered version of TinySTM, a state-of-the-arte TM system. The experimental evaluation shows our proposed TM system exploits more parallelism from the sequential program and obtains better performance than the other system. |
Manuel Pedrero, Eladio Gutiérrez, Sergio Romero, Oscar Plata |
635 |
Building Java Intelligent Applications Data Mining for Java Type-2 Fuzzy Inference Systems [abstract] Abstract: This paper introduces JT2FISClustering, a data mining extension for JT2FIS. JT2FIS is a Java class library for building intelligent applications. This extension is used to extract information from a data set and transform it into an Interval Type-2 Fuzzy Inference System in Java applications. Mamdani and Takagi-Sugeno Fuzzy Inference Systems can be generated using fuzzy c-means or subtractive data mining methods. We compare the outputs and performance of Matlab R versus Java in order to validate the proposed extension. |
Manuel Castañón-Puga, Josué-Miguel Flores-Parra, Juan Ramón Castro, Carelia Gaxiola-Pacheco, Luis Enrique Palafox-Maestre |
639 |
The Framework for Rapid Graphics Application Developent: The Multi-scale Problem Visualization. [abstract] Abstract: Interactive real-time visualization plays a significant role in simulation research domain. Multi-scale problems are in need of high performance visualization with good quality and the same could be said about other problem domains, e.g. big data analysis, physics simulation, etc. The state of the art shows that a universal tool for solving such problem is non-existent. Modern computer graphics requires enormous efforts to implement efficient algorithms on modern GPUs and GAPIs. In the first part of our paper we introduce a framework for rapid graphics application development and its extensions for multi-scale problem visualization. In the second part of the paper we provide a prototype of multi-scale problem’s solution in simulation and monitoring of high-precision agent movements starting from behavioral patterns in an airport and up to world-wide flight traffic. Finally we summarize our results and speculate about future investigations. |
Alexey Bezgodov, Andrey Karsakov, Aleksandr Zagarskikh, Vladislav Karbovskii |
29 |
A multiscale model for the feto-placental circulation in the monochorionic twin pregnancies [abstract] Abstract: We developed a mathematical model of monochorionic twin pregnancies to simulate both the normal gestation and the Twin-Twin Transfusion Syndrome (TTTS), a disease in which the interplacental anastomose create a flow imbalance, causing one of the twin to receive too much blood and liquids, becoming hypertensive and polyhydramnios (the Recipient) and the other to become hypotensive and oligohydramnios (the Donor). This syndrome, if untreated, leads almost certainly to death one or both twins. We propose a compartment model to simulate the flows between the placenta and the fetuses and the accumulation of the amniotic fluid in the sacs. The aim of our work is to provide a simple but realistic model of the twins-mother system and to stress it by simulating the pathological cases and the related treatments, i.e. aminioreduction (elimination of the excess liquid in the recipient sac), laser therapy (removal of all the anastomoses) and other possible innovative therapies impacting on pressure and flow parameters. |
Ilaria Stura, Pietro Gaglioti, Tullia Todros, Caterina Guiot |
86 |
Sequential and Parallel Implementation of GRASP for the 0-1 Multidimensional Knapsack Problem [abstract] Abstract: The knapsack problem is a widely known problem in combinatorial optimization and has been object of many researches in the last decades. The problem has a great number of variants and obtaining an exact solution to any of these is not easily accomplished, which motivates the search for alternative techniques to solve the problem. Among these alternatives, metaheuristics seem to be suitable on the search for approximate solutions for the problem. In this work we propose a sequential and a parallel implementation for the multidimensional knapsack problem using GRASP metaheuristic. The obtained results show that GRASP can lead to good quality results, even optimal in some instances, and that CUDA may be used to expand the neighborhood search and as a result may lead to improved quality results. |
Bianca De Almeida Dantas, Edson Cáceres |
89 |
Telescopic hybrid fast solver for 3D elliptic problems with point singularities [abstract] Abstract: This paper describes a telescopic solver for two dimensional h adaptive grids with point singularities. The input for the telescopic solver is an h refined two dimensional computational mesh with rectangular finite elements. The candidates for point singularities are first localized over the mesh by using a greedy algorithm. Having the candidates for point singularities, we execute either a direct solver, that performs multiple refinements towards selected point singularities and executes a parallel direct solver algorithm which has logarithmic cost with respect to refinement level. The direct solvers executed over each candidate for point singularity return local Schur complement matrices that can be merged together and submitted to iterative solver. In this paper we utilize a parallel logarithmic computational cost GPU solver or parallel multi-thread GALOIS solver as a direct solver. We use Incomplete LU Preconditioned Conjugated Gradients (ILUPCG) as an iterative solver. We also show that elimination of point singularities from the refined mesh reduces significantly the number of iterations to be performed by the ILUPCG iterative solver. |
Anna Paszynska, Konrad Jopek, Krzysztof Banaś, Maciej Paszynski, Andrew Lenerth, Donald Nguyen, Keshav Pingali, Lisandro Dalcin, Victor Calo |
95 |
Adapting map resolution to accomplish execution time constraints in wind field calculation [abstract] Abstract: Forest fires are natural hazards that every year destroy thousands of hectares around the world. Forest fire propagation prediction is a key point to fight against such hazards. Several models and simulators have been developed to predict forest fire propagation. These models require input parameters such as digital elevation map, vegetation map, and other parameters describing the vegetation and meteorological conditions. However, some meteorological parameters, such as wind speed and direction, change from one point to another one due to the effect of the topography of the terrain. Therefore, it is necessary to couple wind field models, such as WindNinja, to estimate the wind speed and direction at each point of the terrain. The output provided by the wind field simulator is used as input of the fire propagation model. Coupling wind field model and forest fire propagation model improves accuracy prediction, but increases significantly prediction time. This fact is critical since propagation prediction must be provided in advance to allow the control centers to manage firefighters in the best possible way. This work analyses WindNinja execution time, describes a WindNinja parallelisation based on map partitioning, determines the limitations of such methodology for large maps and presents an improvement based on adapting map resolution to accomplish execution time limitations. |
Gemma Sanjuan, Tomas Margalef, Ana Cortes |
103 |
Efficient BSP/CGM algorithms for the maximum subsequence sum and related problems [abstract] Abstract: Given a sequence of n numbers, with at least one positive value, the maximum subsequence sum problem consists in finding the contiguous subsequence with the largest sum or score, among all derived subsequences of the original sequence. Several scientific applications have used algorithms that solve the maximum subsequence sum. Particularly in Computational Biology, these algorithms can help in the tasks of identification of transmembrane domains and in the search for GC-content regions, a required activity in the operation of pathogenicity islands location. The sequential algorithm that solves this problem has O(n) time complexity. In this work we present BSP/CGM parallel algorithms to solve the maximum subsequence sum problem and three related problems: the maximum longest subsequence sum, the maximum shortest subsequence sum and the number of disjoints subsequences of maximum sum. To the best of our knowledge there are no parallel BSP/CGM algorithms for these related problems. Our algorithms use p processors and require O(n/p) parallel time with a constant number of communication rounds for the algorithm of the maximum subsequence sum and O(log p) communication rounds, with O(n/p) local computation per round, for the algorithms of the related problems. We implemented the algorithms on a cluster of computers using MPI and on a machine with GPU using CUDA, both with good speed-ups. |
Anderson C. Lima, Edson N. Cáceres, Rodrigo G. Branco, Roussian R. A. Gaioso, Samuel B. Ferraz, Siang W. Song, Wellinton S. Martins |
225 |
Fire Hazard Safety Optimisation for Building Environments [abstract] Abstract: This article provides a theoretical study for fire hazard safety in building environments. The working hypothesis is that the navigation costs and hazard spread are deterministically modeled and over time. Based on the dynamic navigation costs under fire hazard, the article introduces the notion of dynamic safety in a recursive manner. Then several theoretical results are proposed to calculate the dynamic safety over time and to establish that it represents the maximum amount of time to delay safely on nodes. Based on the recursive equations, an algorithm is proposed to calculate the dynamic safety and successor matrices. Finally, some experimental results are provided to illustrate the efficiency of the algorithm and to present a real case study. |
Sabin Tabirca, Tatiana Tabirca, Laurence Yang |
295 |
A Structuring Concept for Securing Modern Day Computing Systems [abstract] Abstract: Security within computing systems is ambiguous, proliferated through obscurity, a knowledgeable user, or plain luck. Presented is a novel concept for structuring computing systems to achieve a higher degree of overall system security through the compartmentalization and isolation of executed instructions for each component. Envisioned is a scalable model which focuses on lower level operations to alleviate the view of security as a binary outcome to that of a deterministic metric based on a set of independent characteristics. |
Orhio Creado, Phu Dung Le, Jan Newmarch, Jeff Tan |
323 |
Federated Big Data for resource aggregation and load balancing with DIRAC [abstract] Abstract: BigDataDIRAC is a Federated Big Data solution with a Distributed Infrastructure with Remote Agent Control (DIRAC) access point. Users have the opportunity to access multiple Big Data resources scattered in different geographical areas, such as access to grid resources. This approach opens the possibility of offering not only grid and cloud to the users, but also Big Data resources from the same DIRAC environment. We describe a system to allow access to a federation of Big Data resources, including load balancing, using DIRAC. Proof of concept is shown and load balancing performance evaluations are presented using several use cases supported by three computing centers in two countries, and with four Hadoop clusters. |
Victor Fernandez, Víctor Méndez, Tomás F. Pena |
324 |
Big Data Analytics Performance for Large Out-Of-Core Matrix Solvers on Advanced Hybrid Architectures [abstract] Abstract: This paper examines the performance of large Out-Of-Core matrices to assess the optimal Big Data system performance of advanced computer architectures, based on the performance evaluation of a large dense Lower-Upper Matrix Decomposition (LUD) employing a highly tuned, I/O managed, slab based LUD software package developed by the Lockheed Martin Corporation. We present extensive benchmark studies conducted with this package on UMBC’s Bluegrit and Bluewave clusters, and NASA-GFSC’s Discover cluster systems. Our results show speedup for a single node achieved by Phi Coprocessors relative to the host CPU SandyBridge processors is about a 1.5X improvement, which is an even smaller relative performance gain compared with the studies published by F.Masci (Masci, 2013), where he obtains a 2-2.5x performance. Surprisingly, the Westmere with the Tesla GPU scales comparably with the Sandy Bridge and the Phi Coprocessor up to 12 processes and then fails to continue to scale. The performances across 20 CPU nodes of SandyBridge obtains a uniform speedup of 0.5X over Westmere for problem sizes of 10K, 20K and 40K unknowns. With an Infiniband DDR, the performance of Nehalem processors is comparable to Westmere without the interconnect. |
Raghavendra Rao, Milton Halem, John Dorband |
352 |
A critical survey of data grid replication strategies based on data mining techniques [abstract] Abstract: Replication is one common way to effectively address challenges for improving the data management in data grids. It has attracted a great deal of attention of many researchers. Hence, a lot of work is done and many strategies have been proposed. However, most of the existing replication strategies consider a single file-based granularity and do not take into account file access patterns or possible file correlations. However, file correlations become an increasingly important consideration for performance enhancement in data grids. In this regard, the knowledge about file correlations can be extracted from historical and operational data using the techniques of the data mining field. Data mining techniques have proved to be a powerful tool facilitating the extraction of meaningful knowledge from large data sets. As a consequence of the convergence of data mining and data grid, mining grid data is an interesting research field which aims at analyzing grid systems with data mining techniques in order to efficiently discover new meaningful knowledge to enhance data management in data grids. More precisely, in this paper, the extracted knowledge is used to enhance replica management. Gaps in the current literature and opportunities for further research are presented. In addition, we propose a new guideline to data mining application in the context of data grid replication strategies. To the best of our knowledge, this is the first survey mainly dedicated to data grid replication strategies based on data mining techniques. |
Tarek Hamrouni, Sarra Slimani, Faouzi Ben Charrrada |
428 |
Reduction of Computational Load for MOPSO [abstract] Abstract: The run time for many optimisation algorithms, particularly those that explicitly consider multiple objectives, can be impractically large when applied to real world problems. This paper reports an investigation into the behaviour of Multi-Objective Particle Swarm Optimisation (MOPSO), that seeks to reduce the number of objective function evaluations needed, without degrading solution quality. By restricting archive size and strategically reducing the trial solution population size, it has been found the number of function evaluations can been reduced by 66.7% without significant reduction in solution quality. In fact, careful manipulation of algorithm operating parameters can even significantly improve solution quality. |
Mathew Curtis, Andrew Lewis |
501 |
The Effects of Hotspot Detection and Virtual Machine Migration Policies on Energy Consumption and Service Levels in the Cloud [abstract] Abstract: Cloud computing has received much attention among researchers lately. Managing Cloud resources efficiently necessitates effective policies that assign applications to hardware in a way that they require the least resources possible. Applications are first assigned to virtual machines which are subsequently placed on the most appropriate server host. If a server becomes overloaded, some of its virtual machines are reassigned. This process requires a hotspot detection mechanism in combination with techniques that select the virtual machine(s) to migrate. In this work we introduce two new virtual machine selection policies, Median Migration Time and Maximum Utilisation, and show that they outperform existing approaches on the criteria of minimising energy consumption, service level agreement violations and the number of migrations when combined with different hotspot detection mechanisms. We show that parametrising the the hotspot detection policies correctly has a significant influence on the workload balance of the system. |
S Sohrabi, I. Moser |
614 |
Towards a Performance-realism Compromise in the Development of the Pedestrian Navigation Model [abstract] Abstract: Despite the emergence of new approaches and increasingly powerful processing resources, there are cases in the domain of pedestrian modeling that require the maintenance of compromise between the computational performance and realism of the behavior of the simulated agents. Present paper seeks to address this issue through comparative computational experiments and visual validation of the simulations using the real-world data. Acquired results show that a reasonable compromise may be reached for in the multi-level navigation incorporating both route planning and collision avoidance. |
Daniil Voloshin, Vladislav Karbovskii, Dmitriy Rybokonenko |
641 |
A Methodology for Designing Energy-Aware Systems for Computational Science [abstract] Abstract: Energy consumption is currently one of the main issues in large distributed systems. More specifically, the efficient management of energy without losing performance has become a hot topic in the field. Thus, the design of systems solving complex problems must take into account energy efficiency. In this paper we present a formal methodology to check the correctness, from an energy-aware point of view, of large systems, such as HPC clusters and cloud environments, dedicated to computational science. Our approach uses a simulation platform, to model and simulate computational science environments, and metamorphic testing, to check the correctness of energy consumption in these systems. |
Pablo Cañizares, Alberto Núñez, Manuel Nuñez, J.Jose Pardo |
528 |
Towards an automatic co-generator for manycores’ architecture and runtime: STHORM case-study [abstract] Abstract: The increasing design complexity of manycore architectures at the hardware and software levels imposes to have powerful tools capable of validating every functional and non-functional property of the architecture. At the design phase, the chip architect needs to explore several parameters from the design space, and iterate on different instances of the architecture, in order to meet the defined requirements. Each new architectural instance requires the configuration and the generation of a new hardware model/simulator, its runtime, and the applications that will run on the platform, which is a very long and error-prone task. In this context, the IP-XACT standard has become widely used in the semiconductor industry to package IPs and provide low level SW stack to ease their integration. In this work, we present a primer work on a methodology to automatically configuring and assembling an IP-XACT golden model and generating the corresponding manycore architecture HW model, low-level software runtime and applications. We use the STHORM manycore architecture and the HBDC application as a case study. |
Charly Bechara, Karim Ben Chehida, Farhat Thabet |
306 |
Enhancing ELM-based facial image classification by exploiting multiple facial views [abstract] Abstract: In this paper, we investigate the effectiveness of the Extreme Learning Machine (ELM) network in facial image classification. In order to enhance performance, we exploit knowledge related to the human face structure. We train a multi-view ELM network by employing automatically created facial regions of interest to this end. By jointly learning the network parameters and optimized network output combination weights, each facial region appropriately contributes to the final classification result. Experimental results on three publicly available databases show that the proposed approach outperforms facial image classification based on a single facial representation and on other facial region combination schemes |
Alexandros Iosifidis, Anastasios Tefas, Ioannis Pitas |
429 |
Automatic Query Driven Data Modelling in Cassandra [abstract] Abstract: Non-relational databases have recently been the preferred choice when it comes to dealing with BigData challenges, but their performance is very sensitive to the chosen data organisations. We have seen differences of over 70 times in response time for the same query on different models. This brings users the need to be fully conscious of the queries they intend to serve in order to design their data model. The common practice then, is to replicate data into different models designed to fit different query requirements. In this scenario, the user is in charge of the code implementation required to keep consistency between the different data replicas. Manually replicating data in such high layers of the database results in a lot of squandered storage due to the underlying system replication mechanisms that are formerly designed for availability and reliability ends. In this paper, we propose and design a mechanism and a prototype to provide users with transparent management, where queries are matched with a well-performing model option. Additionally, we propose to do so by transforming the replication mechanism into a heterogeneous replication one, in order to avoid squandering disk space while keeping the availability and reliability features. The result is a system where, regardless of the query or model the user specifies, response time will always be that of an affine query. |
Roger Hernandez, Yolanda Becerra, Jordi Torres, Eduard Ayguade |
186 |
A clustering-based approach to static scheduling of multiple workflows with soft deadlines in heterogeneous distributed systems [abstract] Abstract: Typical patterns of using scientific workflow management systems (SWMS) include periodical executions of prebuilt workflows with precisely known estimates of tasks’ execution times. Combining such workflows into sets could sufficiently improve resulting schedules in terms of fairness and meeting users’ constraints. In this paper, we propose a clustering-based approach to static scheduling of multiple workflows with soft deadlines. This approach generalizes commonly used techniques of grouping and ordering of parts of different workflows. We introduce a new scheduling algorithm, MDW-C, for multiple workflows with soft deadlines and compare its effectiveness with task-based and workflow-based algorithms which we proposed earlier in [1]. Experiments with several types of synthetic and domain-specific test data sets showed the superiority of a mixed clustering scheme over task-based and workflow-based schemes. This was confirmed by an evaluation of proposed algorithms on a basis of the CLAVIRE workflow management platform. |
Klavdiya Bochenina, Nikolay Butakov, Alexey Dukhanov, Denis Nasonov |
268 |
Challenges and Solutions in Executing Numerical Weather Prediction in a Cloud Infrastructure [abstract] Abstract: Cloud Computing has emerged as an option to perform large-scale scientific computing. The elasticity of the cloud and its pay-as-you-go model present an interesting opportunity for applications commonly executed in clusters or supercomputers. This paper presents the challenges of migrating and executing a numerical weather prediction (NWP) application to a cloud computing infrastructure. We compared the execution of this High-Performance Computing (HPC) application in a local cluster and in the cloud using different instances sizes. The experiments demonstrate that processing and networking create a limiting factor, but that storing input and output datasets in the cloud presents an interesting option to share results and ease the deployment of a test-bed for a weather research platform. Results show that cloud infrastructure can be used as an viable HPC alternative for numerical weather prediction software. |
Emmanuell Diaz Carreño, Eduardo Roloff, Philippe Navaux |
325 |
Flexible Dynamic Time Warping for Time Series Classification [abstract] Abstract: Measuring the similarity or distance between two time series sequences is critical for the classification of a set of time series sequences. Given two time series sequences, X and Y, the dynamic time warping (DTW) algorithm can calculate the distance between X and Y. But the DTW algorithm may align some neighboring points in X to the corresponding points which are far apart in Y. It may get the alignment with higher score, but with less representative information. This paper proposes the flexible dynamic time wrapping (FDTW) method for measuring the similarity of two time series sequences. The FDTW algorithm adds an additional score as the reward for the contiguously long one-to-one fragment. As the experimental results show, the DTW and DDTW and FDTW methods outperforms each other in some testing sets. By combining the FDTW, DTW and DDTW methods to form a classifier ensemble with the voting scheme, it has less average error rate than that of each individual method. |
Che-Jui Hsu, Kuo-Si Huang, Chang-Biau Yang, Yi-Pu Guo |
511 |
Onedata - a Step Forward towards Globalization of Data Access for Computing Infrastructures [abstract] Abstract: To satisfy requirements of data globalization and high performance access in particular, we introduce the originally created onedata system which virtualizes storage systems provided by storage resource providers distributed globally. onedata introduces new data organization concepts together with providers' cooperation procedures that involve use of GlobalRegistry as a mediator. The most significant features include metadata synchronization and on-demand file transfer. |
Lukasz Dutka, Michał Wrzeszcz, Tomasz Lichoń, Rafał Słota, Konrad Zemek, Krzysztof Trzepla, Łukasz Opioła, Renata Slota, Jacek Kitowski |
536 |
Ocean forecast information system for emergency interventions [abstract] Abstract: The paper describes the computation and information system required to support fast and efficient operations in emergency situation in the marine environment. The most common cases, which induced to activate emergency procedures, are identified and the main features of the Search And Rescue (SAR) intervention are described in their evolution, the inputs and detail that are required and the weakness that still exist. The improvement that can come from a more integrated information system, from the computation of the environmental condition to the adoption of dedicated graphical interface to provide all the necessary information in a clear and complete way, are also explained. |
Roberto Vettor, Carlos Guedes Soares |
682 |
Optimizing Performance of ROMS on Intel Xeon Phi [abstract] Abstract: ROMS (Regional Oceanic Modeling System) is an open-source ocean modeling system that is widely used by the scientific community. It uses a coarse-grained parallelization scheme which partitions the computational domain into tiles. ROMS operates on a lot of multi-dimensional arrays, which makes it an ideal candidate to gain from architectures with wide and powerful Vector Processing Units (VPU) such as Intel Xeon Phi. In this paper we present an analysis of the BENCHMARK application of ROMS and the issues affecting its performance on Xeon Phi. We then present an iterative optimization strategy for this application on Xeon Phi which results in a speed-up of over 2x compared to the baseline code in the native mode and 1.5x in symmetric mode. |
Gopal Bhaskaran, Pratyush Gaurav |
336 |
Fuzzy indication of reliability in metagenomics NGS data analysis [abstract] Abstract: NGS data processing in metagenomics studies has to deal with noisy data that can contain a large amount of reading errors which are difficult to detect and account for. This work introduces a fuzzy indicator of reliability technique to facilitate solutions to this problem. It includes modified Hamming and Levenshtein distance functions that are aimed to be used as drop-in replacements in NGS analysis procedures which rely on distances, such as phylogenetic tree construction. The distances utilise fuzzy sets of reliable bases or an equivalent fuzzy logic, potentially aggregating multiple sources of base reliability. |
Milko Krachunov, Dimitar Vassilev, Maria Nisheva, Ognyan Kulev, Valeriya Simeonova, Vladimir Dimitrov |
559 |
Pairwise genome comparison workflow in the Cloud using Galaxy [abstract] Abstract: Workflows are becoming the new paradigm in bioinformatics. In general, bioinformatics problems are solved by interconnecting several small software pieces to perform complex analyses. This demands a minimal expertise to create, enact and monitor such tools compositions. In addition bioinformatics is immersed in the big-data territory, facing huge problems to analyse such amount of data. We have addressed these problems by integrating a tools management platform (Galaxy) and a Cloud infrastructure, which prevents moving the big datasets between different locations and allows the dynamic scaling of the computing resources depending on the user needs. The result is a user-friendly platform that facilitates the work of the end-users while performing their experiments, installed in a Cloud environment that includes authentication, security and big-data transfer mechanisms. To demonstrate the suitability of our approach we have integrated in the infrastructure an existing pairwise and multiple genome comparison tool which comprises the management of huge datasets and high computational demands. |
Óscar Torreño Tirado, Michael T. Krieger, Paul Heinzlreiter, Oswaldo Trelles |
583 |
WebGL based visualisation and analysis of stratigraphic data for the purposes of the mining industry [abstract] Abstract: In recent years the combination of databases, data and internet technologies has greatly enhanced the functionality of many systems based on spatial data, and facilitated the dissemination of such information. In this paper, we propose a web-based data visualisation and analysis system for stratigraphic data from a Polish mine, with visualisation and analysis tools which can be accessed via the Internet. WWW technologies such as active web pages and WebGL technology provide a user-friendly interface for browsing, plotting, comparing, and downloading information of interest, without the need for dedicated mining industry software. |
Anna Pieta, Justyna Bała |
33 |
Modeling and Simulation of Masticatory Muscles [abstract] Abstract: Medical simulators play an important role in helping the development of prototype prostheses, pre-surgical planning and in a better understanding of the mechanical phenomena involved in muscular activity. This article focuses in modeling and simulating the activity of the jaw muscular system. The model involves the use of three-dimensional bone models and muscle modeling based on Hill type actuators. Ligament restrictions to mandible movement were taken into account in our model. Data collected from patients were used to partially parameterize our model so that it could be used in medical applications. In addition, the simulation of muscles employed a new methodology based on insertion curves, with many lines of action for each group of muscles. A simulator was developed, which allowed real time visualization of individual muscle activation under each correspondent simulation time. The model derived trajectory was then compared to the assembled data, remaining mostly within the convex hull of the mandible motion curves captured. Furthermore, the model accurately described the desired border movements. |
Eduardo Garcia, Márcio Leal, Marta Villamil |
35 |
Fully automatic 2D hp-adaptive Finite Element Method for Non-Stationary Heat Transfer [abstract] Abstract: In this paper we present a fully automatic hp adaptive finite element method code for non-stationary two dimensional problems. The code utilizes the -scheme for time discretization and fully automatic hp adaptive finite element method discretization for numerical solution of each time step. The code is verified on the examplary non-stationary problem of heat transfer over the L-shape domain. |
Paweł Matuszyk, Marcin Sieniek, Maciej Paszyński |
46 |
Parallelization of an Encryption Algorithm Based on a Spatiotemporal Chaotic System and a Chaotic Neural Network [abstract] Abstract: In this paper the results of parallelizing a block cipher based on a spatiotemporal chaotic system and a chaotic neural network are presented. A data dependence analysis of loops was applied in order to parallelize the algorithm. The parallelism of the algorithm is demonstrated in accordance with the OpenMP standard. As a result of my study, it was stated that the most time-consuming loops of the algorithm are suitable for parallelization. The efficiency measurements of a parallel algorithm working in ECB, CTR, CBC and CFB modes of operation are shown. |
Dariusz Burak |
64 |
Cryptanalysing the shrinking generator [abstract] Abstract: Some linear cellular automata generate exactly the same PN-sequences as those generated by maximum-length LFSRs. Hence, cellular automata can be considered as alternative generators to the maximum-length LFSRs. Moreover, some LFSR-based keystream generators can be modelled as linear structures based on cellular automata. In this work, we analyse a family of one-dimensional, linear, regular and cyclic cellular automata based on the rule 102 that describe the behaviour of the shrinking generator, designed as a non-linear generator. This implies that the output sequence of the generator is sensitive to suffer a cryptanalysis that takes advantage of this linearity. |
Sara D. Cardell, Amparo Fúster-Sabater |
74 |
D-Aid - An App to Map Disasters and Manage Relief Teams and Resources [abstract] Abstract: Natural or man-made disasters cause damage to life and property. Lack of appropriate emergency management increases the physical damage and loss of life. D-Aid, the smartphone App proposed by this article, intends to help volunteers and relief teams to quickly map and aid victims of a disaster. Anyone can put an occurrence after a disaster on a web map streamlining and decentralizing the information access. Through visualization techniques like heat maps and voronoi diagrams on a map implemented in the D-Aid app and also on a web map everyone can easily get information about amount of victims, their necessities and eminent dangers after disasters. |
Luana Carine Schunke, Luiz Paulo Luna de Oliveira, Mauricio Cardoso, Marta Becker Villamil |
168 |
My Best Current Friend in a Social Network [abstract] Abstract: Due to its popularity, social networks (SNs) have been subject to different analyses. A research field in this area is the identification of several types of users and groups. To make the identification process easier, a SN is usually represented through a graph. Usual tools to analyze a graph are the centrality measures, which identify the most important vertices within a graph; among them the PageRank (a measure originally designed to classify web pages). Informally, in the context of a SN, the PageRank of a user i represents the probability that another user of the SN is seeing the page of i after a considerable time of navigation in the SN. In this paper, we define a new type of user in a SN: the best current friend. Informally, the idea is to identify, among the friends of a user i, who is the friend k that would generate the highest decrease in the PageRank of i if k stops being his/her friend. This may be useful to identify the users/customers whose friendship/relationship should be a priority to keep. We provide formal definitions, algorithms and some experiments for this subject. Our experiments showed that the best current friend of a user is not necessarily among those who have the highest PageRank in the SN, or among the ones who have lots of friends. |
Francisco Moreno, Santiago Hernández, Edison Ospina |
398 |
Clustering Heterogeneous Semi-Structured Social Science Datasets [abstract] Abstract: Social scientists have begun to collect large datasets that are heterogeneous and semi-structured, but the ability to analyze such data has lagged behind its collection. We design a process to map such datasets to a numerical form, apply singular value decomposition clustering, and explore the impact of individual attributes or fields by overlaying visualizations of the clusters. This provides a new path for understanding such datasets, which we illustrate with three real-world examples: the Global Terrorism Database, which records details of every terrorist attack since 1970; a Chicago police dataset, which records details of every drug-related incident over a period of approximately a month; and a dataset describing members of a Hezbollah crime/terror network within the U.S. |
David Skillicorn, Christian Leuprecht |
473 |
CFD post-processing in Unity3D [abstract] Abstract: In architecture and urban design the urban climate on a meso/micro scale is a strong design criterion for outdoor thermal comfort and building’s energy performance. Evaluating the effect of buildings on the local climate and vice versa can be done by computational fluid dynamics (CFD) methods. The results from CFD are typically visualized through post-processing software closely related to the product family of pre-processing and simulation. The built-in functions are made for engineers and lack user-friendliness for real-time exploration of results. To bridge the gap between architect and engineer we propose visualizations based on game engine technology. This paper demonstrates the implementation of CFD to Unity3D conversion and weather data visualization. |
Matthias Berger, Verina Cristie |
596 |
Helsim: a particle-in-cell simulator for highly imbalanced particle distributions [abstract] Abstract: Helsim is a 3D electro-magnetic particle-in-cell simulator used to simulate the behaviour of plasma in space. Particle-in-cell simulators track the movement of particles through space, with the particles generating and being subjected to various fields (electric, magnetic and or gravitational). Helsim dissociates the particles data structure from the fields, allowing them to be distributed and load- balanced independently and can simulate experiments with highly imbalanced particle distributions with ease. This paper shows weak scaling results of a highly imbalanced particle setup on up to 32 thousand cores. The results validate the basic claims for scalability for imbalanced particle distributions, but also highlights a problem with a workaround we had to implement to circumvent an OpenMPI bug we encountered. |
Roel Wuyts, Tom Haber, Giovanni Lapenta |
724 |
Efficient visualization of urban simulation data using modern GPUs [abstract] Abstract: Visualization of simulation results in major urban areas is a difficult task. Multi-scale processes and connectivity of the urban environment may require interactive visualization of dynamic scenes with lots of objects at different scales. To visualize these scenes it is not always possible to use standard GIS systems. Wide distribution of high-performance gaming graphics cards has led to the emergence of specialized frameworks, which are able to cope with such kinds of visualization. This paper presents a framework and special algorithms that take full advantage of the GPU to render the urban simulation data over a virtual globe. The experiments on a scalability of the framework have showed that the framework is successfully deals with the visualization of up to two million moving agents and up to eight million of fixed points of interest on top of the virtual globe without detriment to smoothness of the image. |
Aleksandr Zagarskikh, Andrey Karsakov, Alexey Bezgodov |
732 |
Cloud Technology for Forecasting Accuracy Evaluation of Extreme Metocean Events [abstract] Abstract: The paper describes the approach for ensemble-based simulation within the tasks of extreme metocean events forecasting as an urgent computing problem. The approach is based on the developed conceptual basis of data-flow construction for the simulation-based ensemble forecasting. It was used to develop the architecture for ensemble-based data processing based on cloud computing environment CLAVIRE with extension for urgent computing resource provisioning and scheduling. Finally the solution for ensemble water level forecasting in Baltic Sea was developed as a part of St. Petersburg flood preventing system. |
Sergey Kosukhin, Sergey Kovalchuk, Alexander Boukhanovsky |
320 |
Co-clustering based approach for Indian monsoon prediction [abstract] Abstract: Prediction of Indian monsoon is a challenging task due to complex dynamics and variability over the years. Skills of statistical predictors that perform well in a set of years are not as good for others. In this paper, we attempt to identify a set of predictors that have high skills for a cluster of years. A co-clustering algorithm, which extracts groups of years, paired with good predictor sets for those years, is used for this purpose. Weighted ensemble of these predictors are used in final prediction. Results on past 65 years data show that the approach is competitive with state of art techniques. |
Moumita Saha, Pabitra Mitra |
139 |
Agent Based Simulations for the Estimation of Sustainability Indicators [abstract] Abstract: We present a methodology to improve the estimation of several Sustainability Indicators based on the measurement of walking distance to infrastructures combining Agent Based Simulation with Volunteer Geographic Information. Joining these two forces we construct a more realistic and accurate distribution of the infrastructures based on knowledge created by citizens and their perceptions instead of official data sources. A Situated Multi-Agent System is in charge of simulating not only the functional disparity and sociodemographic characteristics of the population but also the geographic reality in a dynamic way. Namely, the system will analyze different geographic barriers for each collective bringing new possibilities to improve the assessment of the needs of the population for a more sustainable development of the city. In this article we will describe the methodology to carry on several sustainability indicator measurements and present the results of the proposed methodology applied to several municipalities. |
Ander Pijoan, Cruz E. Borges, Iraia Oribe-Garcia, Cristina Martín, Ainhoa Alonso-Vicario |
276 |
Bray-Curtis Metrics as Measure of Liquid State Machine Separation Ability in Function of Connections Density [abstract] Abstract: Separation ability is one of two most important properties of Liquid State Machines used in the Liquid Computing theory. To measure the so-called distance of states that Liquid State Machine can exist in -- different norms and metrics can be applied. Till now we have used the Euclidean distance to tell the distance of states representing different stimulations of simulated cortical microcircuits. In this paper we compare our previously used methods and the approach with Bray-Curtis measure of dissimilarity. Systematic analysis of efficiency and its comparison for a different number of simulated synapses present in the model will be discussed to some extent. |
Grzegorz Wójcik, Marcin Ważny |
365 |
A First Step to Performance Prediction for Heterogeneous Processing on Manycores [abstract] Abstract: In order to maintain the continuous growth of the performance of computers while keeping their energy consumption under control, the microelecttronic industry develops architectures capable of processing more and more tasks concurrently. Thus, the next generations of microprocessors may count hundreds of independent cores that may differ in their functions and features. As an extensive knowledge of their internals cannot be a prerequisite to their programming and for the sake of portability, these forthcoming computers necessitate the compilation flow to evolve and cope with heterogeneity issues. In this paper, we lay a first step toward a possible solution to this challenge by exploring the results of SPMD type of parallelism and predicting performance of the compilation results so that our tools can guide a compiler to build an optimal partition of task automatically, even on heterogeneous targets. We show on experimental results a very good accuracy of our tools to predict real world performance. |
Nicolas Benoit, Stephane Louise |
468 |
A decision support system for emergency flood embankment stability [abstract] Abstract: This article presents a decision support system for emergency flood embankment stability. The proposed methodology is based on analysis of data from both a flood embankment measurement network and data generated through numerical modeling. Decisions about the risk of embankment interruption are made on the basis of this analysis. The authors present both the general concept of the system as well as a detailed description the system components. |
Magdalena Habrat, Michał Lupa, Monika Chuchro, Andrzej Leśniak |
422 |
A Methodology for Profiling and Partitioning Stream Programs on Many-core Architectures [abstract] Abstract: Maximizing the data throughput is a very common implementation objective for several streaming applications. Such task is particularly challenging for implementations based on many-core and multi-core target platforms because, in general, it implies tackling several NP-complete combinatorial problems. Moreover, an efficient design space exploration requires an accurate evaluation on the basis of dataflow program execution profiling. The focus of the paper is on the methodology challenges for obtaining accurate profiling measures. Experimental results validate a many-core platform built by an array of Transport Triggered Architecture processors for exploring the partitioning search space based on the execution trace analysis. |
Malgorzata Michalska, Jani Boutellier, Marco Mattavelli |
590 |
Minimum-overlap clusterings and the sparsity of overcomplete decompositions of binary matrices. [abstract] Abstract: Given a set of $n$ binary data points, a widely used technique is to group its features into $k$ clusters: sets of features for which there is, in turn, a set of data points that has similar values in those features. In the case where $n < k$, an exact decomposition is always possible, and the question of how overlapping are the clusters is of interest. In this paper we approach the question through matrix decomposition, and relate the degree of overlap with the sparsity of one of the resulting matrices. We present i) analytical results regarding bounds on this sparsity, and ii) a heuristic to estimate the minimum amount of overlap that an exact grouping of features into $k$ clusters must have. Happily, adding new data will not alter this minimum amount of overlap. An interpretation of this amount, and its change with $k$, is given for a biological example. |
Victor Mireles, Tim Conrad |
736 |
Modeling of critical situations in the migration policy implementation [abstract] Abstract: This paper describes an approach for modeling of potentially critical situations in the society. Potentially critical situations is caused by the lack of compliance of current local policies and the desired goals, methods and means of these policies implementation. The modeling approach is proposed to improve the efficiency of the local government management, taking into account potentially critical situations that may arise on a personal level, social group’s level and society as a whole. The use of proposed method is shown by the example of migration policies in St. Petersburg. |
Sergey Mityagin, Sergey Ivanov, Alexander Boukhanovsky, Iliya Gubarev, Tihonova Olga |
450 |
Parallelization of context-free grammar parsing on GPU using CUDA [abstract] Abstract: During the last decade, increasing interest in parallel programming can be observed. It is caused by a tendency of developing microprocessors as multicore units, that can perform instructions simultaneously. Popular and widely used example of such platform is a graphic processing unit (GPU). Its ability to perform calculations simultaneously is being investigated as a way of improving performance of the complex algorithms. Therefore, GPU has the architectures that allow to use its computational power by programmers and software developers in the same way as CPU. One of these architectures is CUDA platform, developed by nVidia.
Purpose of our work was to implement the parallel CYK algorithm, which is one of the most popular and effective parsing algorithms for the context-free languages. The process of parsing is crucial for a systems which are dedicated to work with the natural, biological (like RNA), or artificial languages, i.e. interpreters of scripting languages, compilers, and systems, which concern pattern or natural/biological language recognition. Parallelization of context-free grammar parsing on GPU was done by using CUDA platform. Paper presents a review of existing parallelizations of CYK algorithm in the literature, deliver descriptions of proposed algorithms, and discusses experimental results obtained. We considered algorithms in which each cell of CYK matrix was assigned to the respective thread (processor), each pair of cells assigned to the thread, version with a shared memory, and finally version with limited number of non-terminal.
The algorithms were evaluated on five artificial grammars with different number of terminals, non-terminals, size of grammar rules, and different lengths of input sequences. Significant performance improvement (up to about 10x) compared with CPU-based computations was achieved. |
Olgierd Unold and Piotr Skrzypczak |