Poster Track (POSTER) Session 1
Time and Date: Before Lunch
Room: Poster Room
Chair: None
3 | Efficient Skyline Query over Multiple Relations [abstract] Abstract: Skyline query on multiple relations, known as skyline join query, finds skyline points from join results of multiple data sources. The issue of skyline join query has been extensively studied. However, most of the existing skyline join algorithms can only perform query on two relations, and ignores the common occasion which involves more than two relations. In this paper, we propose an efficient skyline join algorithm Skyjog, which is applicable for query on two or more relations. Skyjog can quickly identify most of skyline join results with simple calculation. Extensive experiments demonstrate that Skyjog outperforms the state-of-the-art skyline join algorithms on two and more than two relations. |
Jinchao Zhang |
6 | Social Media Conversation Monitoring: Visualize Information Contents of Twitter messages using conversational metrics [abstract] Abstract: In this paper we present a novel method to extract and visualize actionable information from streams of social media messages, analyzed as conversational elements. Our method has been applied to over 4 million messages related to more than 35 different events, demonstrating good results identifying conversational patterns. |
Carlo Lipizzi, José Emmanuel Ramirez Marquez, Dante Gama Dessavre, Luca Iandoli |
17 | Processing High-Volume Geospatial Data: A case of monitoring Heavy Haul Railway operations [abstract] Abstract: Sensor technology such as GPS can be used in the mapping of transportation networks (e.g., road, rail). The advancement of sensor technology enables fast and cost-effective acquisition of geospatial data. However, GPS suffers errors in positional accuracy due to factors such as signal arrival time, ionospheric effects, multipath distortions and so on. In railway systems, positional accuracy is of utmost importance because it is possible that the current position of a particular wagon may be read to be in a wrong track causing incorrect analysis for safety and maintenance of track and wagons. Also, the numerous lightweight sensors installed in each wagon along with GPS produce a large amount of continuous data streams as multiple trains operate for long hours during its trip. This will cause problems as huge amounts of location data needs to be processed continuously and the traditional data processing and storage applications can not handle it. In this paper, we propose efficient algorithms and a suitable data structure to achieve rapid and accurate location mappings to increase both the run time performance and mapping accuracy based on 2 months of historical data consisting of approximately 250 million records. Our large scale evaluation demonstrates that our system is capable of real-time performance - processing tens of thousands of records per second and, accurate - mapping rail track information with 98.5% accuracy. |
Prajwol Sangat, Maria Indrawan-Santiago, David Taniar, Beng Oh, Paul Reichl |
22 | A Suite of Java Message-Passing Benchmarks to Support the Validation of Testing Models, Criteria and Tools [abstract] Abstract: This paper proposes a novel suite of benchmarks for the evaluation of the structural testing of concurrent programs with message-passing. The main focus is on the supply of a set of programs that generates controlled and qualified demand on concurrent software testing artifacts, such as models, criteria and tools. We hope to reveal strengths and limitations of such artifacts, and also allow a fair comparison of different testing approaches. Such an evaluation represents a challenge to the test area, which must consider benchmarks simple enough to be validated manually, if necessary, but also complex enough to exercise non trivial aspects of communication/synchronization found on programs. The proposed suite is composed of thirteen bug-free programs and five faulty programs (the latter used in the evaluation of the capability to reveal unknown defects). The benchmarks are developed in Java and are available as free-software on the Internet with a detailed description. They were validated with experimental studies and also have been used in different research projects and for educational aims. The experimental and practical results show the benchmarks can generate qualified workload for the testing of message-passing programs. The main contribution of this study is the development of a more robust and fair suite of benchmarks capable of improving the evaluation of the test activity related to concurrent programs. |
George Gabriel Mendes Dourado, Paulo S Lopes De Souza, Rafael R. Prado, Raphael Negrisoli Batista, Simone R. S. Souza, Julio C. Estrella, Sarita M. Bruschi, Joao Lourenco |
50 | Second-Order Adjoint Solvers for Systems of Parameterized Nonlinear Equations [abstract] Abstract: Adjoint mode algorithmic (also know as automatic) differentiation (AD) transforms implemen- tations of multivariate vector functions as computer programs into first-order adjoint code. Its reapplication or combinations with tangent mode AD yields higher-order adjoint code. Sec- ond derivatives play an important role in nonlinear programming. For example, second-order (Newton-type) nonlinear optimization methods promise faster convergence in the neighborhood of the minimum through taking into account second derivative information. The adjoint mode is of particular interest in large-scale gradient-based nonlinear optimization due to the independence of its computational cost on the number of free variables. Part of the objective function may be given implicitly as the solution of a system of n parameterized nonlinear equations. If the system parameters depend on the free variables of the objective, then second derivatives of the nonlinear system’s solution with respect to those parameters are required. The local computational overhead as well as the additional memory requirement for the computation of second-order adjoints of the solution vector with respect to the parameters by AD depends on the number of iterations performed by the nonlinear solver. This dependence can be eliminated by taking a symbolic approach to the differentiation of the nonlinear system. |
Niloofar Safiran, Johannes Lotz, Uwe Naumann |
59 | Cryptographic Properties of Equivalent Ciphers [abstract] Abstract: In this work cryptographic properties of equivalent ciphers have been investigated. More precisely, we analyze the use of nonlinear filtering functions applied to Linear Feedback Shift Registers (LFSRs) to generate pseudorandom sequences of cryptographic application. Emphasis is on the class of equivalent nonlinear filter generators whose elements produce exactly the same sequence. Given a pair (nonlinear filter, LFSR), the work develops a method of computing different equivalent nonlinear filters applied to distinct LFSRs where all of them generate the same cryptographic sequence. Moreover, it is showed that important cryptographic properties of the filtering functions, such as nonlinearity or algebraic immunity, are not invariant among elements of the same equivalence class. The security of a cipher is that of its weakest equivalent, thus this article allows one to compute weaker equivalents of any nonlinear filter in the sense that such equivalents could be used to cryptanalyze apparently secure sequence generators. In order to assess the resistance of a cipher against a type of cryptanalytic attack, one should determine the weakest equivalent in its corresponding class and not only consider a particular instance. |
Amparo Fuster-Sabater, Sara D. Cardell |
70 | Understanding User Behavior: from HPC to HTC [abstract] Abstract: In this paper, we investigate the differences and similarities in user job submission behavior in High Performance Computing (HPC) and High Throughput Computing (HTC). We consider job submission behavior in terms of parallel batch-wise submissions, as well as delays and pauses in job submission. We compare differences in batch characteristics by classifying batches using a popular model. Our findings show that modeling an HTC job submission behavior requires knowledge of the underlying bags of tasks, which is often unavailable. Furthermore, we find evidence that subsequent job submissions behavior is not influenced by the different complexities and requirements of HPC and HTC jobs. |
Stephan Schlagkamp, Rafael Ferreira Da Silva, Ewa Deelman, Uwe Schwiegelshohn |
72 | A training engine for automatic quantification of left ventricular trabeculation from cardiac magnetic resonance images [abstract] Abstract: The medical community needs an objective quantification of non-compacted cardiomyopathy, characterized by a trabeculated mass in the left ventricle myocardium. Based on a software tool proposed for the automatic quantification of the exact hyper-trabeculation degree in the left ventricle myocardium, where the trabecular and the compacted zones are detected, we have designed, developed and tested a training engine for an automatic adjustment of the threshold to identify the trabecular zones and the intensity of the black zone in order to determine the external layer of the compacted zone. Significant improvements are obtained with respect to the manual process and the automatic software tool aided by cardiologists, so saving valuable diagnosis time. |
Gregorio Bernabe, Javier Cuenca, Domingo Gimenez, Josefa González-Carrillo |
114 | Simulating Refugee Movements: where would you go? [abstract] Abstract: The challenge of understanding refugee movements is huge and affects countries worldwide on a daily basis. Yet, in terms of simulation, the challenge appears to have been largely ignored. I argue that we as researchers can, and should, harness our computational skills to better understand and predict refugee movements. I reflect on the computational challenges of modelling refugees, and present a simulation case study example focused on the Northern Mali Conflict in 2012. Compared to UNHCR data, the simulation predicts fewer refugees moving towards Mauritania, and more refugees moving towards Niger. This outcome aligns with UNHCR reports, which mention that unregistered refugees were known to reside outside of the official camps, though further investigations are required to rule out competing theories. |
Derek Groen |
141 | Cost-benefit analysis and exploration of cost-energy-performance trade-offs in scientific computing infrastructures [abstract] Abstract: Running a scientific computing infrastructure can be very costly, and selecting the appropriate hardware for a specific computational problem that minimizes the total cost is not trivial. There are trade-offs between the initial cost of hardware, the performance, and energy consumption. Energy is a very important aspect of scientific computing infrastructures, and in fact, in some cases energy consumption costs end up becoming more significant than the cost of the hardware itself. In this work, we demonstrate how a cost-benefit analysis of different scenarios reveals insights into trade-offs between cost, performance, and energy for specific scientific computational workloads. We aim to provide answers to the questions of which configuration will offer the lowest total cost of ownership for a scientific cluster. We also analyze at what point in time a hardware upgrade becomes beneficial, taking into account performance/watt improvements and hardware devaluation. This knowledge enables a person responsible for planning, designing, and managing the life cycle of a scientific computing infrastructure to find a cost-effective solution tailored to their specific applications and resource utilization patterns. We demonstrate that quite substantial cost savings can be achieved by using this methodology. |
Pablo Llopis Sanmillán, Gabriel G. Castañé, Jesús Carretero |
165 | An Evolutionary Algorithm for Autonomous Robot Navigation [abstract] Abstract: This paper presents an implementation of an evolutionary algorithm to control a robot with autonomous navigation in avoiding obstacles. The paper describes how the evolutionary system controls the sensors and motors in order to complete this task. A simulator was developed to test the algorithm and its configurations. The tests were performed in a simulated environment containing a set of barriers that were observed by means of a set of sensors. The solution obtained in the simulator was embedded in a real robot, which was tested in an arena containing obstacles. The robot was able to navigate and avoid the obstacles in this environment. |
Anderson Soares, Telma Woerle de Lima Soares |
170 | Preconditioning Large Scale Iterative Solution of $\mathbf{A} x=b$ Using a Statistical Method with Application to Matrix-Free Spectral Solution of Helmholtz Equation [abstract] Abstract: The problem of preconditioning the linear system A x=b where no entries of A are known but the matrix-vector product is given in the linear functional form A x = L(x) is considered. A statistical multiregressor method is proposed whose convergence improves during Krylov iterations. It is shown that this method effectively improves the rate of convergence of the GMRES algorithm. The results are validated by applying the current preconditioner to a pseudo-spectral solution of Helmholtz equation. |
Arash Ghasemi, Lafayette K. Taylor |
250 | OPC: A Real-time Solution for Big Data Computing [abstract] Abstract: Big data computing is one of the most significant research focuses in the Internet of Things and cloud computing. Hadoop technology in particular has high calculation efficiency in the non-real-time scenario (computing time around 1 minute) for both structured and unstructured data; however, this technology cannot meet the demands of the real-time solution, which requires a computing time of <10s. In order to solve this problem, the authors of this article propose Objectification Parallel Computing (OPC) and provide an efficient real-time solution on this basis. |
Xiang Li, Zhi Yang, Wang Liao, Charles Zhou |
252 | Big Data System and Analysis for Agricultural Applications [abstract] Abstract: The data system for agriculture is more complicated than for other industries. The agricultural information system in China and in the Chinese language offered more challenges than in other countries. We tried to build an integrated information system from 1636 counties in China as a test bed for a future big data system of agriculture world wide.
We used collaborative learning agents to analyze the prices of all the agricultural products, including crops, vegetables, farm animals and aquatic products. We also analyze factors impacting the price fluctuation.
In addition, we also developed automatic diagnostic system of agricultural produce based on real time photos from smart phones. This automatic diagnostic system is linked to the farmers, the 1 million field engineers, fertilizers and other equipment suppliers.
|
Xiang Li, Charles Zhou, Jishun Ou and Wang Liao |
264 | Comparison of the Parallel Fast Marching Method, the Fast Iterative Method, and the Parallel Semi-Ordered Fast Iterative Method [abstract] Abstract: Solving the eikonal equation allows to compute a monotone front propagation of anisotropic nature and is thus a widely applied technique in different areas of science and engineering. Various methods are available out of which only a subset is suitable for shared-memory parallelization, which is the key focus of this analysis. We evaluate three different approaches, those being the recently developed parallel fast marching method based on domain decompositioning, the inherently parallel fast iterative method, and a parallel approach of the semi-ordered fast iterative method, which offers increased stability for variations in the front velocity as compared to established iterative methods. We introduce the individual algorithms, evaluate the accuracy, and show benchmark results based on a dual socket Intel Ivy Bridge-EP cluster node using C++/OpenMP implementations. Our investigations show that the parallel fast marching method performs best in terms of accuracy and single thread performance and reasonably well with respect to parallel efficiency for up to 8-16 threads. |
Josef Weinbub, Andreas Hoessinger |
266 | BEAM: A computational workflow system for managing and modeling material characterization data in HPC environments [abstract] Abstract: Improvements in scientific instrumentation allow imaging at mesoscopic to atomic length scales, many spectroscopic modes, and now—with the rise of multimodal acquisition systems and the associated processing capability—the era of multidimensional, informationally dense data sets has arrived. Technical issues in these combinatorial scientific fields are exacerbated by computational challenges best summarized as a necessity for drastic improvement in the capability to transfer, store, and analyze large volumes of data. The Bellerophon Environment for Analysis of Materials (BEAM) platform provides material scientists the capability to directly leverage the integrated computational and analytical power of High Performance Computing (HPC) to perform scalable data analysis and simulation via an intuitive, cross-platform client user interface. This framework delivers authenticated, “push-button” execution of complex user workflows that deploy data analysis algorithms and computational simulations utilizing the converged compute-and-data infrastructure at Oak Ridge National Laboratory’s (ORNL) Compute and Data Environment for Science (CADES) and HPC environments like Titan at the Oak Ridge Leadership Computing Facility (OLCF). In this work we address the underlying HPC needs for characterization in the material science community, elaborate how BEAM’s design and infrastructure tackle those needs, and present a small sub-set of user cases where scientists utilized BEAM across a broad range of analytical techniques and analysis modes. |
E. J. Lingerfelt, A. Belianinov, E. Endeve, O. Ovchinnikov, S. Somnath, J. Borreguero, N. Grodowitz, B. Park, R. K. Archibald, C. T. Symons, S. V. Kalinin, O. E. B. Messer, M. Shankar, S. Jesse |
282 | Assessing Run-time Overhead of Securing Kepler [abstract] Abstract: We have developed a model for securing data-flow based application chains. We have implemented the model in the form of an add-on package for the scientic workflow system called Kepler. Our Security Analysis Package (SAP) leverages Kepler's Provenance Recorder (PR). SAP secures data flows from external input-based attacks, from access to unauthorized external sites, and from data integrity issues. It is not a surprise that cost of real-time security is a certain amount of run-time overhead. About half of the overhead appears to come from the use of the Kepler PR and the other half from security function added by SAP. |
Donghoon Kim, Mladen Vouk |
283 | A Partition Scheduler Model for Dynamic Dataflow Programs [abstract] Abstract: The definition of an efficient scheduling policy is an important, difficult and open design problem for the implementation of applications based on dynamic dataflow programs for which optimal closed-form solutions do not exist. This paper describes an approach based on the study of the execution of a dynamic dataflow program on a target architecture with different scheduling policies. The method is based on a representation of the execution of a dataflow program with the associated dependencies, and on the cost of using scheduling policy, expressed as a number of conditions that need to be verified to have a successful execution within each partition. The relation between the potential gain of the overall execution satisfying intrinsic data dependencies and the runtime cost of finding an admissible schedule is a key issue to find close-to-optimal solutions for the scheduling problem of dynamic dataflow applications. |
Malgorzata Michalska, Endri Bezati, Simone Casale Brunet, Marco Mattavelli |
290 | Novel Druggable Sites of Insulin-Degrading Enzyme Identified through Applied Structural Bioinformatics Analysis [abstract] Abstract: Insulin-degrading enzyme (IDE) plays critical roles in proteolysis of diverse substrates, like insulin and amyloid β. Pathologically, IDE is implicated in type 2 diabetes mellitus and Alzheimer’s diseases, but potent and selective regulators of IDE remain elusive. We have applied structural bioinformatics techniques to the largest ensemble of IDE structures determined hitherto, identified structural clusters associated with distinct conformational states and their respective clustroids. IDE utilizes its intrinsic large-scale structural motions to adopt multiple conformational states and perform molecular functions. The conformational space occupied by IDE structures can be shifted through mutations and inter-molecular interactions with other proteins, small molecules or substrate peptides. We observed that IDE-N is generally more dynamic than IDE-C and suggested that there are possibly other open conformational states of IDE whose structures remain unknown. We also identified novel druggable sites that are specific to particular conformational states of IDE, these sites can potentially be explored for designing investigative probes or therapeutic agents for specific spatiotemporal contexts. |
Suryani Lukman |
309 | A Fast Evaluation Approach of Data Consistency Protocols within a Compilation Toolchain [abstract] Abstract: Shared memory is a critical issue for large distributed systems. Despite several data consistency protocols have been proposed, the selection of the protocol that best suits to the application requirements and system constraints remains a challenge. The development of multi-consistency systems, where different protocols can be deployed during runtime, appears to be an interesting alternative. In order to explore the design space of the consistency protocols a fast and accurate method should be used. In this work we rely on a compilation toolchain that transparently handles data consistency decisions for a multi-protocol platform. We focus on the analytical evaluation of the consistency configuration that stands within the optimization loop. We propose to use a TLM NoC simulator to get feedback on expected network contentions. We evaluate the approach using five workloads and three different data consistency protocols. As a result, we are able to obtain a fast and accurate evaluation of the different consistency alternatives. |
Loïc Cudennec, Safae Dahmani, Guy Gogniat, Cédric Maignan, Martha Johanna Sepulveda |
312 | A Simple and Efficient Method to Handle Sparse Preference Data Using Domination Graphs: An Application to YouTube [abstract] Abstract: The phenomenal growth of the number of videos on YouTube provides enormous potential for users to find content of interest to them. Unfortunately, as the size of the repository grows, the task of discovering high-quality content becomes more daunting. To address this, YouTube occasionally asks users for feedback on videos. In one such event (the YouTube Comedy Slam), users were asked to rate which of two videos was funnier. This yielded sparse pairwise data indicating a participant’s relative assessment of two videos. Given this data, several questions immediately arise: how do we make inferences for uncompared pairs, overcome noisy, and usually contradictory, data, and how do we handle severely skewed, real-world, sampling? To address these questions, we introduce the concept of a domination-graph, and demonstrate a simple and scalable method, based on the Adsorption algorithm, to efficiently propagate preferences through the graph. Before tackling the publicly available YouTube data, we extensively test our approach on synthetic data by attempting to recover an underlying, known, rank-order of videos using similarly created sparse preference data. |
Shumeet Baluja |
315 | Hydra: A High-throughput Virtual Screening Data Visualization and Analysis Tool [abstract] Abstract: Virtual high-throughput biochemical screening offers a cost-effective alternative to the empirical testing of millions of compounds. However, virtual screening data can have errors and thus often requires some manual processing of the data to eliminate false positives, evaluate the ligand- macromolecule fit, and identify new molecular interactions. This analysis is generally hindered by highly specific software and hardware requirements and complex user interfaces. Hydra is an HTML5 and JavaScript based application, which ameliorates this issue by displaying ligand-macromolecule models calculated by virtual screening programs in a single, simple online interface. The application is capable of loading raw data sets from the DOCK virtual screening platform and utilizing pre-processed datasets from other software to display compounds side-by-side in a user-defined size grid of 3Dmol.js instances. It also searches databases for selected compound information to natively display within the interface. This tool provides a highly accessible platform for streamlined virtual screening results analysis. |
Curtis Sera, Shelby Matlock, Yasuhiro Watashiba, Kohei Ichikawa, Jason Haga |
316 | Modelling complex systems with distributed agency and fuzzy inference systems. Knowledge-based curricula in higher education [abstract] Abstract: Higher education has become a cornerstone in the arise of knowledge society conceptualization. This concept covers a new type of social configuration that puts knowledge as the main value in human interaction for future social and economic development and also as one of the main actions to improve in order to gain a better quality of life. In this paper we discuss the importance of this structure and its characteristics, then we focus in the need to generate a higher education curriculum that fits this emphasis in social development of a new knowledge-based society. Finally, we use complex systems simulation to analyze six agents including: 1) Students; 2) Teaching - Teachers; 3) Training plan - Teachers; 4) Scientific research, IT development, innovation and professional performance; 5) Management - Managers; and 6) Environment and relevance or External Agents and five variables for this study: 1) Teaching; 2) Extracurricular activities; 3) Research and development; 4) Management; and 5) Educational culture. The final goal is to propose a curriculum that includes a projection to create knowledge-based society from higher education as a main factor to achieve this objective |
Eduardo Ahumada-Tello, Manuel Castanon-Puga |
320 | Formal Analysis of Control Strategies for a Cyber Physical System [abstract] Abstract: CPS systems use recent computing, communication, and control methods to monitor and control geographically dispersed field sites in provide and maintain a high level of confidence about their operation. Simulation methods are frequently used in testing such systems, however, it might not be adequate to show the absence of errors given the complexity of the system under test. Failure in detecting errors in safety critical systems can lead to a catastrophic situations. In this paper we propose an approach for the reliability analysis of cyber physical systems based on simulation and formal analysis. We demonstrate the approach on an industrial case study of a four tank process that demonstrate several challenging features in the design and implementation of CPS. Our experimental results show that the proposed approach is efficiently used in order to test and verify the four tanks process system, where simulation results have shown that validity of approximation and abstraction of the system, and then formal analysis have been used to validate that several design requirements were satisfied in the design model. |
Abdullah Abu Omar, Amjad Gawanmeh and Alain April |
331 | An Execution Framework for Grid-Clustering Methods [abstract] Abstract: Cluster analysis methods have proven extremely valuable for explorative data analysis and are also fundamental for data mining methods. Goal of cluster analysis is to find correlations of the value space and to separate the data values into a priori unknown set of subgroups based on a similarity metrics. In case of high dimensional data, i.e. data with a large number of describing attributes, clustering can result into a very time consuming task, which often limits the number of observations to be clustered in practice. To overcome this problem, Grid clustering methods have been developed, which do not calculate similarity values between the data value each, but organize the value space surrounding the data values, e.g. by specific data structure indices. In this paper we present a framework which allows to evaluate different data structures for the generation of a multi-dimensional grid structure grouping the data values into blocks. The values are then clustered by a topological neighbor search algorithm on the basis of the block structure. As first data structure to be evaluated we present the BANG file structure and show its feasibility as clustering index. The developed framework is planned to be contributed as package to the WEKA software. |
Erich Schikuta, Florian Fritz |
333 | Motion Deblurring for Space-Based Imaging on Sandroid CubeSats Using Improved Genetic Algorithm [abstract] Abstract: NanoSats have become viable alternative to larger spacecraft that focuses on providing the end user with access to space and similar functionality to mainstream missions. However, motion blur ruins the images captured under the situation that NanoSats work in low-earth orbit at high speeds. In this paper, we address the problem of deblurring images degraded due to space-based imaging system shaking or movements of observing targets. We propose a motion deblurring strategy relying on the powerful on-board computing capability of our Sandroid CubeSats, a member of NanoSats, to compensate for the functional inadequacy of hardware. We use Improved Genetic Algorithm within the strategy to obtain a better linear motion blur kernel and then perform non-blind deconvolution on a single image taken by the space-based imaging system on our Sandroid CubeSats to produce a deblurred result. Experimental results demonstrate the effectiveness of proposed strategy. |
Xiaoqiang Wu, Fengge Wu, Junsuo Zhao |
337 | Best practices in debugging Kepler workflows [abstract] Abstract: In this paper we present various techniques related to Kepler development, debugging, and \textit{JVM} customisation. We highlight some aspects of development process that may help people to perform better while working with Kepler (especially in case they develop new components for Kepler platform). We present knowledge and ideas that were gained over the time while working with Kepler tools throughout various projects and different applications of Kepler into existing environments. These loose ideas are presented for the sake of saving time and effort by other people who just start their experience with Kepler project. |
Michał Owsiak, Marcin Plociennik, Bartek Palak, Tomasz Zok, Olivier Hoenen |
338 | Accelerated hybrid approach for spectral problems arising in graph analytics [abstract] Abstract: Data sets such as graphs are growing so rapidly that performing meaningful data analytics in reasonable time is beyond the ability of common software and hardware for many applications. In this context, performance and efficiency are primary concerns. The spectral analysis of real networks reflects such problematic. In this paper we present a solution based on Krylov methods which combines accelerators to increase the throughput of graphs traversals and latency oriented architectures to solve small problems. We focus on an hybrid acceleration of the implicitly restarted Arnoldi method which targets large non-symmetric problems with irregular sparsity pattern. The result of this cooperation is an efficient solver to compute eigenpairs of real networks. Moreover, this approach can be applied to other methods based on coarsening. |
Alexandre Fender, Nahid Emad, Joe Eaton, Serge Petiton |
344 | Sliding Window-based Probabilistic Change Detection for Remote-sensed Images [abstract] Abstract: A recent probabilistic change detection algorithm provides a way for assessing changes on remote-sensed images which is more robust to geometric and atmospheric errors than existing pixel-based methods. However, its grid (patch)-based change detection results in coarse-resolution change maps and often discretizes continuous changes that occur across grid boundaries. In this study, we propose a sliding window-based extension of the probabilistic change detection approach to overcome such artificial limitations. |
Seokyong Hong, Ranga Raju Vatsavai |
347 | Implementing OpenSHMEM for the Adapteva Epiphany RISC Array Processor [abstract] Abstract: The energy-efficient Adapteva Epiphany architecture exhibits massive many-core scalability in a physically compact 2D array of RISC cores with a fast network-on-chip (NoC). With fully divergent cores capable of MIMD execution, the physical topology and memory-mapped capabilities of the core and network translate well to partitioned global address space (PGAS) parallel programming models. Following an investigation into the use of two-sided communication using threaded MPI, one-sided communication using SHMEM is being explored. Here we present work in progress on the development of an OpenSHMEM 1.2 implementation for the Epiphany architecture. |
James Ross, David Richie |
355 | Research of Zigbee and Big Data Analysis based Pulse Monitoring System for Efficient Physical Training [abstract] Abstract: With development of IOT(Internet of Things), more and more wearable systems have been used to strengthen existing applications. Therefore, for the problems that we can’t monitor abnormal conditions of heart rate as well as carrying out scientific and efficient training plans based on knowledge from variation of them. A ZigBee and big data analysis based pulse monitoring system has been proposed. The system is composed of multiple ZigBee based pulse monitoring sensors, customized gateways and back-end system. Individuals’ pulse information are collected by the sensors and passed to back-end system to support big data analysis of the training conditions. To guarantee collecting efficient pulse signal, we have researched photo electricity based dynamic and continuous heart rate monitoring methods as well as comprehensive anti-jamming methods. Finally, by using according big data analysis methods we have built up the training model by the standards such as different age, different mood and so on. Results shows the system can be used to improve the physical training level; accumulate the training data of the individuals and support more efficient and scientific training plans. |
Hongliang Yuan, Jun Wang, Jun Liu |
356 | Formal Analysis of Collision Prevention of Two Wireless Personal Area Networks [abstract] Abstract: There are several challenges in the design and operation of Wireless Personal Area Networks (WPANs) such as wireless networking and communication, power consumption, and mobility. Hence, the operation of several WPANs within the same area can result in collision if two WPANs operate in the same wireless channel and come in close range. Therefore, methods for collision detection and prevention need to be validated properly due to the sensitivity of the applications of WPANs. Existing approaches depend on paper and pencil methods to proof the correctness of proposed methods, which might not be enough when practical issues, such as mobility, are taken into consideration. In this paper, we use formal analysis in order to verify the correctness of collision prevention conditions for two WPANs. |
Amjad Gawanmeh, Youssef Iraqi |
359 | A Multi-Objective Evolutionary Algorithm with Efficient Data Structure and Heuristic Initialization for Fault Service Restoration [abstract] Abstract: Service restoration in energy distribution systems is a complex optimization problem with many restrictions. After a fault occurrence, the challenge is obtain a service restoration plan reconnecting all the healthy out-of-service areas satisfying all the operational and technical constraints. Recent works have study the use of meta-heuristics in order to find a sub-optimal solution with low computational complexity. One of these works include the use of a multi-objective algorithm with a new data structure called node depth encoding. This paper proposes and analyses the use of a new heuristic initialization procedure to be used with node depth encoding which guarantees the analysis of all possible solutions considering only a restricted number of switches incident to the out-of-service areas. The proposed methodology is evaluated by applying it to the real and large-scale distribution system of Londrina city (Brazil). The results showed that the new heuristic improves the overall performance reducing the number of switches operations to reconfigure the distribution system. |
Anderson Soares |
367 | Random Neural Network based Intelligent Intrusion Detection for Wireless Sensor Networks [abstract] Abstract: Security and privacy of data are one of the prime concerns in today's embedded devices. Primitive security techniques like signature-based detection of malware and regular update of signature database are not feasible solutions as they cannot secure such systems, having limited resources, effectively. Furthermore, energy efficient wireless sensor modes running on batteries cannot afford the implementation of cryptography algorithms as such techniques have significant impact on the system power consumption. Therefore, in order to operate wireless embedded devices in a secure manner, the system must be able to detect and prevent any kind of intrusions before the network (i.e., sensor nodes and base station) is destabilized by the attackers. In this paper, we have presented an intrusion detection mechanism by implementing an intelligent security architecture using Random Neural Networks (RNN). To validate the feasibility of the proposed security solution, it is implemented for an existing wireless sensor network system and its functionality is practically demonstrated by successfully detecting the presence of any suspicious sensor node within the system operating range and anomalous activity in the base station with an accuracy of 97.23%. Overall, the proposed security solution has presented minimal performance overhead. |
Ahmed Saeed, Ali Ahmadinia, Abbas Javed, Hadi Larijani |
376 | GPU-based pedestrian detection for autonomous driving [abstract] Abstract: We propose a real-time pedestrian detection system for the embedded Nvidia Tegra X1 GPU-CPU hybrid platform. The pipeline is composed by the following state-of-the-art algorithms: Histogram of Local Binary Patterns (LBP) and Histograms of Oriented Gradients (HOG) features extracted from the input image; Pyramidal Sliding Window technique for foreground segmentation; and Support Vector Machine (SVM) for classification. Results show a 8x speedup in the target Tegra X1 platform and a better $performance/watt$ ratio than desktop CUDA platforms in study. |
Victor Campmany, Sergio Silva, Antonio Espinosa, Juan Carlos Moure, David Vazquez, Antonio Lopez |
378 | Effects of Simulation Parameters on Naïve Creatures Learning to Safely Cross a Highway on Bimodal Threshold Nature of Success [abstract] Abstract: A model of simulated cognitive agents (naïve creatures) learning to safely cross a cellular automaton based highway is described. These creatures have ability to learn from each other. We investigate how the creatures’ learning outcomes are affected by the model parameters (e.g., the traffic density, creatures’ ability to change a crossing point, creatures’ fear and desire). We observe and study a bimodal nature in the number of successful creatures in various creature populations at simulation end. This value is either low or high depending on the values of the model parameters. |
Anna T. Lawniczak, Leslie Ly, Fei Yu |
405 | Non-invasive procedure to probe the route choices of commuters in rail transit systems [abstract] Abstract: Accurately determining the probability of various route choices is critical in understanding the actual spatiotemporal flow of commuters and the instantaneous capacity of trains and stations. Here, we report a novel procedure, based solely on the recorded tap-in tap-out ticketing data, that dictates the route choice of commuters in a rail transit system (RTS) . We show that there exists a signature travel time distribution, in the form of $Gumbel$ type 1 function, from a given origin $O$ to a destination $D$. Any particular route can then be considered as a superposition of this mapping function and one can compute the probability that a specific path, over other possible paths, is taken by a commuter from $O$ to $D$. The procedure is demonstrated by considering different scenarios using travel data from smart fare cards from Singapore's RTS; results show that the forecasted characteristic profile deviates by less than $10^{-5}$ from the actual distribution. We note that our method utilizes only two parameters that can be experimentally accounted for. |
Christopher Monterola, Erika Fille Legara, Lee Kk, Pan Di, Terence Hung |
412 | The Bilingual Semantic Network of Computing Concepts [abstract] Abstract: We describe the construction of a bilingual (English-Russian /Russian-English) semantic network covering basic concepts of computing. To construct the semantic network, we used the Computing Curricular series created during 2000-2015 under the aegis of ACM and IEEE and the current stan-dards of IT specialists training in Russia, as well as some other English language and Russian lan-guage sources. The resulting network can be used as a basic component in an intelligent information system that allows processing bilingual search queries while considering their semantics and to help support and guide automated translation efforts of academic texts from one language to the other. The network can also be useful to support comparative analysis and integration of the programs and teach-ing materials for Computing and IT education in Russia and English speaking countries. This network can support cross-lingual information retrieval, knowledge management and machine translation, which play an important role in e-learning personalization and retrieval in the computing domain, thus allowing to benefit from online educational resources that are available in both languages. |
Evgeniy Khenner, Olfa Nasraoui |
415 | Detecting Extreme Events in Gridded Climate Data [abstract] Abstract: Detecting and tracking extreme events in gridded climatological data is a challenging problem on several fronts: algorithms, scalability, and I/O. Successful detection of these events will give climate scientists an alternate view of the behavior of different climatological variables, leading to enhanced scientific understanding of the impacts of events such as heat and cold waves, and on a larger scale, the El Ni$\tilde{n}$o Southern Oscillation. Recent advances in computing power and research in data sciences enabled us to look at this problem with a different perspective from what was previously possible. In this paper we present our computationally efficient algorithms for anomalous cluster detection on climate change big data. We provide results on detection and tracking of surface temperature and geopotential height anomalies, a trend analysis, and a study of relationships between the variables. We also identify the limitations of our approaches, future directions for research and alternate approaches. |
Bharathkumar Ramachandra, Krishna Karthik Gadiraju, Ranga Raju Vatsavai, Dale P. Kaiser, Thomas P. Karnowski |
419 | A computational approach to investigate patterns of acute respiratory illness dynamics in the regions with distinct seasonal climate transitions [abstract] Abstract: In the current work we present a set of computational algorithms aimed to analyze the acute respiratory infection (ARI) incidence data in the regions with distinct seasonal climate transitions. Their capabilities include: (a) collecting incidence data, fixing the under-reporting; (b) distinguishing phases of seasonal ARI dynamics (lower ARI level, higher ARI level, level transitions, epidemic outbreak); (c) finding the connections between the ARI dynamics (epidemic and interepidemic) and the weather factors. The algorithms are tested on the data for Saint Petersburg, Moscow and Novosibirsk and compared with the results for Ile-de-France region (Paris and its suburbs). The results are used to clarify the underlying mechanisms of ARI dynamics in temperate regions. |
Vasiliy Leonenko, Sergei Ivanov, Yulia Novoselova |
430 | A parallel algorithm for modeling of dynamical processes on large stochastic Kronecker graphs [abstract] Abstract: A stochastic Kronecker graph (SKG) is one of the most widely used approaches to model complex networks due to the simplicity of generative procedure and the ability to reproduce the properties of real graphs. In this paper, we present a novel parallel algorithm for modeling dynamical processes on large Poisson SKGs (PSKGs). Proposed algorithm combines the modeling of spreading of a process with the generation of vertices and edges which are involved in it. An experimental study of an algorithm was carried out for different initiator matrices and graphs with the widely varying (from several million to one billion) number of vertices using the supercomputer from Lobachevsky State University of Nizhni Novgorod, Russia. The results confirmed the applicability of an algorithm for mid-size HPC clusters with the efficiency varying on average from 0.2 to 0.7. |
Klavdiya Bochenina, Sergey Kesarev |
434 | Evaluation of the Cardiovascular Risk in Middle-Aged Workers: an Artificial Neural Networks-Based Approach [abstract] Abstract: A method of the evaluation of the risk of cardiovascular events in the group of middle-aged male workers was developed on the basis of artificial neural networks (ANN). The list of analyzed variables included parameters of allostatic load and signs of myocardial involvement. The results were compared with traditional scales and risk charts (SCORE, PROCAM, and Framingham). A better prognostic value of the proposed model was observed, which makes it reasonable to use both additional markers and ANN. |
Anton Selivanov, Aleksandr Sboev, Svetlana Gorokhova, Viktor Pfaf, Dmitry Gudovkikh, Roman Rybka, Aleksey Serenko, Ivan Moloshnikov |
457 | Matching User Accounts across Social Networks based on Users Message [abstract] Abstract: Identifying users across social networks has got more and more attention. The existing methods mainly estimate the pairwise similarity between users in different social networks and mainly rely on users' profiles and activities. But the users who pay attention to their privacy may change their profiles and relationships. In this paper, we propose a MUSIC (Modeling User Style for Identifying aCcounts across Social Networks) framework to address this problem: First, we build user’s content style model based on user’s message using word embedding technology; Second, we reduce the problem of finding users across social networks to classification problem on a single social network. Our experimental results validate the effectiveness and efficiency of our framework, and shows either all of user's message or only user's original posts can provide nearly the same efficiency in identifying this kind of users. |
Ying Sha, Qi Liang, Kaijian Zheng |
458 | Crowd turbulence with ABM and Verlet Integration on GPU cards [abstract] Abstract: Managing crowds is a key problem in a world with a growing population. Being able to predict and manage possible disasters directly affects the safety of crowd events. Current simulations focus mostly on navigation, but crowds have their own special characteristics and phenomena. Under specific conditions, a mass can turn into a crowd turbulence which may lead to a possible disaster. Understanding the internal phenomena is an important issue in order to model behavior. In the particular case of crowd turbulence, agents are moved by the crowd by a series of pushes, an involuntary movement that can be hard to reproduce. We propose a simple model to implement this complex problem based on intentional and involuntary interactions among the agents. The implementation is a hybrid model between the Verlet integration method and Agent Based Modeling. We implemented the proposed model using C and OpenCL and we evaluated its performance on a Nvidia GPU. |
Albert Gutierrez-Milla, Francisco Borges, Remo Suppi, Emilio Luque |
466 | Upwinding Second Order Lagrangian Particle Method for Euler Equations [abstract] Abstract: A new second order upwind Lagrangian particle method for solving Euler equations for compressible inviscid fluid or gas flows is proposed. Similar to smoothed particle hydrodynamics (SPH), the method represents fluid cells with Lagrangian particles and is suitable for the simulation of complex free surface / multiphase flows. The main contributions of our method, which is different from SPH in all other aspects, are (a) significant improvement of approximation of differential operators based on a polynomial fit via weighted least squares approximation and the convergence of prescribed order, (b) an upwind second-order particle-based algorithm with limiter, providing accuracy and long term stability, and (c) accurate resolution of states at free interfaces. Numerical verification tests demonstrating the convergence order for fixed domain and free surface problems are presented. |
Roman Samulyak, Hsin-Chiang Chen, Kwangmin Yu |
483 | Accelerating BWA Aligner Using Multistage Data Parallelization on Multicore and Manycore Architectures [abstract] Abstract: Nowadays, rapid progress in next generation sequencing (NGS) technologies has drastically decreased the cost and time required to obtain genome sequences. A series of powerful computing accelerators, such as GPUs and Xeon Phi MIC, are becoming a common platform to reduce the computational cost of the most demanding processes when genomic data is analyzed. GPU has received more attention at literature so far. However, Xeon Phi constitutes a very attractive approach to improve performance because applications don’t need to be rewritten in a different programming language specifically oriented to it. Sequence alignment is a fundamental step in any variant analysis study and there are many tools that cope with this problem. We have selected BWA, one of the most popular sequence aligner, and studied different data management strategies to improve its execution time on hybrid systems made of multicore CPUs and Xeon Phi accelerators. Our main contributions are focused on designing new strategies that combine data splitting and index replication in order to achieve a better balance in the use of system memory and reduce latency penalties. Our experimental results show significant speed-up improvements when such strategies are executed in our hybrid platform, taking advantage of the combined computing power of a standard multicore CPU and a Xeon Phi accelerator. |
Shaolong Chen, Miquel Senar |
500 | Integrated Machine Learning in Kepler [abstract] Abstract: We present a method to integrate multiple implementations of a machine learning algorithm in Kepler actors. This feature enables the user to compare accuracy and scalability of various implementations of a machine learning technique without having to change the workflow. These actors are based on the Execution Choice actor. They can be incorporated into any workflow to provide machine learning functionality. We describe a use case where actors that provide several implementations of k-means clustering can be used in a workflow to process sensor data from weather stations for predicting wildfire risks. |
Mai Nguyen, Daniel Crawl, Tahereh Masoumi, Ilkay Altintas |
501 | Introducing Triquetrum, A Possible Future for Kepler and Ptolemy II [abstract] Abstract: Triquetrum is an open platform for managing and executing scientific workflows that is under development as an Eclipse project. Both Triquetrum and Kepler use Ptolemy II as their execution engine. Triquetrum presents opportunities and risks for the Kepler community. The opportunities include a possibly larger community for interaction and a path for Kepler to move from Kepler's one-off ant-based build environment towards a more common OSGi-based environment and a way to maintain a stable Ptolemy II core. The risks include the fact that Triquetrum is a fork of Ptolemy II that would result in package name changes and other possible changes. In addition, Triquetrum is licensed under the Eclipse Public License v1.0, which includes a patent clause that could conflict with the University of California patent clause. This paper describes these opportunities and risks. |
Christopher Brooks, Jay Jay Billings |
511 | A Hybrid Approach for Personalized Learning Path Construction [abstract] Abstract: E-Learning systems are gaining importance as they provide “anywhere, anytime” learning. Due to the, it is seen as an alternate approach to traditional classroom. As “one size does not fit all”, these systems focus on providing personalized learning experience to every learner. It is essential to cater different requirements of each individual to improve the performance. Furthermore, a personalized learning increases the motivation and reduces cognitive load. The key challenge is to provide pedagogically rich personalized sequencing of the learning materials according to the divergent needs of the learners. In this paper, a hybrid evolutionary approach is proposed to generate personalized learning path. It uses Ant Colony Optimization (ACO) technique to generate an Individual Path and one or more Social Paths. Number of social paths generated depends on the previous user having similar features. Genetic Algorithm is then applied to generate the optimal path from those paths. |
Vanitha V, Selvamani K |
525 | Data-driven travel demand modelling and agent-based traffic simulation in Amsterdam urban area [abstract] Abstract: The goal of this project is the development of a large-scale agent-based traffic simulation system for Amsterdam urban area, validated on sensor data and adjusted for decision support in critical situations and for policy making in sustainable city development, emission control and electric car research. In this paper we briefly describe the agent-based simulation workflow and give the details of our data-driven approach for (1) modeling the road network of Amsterdam metropolitan area extended by major national roads, (2) recreating the car owners population distribution from municipality demographic data, (3) modeling the agent activity based on travel survey, and (4) modeling the inflow and outflow boundary conditions based on the traffic sensor data. The models are implemented in scientific Python and MATSim agent-based freeware. Simulation results of 46.5 thousand agents -with travel plans sampled from the model distributions- show that travel demand model is consistent, but should be improved to correspond with sensor data. The next steps in our project are: extensive validation, calibration and testing large-scale scenarios, including critical events like the major power outage in the Netherlands (doi:10.1016/j.procs.2015.11.039), and modelling emissions and heat islands caused by traffic jams. |
V.R. Melnikov, V.V. Krzhizhanovskaya, M.H. Lees, A.V. Boukhanovsky |
527 | Multi-agent simulation of passenger evacuation from damage ship under the storm conditions [abstract] Abstract: In this paper, we present the multi-agent model for simulation of evacuation processes considering ship motions and the method for modeling crowd dynamics. To take into account all aspects of the specifics of the evacuation in storm conditions the information model has been developed. This model based on three interrelated processes: sea waves dynamic; ship motions under the influence of sea waves; crowd dynamics under the ship motions. In our research developed a combined method for simulating agent’s movement on the inclined decks of the ship. Our approach joins the classical implementation of Social Force model with the possibility of collisions with obstacles. Depending on the specific issues it is possible to use three different models for ship dynamics on irregular waves. Due to that fact in our simulation of passenger evacuation was developed a distributed test bench based on cloud platform CALVIRE. Having analyzed the results, these outputs could be applied to design contingency plans to assist the crew members in the frameworks of decision support systems (DSS). |
Marina Balakhontceva, Vladislav Karbovskii, Alexander Boukhanovsky, Serge Sutulo |