Poster papers (POSTER) Session 1
Time and Date: Before Lunch
Room: HG F Galerie
Chair: None
17 | Numerical Simulation of Magnetic Nanoparticles Injection in Two-phase Flow in Porous Media under Magnetic Field Effect [abstract] Abstract: In this paper, the problem of magnetic nanoparticles injection into a water-oil two-phase flow under an external permanent magnetic field is investigated. The mathematical model of the problem under consideration has been developed. We treat the water-nanoparticles suspension as a miscible mixture while it is immiscible with the oil phase. So, the magnetization, density, and viscosity of ferrofluids are obtained using immiscible mixture relations. On the other hand, the magnetized phase pressure will have additional pressure term with the conventional thermodynamic pressure, namely, the ferrofluid phase dynamic pressure, the fluid magnetic pressure and the magneto-strictive pressure, while, the magnetic normal pressure is neglected in this study. The countercurrent imbibition flow problem is taken as an example. Physical variables including water-nanoparticles suspension saturation, nanoparticles concentration, and pore wall/throat deposited nanoparticles, are investigated under the influence of the magnetic field. Moreover, variations of both permeability and porosity are represented in figures. |
Mohamed El-Amin, Ahmed Saad, Shuyu Sun and Amgad Salama |
27 | RBF Interpolation with CSRBF of Large Data Sets [abstract] Abstract: This contribution presents a new analysis of properties of the interpolation and approximation using Radial Bases Functions (RBF) related to large data sets interpolation and approximation. The RBF application is convenient method for scattered d-dimensional interpolation and approximation used in GIS systems, in solution of partial differential equations (PDE) etc. The RBF method leads to a solution of linear system of equations and computational complexity of solution is nearly independent of a dimensionality of a problem solved. However, the RBF methods are usually applied for a small data sets with a small span geometric coordinates.
This contribution explores fundamental properties of RBF interpolation and approximation for large data sets and large span of geometric coordinates of the given data sets especially with regard to expectable numerical stability and robustness of computation. |
Vaclav Skala |
32 | SW-SGD: The Sliding Window Stochastic Gradient Descent Algorithm [abstract] Abstract: Stochastic Gradient Descent (SGD, or 1-SGD in our notation) is probably the most popular family of optimisation algorithms used in machine learning on large data sets due to its ability to optimise efficiently with respect to the number of complete training set data touches (epochs) used. Various authors have worked on data or model parallelism for SGD, but there is little work on how SGD fits with memory hierarchies ubiquitous in HPC machines. Standard practice suggests randomising the order of training points and streaming the whole set through the learner, which results in
extremely low temporal locality of access to the training set and thus, when dealing with large data sets, makes minimal use of the small, fast layers of memory in an HPC memory hierarchy. Mini-batch SGD with batch size n (n-SGD) is often used to control the noise on the gradient and make convergence smoother and more easy to identify, but this can reduce the learning efficiency wrt. epochs when compared to 1-SGD whilst also having the same extremely low temporal locality. In this paper we introduce Sliding Window SGD (SW-SGD) which uses temporal locality of training point access in an attempt to combine the advantages of 1-SGD (epoch efficiency) with n-SGD (smoother convergence and easier identification of convergence) by leveraging HPC memory hierarchies. We give initial results on part of the Pascal dataset that show that memory hierarchies can be used to improve SGD performance.
|
Thomas J. Ashby, Imen Chakroun and Tom Haber |
44 | Curvature-Based Feature Detection for Head Modeling [abstract] Abstract: In the field of 3D head modeling and animation, the models are often required to be enhanced by a set of labels or parameters. Feature points are often needed to mark important regions of the face. This additional information can be used to animate or deform the input model. However, an automatic detection of features remains a challenging task. This paper presents a novel approach to feature detection based on curvature and its derived descriptors, such as shape index, curvedness and Willmore energy. Four important feature regions are detected using the proposed approach - eyes, nose, mouth and ears. For each region, important feature points are detected that are necessary for the use in deformation-based head modeling. Results show that the feature points are detected with sufficient accuracy for further use in mesh deformations.
|
Martin Prantl, Věra Skorkovská, Petr Martínek and Ivana Kolingerová |
56 | Behavioral Characterization of Criminality Spread in Cities [abstract] Abstract: Complex networks are commonly used to model urban street networks, which allows aiding the analysis of criminal activities in cities. Despite several works focusing on such application, there is a lack of a clear methodology from data preparation to an in-depth analysis of crime behavior. In this sense, we propose a methodology for employing complex networks in the analysis of criminality spread within highly criminal areas of a city. Our methodology comprises tasks from urban crime mapping to criminality spread behavior analysis. We demonstrate it by considering a real crime dataset and the real-world city of San Francisco - CA, USA. Our results confirm the effectiveness of our methodology in analyzing the crime behavior within highly criminal areas of the city. Hence, this paper renders further development and planning on public safety of a city by means of complex networks. |
Gabriel Spadon, Lucas C. Scabora, Paulo H. Oliveira, Marcus V. S. Araujo, Bruno B. Machado, Elaine P. M. Sousa, Caetano Traina-Jr and Jose F. Rodrigues-Jr |
70 | Column-wise Guided Data Imputation [abstract] Abstract: This paper investigates data imputation techniques for pre-processing of dataset with missing values. The current literature is mainly focused on the overall accuracy, evaluated estimating the missing values on the dataset at hand, however the predictions can be suboptimal when considering the model performance for each feature. To address this problem, a Column-wise Guided Data Imputation method (cGDI) is proposed. Its main novelty resides in the selection of the most suitable model from a multitude of imputation techniques for each individual feature, through a learning process on the known data. To assess the performance of the proposed technique, empirical experiments have been conducted on 13 publicly available datasets. The results show that cGDI outperforms two baselines and has always comparable or greater estimation accuracy over four state-of-the-art methods, widely applied to solve the problem at hand. The cGDI also results to be sensitive to various missing rates and training set size, tested in four different settings (combining 25% and 50% MCAR with 5-fold and 30-fold cross). The use of this approach is proposed when considering multivariate imputation as a way of dealing with missingness in the data. Furthermore, cGDI has a straightforward implementation and any other known imputation technique can be easily added. |
Alessio Petrozziello and Ivan Jordanov |
72 | Stability Analysis of the Modified IMPES Scheme for Two–Phase Flow in Porous Media Including Dynamic Capillary Pressure [abstract] Abstract: In this paper, the problem of two-phase flow in porous media including dynamic capillary pressure has been studied numerically. The IMplicit Pressure Explicit Saturation (IMPES) scheme has been modified to solve the governing equations. The pressure equation is treated implicitly with the saturation equation to obtain the pressure, while the saturation equation is solved explicitly to update the saturation at each time step. We introduce stability analysis of the modified scheme and its stability condition has been derived. Comparison between the static and the dynamic capillary pressure has been introduced to illustrate the efficiency of the modified scheme. |
Mohamed El-Amin |
76 | Enabling efficient stencil code generation in OpenACC [abstract] Abstract: The OpenACC programming model simplifies the programming for accelerator devices such as GPUs. Its abstract accelerator model defines a least common denominator for accelerator devices, thus it cannot represent architectural specifics of these devices without losing portability. Therefore, this general-purpose approach delivers good performance on average, but it misses optimization opportunities for code generation and execution of specific classes of applications. In this paper, we propose stencil extensions to enable efficient code generation in OpenACC. Our results show that our stencil extensions may improve the performance of OpenACC in up to 28% and 45% on GPU and CPU, respectively. |
Alyson Pereira, Rodrigo Rocha, Márcio Castro, Luís Góes and Mario Dantas |
78 | Phenomena of Nonlinear Diffusion in Complex 3D Media [abstract] Abstract: Physical phenomena with critical blow-up regimes simulated by the 3D nonlinear diffusion equation in a spherical shell are studied. For solving the model numerically, the original differential operator is split along the radial coordinate, as well as an original technique of using two coordinate maps for solving the 2D subproblem on the sphere is involved. This results in 1D finite difference subproblems with simple periodic boundary conditions in the latitudinal and longitudinal directions that lead to unconditionally stable implicit second-order finite difference schemes. A band structure of the resulting matrices allows applying fast direct (non-iterative) linear solvers using the Sherman-Morrison formula and Thomas algorithm. The developed method is tested in several numerical experiments. Our tests demonstrate that the model allows simulating different regimes of blow-up in a 3D complex domain. In particular, heat localisation is shown to lead to the breakup of the medium into individual fragments followed by the formation and development of self-organising patterns, which may have promising applications in thermonuclear fusion, nonlinear inelastic deformation and fracture of loaded solids and media and other areas. |
Yuri Skiba and Denis Filatov |
80 | Effective Learning with 2-Dimensional Active Selection on Feature and Instance [abstract] Abstract: Active learning is an effective way to minimize the number of queries on ground-truth labels from an oracle via human-computer interaction. Traditional active learning focus on the selection of unlabeled instances that are most informative for parameter optimization under the existing hypothesis. However, the variation of optimal feature subset and the emergence of new class labels are neglected. In this paper, we propose a novel active learning method with 2-dimentional selection on both feature and instance. For feature-dimensional selection, discriminative feature selection is implemented to extract the smallest possible subset of features that can most accurately reveal the underlying classification labels. For instance-dimensional selection, an indeterminate model is adopted to balance between model update and model upgrade. Finally, the active selection in both dimensions are integrated into a unified framework for robust model learning. |
Xiao-Yu Zhang, Shupeng Wang, Lei Zhang, Chao Li, Yang Chen, Yong Wang and Binbin Li |
89 | Feasibility Study of Social Network Analysis on Loosely Structured Communication Networks [abstract] Abstract: Organised criminal groups are moving more of their activities from traditionally physical crime into the cyber domain; where they form online communities that are used as marketplaces for illegal materials, products and services. The trading of illicit goods drives an underground economy by providing services that facilitate almost any type of cyber crime. The challenge for law enforcement agencies is to know which individuals to focus their efforts on, in order to effectively disrupting the services provided by cyber criminals. This paper present our study to assess graph-based centrality measures' performance for identifying important individuals within a criminal network. These measures has previously been used on small and structured general social networks. In this study, we are testing the measures on a new dataset that is larger, loosely structured and resembles a network within cyber criminal forums. Our result shows that well established measures have weaknesses when applied to this challenging dataset. |
Jan William Johnsen and Katrin Franke |
98 | Using compiler analysis to improve the program understanding of legacy scientic code: A case study on an ACME Land module [abstract] Abstract: It is well known that the complexity of software systems become a barrier for further rapid model development and software modernization. In this study, we present a procedure to use compiler-based technologies to better understand complex scientic code. The approach requires no extra software installation and conguration and its software analysis can be transparent to developer and users. We design a sample code to illustrate the data collection and analysis procedure from compiler technologies and also presented a case study and used the information from interprocesure analysis (provided by PGI) to analyze a scientic function module from an Earth System Model. Since huge information can be collected and transformed from a variety of components of a modern compiler, we belive this study provide a very generic method to better understand legacy scientic code. |
Dali Wang, Yu Pei and Oscar Hernandez |
107 | Selection of Random Walkers that Optimizes the Global Mean First Passage Time for Search in Complex Networks [abstract] Abstract: We investigate the method of finding optimal initial positions of multiple random walkers to search in complex network so the global mean first passage time (GMFPT) to a general target can be minimized. Without specifying the property of the target node, we optimize the efficiency of search by multiple random walkers through the minimization of the resource wasted. This is mapped into the problem of minimization of the overlap of sites visited. According to the Fourier transformed formula of the mean first passage time (MFPT) of multiple walkers, we can minimize the searching time by the reduction of the overlap between the probability distribution of sites visited by the random walkers,
or equivalently maximizing the union of the set of sites visited by all the random walkers over a preset number of time steps. We employ a mutation only genetic algorithm (MOGA) to solve this optimization problem. In the algorithm,
we introduce a representation of the chromosomes that describe the walks for a population of walkers with different starting origins and a corresponding mutation matrix to modify them. From simulation results on two kinds of random networks (WS and BA), we find that the algorithm can produce satisfactory result in selecting origins for the walkers to achieve minimum overlap. This method thus provides guidance for setting up search process by many random walkers on complex network for a general target,
since we have not use any information on the target for our analysis. |
Mu Cong Ding and Kwok Yip Szeto |
113 | #RighttoBreathe why not? Social Media Analysis of the Local in the Capital City of India [abstract] Abstract: How the citizens are experiencing the increase in Air pollution? How can the Social media data be explored and used to develop deeper insights to study citizen’s observations, opinions and experiences? In current research paper, by utilizing micro-blog social media application – Twitter data, these questions are examined.A suitable framework, Health Practice - “HELP”, is emerged based on computational analysis and visualizations of the content of the Tweets. |
Nitin Upadhyay and Shalini Upadhyay |
118 | A Statistical Analysis of the Performance Variability of Read/Write Operations on Parallel File Systems [abstract] Abstract: Modern High Performance Computer systems are typically supplied by a complex file I/O path, in which many sources of performance variability emerge. This paper reports an experimental research about the performance variability of read and write operations on parallel file systems. To properly account for the inherent system variability and to obtain statistically significant and reproducible results, formal experimental design and analysis methods were employed in this study. Main and interaction effects were investigated for nine factors and two experimental environments. This research reveals that in the evaluated conditions six effects dominate I/O time, responding for 99.32% of the performance variability. It further shows some factors traditionally explored in I/O optimization researches presented no statistical evidence of significant performance effects in this study experiments. Moreover, high-level effects were identified by the interpretation of the set of statistically significant factors, providing a case for further research in the subject. |
Eduardo Camilo Inacio, Pedro Barbetta and Mario Dantas |
122 | Path Planning for Groups on Graphs [abstract] Abstract: This paper introduces a new method of planning paths for crowds, applicable to environments described by a graph. The main idea is to group members of the crowd by their common initial and target positions and then plan the path for one member of each group. If the crowd can be divided into a few groups this way, the proposed approach save a huge amount of computational and memory demands in dynamic environments. |
Jakub Szkandera, Ivana Kolingerová and Martin Maňák |
133 | par2hier: towards vector representations for hierarchical content [abstract] Abstract: Word embeddings have received a lot of attention in the natural language processing area for their capabilities of capturing inner words semantics (e.g. word2vec, GloVe). The need of catching semantics at a higher and more abstract level led to creation of models like paragraph vectors for sentences and documents, seq2vec for biological sequences. In this paper we illustrate an approach for creating vector representations for hierarchical content where each node in the hierarchy is represented as a (recursive) function of its paragraph vector and the hierarchical vectors of its child nodes, computed via matrix factorization. We evaluate the effectiveness of our solution against flat paragraph vectors on a text categorization task obtaining significant µF1 improvements. |
Tommaso Teofili |
142 | Recognizing Compound Entity Phrases in Hybrid Academic Domains in View of Community Division [abstract] Abstract: Classifying compound named entities in academic domains, such as the name of papers, patents and projects, plays an important role in enhancing many applications such as knowledge discovering and intelligence property protection. However, there are very little work on this novel and hard problem.Prior mainstream approaches mainly focus on classifying basic named entities (e.g. person names, organization names, twitter named entities, and simple entities in specific sci-tech domain etc). This paper concludes four intrinsic characteristics of entities phrases in academic domains, and further proposes a generic and weak-supervised framework named GenericSegVal to address the problem. Our GenericSegVal consists of four key components: POS tagging, template extraction, text splitting and segment validating. We use context templates to extract the possible candidate compound entities roughly, which is used for reducing searching space of text splitting. We reduce the text splitting problem to the community division problem, which is addressed based on the dynamic programming strategy. The construction of indicative words set used in segment validating is reduced to the classical minimum set cover problem, which is also addressed based on dynamic programming. Experimental results on classifying real-world science-technology compound entities show that GenericSegVal achieves a sharp increase in both precision rate and recall rate by comparing with the supervised bidirectional LSTM approach. |
Yang Yan, Tingwen Liu, Quangang Li, Jinqiao Shi, Li Guo and Yubin Wang |
147 | Social Contact Patterns in an Individual-based Simulator for the Transmission of Infectious Diseases (Stride) [abstract] Abstract: Individual-based models are convenient to study the spread of infectious diseases in combination with heterogeneous social mixing patterns. In this paper, we present an open-source individual-based Simulator for the TRansmission of Infectious Diseases (Stride) in which social contact rates are dependent on both the context and the age of the individuals involved. Stride is a generic simulator for close-contact diseases with focus on model flexibility and performance. Using a case study, we illustrate the social contact patterns in the model and show how mixing patterns can have impact on the disease dynamics. |
Elise Kuylen, Sean Stijven, Jan Broeckhove and Lander Willem |
205 | Improving Operational Intensity in Data Bound Markov Chain Monte Carlo [abstract] Abstract: Typically, parallel algorithms are developed to leverage the processing power of multiple processors simultaneously speeding up overall execution. At the same time, discrepancy between DRAM bandwidth and microprocessor speed hinders reaching peak performance. This paper explores how operational intensity improves by performing useful computation during otherwise stalled cycles. While the proposed methodology is applicable to a wide variety of parallel algorithms, and at different scales, the concepts are demonstrated in the machine learning context. Performance improvements are shown for Bayesian logistic regression with a Markov chain Monte Carlo sampler, either with multiple chains or with multiple proposals, on a dense data set two orders of magnitude larger than the last level cache on contemporary systems. |
Balazs Nemeth, Tom Haber, Thomas J. Ashby and Wim Lamotte |
222 | Cognitive Agents Success in Learning to Cross a CA Based Highway Comparison for Two Decision Formulas [abstract] Abstract: We study cognitive agents’ success in learning to cross a Cellular Automaton (CA) based highway, for two decision formulas used by the agents’ in their decision-making process. We describe the main features of the simulation model focusing on the agents decision-making process. The agents use a type of “observational social learning” strategy based on the observation of performance of other agents, mimicking what worked for them and avoiding what did not. We investigate how the incorporation of the assessment of outcomes of agents waiting decisions into their decision-making process, which was based only on the assessment of outcomes of their crossing decisions, effects the agents’ success in learning to cross the highway. We measure the agents’ success by the numbers of successful, killed and queued agents at simulation end.
|
Anna T. Lawniczak and Fei Yu |
226 | Algorithm for simultaneous adaptation and time step iterations for the electromagnetic waves propagation and heating of the human head induced by cell phone [abstract] Abstract: In this paper, we propose a parallel algorithm for simultaneous adaptation and time step iterations for the solution of difficult coupled time-dependent problems. In particular, we focus on the problem of propagation of electromagnetic waves over the human head induced by cell phone antenna, coupled with the Pennes bioheat equations modeling the heating of the human head. Our algorithm allows for utilization of multiple cores for faster solution of the time-dependent difficult problems. Each core is assigned to a single time step. We utilize hp-adaptive algorithm for iterative solution of both Maxwell and Pennes equations, in every time step. We progress with parallel computations in subsequent time steps, by using the solutions from previous time steps with a given accuracy. At the same time, we increase the accuracy of intermediate time step solutions by performing hp-adaptive computations in parallel, in every time step. |
Luis Emilio Garcia-Castillo, Ignacio Gomez-Revuelto, Marcin Łoś and Maciej Paszynski |
227 | Optimization of DBN using Regularization Methods Applied for Recognizing Arabic Handwritten Script [abstract] Abstract: Since the mid 2010’s, Deep learning has been regarded as a boom and consequently it has big success in a large field of applications like speech and pattern recognition. Handwriting recognition is indeed amongst the triumphal applications in the field of pattern recognition. Despite being quite matured, this field is still ambiguous for the Arabic handwritten script and hence several questions are still a challenge. In this study, Deep Belief Network (DBN) for Arabic handwritten script recognition is investigated. We then ensure DBN architecture against over-fitting because of mighty performance of dropout and dropconnect. Training with the both regularization methods, a randomly selected subsets of activations/weights are dropped. As a result, the evaluation on the HACDB database to deal with character level shows an improvement of classification error rate when using DBN trained with dropout and dropconnect. |
Mohamed Elleuch, Najiba Tagougui and Monji Kherallah |
249 | Efficient OpenCL-based concurrent tasks offloading on accelerators [abstract] Abstract: Current heterogeneous platforms with CPUs and accelerators have the ability to launch several independent tasks simultaneously, in order to exploit concurrency among them. These tasks typically consist of data transfer commands and kernel computation commands. In this paper we develop a runtime approach to optimize the concurrency between data transfers and kernel computation commands in a multithreaded scenario where each CPU thread offloads tasks tothe accelerator. It deploys a heuristic based on a temporal execution model for concurrent tasks. It is able to establish a near-optimal task execution order that significantly reduces the total execution time, including data transfers. Our approach has been evaluated employing five different benchmarks composed of dominant kernel and dominant transfer real tasks. In these experiments our heuristic achieves speedups up to 1.5x in AMD R9 and NVIDIA K20c
accelerators and 1.3x in an Intel Xeon Phi device. |
Antonio J Lázaro-Muñoz, Jose González-Linares, Juan Gómez Luna and Nicolás Guil Mata |
256 | On Patterns of Multi-domain Interaction for Scientific Software Development focused on Separation of Concerns [abstract] Abstract: This year's ICCS conference theme promotes the use of computational science as a means to foster multidisciplinarity and synergies with other fields.
Our thesis is that this trend towards multidisciplinarity should be accompanied by the use of best practices issued from the software engineering community in order to avoid obtaining overly complex and tangled code, difficult to validate, to maintain and to port.
In this paper we argue for the need of applying separation of concerns principles when the development involves scientists from various application fields. We overview several strategies that may be used to achieve this separation, focusing mainly on two approaches drawn from our previous experiences with multidisciplinary projects, addressing two distinct patterns of multi-domain interaction that may occur in scientific software development. |
Ileana Ober and Iulian Ober |
276 | Crowd Evacuation Modeling and Simulation Using Care HPS [abstract] Abstract: The problem of evacuating crowded closed spaces, such as discotheques, public exhibition pavilions or concert houses, has become increasingly important and gained attention both from practitioners and from public authorities. This kind of problem can be modeled using Agent-Based Model techniques and consequently simulated in order to study evacuation strategies. In this paper, we show the Fira of Barcelona evacuation model implemented with Care HPS.
Our aim is to prove that this model albeit simple can be expanded and adapted for experts test various scenarios and validate the outcome of their design. Some preliminary experiments are carried out using parallel and distributed architecture, whose results are presented, validated and discussed. As a main contribution: i) we extend and added new partitioning approaches and other features in Care HPS to carry on this model; and ii) we figured out that crowd evacuation problem has bottlenecks in reality, such as exits, that required more deep optimization in code in order to decrease the total execution time. Finally, we draw some conclusions and point out ways in which this work can be further extended. |
Mohammed Alghazzawi, Ghazal Tashakor, Francisco Borges and Remo Suppi |
306 | Video face recognition through multi-scale and optimization of margin distributions [abstract] Abstract: Video based face recognition has attracted much attention and made great progress in the past decade. It has a wide range of applications in video conference, human-computer interaction, judicature identification, video surveillance, and entrance controlling, etc. Video based face recognition methods can be divided to be two main aspects: the first step is constructing the face models which is used to represent the individual image sets, the second step is generating the similarity metric which is used to compare the face models and query face. Inspired by the image-set based object classification methods, we present a multi-scale image-set based on collaborative representation method which is optimized by margin distribution for face recognition in videos. We use the collaborative representation method to get the outputs of different size of sub image set, and obtain the finally result by optimally combine these outputs. Experimental results on public video face databases demonstrate that our multi-scale image-set based collaborative representation method can be able to outperform a number of existing state-of-the-art ones. |
Gaopeng Gou, Zhen Li, Gang Xiong, Yangyang Guan and Junzheng Shi |
318 | MCM: A new MPI Communication Management for Cloud Environments [abstract] Abstract: On-demand Cloud Computing offers an attractive new dimension to High Performance Computing (HPC). Many HPC applications are moving to clouds due to some characteristics of this environment, like elasticity, pay per use, and the maintenance cost. HPC applications normally use Message Passing Interface (MPI). However, for this kind of applications on cloud environments, network performance remains a challenge due to the effects of network virtualization. To use a cloud environment with scientific applications of this kind, low latency communication mechanisms are required, and the hidden network topology information makes difficult to use the existing optimizations based on network topology information. In this paper a method named MPI Communication Management (MCM) is presented. This method characterizes the underlying network topology and analyzes parallel applications behavior in the cloud, improving the application's messages latency time. MCM achieves lower application execution time in case of congestion, obtaining better performance in clouds. Finally, experiments verify the functionality and improvements of MCM with MPI Applications in Amazon EC2 public cloud. |
Laura Espínola, Daniel Franco and Emilio Luque |
320 | Clustering of comorbidities based on conditional probabilities of diseases in hypertensive patients [abstract] Abstract: Treatment of chronic diseases, such as arterial hypertension, is always a difficult decision for cardiologist. As the majority of hypertensive patients are of older age, they also have many comorbid diseases. Optimized treatment is supposed to be targeted to the specific cluster of comorbidities. The objective of study is to find effective algorithms for clustering of comorbidities in hypertensive patients. Hierarchical structure of diseases, their types and groups was extracted from text descriptions in EHR database of Federal Almazov North-West Medical Research Centre. Three approaches were tested to find connections between comorbidities: frequency analysis, association rules mining and Bayesian networks. Robust cluster of diseases was found and contains cardiovascular, endocrinological, musculoskeletal and nervous system groups. Further research will be focused on investigating this cluster at the next level of hierarchy and incorporating time scale data of patients’ visits into analysis. |
Nikita Bukhanov, Marina Balakhontсeva, Alexey Krikunov, Arthur Sabirov, Anna Semakova, Nadezhda Zvartau and Aleksandra Konradi |
332 | Dual-mixed finite elements for the three-field Stokes model as a finite volume method on staggered grids [abstract] Abstract: In this paper, a new three-field weak formulation for Stokes problems is developed, and from this, a dual-mixed finite element method is proposed on a rectangular mesh. In our mixed methods, the components of stress tensor are approximated by piecewise constant functions or $Q_1$ functions, while the velocity and pressure are discretized by the lowest-order Raviart-Thomas element and the piecewise constant functions, respectively. Using certain quadrature rules, we demonstrate that this scheme can be reduced into a finite volume method on staggered grids, which is extensively used in computational fluid mechanics and engineering. |
Jisheng Kou and Shuyu Sun |
366 | Influenza peaks forecasting in Russia: assessing the applicability of statistical methods [abstract] Abstract: This paper compares the accuracy of different statistical methods for influenza peaks prediction. The research demonstrates the performance of long short-term memory neural networks along with the results obtained by proved statistical methods, which are the Serfling model, averaging and point-to-point estimates. The prediction accuracy of the methods is compared with the accuracy of the modeling approaches. Possible applications of the methods and the ways to improve their accuracy are discussed. |
Vasiliy Leonenko, Klavdiya Bochenina and Sergey Kesarev |
380 | Modeling and Simulation for Exploring Power/Time Trade-off of Parallel Deep Neural Network Training [abstract] Abstract: In the paper we tackle bi-objective execution time and power consumption optimization problem concerning parallel applications constructed from sequences of atomic operations called processes, which can depend on each other due to communication. The decision space consists of application parameters and process mappings. We propose using a discrete-event simulation environment for exploring this power/time trade-off in the form of a Pareto front. The solution is verified by a case study based on a real deep neural network training application for speech recognition. A simulation lasting over 2 hours accurately predicts real results from executions that take over 335 hours in a cluster with 8 GPUs. Additionally, the modeling process helps to notice a data imbalance bottleneck in the application and estimate its impact on the application performance giving incentives for further optimization. |
Paweł Rościszewski |
394 | Imperative BSPlib-style Communications in BSML [abstract] Abstract: Bulk synchronous parallelism (BSP) offers an abstract and simple model of parallelism yet allows to take realistically into account the communication costs of parallel algorithms. BSP has been used in many application domains. BSPlib and its variants are programming libraries for the C language that support the BSP style.
Bulk Synchronous Parallel ML (BSML) is a library for BSP programming with the functional language OCaml. It is based on an extension of the λ-calculus by parallel operations on a data structure named parallel vector. BSML offers a global view of programs, i.e. BSML programs can be seen as sequential programs working on a parallel data structure (seq of par) while a BSPlib program is written in the SPMD style and understood as a parallel composition of communicating sequential programs (par of seq). The communication styles of BSML and BSPlib are also quite different.
This paper shows that BSPlib-style communications can be implemented on top of BSML, without the need to extend BSML parallel primitives. |
Frederic Loulergue |
399 | Heterogeneous Personal Computing: A Case Study in Materials Science [abstract] Abstract: The HRTE platform (Heterogeneous Run-time Environment) enables the construction of problem solving environments dedicated to a specific area (PSE) that exploit the heterogeneous processing resources available in a desktop computer (eg GPU). The HRTE-enabled PSE supports the inter-operation between existing processing modules and new ones (HModules), optimizing the typical communication patterns of a PSE. HModules can register multiple implementations allowing HRTE to select the target device at runtime.
The paper describes the main features of HRTE and the programming interface used to build HModules. An example of an application in the Materials Science area illustrates the approach and allows us to show some promising performance figures. |
Nuno Oliveira and Pedro Medeiros |
401 | Classification of Critical Points Using a Second Order Derivative [abstract] Abstract: This article presents a new method for classification of critical points. A vector field is usually classified using only a Jacobian matrix of the approximated vector field. This work shows how an approximation using a second order derivative can be used for more detailed classification. An algorithm for calculation of the curvature of main axes is also presented. |
Michal Smolik and Vaclav Skala |
403 | Building hybrid scientific similarity networks using research papers and social networks [abstract] Abstract: Research collaboration is very important in the modern scientific world, because it provides a wide range of opportunities, from the knowledge transfer between departments, institutions or even countries, to experimental research, where scientists also share materials and equipment. Because of that the field of building and analyzing collaboration networks grows in popularity. In this work we propose the methodology of building a hybrid scientific similarity network based on the features obtained from two leading scientific platforms – ResearchGate and Scopus. Experimental evaluation demonstrates good quality of the proposed approach and ability of the hybrid network to generate interconnections not distinguishable when using a single network. |
Gali-Ketema Mbogo and Alexander Visheratin |
426 | A Self-Enforcing Network as a Tool for Clustering and Analyzing Complex Data [abstract] Abstract: The Self-Enforcing Network (SEN), which is a self-organized learning neural network, is introduced as a tool for clustering to define reference types in complex data. In order to achieve this, a cue validity factor is defined, which first steers the clustering of the data. Finding reference types allows the analysis and classification of new data. The results show that a user can influence the clustering of data by SEN, thus allowing the analysis of the data depending on specific interests. The described tool includes concrete examples with clinical data and shows the potential of such a network for the analysis of complex data. |
Christina Kluever |
430 | A multicomponent QM study of H2 dissociation on small aluminum cluster [abstract] Abstract: H2 dissociation on small aluminum cluster, Al2, is studied using our multicomponent quantum-mechanical (MC_QM) method, which can take account of the nuclear quantum effect (NQE) of light nucleus, such as proton and deuteron. We demonstrate that no standard density functionals cannot reproduce CCSD(T) geometry of van der Waals H2…Al2 complex well, even though the empirical dispersion correction is included. Our MC_QM calculations reveal that NQE stabilizes each stationary point structure, and H2 dissociation reaction is the barrierless reaction on the MC_QM effective potential energy hypersurface. The H/D isotope effects on the dissociation reaction are also analyzed. |
Taro Udagawa, Kimichi Suzuki and Masanori Tachikawa |
443 | High Performance and Enhanced Scalability for Parallel Applications using MPI-3’s non-blocking Collectives [abstract] Abstract: Collective communications occupy 20-90% of total execution times in many MPI applications. In this paper, we propose strategies for automatically identifying the most time-consuming collective operations that also act as scalability bottlenecks. We then explore the use of MPI-3’s non-blocking collectives for these communications. We also rearrange the codes to adequately overlap the independent
computations with the non-blocking collective communications. Applying these strategies for different graph and machine learning applications, we obtained up to 33% performance improvements for large-scale runs on a Cray supercomputer. |
Surendra Varma Pericherla and Sathish Vadhiyar |
452 | DomainWatcher: Detecting Malicious Domains Based on Local and Global Textual Features [abstract] Abstract: Malicious domains usually refer to a series of illegal activities, posing threats to people's privacy and property. Therefore, the problem of detecting malicious domains to prevent people from attacks has aroused the widespread concern. This paper introduces a novel approach named DomainWatcher to detect malicious domains based on local and global textual features. Specifically, we extract features from the domain to be classified and introduce two types of global textual features, namely imitation feature and bigram feature, by measuring the similarity between the domain to be classified and known domains. Experimental results on real-world data show that DomainWatcher can achieve high precision rate, recall rate and F1-measure with low time consumption. |
Panpan Zhang, Tingwen Liu, Yang Zhang, Jing Ya, Jinqiao Shi and Yubin Wang |
455 | Modeling perfusion by fractal tree and stochastic dynamics [abstract] Abstract: We present a novel approach to the modeling perfusion processes. It consist of the generation of arterial tree using a fractal approach. Then the advection diffusion processes are captured by the GPU accelerated stochastic dynamics. The preliminary results of such modeling and its possible application to real case scenarios are discussed. |
Katarzyna Jesionek, Dominik Szczerba, Jarosław Wasilewski and Marcin Kostur |
464 | Counterion Effect on the Mechanism of Gold (I)-Catalyzed Cycloisomerization of 3-Allenylmethylindoles: A Computational Study [abstract] Abstract: A DFT study of the gold-catalyzed cycloisomerization of 3-allenylmethylindoles has been carried out to elucidate the mechanism and regioselectivity of the reaction. The calculation results suggest that the reaction proceeds through initial coordination of the catalyst to the allene moiety followed by intramolecular nucleophilic attack of the indole at the activated terminal carbon of the allene moiety. The results predict that the regioselectivity of the reaction depends on the substitution pattern of the allene moiety which is consistent to the experimental observation. It is also found that the less reactive C2-position of indole can be a possible nucleophilic site when there is a substituent on the terminal carbon of the allene moiety. Furthermore, our calculation results showed that the bistriflimide counterion is crucial to catalyze the proton transfer steps and at the same time to control the regioselectivity of the reaction. The calculation results are in good agreement with the experimental observation. |
Mengistu Gemech Menkir and Shyi-Long Lee |
476 | Lost in Translation: The Fundamental Flaws in Star Coordinate Visualizations [abstract] Abstract: Star Coordinate Plot is a simple and efficient technique for visualizing multidimensional data. Since the proposal of this method in early 2000, several researchers have attempted to address its weakness of tending to project data points towards the origin of the star coordinate space. But so far no one has provided a critical analysis of the issue in the literature. As a result, the weakness of Star Coordinate Plot is still not well understood. In this paper, we first provide an explanation of its weakness by pointing out two fundamentals errors in the design of the original Star Coordinate Plot. We show how these errors result in three categories of data points that are lost in the process of translating from n-dimensional space to the two-dimensional star coordinate space. We then propose the Enhanced Star Coordinate data visualization method to address these issues. Our experimental results show that the proposed method is superior to the original Star Coordinate Plot on several datasets used for evaluations. |
Swee Chuan Tan and Jeksen Tan |
495 | Performance and Scalability Study of FMM Kernels on Novel Multi- and Many-core Architectures [abstract] Abstract: We describe and evaluate the performance and scalability of efficient implementations of common Fast Multipole Method (FMM) kernels for modern multi-core (Intel Xeon Haswell),
many-core (Intel Xeon Phi Knights Landing) and Nvidia Pascal GPUs. We offer optimization guidelines for each kernel and architecture, and perform detailed performance and
scalability evaluations on these architectures. Task granularity issues are exposed, and saturation points (best threading configuration) for optimal performance are found for each kernel
and processor. These results motivate the use of hybrid execution models for FMM on heterogeneous architectures, in which per-kernel execution configurations are determined based on
platform characteristics.
|
Antón Rey, Francisco Igual, Manuel Prieto-Matías and Jan Prins |
501 | Detection of tourists attraction points using Instagram profiles [abstract] Abstract: During their vacations, people try to explore new places. However, due to the time limits it is not possible for tourists to see all interesting locations in the city that they visit. Thus, they have to choose, which places to visit in the first place, and which could missed during the trip. In this paper, we shed the light on differences between favorite places of tourists and locals using their Instagram profiles. The time windows based identification method is proposed to distinguish visitors from residents. In addition to that, the list of potential tourists’ attraction points in Saint Petersburg, Russia was obtained by analysis of locals’ popular places. |
Ksenia Mukhina, Stepan Rakitin and Alexander Visheratin |
502 | Formal Approach to Control Design of Complex and Dynamical Systems [abstract] Abstract: In this paper, we tackle the design of complex discrete event systems whose structures change as they are developed in different operating modes.
Based on the Supervisory Control Theory (SCT), we propose a formal hierarchical approach for controlling the design of these systems by using a multi-model approach;
it involves representing complex systems by a set of simple models, each of which describes the system in a given operating mode. The resulted framework resolves problems of reconfigurations and of common components to multiple operating modes; and it is designed by using the Colored Petri Nets (CP-Nets) formalism and is illustrated with a flexible manufacturing system. |
Hela Kadri, Samir Ben Ahmed and Simon Collart-Dutilleul |
505 | Impacts of Building Geometries and Radiation Properties on Urban Thermal Environment [abstract] Abstract: Urban thermal environment highly affects the livability and life quality of residents, especially in a tropical megacity like Singapore. To identify impacts of building geometries and radiation properties on urban thermal environment, a numerical model is developed and different cases with specific configurations are simulated. It is found that urban thermal environment is quite sensitive to building geometry resolution, building height and the gap width in between buildings. With more detailed information of building geometry, it is able to capture the heating effects of trapped radiations in the recessed area of buildings. Much hotter air will accumulate around the roof top of higher buildings, while air temperature at 1m above the around is hotter around low-rise buildings. Absorptivity is found to contribute most to thermal environment. With new materials of low absorptivity and high reflectivity, it is possible to improve urban thermal environment. |
Ming Xu |
547 | Parallel Post-Processing of the Earth Climate Model Output [abstract] Abstract: The increasing resolution of climate and weather models has resulted in a fast growth of their data production. This calls for a modern and efficient approach to the post-processing of these data. To this end, we have developed a new software package
in Python that exploits the parallel nature of the post-processing workload to process the output of EC-Earth, a coupled atmosphere-ocean model. We describe the design of our post-processing package, and present benchmark results showing the improvement in the workload. |
Gijs van den Oord and Rena Bakhshi |
552 | Towards a Fuzzy Cognitive Map for Opinion Mining [abstract] Abstract: In this paper, we propose a Fuzzy Cognitive Map (FCM) to opinion mining, with special attention to media influence on public opinion. Particularly, in this paper, we describe the FCM, the concepts and relationships among them. Our opinion mining model is based on a multilevel FCM, to distribute the concepts according to the aspects that describe the elements conforming public opinion, which are: social, technological and biological. We carry out preliminary tests, and the results are very encouraging. |
Jose Aguilar, Oswaldo Terán, Hebert Sánchez, José-Antonio Gutiérrez-De-Mesa and Jorge Cordero |
588 | GPU-Accelerated Real-Time Path Planning and the Predictable Execution Model [abstract] Abstract: Path planning is one of the key functional blocks for autonomous vehicles constantly updating their route in real-time. Heterogeneous many-cores are appealing candidates for its execution, but the high degree of resource sharing results in very unpredictable timing behavior. The predictable execution model (PREM) has the potential to enable the deployment of real-time applications on top of commercial off-the-shelf (COTS) heterogeneous systems by separating compute and memory operations, and scheduling the latter in an interference-free manner. This paper studies PREM applied to a state-of-the-art path planner running on a NVIDIA Tegra X1, providing insight on memory sharing and its impact on performance and predictability. The results show that PREM reduces the execution time variance to near-zero, providing a 3× decrease in the worst case execution time. |
Björn Forsberg, Daniele Palossi, Andrea Marongiu and Luca Benini |
598 | Application of Block-structured Adaptive Mesh Refinement to Particle Simulation [abstract] Abstract: We implemented the particle treatment in the block-structured adaptive mesh refinement (AMR) framework which we have developed. In the AMR framework, the simulation domain is divided into multiple sub-domains and they are assigned to a number of processes for parallel calculation using MPI. A sub-domain is composed of multiple block-structured regions each of which has the fixed number of grids. When high resolution is required at a certain region in the sub-domain, a block-structured region with refined grids, which is called child block, is locally created. To apply this AMR framework to the particle simulations such as particle-in-cell simulation, we set up several arrays for the particle treatment for each sub-domain assigned to one process. These arrays are shared among all the blocks consisting of the corresponding sub-domain. For the particle calculation in each block, we also set up another several arrays which are privately defined and used in each block. The functions for these arrays for the particle treatment are described in this paper. To test the implementation of the particle treatment in the AMR framework, we performed test simulations by adopting the sugarscape model which was proposed for the simulation of an artificial society by using many agents representing inhabitants in a certain area. We treated the inhabitants as a bunch of particles and assign the sugar amount at each grid as the environment in a two-dimensional simulation domain. In the simulation, we initially place two peaks of sugar and randomly distribute the inhabitants. The simulation results show that the inhabitant agents are accelerated and gathered to the two sugar peaks as we expected. When the inhabitant density exceeds a certain criteria, a child block with refined grids is adaptively created at the corresponding region. We confirmed that the motions of the inhabitants crossing the block boundaries are smooth, which ensures that the particle treatment to the AMR framework has been correctly implemented. |
Hideyuki Usui, Saki Kito, Masanori Nunami and Masaharu Matsumoto |