Main Track (MT) Session 3
Time and Date: 11:00 - 12:40 on 11th June 2014
Room: Kuranda
Chair: E. Luque
186 | Triplet Finder: On the Way to Triggerless Online Reconstruction with GPUs for the PANDA Experiment
[abstract] Abstract: PANDA is a state-of-the-art hadron physics experiment currently under construction at FAIR, Darmstadt. In order to select events for offline analysis, PANDA will use a software-based triggerless online reconstruction, performed with a data rate of 200 GB/s. To process the raw data rate of the detector in realtime, we design and implement a GPU version of the Triplet Finder, a fast and robust first-stage tracking algorithm able to reconstruct tracks with good quality, specially designed for the Straw Tube Tracker subdetector of PANDA. We reduce the algorithmic complexity of processing many hits together by splitting them into bunches, which can be processed independently. We evaluate different ways of processing bunches, GPU dynamic parallelism being one of them. We also propose an optimized technique for associating hits with reconstructed track candidates. The evaluation of our GPU implementation demonstrates that the Triplet Finder can process almost 6 Mhits/s on a single K20X GPU, making it a promising algorithm for the online event filtering scheme of PANDA. |
Andrew Adinetz, Andreas Herten, Jiri Kraus, Marius Mertens, Dirk Pleiter, Tobias Stockmanns, Peter Wintz |
189 | A Technique for Parallel Share-Frequent Sensor Pattern Mining from Wireless Sensor Networks [abstract] Abstract: WSNs generate huge amount of data in the form of streams and mining useful knowledge from these streams is a challenging task. Existing works generate sensor association rules using occurrence frequency of patterns with binary frequency (either absent or present) or support of a pattern as a criterion. However, considering the binary frequency or support of a pattern may not be a sufficient indicator for finding meaningful patterns from WSN data because it only reflects the number of epochs in the sensor data which contain that pattern. The share measure of sensorsets could discover useful knowledge about numerical values associated with sensor in a sensor database. Therefore, in this paper, we propose a new type of behavioral pattern called share-frequent sensor patterns by considering the non-binary frequency values of sensors in epochs. To discover share-frequent sensor patterns from sensor dataset, we propose a novel parallel and distributed framework. In this framework, we develop a novel tree structure, called parallel share-frequent sensor pattern tree (PShrFSP-tree) that is constructed at each local node independently, by capturing the database contents to generate the candidate patterns using a pattern growth technique with a single scan and then merges the locally generated candidate patterns at the final stage to generate global share-frequent sensor patterns. Comprehensive experimental results show that our proposed model is very efficient for mining share-frequent patterns from WSN data in terms of time and scalability. |
Md Mamunur Rashid, Dr. Iqbal Gondal, Joarder Kamruzzaman |
205 | Performance-Aware Energy Saving Mechanism in Interconnection Networks for Parallel Systems [abstract] Abstract: Growing processing power of parallel computing systems require interconnection networks a higher level of complexity and higher performance, thus consuming more energy. Link components contributes a substantial proportion of the total energy consumption of the networks. Many researchers have proposed approaches to judiciously change the link speed as a function of traffic to save energy when the traffic is light. However, the link speed reduction incurs an increase in average packet latency, thus degrades network performance. This paper addresses that issue with several proposals. The simulation results show that the extended energy saving mechanism in our proposals outperforms the energy saving mechanisms in open literature. |
Hai Nguyen, Daniel Franco, Emilio Luque |
214 | Handling Data-skew Effects in Join Operations using MapReduce [abstract] Abstract: For over a decade, MapReduce has become a prominent programming model to handle vast amounts of raw data in large scale systems. This model ensures scalability, reliability and availability aspects with reasonable query processing time. However these large scale systems still face some challenges: data skew, task imbalance, high disk I/O and redistribution costs can have disastrous effects on performance. In this paper, we introduce MRFA-Join algorithm: a new frequency adaptive algorithm based on MapReduce programming model and a randomised key redistribution approach for join processing of large-scale datasets. A cost analysis of this algorithm shows that our approach is insensitive to data skew and ensures perfect balancing properties during all stages of join computation. These performances have been confirmed by a series of experimentations. |
Mostafa Bamha, Frédéric Loulergue, Mohamad Al Hajj Hassan |
216 | Speeding-Up a Video Summarization Approach using GPUs and Multicore-CPUs [abstract] Abstract: The recent progress of digital media has stimulated the creation, storage and distribution of data, such as digital videos, generating a large volume of data and requiring ecient technologies to increase the usability of these data. Video summarization methods generate concise summaries of video contents and enable faster browsing, indexing and accessing of large video collections, however, these methods often perform slow with large duration and high quality video data. One way to reduce this long time of execution is to develop a parallel algorithm, using the advantages of the recent computer architectures that allow high parallelism. This paper introduces parallelizations of a summarization method called VSUMM, targetting either Graphic Processor Units (GPUs) or multicore Central Processor Units (CPUs), and ultimately a sensible distribution of computation steps onto both hardware to maximise performance, called \hybrid". We performed experiments using 180 videos varying frame resolution (320 x 240, 640 x 360, and 1920 x 1080) and video length (1, 3, 5, 10, 20, and 30 minutes). From the results, we observed that the hybrid version reached the best results in terms of execution time, achieving 7 speed up in average. |
Suellen Almeida, Antonio Carlos Nazaré Jr, Arnaldo De Albuquerque Araújo, Guillermo Cámara-Chávez, David Menotti |