Efficient Top-k query evaluation relies on practices that utilize auxiliary data structures to enable early termination. Such techniques were designed to trade-off complex work in the buffer pool against costly access...
详细信息
Growing needs in terms of latency, throughput and flexibility are driving the architectures of tomorrow's Radio Access Networks towards more centralized configurations that rely on cloud-computing paradigms. In th...
详细信息
ISBN:
(纸本)9789897583582
Growing needs in terms of latency, throughput and flexibility are driving the architectures of tomorrow's Radio Access Networks towards more centralized configurations that rely on cloud-computing paradigms. In these new architectures, digital signals are processed on a large variety of hardware units (e.g., CPUs, Field Programmable Gate Arrays, Graphical Processing Units). Optimizing model compilers that target these architectures must rely on efficient analysis techniques to optimally generate software for signal-processing applications. In this paper, we present a blocking combination of the iterative and worklist algorithms to perform static data-flow analysis on functional views denoted with UML Activity and SysML Block diagrams. We demonstrate the effectiveness of the blocking mechanism with reaching definition analysis of UML/SysML models for a 5G channel decoder (receiver side) and a Software Defined Radio system. We show that significant reductions in the number of unnecessary visits of the models' control-flow graphs are achieved, with respect to a non-blocking combination of the iterative and worklist algorithms.
Complex models in science and engineering need better techniques to execute simulations efficiently. The Discrete-Event System Specification (DEVS) formalism, a well-known technique for modeling and simulation, has be...
详细信息
ISBN:
(纸本)9781713812883
Complex models in science and engineering need better techniques to execute simulations efficiently. The Discrete-Event System Specification (DEVS) formalism, a well-known technique for modeling and simulation, has been enhanced by executing in parallel computers. Here, we present the design and implementation of a parallel version of the Cadmium DEVS simulator. We conducted empirical evaluation for executing the protocol in multithreading architectures and discuss performance by applying these techniques in real problems.
The proceedings contain 18 papers. The special focus in this conference is on New Trends in Information and Communications Technology Applications. The topics include: parallel genetic algorithm for optimizing compile...
ISBN:
(纸本)9783030553395
The proceedings contain 18 papers. The special focus in this conference is on New Trends in Information and Communications Technology Applications. The topics include: parallel genetic algorithm for optimizing compiler sequences ordering;preface;latency evaluation of an sdn controlled by flowvisor and two heterogeneous controllers;using machine learning to predict the sequences of optimization passes;a proposed method for feature extraction to enhance classification algorithms performance;heart disease prediction system using optimization techniques;optimizing energy in cooperative sensing cognitive radios;real-time sickle cell anemia diagnosis based hardware accelerator;energy-efficient particle swarm optimization for lifetime coverage prolongation in wireless sensor networks;iris recognition using localized zernike features with partial iris pattern;analysis of hollywood’s film production using background subtraction visual activity index based on computer vision algorithm;breast cancer recognition by computer aided based on improved fuzzy c-mean and ann;brain computer interface enhancement based on stones blind source separation and naive bayes classifier;policing based traffic engineering fast reroute in sd-wan architectures: Approach development and investigation;performance comparison between traditional network and broadcast cognitive radio networks using idle probability;a developed cloud security cryptosystem using minhash technique.
With micro-services continuously gaining popularity and low-power processors making their way into data centers, efficient execution of managed runtime systems on low-power architectures is also gaining interest. Apar...
详细信息
ISBN:
(纸本)9781450369770
With micro-services continuously gaining popularity and low-power processors making their way into data centers, efficient execution of managed runtime systems on low-power architectures is also gaining interest. Apart from the inherent performance differences between high and low power processors, porting a managed runtime system to a low-power architecture may result in spuriously introducing additional overheads and design trade-offs. In this work we investigate how the lack of strong hardware support for Self Modifying Code (SMC) in low-power architectures, influences Just-In-Time (JIT) compilation and execution in modern virtual machines. In particular, we examine how low-power architectures, with no or limited hardware support for SMC, impose restrictions on call-site implementations, when the latter need to be patchable by the runtime system. We present four different memory-safe implementations for call-site generation and discuss their advantages and disadvantages in the absence of strong hardware support for SMC. Finally, we evaluate each technique on different workloads using micro-benchmarks and we evaluate the best two techniques on the Dacapo benchmark suite showcasing performance differences up to 15%.
The proceedings contain 21 papers. The special focus in this conference is on Mining Humanistic Data. The topics include: Threat Landscape of Next Generation IoT-Enabled Smart Grids;towards a Smart Port: The Role of t...
ISBN:
(纸本)9783030491895
The proceedings contain 21 papers. The special focus in this conference is on Mining Humanistic Data. The topics include: Threat Landscape of Next Generation IoT-Enabled Smart Grids;towards a Smart Port: The Role of the Telecom Industry;a Graph-Based Extension for the Set-Based Model Implementing Algorithms Based on Important Nodes;a Sentiment-Based Hotel Review Summarization Using Machine Learning techniques;an Advanced Deep Learning Model for Short-Term Forecasting U.S. Natural Gas Price and Movement;fake News Detection Regarding the Hong Kong Events from Tweets;improving Movie Recommendation Systems Filtering by Exploiting User-Based Reviews and Movie Synopses;the Converging Triangle of Cultural Content, Cognitive Science, and Behavioral Economics;application and Algorithm: Maximal Motif Discovery for Biological Data in a Sliding Window;A New Approach to 5G and MEC Integration;fingerprints Recognition System-Based on Mobile Device Identification Using Circular String Pattern Matching techniques;mining and Analysis of Air Quality Data to Aid Climate Change;business Aspects of the Neutral Host Model: The Immersive Video Services Case;combined 5G-Based Video Production and Distribution in a Crowded Stadium Event;dynamic Network Slicing: Challenges and Opportunities;dynamic Resource Allocation and Computation Offloading for Edge Computing System;Intelligent Orchestration of End-to-End Network Slices for the Allocation of Mission Critical Services over NFV architectures;on the Prediction of Future User Connections Based on Historical Records in Wireless Networks;Programmable Edge-to-Cloud Virtualization for 5G Media Industry: The 5G-MEDIA Approach.
compilationtechniques for nested-parallel applications that can adapt to hardware and dataset characteristics are vital for unlocking the power of modern hardware. This paper proposes such a technique, which builds o...
详细信息
ISBN:
(纸本)9781450362252
compilationtechniques for nested-parallel applications that can adapt to hardware and dataset characteristics are vital for unlocking the power of modern hardware. This paper proposes such a technique, which builds on flattening and is applied in the context of a functional data-parallel language. Our solution uses the degree of utilized parallelism as the driver for generating a multitude of code versions, which together cover all possible mappings of the application's regular nested parallelism to the levels of parallelism supported by the hardware. These code versions are then combined into one program by guarding them with predicates, whose threshold values are automatically tuned to hardware and dataset characteristics. Our unsupervised method-of statically clustering datasets to code versions-is different from autotuning work that typically searches for the combination of code transformations producing a single version, best suited for a specific dataset or on average for all datasets. We demonstrate-by fully integrating our technique in the repertoire of a compiler for the Futhark programming language-significant performance gains on two GPUs for three real-world applications, from the financial domain, and for six Rodinia benchmarks.
Much research work has been done to parallelize loops with recurrences over the last several decades. Recently, sampling-and-reconstruction method was proposed to parallelize a broad class of loops with recurrences in...
详细信息
ISBN:
(纸本)9781450362771
Much research work has been done to parallelize loops with recurrences over the last several decades. Recently, sampling-and-reconstruction method was proposed to parallelize a broad class of loops with recurrences in an automated fashion, with a practical runtime approach. Although the parallelized codes achieve linear scalability across multi-cores architectures, the sequential merge inherent to this method makes it not scalable on many-core architectures, such as GPUs. At the same time, existing parallel merge approaches used for simple reduction loops cannot be directly and correctly applied to this method. Based on this observation, we propose new methods to merge partial results in parallel on GPUs and achieve linear scalability. Our approach involves refined runtime-checking rules to avoid unnecessary runtime check failures and reduce the overhead of reprocessing. We also propose sample converge technique to reduce the number of sample points so that communication and computation overhead is reduced. Finally, based on GPU architectural features, we develop optimization techniques to further improve performance. Our evaluation results of a set of representative algorithms show that our parallel merge implementation is substantially more efficient than sequential merge, and achieves linear scalability on different GPUs.
The proceedings contain 54 papers. The topics discussed include: discovering architectural rules in practice;self-accounting in architecture-based self-adaptation;towards a parallel template catalogue for software per...
ISBN:
(纸本)9781450371421
The proceedings contain 54 papers. The topics discussed include: discovering architectural rules in practice;self-accounting in architecture-based self-adaptation;towards a parallel template catalogue for software performance predictions;a recommender system for software architecture decision making;an exploration and experiment tool suite for code to architecture mapping techniques;the applicability of Palladio for assessing the quality of cloud-based microservice architectures;Kubow: an architecture-based self-adaptation service for cloud native applications;a bottom-up approach for reconstructing software architecture product lines;and a formal semantics for supporting the automated synthesis of choreography-based architectures.
The proceedings contain 13 papers. The topics discussed include: future energy systems introduction;on using graph signal processing for electrical load disaggregation;variations in residential electricity demand acro...
ISBN:
(纸本)9781728148946
The proceedings contain 13 papers. The topics discussed include: future energy systems introduction;on using graph signal processing for electrical load disaggregation;variations in residential electricity demand across income categories in urban Bangalore: results from primary survey;multi-tier big data introduction;wireless water quality monitoring and quality deterioration prediction system;improving throughput of bigdata applications;techniques for automating assessment of parallel programming assignments;reskilling to match the needs of exascale architectures;experiences of teaching parallel computing to undergraduates and post-graduates;and theoretical and practical approaches for teaching parallel code correctness.
暂无评论