Recent advances in computing and sensor technologies have facilitated the emergence of increasingly sophisticated and complex cyber-physical systems and wireless sensor networks. Moreover, integration of cyber-physica...
详细信息
ISBN:
(纸本)9781509060580
Recent advances in computing and sensor technologies have facilitated the emergence of increasingly sophisticated and complex cyber-physical systems and wireless sensor networks. Moreover, integration of cyber-physical systems and wireless sensor networks with other contemporary technologies, such as unmanned aerial vehicles (i.e. drones) and fog computing, enables the creation of completely new smart solutions. By building upon the concept of a Smart Mobile Access Point (SMAP), which is a key element for a smart network, we propose a novel hierarchical placement strategy for SMAPs to improve scalability of SMAP based monitoring systems. SMAPs predict communication behavior based on information collected from the network, and select the best approach to support the network at any given time. In order to improve the network performance, they can autonomously change their positions. Therefore, placement of SMAPs has an important role in such systems. Initial placement of SMAPs is an NP problem. We solve it using a parallel implementation of the genetic algorithm with an efficient evaluation phase. The adopted hierarchical placement approach is scalable;it enables construction of arbitrarily large SMAP based systems.
The paper presents an experience in incorporating Big Data technologies into introductory parallel and distributed computing courses and building a service-oriented infrastructure to support practical exercises involv...
详细信息
ISBN:
(纸本)9783319712543;9783319712550
The paper presents an experience in incorporating Big Data technologies into introductory parallel and distributed computing courses and building a service-oriented infrastructure to support practical exercises involving these technologies. The presented approach helped to provide a smooth practical experience for students with different technical background by enabling them to run and test their MapReduce and Spark programs on a provided Hadoop cluster via convenient web interfaces. This approach also enabled automation of routine actions related to submission of programs to a cluster and evaluation of programming assignments.
This paper introduces aspect libraries, a unit of modularity in parallel programs with compositional properties. Aspects address the complexity of parallel programs by enabling the composition of (multiple) parallelis...
详细信息
ISBN:
(纸本)9783319619828;9783319619811
This paper introduces aspect libraries, a unit of modularity in parallel programs with compositional properties. Aspects address the complexity of parallel programs by enabling the composition of (multiple) parallelism modules with a given (sequential) base program. This paper illustrates the introduction of parallelism using reusable parallel libraries, coded in AspectJ. These libraries provide performance comparable to traditional parallel programming techniques and enable the composition of multiple parallelism modules (e.g., shared memory with distributed memory) with a given base program.
Data-race-free (DRF) parallel programming becomes a standard as newly adopted memory models of mainstream programming languages such as C++ or Java impose data-race-freedom as a requirement. We propose compiler techni...
详细信息
ISBN:
(纸本)9781509049318
Data-race-free (DRF) parallel programming becomes a standard as newly adopted memory models of mainstream programming languages such as C++ or Java impose data-race-freedom as a requirement. We propose compiler techniques that automatically delineate extended data-race-free regions (xDRF), namely regions of code which provide the same guarantees as the synchronization-free regions (in the context of DRF codes). xDRF regions stretch across synchronization boundaries, function calls and loop back-edges and preserve the data-race-free semantics, thus increasing the optimization opportunities exposed to the compiler and to the underlying architecture. Our compiler techniques precisely analyze the threads' memory accessing behavior and data sharing in shared-memory, general-purpose parallel applications and can therefore infer the limits of xDRF code regions. We evaluate the potential of our technique by employing the xDRF region classification in a state-of-the-art, dual-mode cache coherence protocol. Larger xDRF regions reduce the coherence bookkeeping and enable optimizations for performance (6.8%) and energy efficiency (11.7%) compared to a standard directory-based coherence protocol.
The functionality of DRAMs, especially the state transitions are described in JEDEC standards. These standards contain a finite state machine, which intends to provide an overview of the possible state transitions and...
详细信息
ISBN:
(纸本)9781538634370
The functionality of DRAMs, especially the state transitions are described in JEDEC standards. These standards contain a finite state machine, which intends to provide an overview of the possible state transitions and the commands to control them. However, today's DRAMs are highly concurrent devices as they provide bank parallelism. The state diagram used in JEDEC standards does not model this concurrency and furthermore it is misleading in several aspects. In this paper, for the first time we present an easily comprehensive model of the DRAM states and transitions, using a Petri Net, which covers also the DRAM concurrency.
In this paper, a parallel implementation of the cellular-automata interference algorithm for two waves using the fragmented programming technology and LuNA system based on it is proposed. The technology is based on a ...
详细信息
ISBN:
(纸本)9783319629322;9783319629315
In this paper, a parallel implementation of the cellular-automata interference algorithm for two waves using the fragmented programming technology and LuNA system based on it is proposed. The technology is based on a strategy of data flow control. Unlike existing systems and technologies, LuNA provides a unified technology for implementing parallel programs on a heterogeneous multicomputer. The LuNA program contains a description of data fragments, computational fragments, and information dependencies between them. In the work, the LuNA program was executed on a computational cluster with homogeneous nodes. The results of comparison of the LuNA and MPI implementations showed that the execution time of the LuNA program exceeded that of the MPI program. This is due to the peculiarities of algorithms used for the distribution, search and transfer of data and computation fragments between the nodes of a cluster. The complexity of writing the LuNA program is much lower than for the MPI program.
Bayesian Network algorithms are widely applied in the fields of bioinformatics, document classification, big data, and marketing informatics. In this paper, several Bayesian Network algorithms are evaluated, including...
详细信息
ISBN:
(纸本)9781538619964
Bayesian Network algorithms are widely applied in the fields of bioinformatics, document classification, big data, and marketing informatics. In this paper, several Bayesian Network algorithms are evaluated, including Naive Bayes, Tree Augmented Naive Bayes, k-BAN, and k-BAN with Order Swapping. The algorithms are implemented using Scala and compared with the bnlearn library in R and Weka. Several datasets with varying numbers of attributes and instances are used to test the accuracy and efficiency of the implementations of the algorithms provided by the three packages. When handling huge datasets, issues involving accuracy, efficiency, and serial vs. parallel execution become more critical and should be addressed. We implemented several parallel algorithms as well as an efficient way to perform cross-validations, resulting in significant speedups.
Because of the wide use of randomized scheduling in concurrency testing research, it is important to understand randomized scheduling and its limitations. This work analyzes how randomized scheduling discovers concurr...
详细信息
ISBN:
(纸本)9781538626849
Because of the wide use of randomized scheduling in concurrency testing research, it is important to understand randomized scheduling and its limitations. This work analyzes how randomized scheduling discovers concurrency bugs by focusing on the probabilities of the two possible orders of a pair of events. Analysis shows that the disparity between probabilities can be large for programs that encounter a large number of events during execution. Because sets of ordered event pairs define conditions for discovering concurrency bugs, this disparity can make some concurrency bugs highly unlikely. The complementary nature of the two possible orders also indicates a potential trade-off between the probability of discovering frequently-occurring and infrequently-occurring concurrency bugs. To help address this trade-off in a more balanced way, randomized-stride scheduling is proposed, where scheduling granularity for each thread is adjusted using a randomized stride calculated based on thread length. With some assumptions, strides can be calculated to allow covering the least likely event pair orders. Experiments confirm the analysis results and also suggest that randomized-stride scheduling is more effective for discovering concurrency bugs compared to the original randomized scheduling implementation, and compared to other algorithms in recent literature.
Global scale human simulations have application in diverse fields such as economics, anthropology and marketing. The sheer number of agents, however, makes them extremely sensitive to variations in algorithmic complex...
详细信息
ISBN:
(纸本)9783319589435;9783319589428
Global scale human simulations have application in diverse fields such as economics, anthropology and marketing. The sheer number of agents, however, makes them extremely sensitive to variations in algorithmic complexity resulting in potentially prohibitive computational resource costs. In this paper we show that the computational capability of modern servers has increased to the point where billions of individual agents can be modeled on moderate institutional resources and (in a few years) on high end consumer systems. We close with the proposition of future frameworks to enable collaborative modelling of the global human population.
High Efficiency Video Coding is able to reduce the bit-rate up to 50% compared to H.264/AVC, using increasingly complex computational processes for motion estimation. In this paper, some motion estimation operations a...
详细信息
ISBN:
(数字)9783319522777
ISBN:
(纸本)9783319522777;9783319522760
High Efficiency Video Coding is able to reduce the bit-rate up to 50% compared to H.264/AVC, using increasingly complex computational processes for motion estimation. In this paper, some motion estimation operations are parallelised using Open Computing Language in a Graphics Processing Unit. The parallelisation strategy is three-fold: calculation of distortion measurement using 4 x 4 blocks, accumulation of distortion measure values for different block sizes and calculation of local minima. Moreover, we use 3D-arrays to store the distortion measure values and the motion vectors. Two 3D-arrays are used for transferring data from GPU to CPU to continue the encoding process. The proposed parallelisation is able to reduce the execution time, on average 52.5%, compared to the HEVC Test Model. Additionally, there is a negligible impact on the compression efficiency, as an increment in the BD-BR, on average 2.044%, and a reduction in the BD-PSNR, on average 0.062%.
暂无评论