Moving objects detection is important in traffic video analysis, and many algorithms are being increasingly applied to moving objects detection. Most of these algorithms are time-consuming and cannot satisfy real-time...
详细信息
To provide timely results for ‘Big Data Analytics’, it is crucial to satisfy deadline requirements for MapReduce jobs in production environments. In this paper, we propose a deadline-oriented task scheduling approac...
详细信息
Stragglers can temporize jobs and reduce cluster efficiency seriously. Many researches have been contributed to the solution, such as Blacklist[8], speculative execution[1, 6], Dolly[8]. In this paper, we put forward ...
详细信息
As the rapid growth of open source software, how to choose software from many alternatives becomes a great challenge. Traditional ranking approaches mainly focus on the characteristics of the software themselves, such...
详细信息
As we are approaching the exascale era in supercomputing, designing a balanced computer system with powerful computing ability and low energy consumption becomes increasingly important. GPU is a widely used accelerato...
详细信息
ISBN:
(纸本)9781509032068
As we are approaching the exascale era in supercomputing, designing a balanced computer system with powerful computing ability and low energy consumption becomes increasingly important. GPU is a widely used accelerator in most recently applied supercomputers. It adopts massive multithreads to hide long latency and has high energy efficiency. In contrast to its strong computing power, GPUs have few on-chip resources with several MB of fast on-chip memory storage per SM (Streaming Multiprocessors). GPU caches exhibit poor efficiency due to the mismatch of the throughput-oriented execution model and its cache hierarchy design. Since the severe deficiency in on-chip memory, the benefit of high computing capacity of GPUs is pulled down by the poor cache performance dramatically, which limits system performance and energy-efficiency. In this paper, we put forward a locality protected scheme to make full use of the data locality based on the fixed capacity. We present a Locality Protected method based on instruction PC (LPP) to promote GPU performance. Firstly, we use a PC-based collector to collect the reuse information of each cache line. After getting the dynamic reuse information of the cache line, we take an intelligent cache allocation unit (ICAU) which coordinates the reuse information with LRU (Least Recently Used) replacement policy to find out the cache line with the least locality for eviction. The results show that LPP provides an up to 17.8% speedup and an average of 5.5% improvement over the baseline method.
Functional programming languages have a long history and receive more and more attention today. The paper focuses on the development of functional languages and aims to introduce the concepts, such as higher-order fun...
详细信息
Currently, the performance problems of software systems gets more and more attentions. Among various diagnosis methods based on system traces, principal component analysis (PCA) based methods are widely used due to th...
详细信息
Bloom filters are frequently used to perform set queries that test the existence of some items. However, Bloom filters face a dilemma: the transmission bandwidth and the accuracy cannot be optimized simultaneously. Th...
详细信息
This paper investigates the problem of maximizing uniform multicast throughput (MUMT) for multi-channel dense wireless sensor networks, where all nodes locate within one-hop transmission range and can communicate with...
详细信息
ISBN:
(纸本)9781509056972
This paper investigates the problem of maximizing uniform multicast throughput (MUMT) for multi-channel dense wireless sensor networks, where all nodes locate within one-hop transmission range and can communicate with each other on multiple orthogonal channels. This kind of networks show wide application in the real world, and maximizing uniform multicast throughput for these networks is worth deep studying. Previous researches have proved MUMT problem is NP-hard. However, previous researches are either hard to implement, or use too many relay nodes to complete the multicast task, and thus incur high overhead or poor performance. To efficiently solve MUMT problem, we adopt the concept of the maximum independent set with the size constraint, and present one novel Single-Broadcast based Multicast algorithm called SBM based on the concept. We prove that SBM algorithm achieves a constant ratio to the theoretical throughput upper bound. Extensive experimental results demonstrate that, SBM performs better than existing work in terms of both the uniform multicast throughput and the total number of transmissions.
The coupling of microwaves into apertures plays an important part in many electromagnetic physics and engineering fields. When the width of apertures is very small, Finite Difference Time Domain (FDTD) simulation of t...
详细信息
暂无评论