Regenerating codes have been proposed to achieve an optimal trade-off curve between the amount of storage space and the network traffic for repair. However, existing repair schemes based on regenerating codes are inad...
详细信息
Moving objects detection is important in traffic video analysis, and many algorithms are being increasingly applied to moving objects detection. Most of these algorithms are time-consuming and cannot satisfy real-time...
详细信息
To provide timely results for ‘Big Data Analytics’, it is crucial to satisfy deadline requirements for MapReduce jobs in production environments. In this paper, we propose a deadline-oriented task scheduling approac...
详细信息
One of the most significant challenges introduced by routing protocol in mobile networks is coping with the unpredictable motion and the unreliable behaviour of mobile nodes. In this paper, we present a hierarchical r...
详细信息
Stragglers can temporize jobs and reduce cluster efficiency seriously. Many researches have been contributed to the solution, such as Blacklist[8], speculative execution[1, 6], Dolly[8]. In this paper, we put forward ...
详细信息
As the rapid growth of open source software, how to choose software from many alternatives becomes a great challenge. Traditional ranking approaches mainly focus on the characteristics of the software themselves, such...
详细信息
As we are approaching the exascale era in supercomputing, designing a balanced computer system with powerful computing ability and low energy consumption becomes increasingly important. GPU is a widely used accelerato...
详细信息
ISBN:
(纸本)9781509032068
As we are approaching the exascale era in supercomputing, designing a balanced computer system with powerful computing ability and low energy consumption becomes increasingly important. GPU is a widely used accelerator in most recently applied supercomputers. It adopts massive multithreads to hide long latency and has high energy efficiency. In contrast to its strong computing power, GPUs have few on-chip resources with several MB of fast on-chip memory storage per SM (Streaming Multiprocessors). GPU caches exhibit poor efficiency due to the mismatch of the throughput-oriented execution model and its cache hierarchy design. Since the severe deficiency in on-chip memory, the benefit of high computing capacity of GPUs is pulled down by the poor cache performance dramatically, which limits system performance and energy-efficiency. In this paper, we put forward a locality protected scheme to make full use of the data locality based on the fixed capacity. We present a Locality Protected method based on instruction PC (LPP) to promote GPU performance. Firstly, we use a PC-based collector to collect the reuse information of each cache line. After getting the dynamic reuse information of the cache line, we take an intelligent cache allocation unit (ICAU) which coordinates the reuse information with LRU (Least Recently Used) replacement policy to find out the cache line with the least locality for eviction. The results show that LPP provides an up to 17.8% speedup and an average of 5.5% improvement over the baseline method.
Currently, the performance problems of software systems gets more and more attentions. Among various diagnosis methods based on system traces, principal component analysis (PCA) based methods are widely used due to th...
详细信息
Functional programming languages have a long history and receive more and more attention today. The paper focuses on the development of functional languages and aims to introduce the concepts, such as higher-order fun...
详细信息
Cybercrime caused by malware becomes a persistent and damaging threat which makes the trusted security solution urgently demanded, especially for resource-constrained ends. The existing industry and academic approache...
详细信息
暂无评论