The algorithm based on membrane system, also called membrane-inspired algorithm, has been shown to be powerful for solving combinatorial optimization problems, and it is increasingly used in practical engineering. Wit...
详细信息
Due to the explosive growth of the Internet, stream services are becoming more popular. The large media servers supporting thousands of concurrent media streams have to satisfy us with both good throughout and high av...
详细信息
ISBN:
(纸本)8955191197
Due to the explosive growth of the Internet, stream services are becoming more popular. The large media servers supporting thousands of concurrent media streams have to satisfy us with both good throughout and high availability. Although they are reciprocal from the viewpoint of system performance, the stream service continuity, as well as throughput, should be considered in clustered media servers. This is because media stream services should be continuously available to the clients even on the event of system failure. In this paper, we introduce service availability as the gauge of service degree on media servers, and present the basic scheme for guaranteeing continuous stream service. To do this, we propose LSS(Log Stream Status) and RRD(Ready Resource for instant Delivery) mechanisms. They are classified according to the status management method of stream service and the resumption method of closed service. Although these mechanisms require additional overhead, we found that they can be applied to clustered media servers with minimum overhead. Additionally, we concluded that they can provide higher quality of service availability on traditional clustered media servers.
Distributed Virtual Environment (DVE) systems have become more and more important both in academic communities and the industries. To guarantee the load constrain, the physical world integrity and the virtual world in...
详细信息
In this paper,we provide a unified expression to obtain the conditions on the restricted isometry constantδ2s(Φ).These conditions cover the important results proposed by Candes et *** each of them is a sufficient co...
详细信息
In this paper,we provide a unified expression to obtain the conditions on the restricted isometry constantδ2s(Φ).These conditions cover the important results proposed by Candes et *** each of them is a sufficient condition for sparse signal *** the noiseless case,whenδ2s(Φ)satisfies any one of these conditions,the s-sparse signal can be exactly recovered via(l1)constrained minimization.
A new lookahead instruction prefetching mechanism is proposed in this paper. Though significant performance improvement can be obtained by improving both the cache miss ratio and average access time for successfully p...
详细信息
A new lookahead instruction prefetching mechanism is proposed in this paper. Though significant performance improvement can be obtained by improving both the cache miss ratio and average access time for successfully prefetched blocks, most conventional prefetching mechanisms improve only one out of the two factors. To achieve balanced improvement of the two factors, a lookahead prefetching scheme that fetches multiple blocks for a prefetch request and adopts prefetch on miss mechanism is proposed. Performance evaluation is carried out through the trace-driven simulation and the proposed prefetch scheme reduces 32%/spl sim/56% of the memory access delay time of the cache system that does not perform any prefetching.
In multiprocessor systems, the cache misses due to coherence transactions make up many of the total cache misses. However this type of cache miss is strongly dependent on the type of data sharing among processors, esp...
详细信息
In multiprocessor systems, the cache misses due to coherence transactions make up many of the total cache misses. However this type of cache miss is strongly dependent on the type of data sharing among processors, especially false sharing. Until now the small cache block size has been used to avoid false sharing mainly in multiprocessor systems, but the smaller the cache block size, the lower the prefetching effect. Moreover it is shown that high spatial locality appears in many parallel programs. The paper presents two advanced full-map directory schemes which provide a low cache miss ratio and communication traffic by avoiding false sharing and taking advantage of the spatial locality existing in many parallel programs. The performance was evaluated by the event-driven simulator and the empirical results show that the proposed scheme can provide about a 6/spl sim/77% decrease in the cache miss ratio and a 46/spl sim/96% decrease in the communication traffic.
In this paper an effective memory-processor integrated architecture, called memory based processor array for artificial neural networks (MPAA), is proposed. The MPAA can be easily integrated into any host system via m...
详细信息
ISBN:
(纸本)0780341228
In this paper an effective memory-processor integrated architecture, called memory based processor array for artificial neural networks (MPAA), is proposed. The MPAA can be easily integrated into any host system via memory interface. Specifically, the MPAA system provides an efficient mechanism for its local memory accesses allowed by the row basis and the column basis using the hybrid row and column decoding, which is suitable for the computation model of ANNs such as the accessing and alignment patterns given for matrix-by-vector operations. Mapping algorithms to implement the multilayer perceptron with backpropagation learning on the MPAA system are also provided. The proposed algorithms support both neuron and layer level parallelisms which allow the MPAA system to operate the learning phase as well as the recall phase in the pipelined fashion. Performance evaluation is provided by detailed comparison in terms of two metrics such as the cost and the number of computation steps.
The article proposes a selective compressed memory system (SCMS) focusing on a compressed cache architecture, in which only data blocks with good compression efficiency are compressed selectively and all compressed bl...
详细信息
The article proposes a selective compressed memory system (SCMS) focusing on a compressed cache architecture, in which only data blocks with good compression efficiency are compressed selectively and all compressed blocks are stored in a fixed memory space. The selective compression technique can reduce decompression overhead caused by online data decompression and also the fixed memory space allocation allows efficient management of the compressed blocks. The results from a trace driven simulation show that the SCMS approach can provide around a 35% decrease in the on-chip cache miss ratio as well as a 53% decrease in the data traffic over conventional memory systems. Furthermore, a large amount of the decompression overhead can be reduced, and thus the average memory access time can also be reduced by a maximum 20% against conventional memory systems.
Similarity computation is especially significant in collaborative filtering algorithms. In the existed literatures or large recommender systems, researchers generally use cosine similarity or Pearson correlation coeff...
详细信息
This research explores any potential for an on-chip cache compression which can reduce not only cache miss ratio but also miss penalty, if main memory is also managed in compressed form. However, the decompression tim...
详细信息
This research explores any potential for an on-chip cache compression which can reduce not only cache miss ratio but also miss penalty, if main memory is also managed in compressed form. However, the decompression time causes a critical effect on the memory access time and variable-sized compressed blocks tend to increase the design complexity of the compressed cache architecture. This paper suggests several techniques to reduce the decompression overhead and to manage the compressed blocks efficiently which include selective compression, fixed space allocation for the compressed blocks, parallel decompression, the use of a decompression buffer, and so on. Moreover a simple compressed cache architecture based on the above techniques and its management method are proposed. The results from trace-driven simulation show that this approach can provide around 35% decrease in the on-chip cache miss ratio as well as a 53% decrease in the data traffic over the conventional memory systems. Also, a large amount of the decompression overhead can be reduced, and thus the average memory access time can also be reduced by maximum 20% against the conventional memory systems.
暂无评论