The trimming power of on-chip optical networks consisted by 105 microrings is simulated. The total trimming power is no larger than 14W from 20°C to 100°C, if the distribution of these rings is optimized. &#...
详细信息
Coordination among users is an indispensable part in wireless networks for efficient medium access. Alone with the rapid increase of transmission rate, however, coordination time becomes insufferable. We present AFD, ...
详细信息
Coordination among users is an indispensable part in wireless networks for efficient medium access. Alone with the rapid increase of transmission rate, however, coordination time becomes insufferable. We present AFD, namely asymmetric full duplex, to achieve high coordination efficiency at nearly zero overhead. In AFD, channel contention is performed simultaneously with data transmission. We propose a 3D pipeline contention scheme where the contention process is divided into several parallel stages and executed in a pipelined manner in a 3D domain specified by time, frequency and spatial antenna. To mitigate the interference between the data packet and the contention signal, we adopt a singleton PN sequence as a contention pilot. AFD provides a novel network-scale full duplex capability. The performance is evaluated by both simulations and measurements in a testbed. AFD outperforms IEEE 802.11 significantly, i.e., the Jain's fairness index is around 0.95 with a throughput gain up to 120%.
Erasure codes are promising for improving the reliability of the storage system due to its space efficiency compared to the replication methods. Traditional erasure codes split data into equalsized data blocks and enc...
详细信息
Erasure codes are promising for improving the reliability of the storage system due to its space efficiency compared to the replication methods. Traditional erasure codes split data into equalsized data blocks and encode strips in different data blocks. This brings heavy repairing traffic when clients read parts of the data, since most strips read for repairing are not in the expected blocks. This paper proposes a novel discrete data dividing method to completely avoid this problem. The key idea is to encode strips from the same data block. We could see that for repairing failed blocks, the strips to be read are either in the same data block with corrupted strips or from the encoded strips. Therefore, no data is wasted. We design and implement this data layout into a HDFS-like storage system. Experiments over a small-scale testbed shows that the proposed discrete data divided method avoids downloading data blocks that are not needed for clients during the repairing operations.
Power-gating is a representative circuit level technique to mitigate leakage power. While in low-power Network-on-Chip (NoC) design, the former fine-grained power-gating methods will decrease network performance due t...
详细信息
Power-gating is a representative circuit level technique to mitigate leakage power. While in low-power Network-on-Chip (NoC) design, the former fine-grained power-gating methods will decrease network performance due to serial wake-up latency and head-of-line blocking. Therefore, we propose a flexible Virtual Channel (VC) management scheme for fine-grained power-gating to achieve high throughput and low-power. The proposed power-gating method with the early wake-up is evaluated by using some synthetic workloads. When compared with an optimized early wake-up power-gating technique, it can improve performance effectively in medium and high network loads, and increases the network throughput by 15.7%~44.1% for different synthetic loads, while keeps network power consumption as low as the optimized method. For the PARSEC application traces of token based protocol, it can significantly decrease packet latency by 20.3% on average, however only increases less than 3.6% peak power when compared with the optimized method.
Coordination among users is an indispensable part in wireless networks for efficient access control. Alone with the rapid increase of the data transmission rate, however, coordination time becomes insufferable, even s...
详细信息
Coordination among users is an indispensable part in wireless networks for efficient access control. Alone with the rapid increase of the data transmission rate, however, coordination time becomes insufferable, even several times higher than that for data transmission. We present SIF, a signature-based frequency-domain contention mechanism to achieve high coordination efficiency with low overhead. In SIF, different user is assigned by a different PN sequence as a signature. A contending user issues its signature on some specific OFDM subcarriers and uses the binary sequence of the ON/OFF states of all OFDM subcarriers to deliver the contend information. A signature-based detection method is proposed to detect the CVs of other nodes quickly and reliably. It is shown that, the collision probability of SIF is very low even in a large wireless networks, e.g., less than 0.2% with 100 users. Moreover, as SIF can complete the coordination within one slot in most cases, the throughput gain is up to 200% in comparison with 802.11.
Traditional wireless relay networks have large end-to-end time delay and low throughput because of the limit that it can't receive and forward at the same time. In this paper, we proposed IWFR: Immediate Wireless ...
详细信息
Traditional wireless relay networks have large end-to-end time delay and low throughput because of the limit that it can't receive and forward at the same time. In this paper, we proposed IWFR: Immediate Wireless Full-Duplex Relay which exploits the advantages of full-duplex to shorten the end-to-end time delay and improve the throughput. At the same time, we designed a new implicit acknowledgement mechanism, which can eliminate the ACK overheads and evidently improve the throughput of the relay. To implement IWFR, we also modified the full-duplex node architecture to make it support for immediate relay. Simulation shows that IWFR shortens the end-to-end time delay by 60% on average and improves the throughput to 240% of the original relay.
Deep Belief Networks (DBNs) are state-of-art Machine Learning techniques and one of the most important unsupervised learning algorithms. Training DBNs is computationally intensive which naturally leads to investigate ...
详细信息
Deep Belief Networks (DBNs) are state-of-art Machine Learning techniques and one of the most important unsupervised learning algorithms. Training DBNs is computationally intensive which naturally leads to investigate FPGA acceleration. Fixed-point arithmetic can be used when implementing DBNs in FPGAs to reduce execution time, but it is not clear the implications for accuracy. Previous studies have focused only on accelerators using some fixed bit-widths. A contribution of this paper is to demonstrate the bit-width effect on various configurations of DBNs in a comprehensive way by experimental evaluation. Explicit performance changing points are found using various bit-widths. The impact of sigmoid function approximation, required part of DBNs, is evaluated. A solution of mixed bit-widths DBN is proposed, fitting the bit-widths of FPGA primitives and gaining similar performance to the software implementation. Our results provide a guide to inform the design choices on bit-widths when implementing DBNs in FPGAs documenting clearly the trade-off in accuracy.
Cloud services must upgrade continuously in order to maintain ***,a large body of empirical evidence suggests that,upgrade procedures used in practice are failure-prone and often cause planned or unplanned *** this pa...
详细信息
Cloud services must upgrade continuously in order to maintain ***,a large body of empirical evidence suggests that,upgrade procedures used in practice are failure-prone and often cause planned or unplanned *** this paper,we first define what is cloud service online upgrade,and then we analyze the shortcomings of current mainstream cloud service online upgrade *** particular,the mixed version problem along with rolling upgrade and the capacity loss problem brought by split mode upgrade have been *** that,we propose a solution called Delayed Switch,which can conduct cloud service upgrade with lower loss of availability and capacity in contrast with existing *** prove the performance of delayed switch in theory and develop a prototype system applying this *** by conducting experiments with a typical e-commerce service named Rubis,we validate the effectiveness and efficiency of our approach.
Asynchrony based overlapping of computation and communication is commonly used in MPI applications. However, this overlapping introduces synchronization errors frequently in asynchronous MPI programming. In this paper...
详细信息
ISBN:
(纸本)9781479974276
Asynchrony based overlapping of computation and communication is commonly used in MPI applications. However, this overlapping introduces synchronization errors frequently in asynchronous MPI programming. In this paper, we propose a symbolic execution based method for detecting input-related synchronization errors. The path space of an MPI program is systematically explored, and the related operations of the synchronization errors in the program are checked specifically. In addition, two optimizations are proposed to improve the efficiency. We have implemented our method as a prototype tool based on the symbolic executor Cloud9. The results of the extensive experiments indicate the effectiveness of our method.
Performance bugs that don't cause fail-stop errors but degradation of system performance have been one of the most fundamental issues in the production platform. How to effectively online detect bugs becomes more ...
详细信息
Performance bugs that don't cause fail-stop errors but degradation of system performance have been one of the most fundamental issues in the production platform. How to effectively online detect bugs becomes more and more urgent for engineers. Performance bugs usually manifest themselves as the anomalous call structures of request traces or anomalous latencies of invoked methods. In this paper, we propose an automatic performance bug online detecting approach, CloudDoc. CloudDoc maintains a performance model mined from execution traces that are collected in the normal period. The performance model captures the characteristics of call structures of request traces together with corresponding latencies. With the performance model, CloudDoc periodically detects whether performance bugs occur or not. All suspicious call structures or latency-abnormal invoked methods are presented to engineers. We report two case studies to demonstrate the effectiveness of CloudDoc in helping engineers identify performance bugs.
暂无评论