Broadband ISDN has made possible a variety of new multimedia services, but also created new problems for congestion control, due to the bursty nature of traffic sources. Lazar and Pacifici (1991) showed that traffic p...
详细信息
Broadband ISDN has made possible a variety of new multimedia services, but also created new problems for congestion control, due to the bursty nature of traffic sources. Lazar and Pacifici (1991) showed that traffic prediction is able to alleviate this problem. The traffic prediction model in their framework is a special case of the Box-Jenkins ARIMA model. In this paper, we propose a neural network approach for traffic prediction. A (1,5,1) backpropagation feedforward neural network is trained to capture the linear and nonlinear regularities in several time series. A comparison between the results from the neural network approach and the Box-Jenkins approach is also provided. The nonlinearity used in this paper is chaotic. We have designed a set of experiments to show that a neural network's prediction performance is only slightly affected by the intensity of the stochastic component (noise) in a time series. We have also demonstrated that a neural network's performance should be measured against the variance of the noise, in order to gain more insight into its behavior and prediction performance. Based on experimental results, we then conclude that the neural network approach is an attractive alternative to traditional regression techniques as a tool for traffic prediction.< >
Detection of protein families in large scale database is a difficult but important biological problem. Computational clustering methods can effectively address the problem. Although there exist many clustering algorit...
详细信息
Detection of protein families in large scale database is a difficult but important biological problem. Computational clustering methods can effectively address the problem. Although there exist many clustering algorithms, most of them are just based on the threshold. Their computational performances are affected by the weight distribution greatly, and they are only valid for some special networks. A new network clustering algorithm, Markov Finding and Clustering (MFC), is proposed to cluster the proteins into their functionally specific families accurately in this paper. The MFC algorithm makes an improvement in the random walk process and reduces the affection of the noise on the clustering result. It has a good performance on these networks which are not well addressed by existing algorithms sensitive to the noise. Finally, experiments on the protein sequence datasets demonstrate that the algorithm is effective in the detection of protein families and has a better performance than the current algorithms.
In recent years, erasure codes have become the de facto standard for data protection of large scale distributed cloud storage systems at the cost of an affordable storage overhead. While traditional erasure coding sch...
详细信息
ISBN:
(纸本)9781509036547
In recent years, erasure codes have become the de facto standard for data protection of large scale distributed cloud storage systems at the cost of an affordable storage overhead. While traditional erasure coding schemes, such as Reed-Solomon codes, suffer from high reconstruction cost and I/Os. The recent past has seen a plethora of efforts to optimize the tradeoff between the reconstruction cost, I/Os and storage overhead. Quietly different from all prior studies, in this paper, our erasure coding technology makes the first attempt to take advantage of the unequal failure rates across the disks/nodes to optimize the reconstruction performance and system reliability. Specifically, our proposed technology, the Unequal Failure Protection based Local Reconstruction Code (UFP-LRC) divides the data blocks into several unequal-sized groups with local parities, assigning the data blocks stored on more failure-prone disks/nodes into the smaller-sized group, so as to provide unequal failure protection for each group. In this way, by exploiting the nonuniform local parity degrees, the proposed UFP-LRC enables the data blocks that are stored on more failure-prone disks/nodes to tolerate a greater number of failures while suffer from less repair cost than others, leading to a substantial improvement of overall repair performance and reliability for cloud storage system. We perform numerical analysis and build a prototype storage system to verify our approach. The analytical results show that the UFPLRC technique gradually outperforms LRC along the increase of failure rate ratio. Also, extensive experiments show that, when compared to LRC, UFP-LRC is able to achieve a 10% to 13% improvement in throughput, and a 8% to 12% reduction in decoding latency, while retaining a comparable overall reliability.
Multi-accelerator servers are increasingly being deployed in shared multi-tenant environments (such as in cloud data centers) in order to meet the demands of large-scale compute-intensive workloads. In addition, these...
详细信息
ISBN:
(数字)9781450384421
ISBN:
(纸本)9781665483902
Multi-accelerator servers are increasingly being deployed in shared multi-tenant environments (such as in cloud data centers) in order to meet the demands of large-scale compute-intensive workloads. In addition, these accelerators are increasingly being inter-connected in complex topologies and workloads are exhibiting a wider variety of inter-accelerator communication patterns. However, existing allocation policies are ill-suited for these emerging use-cases. Specifically, this work identifies that multi-accelerator workloads are commonly fragmented leading to reduced bandwidth and increased latency for inter-accelerator communication. We propose Multi-Accelerator Pattern Allocation (MAPA), a graph pattern mining approach towards providing generalized allocation support for allocating multi-accelerator workloads on multi-accelerator servers. We demonstrate that MAPA is able to improve the execution time of multi-accelerator workloads and that MAPA is able to provide generalized benefits across various accelerator topologies. Finally, we demonstrate a speedup of 12.4% for 75th percentile of jobs with the worst case execution time reduced by up to 35% against baseline policy using MAPA.
Biological data exists all over the world as various Web services, which provide biologists with much useful information. However, heterogeneous data formats present a technical hurdle for biologists to fully take adv...
详细信息
ISBN:
(纸本)0780389328
Biological data exists all over the world as various Web services, which provide biologists with much useful information. However, heterogeneous data formats present a technical hurdle for biologists to fully take advantage of the information. It needs some power tools to handle this issue. The grid technology could help common biology tools with highperformance and high throughput. Even so, the data formats produced from various biology tools are heterogeneous. The process of information integration of heterogeneous biological data is complex and difficult. This paper describes an approach to solve this problem by using XML technologies combine with BioGrid system. We use BioJava to integrate with our system for translating data into XML format. Finally, an example is used to illustrate how these techniques can come together in integrating heterogeneous biological data sources.
A three-dimensional (3D) multilayer model based on the skin physical structure is developed to investigate the transient thermal response of human skin subject to laser heating. The temperature distribution of the ski...
详细信息
Clusters of workstations are being extensively used for solving computationally intensive scientific problems. However, there is limited support for quality of service (QoS) based distributed computing on commercial o...
详细信息
Clusters of workstations are being extensively used for solving computationally intensive scientific problems. However, there is limited support for quality of service (QoS) based distributed computing on commercial off- the-shelf (COTS) clusters. This limitation has restricted successful deployment of distributed real-time high-performancecomputing applications to customized and dedicated embedded multi-processor systems. This paper describes research work that attempts to provide a cluster platform that can guarantee access to computational and communication resources to distributed applications. The authors have developed PromisQoS, an architecture that supports execution of hard real-time distributed applications on a Linux cluster while providing high-throughput and low-latency communication using Myrinet. PromisQoS consists of the following major components - Hare, BDM-RT and Turtle. Hare is a prototype implementation of time-based QoS channels specified by the real-time message passing interface (MPI/RT 1.1) standard. BDM-RT is a low-level messaging library on Myrinet that provides deterministic communication latency and bandwidth on Myrinet. Turtle, a variant of RT-Linux, is the real-time operating system that provides guaranteed computation time. This work demonstrates that it is possible to deploy hard real-time distributed applications on COTS clusters and underlines the significance of the MPI/RT API in the realm of distributed high-performancecomputing applications that require QoS.
暂无评论