The conjugate residual (CR) algorithm is a Krylov subspace algorithm that can be used to obtain fast solutions for symmetric linear systems with very large and very sparse coefficient matrices. By changing the computa...
详细信息
The conjugate residual (CR) algorithm is a Krylov subspace algorithm that can be used to obtain fast solutions for symmetric linear systems with very large and very sparse coefficient matrices. By changing the computation sequence in the CR algorithm, this paper proposes an improved Conjugate Residual (ICR) algorithm. The numerical stability of ICR algorithm is same as CR algorithm, but the synchronization overhead that represents the bottleneck of the parallel performance is effectively reduced by a factor of two. And all inner products of a single iteration step are independent and communication time required for inner product can be over lapped efficiently with computation time of vector updates. From the theoretical and experimental analysis it is found that ICR algorithm is faster than CR algorithm as the number of processors in creases. The experiments performed on a 64-processor cluster indicate that ICR is approximately 30% faster than CR.
Transaction of service composition has long-lived feature which a global-transaction is divided into several distributed sub-transactions. Atomicity property is preserved by using compensating transactions, which sema...
详细信息
This paper investigates the problem of maximizing uniform multicast throughput (MUMT) for multi-channel dense wireless sensor networks, where all nodes locate within one-hop transmission range and can communicate with...
详细信息
ISBN:
(纸本)9781509056972
This paper investigates the problem of maximizing uniform multicast throughput (MUMT) for multi-channel dense wireless sensor networks, where all nodes locate within one-hop transmission range and can communicate with each other on multiple orthogonal channels. This kind of networks show wide application in the real world, and maximizing uniform multicast throughput for these networks is worth deep studying. Previous researches have proved MUMT problem is NP-hard. However, previous researches are either hard to implement, or use too many relay nodes to complete the multicast task, and thus incur high overhead or poor performance. To efficiently solve MUMT problem, we adopt the concept of the maximum independent set with the size constraint, and present one novel Single-Broadcast based Multicast algorithm called SBM based on the concept. We prove that SBM algorithm achieves a constant ratio to the theoretical throughput upper bound. Extensive experimental results demonstrate that, SBM performs better than existing work in terms of both the uniform multicast throughput and the total number of transmissions.
Due to the characteristics of stream applications and the insufficiency of conventional processors when running stream programs, stream processors which support data-level parallelism become the research hotspot. This...
详细信息
Due to the characteristics of stream applications and the insufficiency of conventional processors when running stream programs, stream processors which support data-level parallelism become the research hotspot. This paper presents two means, stream partition (SP) and stream compression (SC), to optimize streams on Imagine. The results of simulation show that SP and SC can make stream applications take full advantage of the parallel clusters, pipelines and three-level memory hierarchy of the Imagine processor, and then reduce the execution time of stream programs.
In this paper, according to the resource management problems brought by a large number of replicas, a multi-replica clustering management method based on limited-coding is proposed. In this method, according to the pr...
详细信息
In this paper, according to the resource management problems brought by a large number of replicas, a multi-replica clustering management method based on limited-coding is proposed. In this method, according to the process of creating new replicas from existent single replica, replicas are partitioned into different hierarchies and clusters. Then replicas are coded and managed based on the user-defined limited-coding rule consisting of replica hierarchy and replica sequence, which can also dispose the alteration of clusters caused by dynamic adjustments on replicas (replica addition or replica removal) effectively. After that, a management model of centralization in local and peer to peer in wide area is adopted to organize replicas, and the cost of reconciling consistency can be greatly depressed combining with defined minimal-time of update propagation. The relevance between the coding rule and the number of replicas, and the solutions to replica failure and replica recover are discussed. The results of the performance evaluation show that the clustering method is an efficient way to manage a large number of replicas, achieving good scalability, not sensitive to moderate node failure, and adapting well to applications with frequent updates.
Recent studies on network traffic have shown that self-similar is very popular, and the character will not be changed during buffering, switching and transmitting. The character self-similar must be considered in netw...
详细信息
Recent studies on network traffic have shown that self-similar is very popular, and the character will not be changed during buffering, switching and transmitting. The character self-similar must be considered in network traffic prediction. This paper analyzed and summarized the research results of self-similar network traffic prediction from the fields of self-similar modeling, parameter computing and performance prediction. An equivalent bandwidth algorithm of self-similar traffic prediction based on measurement was put forward. Our analysis has shown that the algorithm can effectively reduce computing and realizing complexities.
In open Internet environment, it is inevitable that multiple ontologies coexist. Centralized service discovery mechanism becomes the bottleneck of SOC (service oriented computing), which results in poor scalability of...
详细信息
In open Internet environment, it is inevitable that multiple ontologies coexist. Centralized service discovery mechanism becomes the bottleneck of SOC (service oriented computing), which results in poor scalability of system. Aiming at solving these problems, a two layered P2P based model for semantic service discovery is proposed in this paper. The model is based on ontology community and integrates iVCE (Internet-based virtual computing environment) core concepts into a P2P model. Based on this model, a service discovery algorithm composed of two stages and three steps is proposed. It matches services across communities as well as within community. Within a community, algorithm locates registers holding service information with a high probability of satisfying a request firstly. Then it captures semantic matching between service advertisements and service requests by logical reasoning. Service discovery across communities occurs according to some policies. The model is suitable for opening environment with coexistent multiple ontologies. Experimental results show that given an appropriate setting, the model can make a tradeoff between recall and responding time. In addition, the model will release the mean load of registers efficiently while holding recall.
As a new stage in the development of the cloud computing paradigm, serverless computing has the high-level abstraction characteristic of shielding underlying details. This makes it extremely challenging for users to c...
As a new stage in the development of the cloud computing paradigm, serverless computing has the high-level abstraction characteristic of shielding underlying details. This makes it extremely challenging for users to choose a suitable serverless platform. To address this, targeting the jointcloud computing scenario of heterogeneous serverless platforms across multiple clouds, this paper presents a jointcloud collaborative mechanism called FCloudless with cross-cloud detection of the full lifecycle performance of serverless platforms. Based on the benchmark metrics set that probe performance critical stages of the full lifecycle, this paper proposes a performance optimization algorithm based on detected performance data that takes into account all key stages that affect the performance during the lifecycle of a function and predicts the overall performance by combining the scores of local stages and dynamic weights. We evaluate FCloudless on AWS, AliYun, and Azure. The experimental results show that FCloudless can detect the underlying performance of serverless platforms hidden in the black box and its optimization algorithm can select the optimal scheduling strategy for various applications in a jointcloud environment. FCloudless reduces the runtime by 23.3% and 24.7% for cold and warm invocations respectively under cost constraints.
To deal with the scalable and fast unbiased sampling problems in unstructured P2P systems, a sampling method based on multi-peer adaptive random walk (SMARW) is proposed. In the method, based on the multi-peer random ...
详细信息
This paper introduces a new deep learning approach to approximately solve the Covering Salesman Problem (CSP). In this approach, given the city locations of a CSP as input, a deep neural network model is designed to d...
详细信息
暂无评论