The fast numerical solutions of Riesz fractional equation have computational cost of O(NMlogM), where M, N are the number of grid points and time steps. In this paper, we present a GPU-based fast solution for Riesz sp...
详细信息
The fast numerical solutions of Riesz fractional equation have computational cost of O(NMlogM), where M, N are the number of grid points and time steps. In this paper, we present a GPU-based fast solution for Riesz space fractional equation. The GPU-based fast solution, which is based on the fast method using FFT and implemented with CUDA programming model, consists of parallel FFT, vector-vector addition and vector-vector multiplication on GPU. The experimental results show that the GPU-based fast solution compares well with the exact solution. Compared to the known parallel fast solution on 8-core Intel E5-2670 CPU, the overall performance speedup on NVIDIA GTX650 GPU reaches 2.12 times and that on NVIDIA K20C GPU achieves 10.93 times.
This paper presents an alternative algorithm for integrating the existing knowledge of a supervised learning neural network with the new training data. The algorithm allows the existing knowledge to age out in slow ra...
详细信息
This paper presents an alternative algorithm for integrating the existing knowledge of a supervised learning neural network with the new training data. The algorithm allows the existing knowledge to age out in slow rate as a neural network is gradually retrained with consecutive sets of new samples, resembling the change of application locality under a consistent environment. The algorithm also utilizes the contour preserving classification algorithm to increase the accuracy of classification. The experiment is performed on 2-dimension partition problem and the result convincingly confirms the effectiveness of the algorithm.
In order to generate local addresses for an array section A(l:h:s) with block-cyclic distribution, an efficient compiling method is required. In this paper, two local address generation methods for the block-cyclic di...
详细信息
In order to identify and schedule jobs that are suitable for determined resources, an execution time estimation model is required. In this paper, it is described a Chronological history-based execution time estimation...
详细信息
The Data Grid enables the sharing, selection, and connection of a wide variety of geographically distributed computational and storage resources for solving large-scale data intensive scientific applications. Such tec...
详细信息
Rendezvous problem is known as the fundamental issue in MAC design for cognitive radio networks (CRNs). With the concept of blind rendezvous, numerous channel-hopping sequence (CHS) based rendezvous schemes have been ...
详细信息
ISBN:
(纸本)9781479959716
Rendezvous problem is known as the fundamental issue in MAC design for cognitive radio networks (CRNs). With the concept of blind rendezvous, numerous channel-hopping sequence (CHS) based rendezvous schemes have been proposed to solve this problem. However, little attention is paid to the design of a CSMA/CA MAC based on these rendezvous schemes. In this paper, we propose a cooperative CSMA/CA MAC (named CoCH-CSMA/CA MAC) which tailors 802.11 distributed coordination function (DCF) to the slotted operation manner of existing CHS based rendezvous schemes. More importantly, the rendezvous de-synchronization problem is identified in MAC design based on these rendezvous schemes. To alleviate its impact on MAC performance, a cooperative control feedback scheme employing correlation-based signal detection is proposed to help secondary users avoid backoff misbehavior and improve networking performance (i.e., packet delivery delay and network throughput). Extensive simulations are conducted to prove the effectiveness of our MAC design.
Continued increasing of fault rate in integrate circuit makes processors more susceptible to errors, especially many-core processor. Meanwhile, most systems or applications do not need full fault coverage, which has e...
详细信息
ISBN:
(纸本)9781479909735
Continued increasing of fault rate in integrate circuit makes processors more susceptible to errors, especially many-core processor. Meanwhile, most systems or applications do not need full fault coverage, which has excessive overhead. So on-demand fault tolerance is desired for these applications. In this paper, we propose an adaptive low-overhead fault tolerance mechanism for many-core system, called Device View Redundancy (DVR). It treats fault tolerance as a device that can be configured and used by application when high reliability is needed. Nevertheless, DVR exploits the idle resources for low-overhead fault tolerance, which is based on the observation that the utilization of many-core system is low due to lack of parallelism in application. Finally, the experiment shows that the performance overhead of DVR is reduced by 16% to 98% compared with full Dual Modular Redundancy (DMR).
Physics-informed neural networks (PINNs) have emerged as promising surrogate modes for solving partial differential equations (PDEs). Their effectiveness lies in the ability to capture solution-related features throug...
详细信息
The computing power provided by high performance low-cost PC-based Cluster and Grid platforms are attractive, and they are equal or superior to supercomputers and mainframes widely available. In this research paper, w...
详细信息
暂无评论