Consistency and responsiveness are two important factors in providing the sense of reality in distributed Virtual Environment (DVE). However, it is not easy to optimize both aspects because of the trade-off between th...
详细信息
Jamming style Denial-of-Service attack is the transmission of radio signals that disrupt communications by decreasing the signal to noise ratio. This kind of attack can be easily launched by jammer through either bypa...
详细信息
In the field of RNA secondary structure prediction, the CYK (Coche-Younger-Kasami) algorithm is a most popular methods using SCFG (stochastic context-free grammars) model. However, general purpose parallel computers i...
详细信息
ISBN:
(纸本)9781605586267
In the field of RNA secondary structure prediction, the CYK (Coche-Younger-Kasami) algorithm is a most popular methods using SCFG (stochastic context-free grammars) model. However, general purpose parallel computers including SMP multiprocessors or cluster systems exhibit low parallel efficiency and they are too expensive to be used easily for many research institutes. FPGA chips provide a new approach to accelerate the CYK algorithm by exploiting fine-grained custom design. The CYK algorithm shows complicated data dependence, in which the dependence distance is variable, and the dependence direction is also across two dimensions. We propose a systolic array structure including one master PE and multiple slave PEs for fine grain hardware implementation on FPGA. We partition tasks by columns and assign tasks to PEs for load balance. We exploit data reuse schemes to reduce the need to load matrix from external memory. To our knowledge, our implementation with 16 PEs is the only FPGA accelerator implementing the complete CYK/inside algorithm. The experimental results show a factor of more than 14 speedup over the Infernal-0.55 software running on a PC platform with Pentium 4 2.66GHz CPU. The computational power of our platform with FPGA accelerator is comparable to a PC cluster consisting of 20 Intel-Xeon CPUs for RNA secondary structure prediction using SCFGs, but the hardware cost and power consumption is only about 15% and 10% of the latter respectively. Copyright 2009 ACM.
With the advancement of peer-to-peer technology, media streaming applications become more and more popular in the Internet. However, the traditional development methods for this kind of applications need developers no...
详细信息
Efficient mapping of logical processes to physical processes is one of key technologies to accelerate parallel performance simulation. Aiming at minimizing the communications between SMP nodes and between host physica...
详细信息
Predicting network latencies between Internet hosts can efficiently support large-scale Internet applications, e.g., file sharing service and the overlay construction. Several study use the Hyperbolic space to model t...
详细信息
Proximity ranking according to end-to-end network distances (e.g., Round-Trip Time, RTT) can reveal detailed proximity information, which is important in network management and performance diagnosis in distributed sys...
详细信息
As an infrastructure for data distribution, overlay networks have to feature efficient routing and adequate robustness to achieve fast and accurate data distribution in the environment with node churn. Considering tha...
详细信息
Graphic processing Unit (GPU), with many lightweight data-parallel cores, can provide substantial parallel computing power to accelerate several general purpose applications. Both the AMD and NVIDIA corps provide thei...
详细信息
Multi-core architectures can deliver high processing power if the multiple levels of parallelism they expose are exploited. However, it is non-trivial to orchestrate the computational and memory resources allocation. ...
详细信息
暂无评论