In the field of RNA secondary structure prediction, the CYK (Coche-Younger-Kasami) algorithm is a most popular methods using SCFG (stochastic context-free grammars) model. However, general purpose parallel computers i...
详细信息
ISBN:
(纸本)9781605586267
In the field of RNA secondary structure prediction, the CYK (Coche-Younger-Kasami) algorithm is a most popular methods using SCFG (stochastic context-free grammars) model. However, general purpose parallel computers including SMP multiprocessors or cluster systems exhibit low parallel efficiency and they are too expensive to be used easily for many research institutes. FPGA chips provide a new approach to accelerate the CYK algorithm by exploiting fine-grained custom design. The CYK algorithm shows complicated data dependence, in which the dependence distance is variable, and the dependence direction is also across two dimensions. We propose a systolic array structure including one master PE and multiple slave PEs for fine grain hardware implementation on FPGA. We partition tasks by columns and assign tasks to PEs for load balance. We exploit data reuse schemes to reduce the need to load matrix from external memory. To our knowledge, our implementation with 16 PEs is the only FPGA accelerator implementing the complete CYK/inside algorithm. The experimental results show a factor of more than 14 speedup over the Infernal-0.55 software running on a PC platform with Pentium 4 2.66GHz CPU. The computational power of our platform with FPGA accelerator is comparable to a PC cluster consisting of 20 Intel-Xeon CPUs for RNA secondary structure prediction using SCFGs, but the hardware cost and power consumption is only about 15% and 10% of the latter respectively. Copyright 2009 ACM.
Satisfiability Modulo Theories (SMT) is an extension of SAT towards FOL. SMT solvers have proven highly scalable and efficient for problems based on some ground theorems. However, SMT problems involving quantifiers an...
详细信息
Nowadays open source software has become an indispensable basis for both individual and industrial software engineering. Various kinds of labeling mechanisms like categories and tags are used in open source communitie...
详细信息
ISBN:
(纸本)9781450315609
Nowadays open source software has become an indispensable basis for both individual and industrial software engineering. Various kinds of labeling mechanisms like categories and tags are used in open source communities to annotate projects and facilitate the discovery of certain software However as large amounts of software are attached with no/few labels or the existing labels are from different ontology space, it is still hard to retrieve potentially topic-relevant software. This paper highlights the valuable semantic information of project descriptions and labels, proposes labeled software topic detection LSTD a hybrid approach combining topic models and ranking mechanisms to detect and enrich the topics of software by mining the large amount of textual software profiles, which can be employed to do software categorization and tag recommendation. LSTD makes use of labeled LDA to capture the semantic correlations between labels and descriptions and then construct the label-based topic-word matrix. Based on the generated matrix and the generality of labels, LSTD designs a simple yet eficient algorithm to detect the latent topics of software that expressed as relevant and popular labels. Comprehensive evaluations are conducted on the large-scale datasets of representative open source communities and the results validate the effectiveness of LSTD.
This work explores the feasibility to implement IEEE-754-2008 standard quadruple precision (Quad) elementary functions on recent FPGAs with plenty of embedded memories and DSP blocks. First, we analysis the implementa...
详细信息
The value range information of program variables is useful in many applications such as compiler optimization and program analysis. In the framework of abstract interpretation, the interval abstract domain infers nume...
详细信息
Efficient designs for intra-session network coding based practical applications largely rely on a better understanding on its queueing behaviors. However, few work devote on this topics. In this paper, we build a mult...
详细信息
With the quick development of open source software, quantity of software is produced in the open source community (OSC) [1]. Lots of researches are launched to study the internal regular patterns of OSC [2], [3]. GitH...
详细信息
ISBN:
(纸本)9789811136719
With the quick development of open source software, quantity of software is produced in the open source community (OSC) [1]. Lots of researches are launched to study the internal regular patterns of OSC [2], [3]. GitHub is one of the most famous open source community which owns thousands software projects. As a result, there are massive and abundant data of software development activities in GitHub. With the purpose to offer an accuracy and efficient dataset of GitHub, this paper proposes Kraken which is a continuous incremental data acquisition system for GitHub. Kraken contains three main modules which are independent with each other. Kraken gets the data of GitHub from two ways: git repositories and rest API. The final result shows that Kraken could extract the commits information of git repositories and get pull requests(PRs) and issues through rest API. The commits information contains the detail development history of software and the feedbacks and wisdom of software engineers are showed through PRs and issues.
Soft errors caused by energetic particle strikes in on-chip cache memories have become a critical challenge for microprocessor design. Architectural vulnerability factor (AVF), which is defined as the probability that...
详细信息
Soft errors caused by energetic particle strikes in on-chip cache memories have become a critical challenge for microprocessor design. Architectural vulnerability factor (AVF), which is defined as the probability that a transient fault in the structure would result in a visible error in the final output of a program, has been widely employed for accurate soft error rate estimation. Recent studies have found that designing soft error protection techniques with the awareness of AVF is greatly helpful to achieve a tradeoff between performance and reliability for several structures (i.e., issue queue, reorder buffer). Considering large on-chip L2 cache, redundancy-based protection techniques (such as ECC) have been widely employed for L2 cache data integrity with high costs. Protecting caches without accurate knowledge of the vulnerability characteristics may lead to the over-protection, thus incurring high overheads. Therefore, designing AVF-aware protection techniques would be attractive for designers to achieve a cost-efficient protection for caches, especially at early design stage. In this paper, we propose an improved AVF estimation framework for conducing comprehensive characterization of dynamic behavior and predictability of L2 cache vulnerability. We propose to employ Bayesian Additive Regression Trees (BART) method to accurately model the variation of L2 cache AVF and to quantitatively explain the important effects of several key performance metrics on L2 cache AVF. Then we employ bump hunting technique to extract some simple selecting rules based on several key performance metrics for a simplified and fast estimation of L2 cache AVF. Using the simplified L2 cache AVF estimator, we develop an AVF-aware ECC technique as an example to demonstrate the cost-efficient advantages of the AVF prediction based dynamic fault tolerant techniques. Experimental results show that compared with traditional full ECC technique, AVF-aware ECC technique reduces the L2 cache acc
Code review is an important process to reduce code defects and improve software quality. In social coding communities like GitHub, as everyone can submit Pull-Requests, code review plays a more important role than eve...
详细信息
Code review is an important process to reduce code defects and improve software quality. In social coding communities like GitHub, as everyone can submit Pull-Requests, code review plays a more important role than ever before, and the process is quite time-consuming. Therefore, finding and recommending proper reviewers for the emerging Pull-Requests becomes a vital task. However, most of the current studies mainly focus on recommending reviewers by checking whether they will participate or not without differentiating the participation types. In this paper, we develop a two-layer reviewer recommendation model to recommend reviewers for Pull-Requests (PRs) in GitHub projects from the technical and managerial perspectives. For the first layer, we recommend suitable developers to review the target PRs based on a hybrid recommendation method. For the second layer, after getting the recommendation results from the first layer, we specify whether the target developer will technically or managerially participate in the reviewing process. We conducted experiments on two popular projects in GitHub, and tested the approach using PRs created between February 2016 and February 2017. The results show that the first layer of our recommendation model performs better than the previous work, and the second layer can effectively differentiate the types of participation.
Code clone is a prevalent activity during the development of softwares. However, it may be harmful to the maintenance and evolution of softwares. Current techniques for detecting code clones are most syntax-based, and...
详细信息
暂无评论