The larger size and complexity of software source code builds many challenges in bug detection. Data mining based bug detection methods eliminate the bugs present in software source code effectively. rule violation an...
详细信息
ISBN:
(纸本)9781467355834;9781467355827
The larger size and complexity of software source code builds many challenges in bug detection. Data mining based bug detection methods eliminate the bugs present in software source code effectively. rule violation and copy paste related defects are the most concerns for bug detection system. Traditional data mining approaches such as frequent Itemset mining and frequent sequence mining are relatively good but they are lacking in accuracy and pattern recognition. Neural networks have emerged as advanced data mining tools in cases where other techniques may not produce satisfactory predictive models. The neural network is trained for possible set of errors that could be present in software source code. From the training data the neural network learns how to predict the correct output. The processing elements of neural networks are associated with weights which are adjusted during the training period.
Program source code substantially is structured and contains semantically rich programming constructs such as variables, functions, data structures, and program structures which indicate patterns. Mining source code b...
详细信息
Program source code substantially is structured and contains semantically rich programming constructs such as variables, functions, data structures, and program structures which indicate patterns. Mining source code by using different data mining techniques to extract the valuable hidden patterns is the new revolution in software engineering. Over last decade many tools and techniques have been proposed by researcher to extract pertinent information and uncover relationships and trends from source code about a particular characteristic of Software Engineering (SE) tasks. These efforts have resulted in wide range of research body but currently there is no comprehensive overview exists. This paper surveys the tools and techniques which rely only on data mining methods to determine patterns from source code in context of programming, bug detection, maintenance, program understanding and software reuse. The work provides comparison and evaluation of the current state-of-the-art source code mining tools and techniques, and organizes the large amount of information into a coherent conceptual way. Thus the survey provides researchers with a concise overview of source code mining techniques and assists practitioners the selection of appropriate techniques for their work. The result of this review shows existing studies focus on one specific pattern being mined from source code such as special kind of bug detection. Thus, there is a need of multiple tools to test and find potential information from software which increase cost and time of development. Hence there is a strong need of tool which helps in developing quality software by automatically detecting different kind of bugs and generates relevant API code automatically to help in decreasing overall software development time.
With the expansion of software size and complexity,how to detect defects becomes a challenging problem. This paper proposes a defect detection method which applies data mining techniques in source code to detect two t...
详细信息
With the expansion of software size and complexity,how to detect defects becomes a challenging problem. This paper proposes a defect detection method which applies data mining techniques in source code to detect two types of defects in one process. The two types of defects are rule-violating defects and copy-paste related defects which may include semantic *** the process, this method can also extract implicit programming rules without prior knowledge of the software and detect copy-paste segments with different granularities. The method is evaluated with the Linux kernel that contains more than 4 million lines of C code. The result shows that the resulting system can quickly detect many programming rules and violations to the rules. After using the novel pruning techniques, it will greatly reduce the effort of manually checking violations so as a large number of false positives are effectively eliminated. As an illustrative example of its effectiveness, a case study shows that among the top 50 violations reported by the proposed model,11 defects can be confirmed after examining the source code.
暂无评论