Dear editor,Intelligent code generation has become an essential research task to accelerate modern software development. To facilitate effective code generation for programming languages, numerous approaches have been...
详细信息
Dear editor,Intelligent code generation has become an essential research task to accelerate modern software development. To facilitate effective code generation for programming languages, numerous approaches have been proposed to generate token or tokens by mining existing open software repositories, e.g.,
API recommendation is a promising approach which is widely used during software development. However, the evaluation of API recommendation is not explored with sufficient rigor. The current evaluation of API recommend...
详细信息
API recommendation is a promising approach which is widely used during software development. However, the evaluation of API recommendation is not explored with sufficient rigor. The current evaluation of API recommendation mainly focuses on correctness, the measurement is conducted by matching recommended results with ground-truth results. In most cases, there is only one set of ground-truth APIs for each recommendation attempt, but the object code can be implemented in dozens of ways. The neglect of code diversity results in a possible defect in the evaluation. To address the problem, we invite15 developers to analyze the unmatched results in a user study. The online evaluation confirms that some unmatched APIs can also benefit to programming due to the functional correlation with ground-truth *** we measure the API functional correlation based on the relationships extracted from API knowledge graph, API method name, and API documentation. Furthermore, we propose an approach to improve the measurement of correctness based on API functional correlation. Our measurement is evaluated on a dataset of 6141 requirements and historical code fragments from related commits. The results show that 28.2% of unmatched APIs can contribute to correctness in our experiments.
Various software-engineering problems have been solved by crowdsourcing. In many projects,the software outsourcing process is streamlined on cloud-based platforms. Among software engineering tasks,test-case developmen...
详细信息
Various software-engineering problems have been solved by crowdsourcing. In many projects,the software outsourcing process is streamlined on cloud-based platforms. Among software engineering tasks,test-case development is particularly suitable for crowdsourcing, because a large number of test cases can be generated at little monetary cost. However, the numerous test cases harvested from crowdsourcing can be high-or low-quality. Owing to the large volume, distinguishing the high-quality tests by traditional techniques is computationally expensive. Therefore, crowdsourced testing would benefit from an efficient mechanism distinguishes the qualities of the test cases. This paper introduces an automated approach —TCQA — to evaluate the quality of test cases based on the onsite coding history. Quality assessment by TCQA proceeds through three steps:(1) modeling the code history as a time series,(2) extracting the multiple relevant features from the time series, and(3) building a model that classifies the test cases based on their qualities. Step(3) is accomplished by feature-based machine-learning techniques. By leveraging the onsite coding history, TCQA can assess the test-case quality without performing expensive source-code analysis or executing the test cases. Using the data of nine test-development tasks involving more than 400 participants, we evaluated TCQA from multiple perspectives. The TCQA approach assessed the quality of the test cases with higher precision, faster speed, and lower overhead than conventional test-case qualityassessment techniques. Moreover, TCQA provided yield real-time insights on test-case quality before the assessment was finished.
暂无评论