software bugs claim approximate to 50% of development time and cost the global economy billions of dollars. Once a bug is reported, the assigned developer attempts to identify and understand the source code responsibl...
详细信息
ISBN:
(纸本)9781665457019
software bugs claim approximate to 50% of development time and cost the global economy billions of dollars. Once a bug is reported, the assigned developer attempts to identify and understand the source code responsible for the bug and then corrects the code. Over the last five decades, there has been significant research on automatically finding or correcting software bugs. However, there has been little research on automatically explaining the bugs to the developers, which is essential but a highly challenging task. In this paper, we propose Bugsplainer, a transformer-based generative model, that generates natural language explanations for software bugs by learning from a large corpus of bug-fix commits. Bugsplainer can leverage structural information and buggy patterns from the source code to generate an explanation for a bug. Our evaluation using three performance metrics shows that Bugsplainer can generate understandable and good explanations according to Google's standard, and can outperform multiple baselines from the literature. We also conduct a developer study involving 20 participants where the explanations from Bugsplainer were found to be more accurate, more precise, more concise and more useful than the baselines.
Text labels are extracted from the content of texts and is a issue of natural language process, which contains multiple labels. The text labels classification aims to divide the multiple labels into only one correct c...
详细信息
With the development of big data, machinelearning, and AI, existing softwareengineering techniques must be re-imagined to provide the productivity gains that developers desire. Furthermore, specialized hardware acce...
详细信息
ISBN:
(纸本)9798350324969
With the development of big data, machinelearning, and AI, existing softwareengineering techniques must be re-imagined to provide the productivity gains that developers desire. Furthermore, specialized hardware accelerators like GPUs or FPGAs have become a prominent part of the current computing landscape. However, developing heterogeneous applications is limited to a small subset of programmers with specialized hardware knowledge. To improve productivity and performance for data-intensive and compute-intensive development, now is the time that the softwareengineering community should design new waves of refactoring, testing, and debugging tools for big data analytics and heterogeneous application development. In this paper, we overview software development challenges in this new data-intensive scalable computing and heterogeneous computing domain. We describe examples of automated softwareengineering (debugging, testing, and refactoring) techniques that target this data and compute intensive domain and share lessons learned from building these techniques.
In the rapid development of network science and technology, the software, as the basic part of the network system operation, the practical application quality directly determines the realization of the function, so th...
详细信息
machinelearning (ML) has been widely used in trace link recovery (TLR) to reduce the manual maintenance cost of trace links by developers. However, the imbalanced distribution of valid links and invalid links serious...
详细信息
ISBN:
(纸本)9781665488679
machinelearning (ML) has been widely used in trace link recovery (TLR) to reduce the manual maintenance cost of trace links by developers. However, the imbalanced distribution of valid links and invalid links seriously affects the performance of classifiers. Although a few studies have applied data balancing techniques (DBT) to ML-based TLR, none of them has systematically analyzed more effective combinations of them. Therefore, we perform an empirical study on three groups of control experiments to explore the impact of the combination of different ML methods with and without DBT on TLR efficiency. We compare the performance of supervised ML-based TLR and unsupervised ML-based TLR with and without DBT respectively. Then, we analyze the performance of the ensemble learning model (EM) with DBT on TLR. The experimental results on the 7 imbalance datasets of CoEST indicate that DBT has a positive effect on ML-based TLR. Specifically, the recall of the LR model increased by 0.5517 after combining with most DBTs on EasyClinic(ID-TC), while Tomek-link significantly improves the precision of K-Nearest Neighbor (KNN), Decision Tree (DT), LR, Support Vector machine (SVM). The precision of LR increased from 0.5036 to 1.0. BalanceRF is best at increasing recall, reaching 1.0 on 4 datasets. Moreover,the improvement degree of ML-based TLR with DBT shows differences in terms of the size of datasets and the proportion of valid links.
Obtaining a new customer is more expensive than predicting the churn probability of an existing customer. A high-performance model in churn prediction can help a company to reduce the cost of obtaining a new customer....
详细信息
Today, in the crowded and competitive telecom market, countless telecom companies suffer from customer churn. To help telecom companies predict the potential churn rate of their customers, this paper proposes to apply...
详细信息
Chronic Kidney Disease (CKD) is a common and critical health problem that calls for effective disease treatment and careful kidney donor selection. This research introduces a mobile application made to help medical pr...
详细信息
The proceedings contain 44 papers. The topics discussed include: analysis and the preliminary design for backend technology of disaster management information system;development of simulator for learning scrum;improvi...
ISBN:
(纸本)9798350381382
The proceedings contain 44 papers. The topics discussed include: analysis and the preliminary design for backend technology of disaster management information system;development of simulator for learning scrum;improving cultural objects portal application usability using user usability evaluation;a static IDE plugin to detect security hotspot for Laravel framework based web application;achieving high-level software component summarization via hierarchical chain-of-thought prompting and static code analysis;aspect-based sentiment analysis model with local sentiment aggregation for online travel reviews;automated chest x-ray report generator using multi-model deep learning approach;comparative analysis of big data utilization at vocational and productivity training centers (BBPVP) and the domestic job market;and comparison machinelearning technique for grade classification of exported Cavendish banana.
software development is a highly structured process that involves the creation and maintenance of a particular system, ranging from simple applications to complex enterprise software. Despite following a well-defined ...
详细信息
暂无评论