A pull request(PR) is an event in Git where a contributor asks project maintainers to review code he/she wants to merge into a project. The PR mechanism greatly improves the efficiency of distributed software developm...
详细信息
A pull request(PR) is an event in Git where a contributor asks project maintainers to review code he/she wants to merge into a project. The PR mechanism greatly improves the efficiency of distributed software development in the opensource community. Nevertheless, the massive number of PRs in an open-source software(OSS) project increases the workload of developers. To reduce the burden on developers, many previous studies have investigated factors that affect the chance of PRs getting accepted and built prediction models based on these factors. However, most prediction models are built on the data after PRs are submitted for a while(e.g., comments on PRs), making them not useful in practice. Because integrators still need to spend a large amount of effort on inspecting PRs. In this study, we propose an approach named E-PRedictor(earlier PR predictor) to predict whether a PR will be merged when it is created. E-PRedictor combines three dimensions of manual statistic features(i.e., contributor profile, specific pull request, and project profile) and deep semantic features generated by BERT models based on the description and code changes of PRs. To evaluate the performance of E-PRedictor, we collect475192 PRs from 49 popular open-source projects on GitHub. The experiment results show that our proposed approach can effectively predict whether a PR will be merged or not. E-PRedictor outperforms the baseline models(e.g., Random Forest and VDCNN) built on manual features significantly. In terms of F1@Merge, F1@Reject, and AUC(area under the receiver operating characteristic curve), the performance of E-PRedictor is 90.1%, 60.5%, and 85.4%, respectively.
As people become increasingly reliant on the Internet, securely storing and publishing private data has become an important issue. In real life, the release of graph data can lead to privacy breaches, which is a highl...
详细信息
As people become increasingly reliant on the Internet, securely storing and publishing private data has become an important issue. In real life, the release of graph data can lead to privacy breaches, which is a highly challenging problem. Although current research has addressed the issue of identity disclosure, there are still two challenges: First, the privacy protection for large-scale datasets is not yet comprehensive; Second, it is difficult to simultaneously protect the privacy of nodes, edges, and attributes in social networks. To address these issues, this paper proposes a(k,t)-graph anonymity algorithm based on enhanced clustering. The algorithm uses k-means++ clustering for k-anonymity and t-closeness to improve k-anonymity. We evaluate the privacy and efficiency of this method on two datasets and achieved good results. This research is of great significance for addressing the problem of privacy breaches that may arise from the publication of graph data.
Network-on-Chip (NoC) is crucial for modern multicore systems, offering high throughput and low latency. However, its shared memory faces threats like illegal access and DDoS attacks. To enhance security, Memory Prote...
详细信息
The rapid growth in the storage scale of wide-area distributed file systems (DFS) calls for fast and scalable metadata management. Metadata replication is the widely used technique for improving the performance and sc...
详细信息
The rapid growth in the storage scale of wide-area distributed file systems (DFS) calls for fast and scalable metadata management. Metadata replication is the widely used technique for improving the performance and scalability of metadata management. Because of the POSIX requirement of file systems, many existing metadata management techniques utilize a costly design for the sake of metadata consistency, leading to unacceptable performance overhead. We propose a new metadata consistency maintenance method (ICCG), which includes an incremental consistency guaranteed directory tree synchronization (ICGDT) and a causal consistency guaranteed replica index synchronization (CCGRI), to ensure system performance without sacrificing metadata consistency. ICGDT uses a flexible consistency scheme based on the state of files and directories maintained through the conflict state tree to provide an incremental consistency for metadata, which satisfies both metadata consistency and performance requirements. CCGRI ensures low latency and consistent access to data by establishing a causal consistency for replica indexes through multi-version extent trees and logical time. Experimental results demonstrate the effectiveness of our methods. Compared with the strong consistency policies widely used in modern DFSes, our methods significantly improve the system performance. For example, in file creation, ICCG can improve the performance of directory tree operations by at least 36.4 times.
The elliptic curve discrete logarithm problem(ECDLP)is a popular choice for cryptosystems due to its high level of ***,with the advent of the extended Shor’s algorithm,there is concern that ECDLP may soon be *** the ...
详细信息
The elliptic curve discrete logarithm problem(ECDLP)is a popular choice for cryptosystems due to its high level of ***,with the advent of the extended Shor’s algorithm,there is concern that ECDLP may soon be *** the algorithm does ofer hope in solving ECDLP,it is still uncertain whether it can pose a real threat in *** the perspective of the quantum circuits of the algorithm,this paper analyzes the feasibility of cracking ECDLP using an ion trap quantum computer with improved quantum circuits for the extended Shor’s *** give precise quantum circuits for extended Shor’s algorithm to calculate discrete logarithms on elliptic curves over prime felds,including modular subtraction,three diferent modular multiplication,and modular ***,we incorporate and improve upon windowed arithmetic in the circuits to reduce the *** previous studies mostly focused on minimizing the number of qubits or the depth of the circuit,we focus on minimizing the number of CNOT gates in the circuit,which greatly afects the running time of the algorithm on an ion trap quantum ***,we begin by presenting implementations of basic arithmetic operations with the lowest known CNOT-counts,along with improved constructions for modular inverse,point addition,and windowed ***,we precisely estimate that,to execute the extended Shor’s algorithm with the improved circuits to factor an n-bit integer,the CNOT-count required is1237n^(3)/log n+2n^(2)+***,we analyze the running time and feasibility of the extended Shor’s algorithm on an ion trap quantum computer.
Researchers have recently achieved significant advances in deep learning techniques, which in turn has substantially advanced other research disciplines, such as natural language processing, image processing, speech r...
详细信息
Researchers have recently achieved significant advances in deep learning techniques, which in turn has substantially advanced other research disciplines, such as natural language processing, image processing, speech recognition, and softwareengineering. Various deep learning techniques have been successfully employed to facilitate softwareengineering tasks, including code generation, software refactoring, and fault localization. Many studies have also been presented in top conferences and journals, demonstrating the applications of deep learning techniques in resolving various softwareengineering tasks. However,although several surveys have provided overall pictures of the application of deep learning techniques in softwareengineering,they focus more on learning techniques, that is, what kind of deep learning techniques are employed and how deep models are trained or fine-tuned for softwareengineering tasks. We still lack surveys explaining the advances of subareas in softwareengineering driven by deep learning techniques, as well as challenges and opportunities in each subarea. To this end, in this study, we present the first task-oriented survey on deep learning-based softwareengineering. It covers twelve major softwareengineering subareas significantly impacted by deep learning techniques. Such subareas spread out through the whole lifecycle of software development and maintenance, including requirements engineering, software development, testing, maintenance, and developer collaboration. As we believe that deep learning may provide an opportunity to revolutionize the whole discipline of softwareengineering, providing one survey covering as many subareas as possible in softwareengineering can help future research push forward the frontier of deep learning-based softwareengineering more systematically. For each of the selected subareas,we highlight the major advances achieved by applying deep learning techniques with pointers to the available datasets i
Both acceleration and pseudo-acceleration response spectra play important roles in structural seismic ***,only one of them is generally provided in most seismic ***,many studies have attempted to develop conversion mo...
详细信息
Both acceleration and pseudo-acceleration response spectra play important roles in structural seismic ***,only one of them is generally provided in most seismic ***,many studies have attempted to develop conversion models between the acceleration response spectrum(SA)and the pseudo-acceleration response spectrum(PSA).Our previous studies found that the relationship between SA and PSA is affected by magnitude,distance,and site ***,we developed an SA/PSA model incorporating these ***,this model is suitable for cases with small and moderate magnitudes and its accuracy is not good enough for cases with large *** paper aims to develop an efficient SA/PSA model by considering influences of magnitude,distance,and site class,which can be applied to cases not only with small or moderate magnitudes but also with large *** this purpose,regression analyses were conducted using 16,660 horizontal seismic records with a wider range of *** magnitude of these seismic records varies from 4 to 9 and the distances vary from 10 to 200 *** ground motions were recorded at 338 stations covering four site *** comparing them with existing models,it was found that the proposed model shows better accuracy for cases with any magnitudes,distances,and site classes considered in this study.
Due to the expiration of the design service life of some nuclear facilities, a large number of hazardous radioactive devices need to be dismantled. Therefore, special attention needs to be paid to the safety and relia...
详细信息
Addressing the issues of complex demolition environments in nuclear facility decommissioning projects, the difficulty of achieving multi-axis linkage for robotic arms, and low work efficiency, an optimization method t...
详细信息
Deep learning(DL) systems exhibit multiple behavioral characteristics such as correctness, robustness, and fairness. Ensuring that these behavioral characteristics function properly is crucial for maintaining the accu...
详细信息
暂无评论