A pull request(PR) is an event in Git where a contributor asks project maintainers to review code he/she wants to merge into a project. The PR mechanism greatly improves the efficiency of distributed software developm...
详细信息
A pull request(PR) is an event in Git where a contributor asks project maintainers to review code he/she wants to merge into a project. The PR mechanism greatly improves the efficiency of distributed software development in the opensource community. Nevertheless, the massive number of PRs in an open-source software(OSS) project increases the workload of developers. To reduce the burden on developers, many previous studies have investigated factors that affect the chance of PRs getting accepted and built prediction models based on these factors. However, most prediction models are built on the data after PRs are submitted for a while(e.g., comments on PRs), making them not useful in practice. Because integrators still need to spend a large amount of effort on inspecting PRs. In this study, we propose an approach named E-PRedictor(earlier PR predictor) to predict whether a PR will be merged when it is created. E-PRedictor combines three dimensions of manual statistic features(i.e., contributor profile, specific pull request, and project profile) and deep semantic features generated by BERT models based on the description and code changes of PRs. To evaluate the performance of E-PRedictor, we collect475192 PRs from 49 popular open-source projects on GitHub. The experiment results show that our proposed approach can effectively predict whether a PR will be merged or not. E-PRedictor outperforms the baseline models(e.g., Random Forest and VDCNN) built on manual features significantly. In terms of F1@Merge, F1@Reject, and AUC(area under the receiver operating characteristic curve), the performance of E-PRedictor is 90.1%, 60.5%, and 85.4%, respectively.
Machine learning-assisted retrosynthesis planning aims to utilize machine learning(ML)algorithms to find synthetic pathways for target *** recent years,with the development of artificial intelligence(AI),especially ML...
详细信息
Machine learning-assisted retrosynthesis planning aims to utilize machine learning(ML)algorithms to find synthetic pathways for target *** recent years,with the development of artificial intelligence(AI),especially ML,researchers’interest in ML-assisted retrosynthesis planning has rapidly increased,bringing development and opportunities to the *** this review,we aim to provide a comprehensive understanding of ML-assisted retrosynthesis *** first discuss the formal definition and the objective of retrosynthesis planning,and organize a modular framework which includes four modules:data preparation,data preprocessing,pathway generation and evaluation,and pathway ***,we sequentially review the current status of the first three modules(except pathway verification)in the ML-assisted retrosynthesis planning framework,including ideas,methods,and latest *** that,we specifically discuss large language models in retrosynthesis ***,we summarize the extant challenges that are faced by current ML-assisted retrosynthesis planning research and offer a perspective on future research directions and development.
Tables,typically two-dimensional and structured to store large amounts of data,are essential in daily activities like database queries,spreadsheet manipulations,Web table question answering,and image table information...
详细信息
Tables,typically two-dimensional and structured to store large amounts of data,are essential in daily activities like database queries,spreadsheet manipulations,Web table question answering,and image table information *** these table-centric tasks with Large Language Models(LLMs)or Visual Language Models(VLMs)offers significant public benefits,garnering interest from academia and *** survey provides a comprehensive overview of table-related tasks,examining both user scenarios and technical *** covers traditional tasks like table question answering as well as emerging fields such as spreadsheet manipulation and table data *** summarize the training techniques for LLMs and VLMs tailored for table ***,we discuss prompt engineering,particularly the use of LLM-powered agents,for various tablerelated ***,we highlight several challenges,including diverse user input when serving and slow thinking using chainof-thought.
Antenna Group Delay Variation(AGDV)is a hardware error source that affects the performance of Dual-Frequency Multi-Constellation(DFMC)Ground-based Augmentation System(GBAS),and these errors are difficult to distinguis...
详细信息
Antenna Group Delay Variation(AGDV)is a hardware error source that affects the performance of Dual-Frequency Multi-Constellation(DFMC)Ground-based Augmentation System(GBAS),and these errors are difficult to distinguish from multipath ***,AGDV is usually modeled as a part of the multipath error,which is called the multipath-AGDV ***,because of the inconsistency of AGDV and multipath when switching among different positioning modes of GBAS,and because the traditional model does not consider the impact of the azimuth on AGDV,using the traditional multipath-AGDV model will cause the protection levels to be inaccurately *** this paper,azimuth-based modeling of AGDV is conducted by using anechoic chamber *** biases and standard deviations of AGDV based on azimuths are analyzed and modeled,and the calculation method for the DFMC GBAS protection level is *** results show that the azimuth-based AGDV model and protection level optimization algorithm can better avoid the error exceeding the protection level than the multipath-AGDV *** with AGDV elevation model,the VPLs of the B1C signal are increased by 0.24 m and 0.06 m,and the VPLs of the B2a signal are reduced by 0.01 m and 0.16 m using the 100 s and 600 s DFree filtering positioning modes,*** changes in the B1C and B2a protection levels reflect the changes in AGDV corresponding to the azimuth for the respective frequencies,further ensuring the integrity of airborne users,especially when they turn near the airport.
As people become increasingly reliant on the Internet, securely storing and publishing private data has become an important issue. In real life, the release of graph data can lead to privacy breaches, which is a highl...
详细信息
As people become increasingly reliant on the Internet, securely storing and publishing private data has become an important issue. In real life, the release of graph data can lead to privacy breaches, which is a highly challenging problem. Although current research has addressed the issue of identity disclosure, there are still two challenges: First, the privacy protection for large-scale datasets is not yet comprehensive; Second, it is difficult to simultaneously protect the privacy of nodes, edges, and attributes in social networks. To address these issues, this paper proposes a(k,t)-graph anonymity algorithm based on enhanced clustering. The algorithm uses k-means++ clustering for k-anonymity and t-closeness to improve k-anonymity. We evaluate the privacy and efficiency of this method on two datasets and achieved good results. This research is of great significance for addressing the problem of privacy breaches that may arise from the publication of graph data.
Carbon-based nanomaterials have become a long-term research hotspot in the fields of material science and nanotechnology. Carbon nanotubes as one-dimensional nanomaterials have shown great application value in the fie...
详细信息
Multimodal Sentiment Analysis (MSA) aims to identify human attitudes from diverse modalities such as visual, audio and text modalities. Recent studies suggest that the text modality tends to be the most effective, whi...
详细信息
Finding semantic relationships between words in several sentences is the goal of document-level relation extraction (DocRE), a crucial problem in natural language processing. Current research is unable to accurately c...
详细信息
作者:
Ding, ZixuanWang, DingNankai University
College of Cryptology and Cyber Science Key Laboratory of Data and Intelligent System Security Ministry of Education Tianjin300350 China Chinese Academy of Sciences
Key Laboratory of Cyberspace Security Defense Institute of Information Engineering Beijing100085 China
One-Time Passwords (OTPs) play a crucial role in Two-Factor Authentication (2FA) and Multi-Factor Authentication (MFA) by adding an additional layer of security. OTPs effectively reduce the risk of static passwords be...
详细信息
Accurately estimating the State of Health(SOH)and Remaining Useful Life(RUL)of lithium-ion batteries(LIBs)is crucial for the continuous and stable operation of battery management ***,due to the complex internal chemic...
详细信息
Accurately estimating the State of Health(SOH)and Remaining Useful Life(RUL)of lithium-ion batteries(LIBs)is crucial for the continuous and stable operation of battery management ***,due to the complex internal chemical systems of LIBs and the nonlinear degradation of their performance,direct measurement of SOH and RUL is *** address these issues,the Twin Support Vector Machine(TWSVM)method is proposed to predict SOH and ***,the constant current charging time of the lithium battery is extracted as a health indicator(HI),decomposed using Variational Modal Decomposition(VMD),and feature correlations are computed using Importance of Random Forest Features(RF)to maximize the extraction of critical factors influencing battery performance ***,to enhance the global search capability of the Convolution Optimization Algorithm(COA),improvements are made using Good Point Set theory and the Differential Evolution *** Improved Convolution Optimization Algorithm(ICOA)is employed to optimize TWSVM parameters for constructing SOH and RUL prediction ***,the proposed models are validated using NASA and CALCE lithium-ion battery *** results demonstrate that the proposed models achieve an RMSE not exceeding 0.007 and an MAPE not exceeding 0.0082 for SOH and RUL prediction,with a relative error in RUL prediction within the range of[-1.8%,2%].Compared to other models,the proposed model not only exhibits superior fitting capability but also demonstrates robust performance.
暂无评论