The great success of numerous community-based open source software (OSS) is based on volunteers continuously submitting contributions, but ensuring sustainability is a persistent challenge in OSS communities. Although...
详细信息
The great success of numerous community-based open source software (OSS) is based on volunteers continuously submitting contributions, but ensuring sustainability is a persistent challenge in OSS communities. Although the motivations behind and barriers to OSS contributors' joining and retention have been extensively studied, the impacts of, reasons for and solutions to contribution abandonment at the individual level have not been well studied, especially for pull-based development. To bridge this gap, we present an empirical study on pull request abandonment based on a sizable dataset. We manually examine 321 abandoned pullrequests on GitHub and then quantify the manual observations by surveying 710 OSS developers. We find that while the lack of integrators' responsiveness and the lack of contributors' time and interest remain the main reasons that deter contributors from participation, limitations during the processes of patch updating and consensus reaching can also cause abandonment. We also show the significant impacts of pull request abandonment on project management and maintenance. Moreover, we elucidate the strategies used by project integrators to cope with abandoned pullrequests and highlight the need for a practical handover mechanism. We discuss the actionable suggestions and implications for OSS practitioners and tool builders, which can help to upgrade the infrastructure and optimize the mechanisms of OSS communities.
pullrequests (PRs) that are neither progressed nor resolved clutter the list of PRs, making it difficult for the maintainers to manage and prioritize unresolved PRs. To automatically track, follow up, and close such ...
详细信息
pullrequests (PRs) that are neither progressed nor resolved clutter the list of PRs, making it difficult for the maintainers to manage and prioritize unresolved PRs. To automatically track, follow up, and close such inactive PRs, Stale bot was introduced by GitHub. Despite its increasing adoption, there are ongoing debates on whether using Stale bot alleviates or exacerbates the problem of inactive PRs. To better understand if and how Stale bot helps projects in their pull-based development workflow, we perform an empirical study of 20 large and popular open source projects. We find that Stale bot can help deal with a backlog of unresolved PRs, as the projects closed more PRs within the first few months of adoption. Moreover, Stale bot can help improve the efficiency of the PR review process as the projects reviewed PRs that ended up merged and resolved PRs that ended up closed faster after the adoption. However, Stale bot can also negatively affect the contributors, as the projects experienced a considerable decrease in their number of active contributors after the adoption. Therefore, relying solely on Stale bot to deal with inactive PRs may lead to decreased community engagement and an increased probability of contributor abandonment.
The success of a pullrequest (PR) depends on the responsiveness of the maintainers and the contributor during the review process. Being aware of the expected waiting times can lead to better interactions and managed ...
详细信息
The success of a pullrequest (PR) depends on the responsiveness of the maintainers and the contributor during the review process. Being aware of the expected waiting times can lead to better interactions and managed expectations for both the maintainers and the contributor. In this paper, we propose a machine-learning approach to predict the first response latency of the maintainers following the submission of a PR, and the first response latency of the contributor after receiving the first response from the maintainers. We curate a dataset of 20 large and popular open-source projects on GitHub and extract 21 features to characterize projects, contributors, PRs, and review processes. Using these features, we then evaluate seven types of classifiers to identify the best-performing models. We also conduct permutation feature importance and SHAP analyses to understand the importance and the impact of different features on the predicted response latencies. We find that our CatBoost models are the most effective for predicting the first response latencies of both maintainers and contributors. Compared to a dummy classifier that always returns the majority class, these models achieved an average improvement of 29% in AUC-ROC and 51% in AUC-PR for maintainers, as well as 39% in AUC-ROC and 89% in AUC-PR for contributors across the studied projects. The results indicate that our models can aptly predict the first response latencies using the selected features. We also observe that PRs submitted earlier in the week, containing an average number of commits, and with concise descriptions are more likely to receive faster first responses from the maintainers. Similarly, PRs with a lower first response latency from maintainers, that received the first response of maintainers earlier in the week, and containing an average number of commits tend to receive faster first responses from the contributors. Additionally, contributors with a higher acceptance rate and a history of timel
暂无评论