Cryptography is known as a challenging topic for developers. We studied stackoverflow posts to identify the problems that developers encounter when using Java Cryptography Architecture (JCA) for symmetric encryption. ...
详细信息
Cryptography is known as a challenging topic for developers. We studied stackoverflow posts to identify the problems that developers encounter when using Java Cryptography Architecture (JCA) for symmetric encryption. We investigated security risks that are disseminated in these posts, and we examined whether ChatGPT helps avoid cryptography issues. We found that developers frequently struggle with key and IV generations, as well as padding. Security is a top concern among developers, but security issues are pervasive in code snippets. ChatGPT can effectively aid developers when they engage with it properly. Nevertheless, it does not substitute human expertise, and developers should remain alert.
As the rapidly expanding digital transformations at multiple organizations require development of growing number of software solutions, low code development platforms (LCDPs) started to be widely used by pretrained bu...
详细信息
ISBN:
(纸本)9780998133171
As the rapidly expanding digital transformations at multiple organizations require development of growing number of software solutions, low code development platforms (LCDPs) started to be widely used by pretrained business users, in such use-cases as process automation and rapid application development. Our study explores the challenges of LCDPs use for developers, by investigating 30 000 of their posts at one of the most prominent fora stackoverflow. It is conducted with text-mining approaches, primarily Latent Dirichlet Allocation (LDA), aiming to identify challenges for users of LCDPs. As they were from the areas of visualization, third-party integration, database and table management, datatype conversion, programming languages, and file handling, we further discussed them to propose possible enhancements for users of LCDPs.
stackoverflow (SO) is a widely used question-and-answer (Q&A) website for software developers and computer scientists. GitHub is an online development platform used for storing, tracking, and collaborating on soft...
详细信息
ISBN:
(纸本)9798331523169;9798400712494
stackoverflow (SO) is a widely used question-and-answer (Q&A) website for software developers and computer scientists. GitHub is an online development platform used for storing, tracking, and collaborating on software projects. Prior work relates the information mined from both platforms without carefully inspecting the answer-reuse practices. For this paper, we did an empirical study by mining the SO answers reused by Java projects available on GitHub. We created a hybrid approach of clone detection, keyword-based search, and manual inspection, to identify the answer(s) actually used by developers. Based on those answers, we studied topics of the discussion threads, answer characteristics (e.g., scores, ages, code lengths, and text lengths), and developers' reuse practices. We observed that most reused answers offer programs to implement specific coding tasks. Among all analyzed SO discussion threads, the reused answers often have higher scores, older ages, longer code, and longer text than unused answers. In only 9% of scenarios (40/430), developers fully copied answer code for reuse. In the remaining scenarios, they reused partial code or created brand new code from scratch. Our study characterized 130 SO discussion threads referred to by Java developers in 357 GitHub projects. Our observations can guide SO answerers to provide better answers, and shed lights on future human-centric research that creates better tools to help with code reuse.
A large part of knowledge evolves outside of the operations of an organization. Question and answer online social platforms provide an important source of information to explore the underlying communities. StackOverfl...
详细信息
ISBN:
(纸本)9780998133164
A large part of knowledge evolves outside of the operations of an organization. Question and answer online social platforms provide an important source of information to explore the underlying communities. stackoverflow (SO) is one of the most popular question and answer platforms for developers, with more than 23 million questions asked. Organizing and categorizing data is crucial to manage knowledge in such large quantities. Questions posted on SO are assigned a set of tags and textual content of each question may contain coding syntax. In this paper, we evaluate the performance of multiple text representation methods in the task of predicting tags for SO questions and empirically prove the impact of code syntax in text representations. The SO dataset was sampled and questions without code syntax were identified. Two classical text representation methods consisting of BoW and TF-IDF were selected along four other methods based on pre-trained models including Fasttext, USE, Sentence-BERT and Sentence-RoBERTa. Multi-label k'th Nearest Neighbors classifier was used to learn and predict tags based on the similarities between feature-vector representations of the input data. Our results indicate a consistent superiority of the representations generated from Sentence-RoBERTa. Overall, the classifier achieved a 17% or higher improvement on F1 score when predicting tags for questions without any code syntax in content.
Developer forums like stackoverflow have become essential resources to modern software development practices. However, many code snippets lack a well-defined method declaration, and thus they are often incomplete for ...
详细信息
ISBN:
(纸本)9781665403375
Developer forums like stackoverflow have become essential resources to modern software development practices. However, many code snippets lack a well-defined method declaration, and thus they are often incomplete for immediate reuse. Developers must adapt the retrieved code snippets by parameterizing the variables involved and identifying the return value. This activity, which we call APIzation of a code snippet, can be tedious and time-consuming. In this paper, we present APIzAToR to perform APIzations of JAVA code snippets automatically. APIzAToR is grounded by four common patterns that we extracted by studying real APIzations in GitHub. APIzAToR presents a static analysis algorithm that automatically extracts the method parameters and return statements. We evaluated APIzAToR with a ground-truth of 200 APIzations collected from 20 developers. For 113 (56.50 %) and 115 (57.50 %) APIzations, APIzAToR and the developers extracted identical parameters and return statements, respectively. For 163 (81.50 %) APIzations, either the parameters or the return statements were identical.
stackoverflow has become an emerging resource for talent recognition in recent years. While users exploit technical language on stackoverflow, recruiters try to find the relevant candidates for jobs using their own te...
详细信息
stackoverflow has become an emerging resource for talent recognition in recent years. While users exploit technical language on stackoverflow, recruiters try to find the relevant candidates for jobs using their own terminology. This procedure implies a gap which exists between recruiters and candidates terms. Due to this gap, the state-of-the-art expert finding models cannot effectively address the expert finding problem on stackoverflow. We propose two translation models to bridge this gap. The first approach is a statistical method and the second is based on word embedding approach. Utilizing several translations for a given query during the scoring step, the result of each intermediate query is blended together to obtain the final ranking. Here, we propose a new approach which takes the quality of documents into account in scoring step. We have made several observations to visualize the effectiveness of the translation approaches and also the quality-aware scoring approach. Our experiments indicate the following: First, while statistical and word embedding translation approaches provide different translations for each query, both can considerably improve the recall. Besides, the quality-aware scoring approach can improve the precision remarkably. Finally, our best proposed method can improve the MAP measure up to 46% on average, in comparison with the state-of-the-art expert finding approach. (C) 2019 Elsevier Ltd. All rights reserved.
Web3D developers often have to decide on which technologies are best for their projects. We explore that question through the perspective of community attention and support on Stack Overflow (SO). We focused on i) Web...
详细信息
ISBN:
(纸本)9781450381697
Web3D developers often have to decide on which technologies are best for their projects. We explore that question through the perspective of community attention and support on Stack Overflow (SO). We focused on i) WebGL, a key low-level JavaScript ( JS) API used to render 3D graphics in browsers without plugins, and ii) ***, a higher level JS library that reuses WebGL and is reputed easier and more intuitive. We considered questions from SO tagged with WebGL or *** and extracted all tags used on these questions. Using these, we were able to compare the relative attention (considering the number of questions and views) and support (considering satisfactory answers and how long they take) received by concerns and technologies associated to WebGL and ***. Our results suggest that *** gets significantly more community attention but less community support than WebGL on SO.
The popularity of Question-Answer websites such as stackoverflow, Ask and Yahoo! Answers is gradually increasing. Considering this increased popularity, the quality of the questions and answers is necessary to be take...
详细信息
The popularity of Question-Answer websites such as stackoverflow, Ask and Yahoo! Answers is gradually increasing. Considering this increased popularity, the quality of the questions and answers is necessary to be taken into account. The reason for this necessity is that there could be many answers to any question, and some of these answers have a low level in terms of quality. The credibility and expertise of the questioner and the respondents in the field of the question is one of the solutions to get around this problem. In other words, individuals with a high level of expertise ask more difficult and high-quality questions in their field of expertise, and individuals with a high level of expertise can answer these questions appropriately. The present paper aims at finding the expertise level of the individuals based on available statistical data about questions and answers. For this purpose, two methods were tested. In the first method, which is performed through scoring in this study, the emphasis is on the scores of the questions and answers. The basic assumption of the scoring method is that the scores of the questions and the answers are related to each other, because individuals with a higher level of expertise raise questions with higher scores and, on the other hand, respondents also provide answers with higher scores. Therefore, it is possible to determine the level of expertise of individuals by examining the scores of the questions and answers. In the second method, which is named comment-mining here, the expertise of an individual is determined through scoring positive and negative words in the comments and ultimately obtaining a final score for the questions and answers of a user. The actual data on the stackoverflow website were used to perform these methods. The results of the scoring method show that there is no significant relationship between the scores of the questions and answers, so this method cannot be used to determine the level of expertis
Software reuse is a well-established software engineering process that aims at improving development productivity. Although reuse can be performed in a systematic way (e.g., through product lines), in practice, reuse ...
详细信息
ISBN:
(纸本)9781728134215
Software reuse is a well-established software engineering process that aims at improving development productivity. Although reuse can be performed in a systematic way (e.g., through product lines), in practice, reuse is performed in many cases opportunistically, i.e., copying small code chunks either from the web or in-house developed projects. Knowledge sharing communities and especially stackoverflow constitute the primary source of code-related information for amateur and professional software developers. Despite the obvious benefit of increased productivity, reuse can have a mixed effect on the quality of the resulting code depending on the properties of the reused solutions. An efficient concept for capturing a wide-range of internal software qualities is the metaphor of Technical Debt which expresses the impact of shortcuts in software development on its maintenance costs. In this paper, we present the results of an empirical study on the relation between the existence of reusing code retrieved from stackoverflow on the technical debt of the target system. In particular, we study several open-source projects and identify non-trivial pieces of code that exhibit a perfect or near-perfect match with code provided in the context of answers in stackoverflow. Then, we compare the technical debt density of the reused fragments, obtained as the ratio of inefficiencies identified by SonarQube over the lines of reused code, to the technical debt density of the target codebase. The results provide insights to the potential impact of small-scale code reuse on technical debt and highlight the benefits of assessing code quality before committing changes to a repository.
Diversity is being intensively discussed by different knowledge areas of society and discussions in Software Engineering, are increasing as well. There are unconscious bias and lack of representativeness and when we t...
详细信息
ISBN:
(纸本)9789897583728
Diversity is being intensively discussed by different knowledge areas of society and discussions in Software Engineering, are increasing as well. There are unconscious bias and lack of representativeness and when we talk about characteristics as ethnicity and gender, to mention a few. How can tech companies support diversity, minimizing unconscious bias in their teams? Studies say that diversity builds better teams and delivers better results, among other benefits. Cognitive diversity is linked to better outcomes, and is influenced by identity diversity (e.g., gender, race, etc.), mainly when tasks are related to problem-solving and prediction. In this work, we are interested in understanding the pain points in software engineering regarding diversity and provide insights to support the attraction, hiring and retention policies for more diverse software engineering environments. stackoverflow is a popular community question&answer forum, with a high engagement of software developers. Yearly, they apply a survey, present straightforward results, and made the anonymized results available for download. So, it is possible to perform additional analysis beyond the original ones. Using data visualization techniques, we analyzed 2018 data implying insights and recommendations. Results show that diversity in companies is not yet a conscious decision-making factor for developers assessing a new job opportunity, and respondents from underrepresented groups tend to believe more they are not as good as their peers. We also propose a discussion about the unconscious bias, stereotypes, and impostor syndrome and how to provide support on that.
暂无评论