In programming education, it is crucial to provide instruction that is tailored to students’ proficiency levels. For this purpose, an objective evaluation of each student’s coding ability is essential. Additionally,...
详细信息
In programming education, it is crucial to provide instruction that is tailored to students’ proficiency levels. For this purpose, an objective evaluation of each student’s coding ability is essential. Additionally, it is necessary to understand the characteristics of sourcecode written by both beginners and advanced students. Previous research has successfully converted the structural information of sourcecode into graphs and assessed coding skills using deep learning, achieving high accuracy. However, it remains unclear which specific structural elements significantly influence these assessments. This study addresses this gap by transforming sourcecode into abstract syntax trees and developing a model that uses Graph Convolutional Networks to categorize code as either beginner or advanced users based on learned structural information. Furthermore, we apply Integrated Gradients to visualize the decision-making basis of our model and elucidate the structural characteristics distinguishing sourcecode written by beginner and advanced users.
source code classification (SCC) is a task to assign codes into different categories according to a criterion such as according to their functionalities, programming languages or vulnerabilities. Many sourcecode arch...
详细信息
Over the past few years, the software engineering (SE) community has widely employed deep learning (DL) techniques in many sourcecode processing tasks. Similar to other domains like computer vision and natural langua...
详细信息
Over the past few years, the software engineering (SE) community has widely employed deep learning (DL) techniques in many sourcecode processing tasks. Similar to other domains like computer vision and natural language processing (NLP), the state-of-the-art DL techniques for sourcecode processing can still suffer from adversarial vulnerability, where minor code perturbations can mislead a DL model's inference. Efficiently detecting such vulnerability to expose the risks at an early stage is an essential step and of great importance for further enhancement. This paper proposes a novel black-box effective and high-quality adversarial attack method, namely codeBERT-Attack (CBA), based on the powerful large pre-trained model (i.e., codeBERT) for DL models of sourcecode processing. CBA locates the vulnerable positions through masking and leverages the power of codeBERT to generate textual preserving perturbations. We turn codeBERT against DL models and further fine-tuned codeBERT models for specific downstream tasks, and successfully mislead these victim models to erroneous outputs. In addition, taking the power of codeBERT, CBA is capable of effectively generating adversarial examples that are less perceptible to programmers. Our in-depth evaluation on two typical source code classification tasks (i.e., functionality classification and code clone detection) against the most widely adopted LSTM and the powerful fine-tuned codeBERT models demonstrate the advantages of our proposed technique in terms of both effectiveness and efficiency. Furthermore, our results also show (1) that pre-training may help codeBERT gain resilience against perturbations further, and (2) certain pre-training tasks may be beneficial for adversarial robustness.
Despite the ubiquity of data science, we are far from rigorously understanding how coding in data science is performed. Even though the scientific literature has hinted at the iterative and explorative nature of data ...
详细信息
Despite the ubiquity of data science, we are far from rigorously understanding how coding in data science is performed. Even though the scientific literature has hinted at the iterative and explorative nature of data science coding, we need further empirical evidence to understand this practice and its workflows in detail. Such understanding is critical to recognise the needs of data scientists and, for instance, inform tooling support. To obtain a deeper understanding of the iterative and explorative nature of data science coding, we analysed 470 Jupyter notebooks publicly available in GitHub repositories. We focused on the extent to which data scientists transition between different types of data science activities, or steps (such as data preprocessing and modelling), as well as the frequency and co-occurrence of such transitions. For our analysis, we developed a dataset with the help of five data science experts, who manually annotated the data science steps for each code cell within the aforementioned 470 notebooks. Using the first-order Markov chain model, we extracted the transitions and analysed the transition probabilities between the different steps. In addition to providing deeper insights into the implementation practices of data science coding, our results provide evidence that the steps in a data science workflow are indeed iterative and reveal specific patterns. We also evaluated the use of the annotated dataset to train machine-learning classifiers to predict the data science step(s) of a given code cell. We investigate the representativeness of the classification by comparing the workflow analysis applied to (a) the predicted data set and (b) the data set labelled by experts, finding an F1-score of about 71% for the 10-class data science step prediction problem.
Over the years, programmers have improved their programming skills and can now write code in many different languages to solve problems. A lot of new code is being generated all over the world regularly. Since a progr...
详细信息
ISBN:
(纸本)9783030937331;9783030937324
Over the years, programmers have improved their programming skills and can now write code in many different languages to solve problems. A lot of new code is being generated all over the world regularly. Since a programming problem can be solved in many different languages, it is quite difficult to identify the problem from the written sourcecode. Therefore, a classification model is needed to help programmers identify the problems built (written/developed) in Multi-Programming Languages (MPLs). This classification model can help programmers learn better programming. However, source code classification models based on deep learning are still lacking in the field of programming education and software engineering. To address this gap, we propose a stacked Bidirectional Long Short-Term Memory (Bi-LSTM) neural network-based model for classifying sourcecodes developed in MPLs. To accomplish this research, we collect a large number of real-world sourcecodes from the Aizu Online Judge (AOJ) system. The proposed model is trained, validated, and tested on the AOJ dataset. Various hyperparameters are fine-tuned to improve the performance of the model. Based on the experimental results, the proposed model achieves an accuracy of about 93% and an F1-score of 89.24%. Moreover, the proposed model outperforms the state-of-the-art models in terms of other evaluation matrices such as precision (90.12%) and recall (89.48%).
A benchmark is an action to assess performance (e.g., program execution time) by developers preparing and running several test cases over a long period. To reasonably assess the performance of method-level code snippe...
详细信息
ISBN:
(纸本)9781665448970
A benchmark is an action to assess performance (e.g., program execution time) by developers preparing and running several test cases over a long period. To reasonably assess the performance of method-level code snippets, developers could use a micro benchmark. Some micro benchmarks for JavaScript provide online web services (e.g., jsPerf and ***). Developers easily search code snippets with better performance in the micro benchmark service. Then, the developers will find many similar code snippets for different functions in the service because the micro benchmark service has a collection of versatile method-level code snippets. To find replaceable code snippets with better performance, we tackle to distinguish similar code snippets for different functions with more fine-grained size than method-level in micro benchmark services. This study proposes an approach to collect diverse code snippets using the similar function. The approach measures the similarity using code2Vec between some code snippets assessed in the micro benchmark service, and find an appropriate threshold to associate with the code snippets using the similar function. Using the micro benchmark service jsPerf dataset that the authors collected, this study evaluates the usefulness of our approach. Specifically, we collect code snippets related to the most frequent topics "innerHTML vs removeChild" and "for vs forEach" assessed in jsPerf. Consequently, we find our approach achieves higher precision (98% and 92%) to identify diverse code snippets using the similar function.
In the face threat of the Internet attack, malware classification is one of the promising solutions in the field of intrusion detection and digital forensics. In previous work, researchers performed dynamic analysis o...
详细信息
In the face threat of the Internet attack, malware classification is one of the promising solutions in the field of intrusion detection and digital forensics. In previous work, researchers performed dynamic analysis or static analysis after reverse engineering. But malware developers even use anti-virtual machine(VM) and obfuscation techniques to evade malware classifiers. By means of the deployment of honeypots, malware sourcecode could be collected and analyzed. sourcecode analysis provides a better classification for understanding the purpose of attackers and forensics. In this paper, a novel classification approach is proposed, based on content similarity and directory structure similarity. Such a classification avoids to re-analyze known malware and allocates resources for new malware. Malware classification also let network administrators know the purpose of attackers. The experimental results demonstrate that the proposed system can classify the malware efficiently with a small misclassification ratio and the performance is better than virustotal.
Various commercial and open-source tools exist, developed both by the industry and academic groups, which are able to detect various types of security bugs in applications' sourcecode. However, most of these tool...
详细信息
ISBN:
(数字)9783319302225
ISBN:
(纸本)9783319302225;9783319302218
Various commercial and open-source tools exist, developed both by the industry and academic groups, which are able to detect various types of security bugs in applications' sourcecode. However, most of these tools are prone to non-negligible rates of false positives and false negatives, since they are designed to detect a priori specified types of bugs. Also, their analysis scalability to large programs is often an issue. To address these problems, we present a new sourcecode analysis technique based on execution path classification. We develop a prototype tool to test our method's ability to detect different types of information-flow dependent bugs. Our approach is based on classifying the Risk of likely exploits inside sourcecode execution paths using two measuring functions: Severity and Vulnerability. For an Application Under Test (AUT), we analyze every single pair of input vector and program sink in an execution path, which we call an Information Block (IB). Severity quantifies the danger level of an IB using static analysis and a variation of the Information Gain algorithm. On the other hand, an IB's Vulnerability rank quantifies how certain the tool is that an exploit exists on a given execution path. The Vulnerability function is based on tainted object propagation. The Risk of each IB is the combination of its computed Severity and Vulnerability measurements through an aggregation operation over two fuzzy sets using a Fuzzy Logic system. An IB is characterized of a high risk, when both its Severity and Vulnerability rankings have been found to be above the low zone. In this case, our prototype tool called Entroine reports a detected code exploit. The tool was tested on 45 Java vulnerable programs from NIST's Juliet Test Suite, which implement three different types of exploits. All existing code exploits were detected without any false positive.
Recent advances in static and dynamic program analysis resulted in tools capable to detect various types of security bugs in the Applications under Test (AUT). However, any such analysis is designed for a priori speci...
详细信息
ISBN:
(纸本)9789897581403
Recent advances in static and dynamic program analysis resulted in tools capable to detect various types of security bugs in the Applications under Test (AUT). However, any such analysis is designed for a priori specified types of bugs and it is characterized by some rate of false positives or even false negatives and certain scalability limitations. We present a new analysis and source code classification technique, and a prototype tool aiming to aid code reviews in the detection of general information flow dependent bugs. Our approach is based on classifying the criticality of likely exploits in the sourcecode using two measuring functions, namely Severity and Vulnerability. For an AUT, we analyse every single pair of input vector and program sink in an execution path, which we call an Information Block (IB). A classification technique is introduced for quantifying the Severity (danger level) of an IB by static analysis and computation of its Entropy Loss. An IB's Vulnerability is quantified using a tainted object propagation analysis along with a Fuzzy Logic system. Possible exploits are then characterized with respect to their Risk by combining the computed Severity and Vulnerability measurements through an aggregation operation over two fuzzy sets. An IB is characterized of a high risk, when both its Severity and Vulnerability rankings have been found to be above the low zone. In this case, a detected code exploit is reported by our prototype tool, called Entroine. The effectiveness of the approach has been tested by analysing 45 Java programs of NIST's Juliet Test Suite, which implement 3 different common weakness exploits. All existing code exploits were detected without any false positive.
暂无评论