Detecting designpatterns from source code of software systems can help to understand the structure and the behavior of the software systems. The better understanding of software systems is helpful in reengineering an...
详细信息
Detecting designpatterns from source code of software systems can help to understand the structure and the behavior of the software systems. The better understanding of software systems is helpful in reengineering and refactoring. As software progression, refactoring has become more valuable. One way to reduce the refactoring costs is to detect designpatterns. The key criteria for accurately detecting designpatterns is signatures. Achieving fine signatures is not an easy forward task. Instead of improving signatures, more accurate detection can be achieved by having probabilistic viewpoints. Since each of the designpatterns has variants or may be implemented differently, having a probabilistic approach in detection can increase coverage as well as help in software refactoring. In this study, the main purpose is to identify the designpatterns in source code with a non-crisp approach and measuring the possibility of the presence of the designpatterns in the source code. Considering main body of designpatterns and their corresponding signatures, designpatterns are represented as appropriate features. We try to get features from design pattern signatures that do not change in the face of variations that occur during implementation. Then, through these features, the probability of presence of the roles forming the designpatterns is determined, using neural network and regression analysis. After this step, using probabilistic graphical models the probability of presenting designpatterns in source code is measured. The results of the proposed method show the similarity of each code to the designpatterns in the range between 0 and 1. The results of other valid methods are a subset of the results of proposed method. Results that are 50% to 100% similar to the designpatterns are presented in the evaluation section.
Software reverse engineering plays a crucial role in identifying designpatterns and reconstructing software architectures by analyzing system implementations and producing abstract representations across multiple lay...
详细信息
Software reverse engineering plays a crucial role in identifying designpatterns and reconstructing software architectures by analyzing system implementations and producing abstract representations across multiple layers. This research introduces a novel feature engineering approach that integrates both behavioral and structural analysis of code, resulting in a feature-rich sequential representation. This transformation enables the effective use of transformers and attention mechanisms to detect designpatterns in source code. Our results emphasize the importance of context in distinguishing between various designpatterns, demonstrating that the proposed sequence format, with its sensitivity to token order, significantly improves the model's capacity to differentiate between similar patterns. By leveraging the power of attention mechanisms, our approach efficiently discards irrelevant code elements, focusing on the most critical features for accurate patternsdetection. Additionally, we show that this sequential code representation can be utilized to augment training data, leading to enhanced model accuracy. Trained on a diverse set of code samples representing all 23 GoF designpatterns, sourced from repositories such as GitHub and Bitbucket, our methodology achieved an accuracy of 92%. Evaluation metrics further validate the robustness of the approach. This study underscores the potential of context-driven, feature-engineered representations in advancing design patterns detection and contributes a comprehensive new dataset that supports behavioral code analysis, setting the stage for future research in this area.
暂无评论