In the face of increased threats within software registries and management systems, we address the critical need for effective malicious code detection. In this paper, we propose an innovative approach that integrates...
详细信息
ISBN:
(纸本)9798350329964
In the face of increased threats within software registries and management systems, we address the critical need for effective malicious code detection. In this paper, we propose an innovative approach that integrates source code slicing, inter-procedural analysis, and cross-file inter-procedural analysis, thereby enhancing the detection precision and reducing false positives. this approach has been encapsulated within a multi-analysis-based framework for automatic detection of malicious code in real-world software packages. In its application to major third-party software registries like PyPI and NPM, our framework has proven effective, identifying 130 malicious packages from a total of 169,640 monitored over a continuous period of five weeks. this work advances the current state-of-the-art solution to malicious code detection, demonstrating significant practical impact in strengthening the software supply chain defense.
A noticeable gap exists in the current acquisition and engineering workforce’s knowledge, skills, and support resources needed to address software and supply chain risk. the growing reliance on software to handle sys...
详细信息
Detecting and refactoring code smells is challenging, laborious, and sustaining. Although large language models have demonstrated potential in identifying various types of code smells, they also have limitations such ...
详细信息
ISBN:
(数字)9798400712487
ISBN:
(纸本)9798400712487
Detecting and refactoring code smells is challenging, laborious, and sustaining. Although large language models have demonstrated potential in identifying various types of code smells, they also have limitations such as input-output token restrictions, difficulty in accessing repository-level knowledge, and performing dynamic source code analysis. Existing learning-based methods or commercial expert toolsets have advantages in handling complex smells. they can analyze project structures and contextual information in-depth, access global code repositories, and utilize advanced code analysis techniques. However, these toolsets are often designed for specific types and patterns of code smells and can only address fixed smells, lacking flexibility and scalability. To resolve that problem, we propose iSMELL, an ensemble approach that employs various code smell detection toolsets via Mixture of Experts (MoE) architecture for comprehensive code smell detection, and enhances the LLMs withthe detection results from expert toolsets for refactoring those identified code smells. First, we train a MoE model that, based on input code vectors, outputs the most suitable expert tool for identifying each type of smell. then, we select the recommended toolsets for code smell detection and obtain their results. Finally, we equip the prompts withthe detection results from the expert toolsets, thereby enhancing the refactoring capability of LLMs for code with existing smells, enabling them to provide different solutions based on the type of smell. We evaluate our approach on detecting and refactoring three classical and complex code smells, i.e., Refused Bequest, God Class, and Feature Envy. the results show that, by adopting seven expert code smell toolsets, iSMELL achieved an average F1 score of 75.17% on code smell detection, outperforming LLMs baselines by an increase of 35.05% in F1 score. We further evaluate the code refactored by the enhanced LLM. the quantitative and human eval
Requirements analysis is crucial in software system development. Withthe growth of Artificial Intelligence (AI)-based solutions, this analysis has gained greater importance to create more robust and accessibility-foc...
详细信息
An important problem in microrobotics is how to control a large group of microrobots with a global control signal. this paper focuses on controlling a large-scale swarm of MicroStressBots with on-board physical finite...
详细信息
ISBN:
(数字)9781665490429
ISBN:
(纸本)9781665490429
An important problem in microrobotics is how to control a large group of microrobots with a global control signal. this paper focuses on controlling a large-scale swarm of MicroStressBots with on-board physical finite-state machines. We introduce the concept of group-based control, which makes it possible to scale up the swarm size while reducing the complexity both of robot fabrication as well as swarm control. We prove that the group-based control system is locally accessible in terms of the robot positions. We further hypothesize based on extensive simulations that the system is globally controllable. A nonlinear optimization strategy is proposed to control the swarm by minimizing control effort. We also propose a probabilistically complete collision avoidance method that is suitable for online use. the paper concludes with an evaluation of the proposed methods in simulations.
Background: the adoption of chatbots into software development tasks has become increasingly popular among practitioners, driven by the advantages of cost reduction and acceleration of the software development process...
详细信息
Stowage planning is one of the most important stages in management of container terminals and depends the sequence of containers to be loaded on the ship. For non-clear containers, which are absence from the terminal,...
详细信息
ISBN:
(数字)9781665490429
ISBN:
(纸本)9781665490429
Stowage planning is one of the most important stages in management of container terminals and depends the sequence of containers to be loaded on the ship. For non-clear containers, which are absence from the terminal, their slots will be selected manually by stowage planners, making it a time-consuming job. In order to optimize the slot reservation problem of non-clear containers, a mathematical model based on the knapsack problem is constructed. A Stack Selection Algorithm based on dynamic programming is proposed to solve the model. Further case study of Yangshan automatic container terminal demonstrates that the method can solve the non-clear containers reservation problem in a very short time, and the results are better than traditional heuristic approaches.
High-end equipment is manufactured using a "main manufacturer-supplier" mode. the supply-demand relationship of high-end equipment involves benefit conflicts among the customer, the main manufacturer, and mu...
详细信息
ISBN:
(数字)9781665490429
ISBN:
(纸本)9781665490429
High-end equipment is manufactured using a "main manufacturer-supplier" mode. the supply-demand relationship of high-end equipment involves benefit conflicts among the customer, the main manufacturer, and multiple suppliers, so transaction decisions become very complicated. Traditional optimisation methods are inadequate at revealing the interactions among multiple stakeholders. In this study, based on the transaction process of high-end equipment manufacturing, two correlated Stackelberg game models, namely, "main manufacturer-customer" and "main manufacturer-supplier," are constructed and their Nash equilibria are solved to maximize the profit of each stakeholder. the effects of various parameters on the decision variables of each stakeholder are analyzed through numerical simulation.
Several of the current procedures for detecting cancer, such as mammography, ultrasound, MRI, and biopsy, are either expensive, painful, intrusive, or have limitations in accuracy and sensitivity. As a result, there i...
详细信息
Within the realms of scientific computing, large-scale data processing, and artificial intelligence-powered computation, disparities in performance, which originate from differing code implementations, directly influe...
详细信息
ISBN:
(数字)9798400712487
ISBN:
(纸本)9798400712487
Within the realms of scientific computing, large-scale data processing, and artificial intelligence-powered computation, disparities in performance, which originate from differing code implementations, directly influence the practicality of the code. Although existing works tried to utilize code knowledge to enhance the execution performance of codes generated by large language models, they neglect code evaluation outcomes which directly refer to the code execution details, resulting in inefficient computation. To address this issue, we propose DSCT-Decode, an innovative adaptive decoding strategy for large language models, that employs a data structure named 'Code Token Tree' (CTT), which guides token selection based on code evaluation outcomes. DSCT-Decode assesses generated code across three dimensions-correctness, performance, and similarity-and utilizes a dynamic penalty-based boundary intersection method to compute multi-objective scores, which are then used to adjust the scores of nodes in the CTT during backpropagation. By maintaining a balance between exploration, through token selection probabilities, and exploitation, through multi-objective scoring, DSCT-Decode effectively navigates the code space to swiftly identify high-performance code solutions. To substantiate our framework, we developed a new benchmark, big-DS-1000, which is an extension of DS-1000. this benchmark is the first of its kind to specifically evaluate code generation methods based on execution performance. Comparative evaluations with leading large language models, such as CodeLlama and GPT-4, show that our framework achieves an average performance enhancement of nearly 30%. Furthermore, 30% of the codes exhibited a performance improvement of more than 20%, underscoring the effectiveness and potential of our framework for practical applications.
暂无评论