sourcecodeanalysis and manipulation (SCAM) underpins virtually every operational software system. Despite the impact and ubiquity of SCAM principles and techniques in software engineering, there are still frontiers ...
详细信息
ISBN:
(纸本)9780769543475
sourcecodeanalysis and manipulation (SCAM) underpins virtually every operational software system. Despite the impact and ubiquity of SCAM principles and techniques in software engineering, there are still frontiers to be explored. Looking "inward" to existing techniques, one finds frontiers of performance, efficiency, accuracy, and usability;looking "outward" one finds new languages, new problems, and thus new approaches. This paper presents a reflective framework for characterizing source languages and domains. It draws on current research projects in music program analysis, musical score processing, and machine knitting to identify new frontiers for SCAM. The paper also identifies opportunities for SCAM to inspire, and be inspired by, problems and techniques in other domains.
We propose an approach that derives interactive-visualization scenarios from descriptions of codeanalysis tasks. The scenario derivation is treated as an optimization process. In this context, we evaluate different p...
详细信息
ISBN:
(纸本)9780769543475
We propose an approach that derives interactive-visualization scenarios from descriptions of codeanalysis tasks. The scenario derivation is treated as an optimization process. In this context, we evaluate different possibilities of using a given visualization tool to perform the analysis task, and select the scenario that requires the least effort from the analyst. Our approach was applied successfully to various analysis tasks such as design defect detection and feature location.
The measurement of software quality, including the preparation and management of the necessary resources and libraries, is a major challenge in continuous software quality measurement and assessment. When applying cod...
详细信息
ISBN:
(纸本)9781538632383
The measurement of software quality, including the preparation and management of the necessary resources and libraries, is a major challenge in continuous software quality measurement and assessment. When applying codeanalysis tools to a large number of projects, the preparation of the sourcecode and its dependencies, focusing on the completeness of these elements, is the basis for correct analysis results. In order to make this preparation process efficient and effective, there is a need to automate this process. Therefore, we built a tool infrastructure, which automates this preparation and analysis process. As part of the code preparation process, we developed the tool LibLoader, which automatically resolves missing dependencies in open source Java projects. This enables the analysis of complete projects in due time and with more accurate results from static codeanalysis tools.
analysis of software is essential to addressing problems of correctness, efficiency, and security. Existing sourcecodeanalysis tools are very useful for such purposes, but there are many instances where high-level s...
详细信息
ISBN:
(纸本)9780769543475
analysis of software is essential to addressing problems of correctness, efficiency, and security. Existing sourcecodeanalysis tools are very useful for such purposes, but there are many instances where high-level sourcecode is not available for software that needs to be analyzed. A need exists for tools that can analyze assembly code, whether from disassembled binaries or from handwritten sources. This paper describes an equational reasoning system for assembly code for the ubiquitous Intel x86 architecture, focusing on various problems that arise in low-level equational reasoning, such as register-name aliasing, memory indirection, condition-code flags, etc. Our system has successfully been applied to the problem of simplifying execution traces from obfuscated malware executables.
Mane automated software engineering tools require tight integration of techniques for sourcecodeanalysis and manipulation. State-of-the-art tools exist for both, but the domains have remained notoriously separate be...
详细信息
ISBN:
(纸本)9780769537931
Mane automated software engineering tools require tight integration of techniques for sourcecodeanalysis and manipulation. State-of-the-art tools exist for both, but the domains have remained notoriously separate because different computational paradigms fit each domain best. This impedance mismatch hampers the development of new solutions because the desired functionality and scalability can only be achieved by repeated and ad hoc integration of different techniques. RASCAL is a domain-specific language that takes away most of this boilerplate by integrating sourcecodeanalysis and manipulation at the conceptual, syntactic, semantic and technical level. We give an overview of the language: and assess its merits by implementing a complex refactoring.
When we designed the first version of Rascal in 2009, we jokingly promised ourselves to only write a single paper on the language itself, and see it as vehicle for research from then on,that one paper became the SCAM ...
详细信息
ISBN:
(纸本)9781728149370
When we designed the first version of Rascal in 2009, we jokingly promised ourselves to only write a single paper on the language itself, and see it as vehicle for research from then on,that one paper became the SCAM 2009 article [2], now awarded with the SCAM most influential paper award. Since then, Rascal has evolved significantly, and has been successfully applied in research, education, and industry. This extended abstract gives an overview of the impact of Rascal over the last 10 years, and looks at current and future developments.
The optimal number of latent topics required to model the most accurate latent substructure for a sourcecode corpus is an open question in sourcecodeanalysis. Most estimates about the number of latent topics that e...
详细信息
ISBN:
(纸本)9780769541785
The optimal number of latent topics required to model the most accurate latent substructure for a sourcecode corpus is an open question in sourcecodeanalysis. Most estimates about the number of latent topics that exist in a software corpus are based on the assumption that the data is similar to natural language, but there is little empirical evidence to support this. In order to help determine the appropriate number of topics needed to accurately represent the sourcecode, we generate a series of Latent Dirichlet Allocation models with varying topic counts. We use a heuristic to evaluate the ability of the model to identify related sourcecode blocks, and demonstrate the consequences of choosing too few or too many latent topics.
code comments play a vital role in sourcecode comprehension and software maintainability. It is common for developers to write comments for explaining a code snippet. However, low-quality comments can have a detrimen...
详细信息
ISBN:
(纸本)9781665448970
code comments play a vital role in sourcecode comprehension and software maintainability. It is common for developers to write comments for explaining a code snippet. However, low-quality comments can have a detrimental effect on software quality or be ineffective for code understanding. This study aims to create a taxonomy of inline code comment smells and determine how commonly each smell type occurs in software projects. We conducted a multivocal literature review for defining the initial taxonomy of inline comment smells. Afterward, we manually labeled 899 inline comments from three open-source Java projects. We created a taxonomy of 11 inline code comment smell types and found out that the smells exist in practice with varying degrees.
sourcecode metadata on a file-level granularity is too coarse for certain applications. But fine-grained metadata (e.g. line-by-line authorship) easily gets lost dire to changes like merging, proving or copying code....
详细信息
ISBN:
(纸本)9780769537931
sourcecode metadata on a file-level granularity is too coarse for certain applications. But fine-grained metadata (e.g. line-by-line authorship) easily gets lost dire to changes like merging, proving or copying code. Enabling metadata to survive code evolution provides valuable insights into program sourcecode. This helps developers to understand the sources and opens up opportunities for advanced tools. We present a concept that utilizes different search heuristics to identify probable ancestors of source documents, and pair this with clone detection to locate origins of inserted code. Arbitrary kinds of metadata can then be linked to code sections and be preserved automatically while code evolves. We evaluate our approach using code from the Hydra and FreeCol projects, and sketch prospective applications.
暂无评论