检索结果-内蒙古大学图书馆

Faster Variational Execution with Transparent Bytecode Transformation

PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL 2018年第OOPSLA期2卷 1–30页

作者： Wong, Chu-Pan Meinicke, Jens Lazarek, Lukas Kastner, Christian Carnegie Mellon Univ Pittsburgh PA 15213 USA Univ Magdeburg Magdeburg Germany Northwestern Univ Evanston IL 60208 USA

Variational execution is a novel dynamic analysis technique for exploring highly configurable systems and accurately tracking information flow. It is able to efficiently analyze many configurations by aggressively sharing redundancies of program executions. The idea of variational execution has been demonstrated to be effective in exploring variations in the program, especially when the configuration space grows out of control. Existing implementations of variational execution often require heavy lifting of the runtime interpreter, which is painstaking and error-prone. Furthermore, the performance of this approach is suboptimal. For example, the state-of-the-art variational execution interpreter for java, VarexJ, slows down executions by 100 to 800 times over a single execution for small to medium size java programs. Instead of modifying existing JVMs, we propose to transform existing bytecode to make it variational, so it can be executed on an unmodified commodity JVM. Our evaluation shows a dramatic improvement on performance over the state-of-the-art, with a speedup of 2 to 46 times, and high efficiency in sharing computations.

关键词： java virtual machine Bytecode Transformation Variational Execution Configurable System

来源：评论

学校读者我要写书评

暂无评论

Optimizing type-specific instrumentation on the JVM with reflective supertype information

引用

JOURNAL OF VISUAL LANGUAGES AND COMPUTING 2018年第Dec.期49卷 29-45页

作者： Rosa, Andrea Binder, Walter Univ Svizzera Italiana Fac Informat Via Giuseppe Buffi 13 Lugano Switzerland

Reflective supertype information (RSI) is useful for many instrumentation-based type-specific analyses on the java virtual machine (JVM). On the one hand, while such information can be obtained when performing the instrumentation within the same JVM process executing the instrumented program, in-process instrumentation severely limits the bytecode coverage of the analysis. On the other hand, performing the instrumentation in a separate process can achieve full bytecode coverage, but complete RSI is generally not available, often requiring the insertion of expensive runtime type checks in the instrumented program. In this article, we present a novel technique to accurately reify complete RSI in a separate instrumentation process. This is challenging, because the observed application may make use of custom classloaders and the loaded classes in one application execution are generally only known upon termination of the application. We implement our technique in an extension of the dynamic analysis framework DiSL. The resulting framework guarantees full bytecode coverage, while providing RSI. Evaluation results on a task profiler demonstrate that our technique can achieve speedups up to a factor of 6.24 x wrt. resorting to runtime type checks in the instrumentation code for an analysis with full bytecode coverage.

关键词： Dynamic analysis Reflective information Bytecode instrumentation java virtual machine

来源：评论

学校读者我要写书评

暂无评论

Analyzing and Optimizing Task Granularity on the JVM 2018

Analyzing and Optimizing Task Granularity on the JVM

引用

16th International Symposium on Code Generation and Optimization (CGO)

作者： Rosa, Andrea Rosales, Eduardo Binder, Walter Univ Svizzera Italiana Fac Informat Lugano Switzerland

ISBN: (纸本)9781450356176

Task granularity, i.e., the amount of work performed by parallel tasks, is a key performance attribute of parallel applications. On the one hand, fine-grained tasks (i.e., small tasks carrying out few computations) may introduce considerable parallelization overheads. On the other hand, coarse-grained tasks (i.e., large tasks performing substantial computations) may not fully utilize the available CPU cores, resulting in missed parallelization opportunities. In this paper, we provide a better understanding of task granularity for applications running on a java virtual machine. We present a novel profiler which measures the granularity of every executed task. Our profiler collects carefully selected metrics from the whole system stack with only little overhead, and helps the developer locate performance problems. We analyze task granularity in the DaCapo and ScalaBench benchmark suites, revealing several inefficiencies related to fine-grained and coarse-grained tasks. We demonstrate that the collected task-granularity profiles are actionable by optimizing task granularity in two benchmarks, achieving speedups up to 1.53x.

关键词： task granularity performance analysis parallel applications actionable profiler java virtual machine

来源：评论

学校读者我要写书评

暂无评论

Espresso: Brewing java For More Non-Volatility with Non-volatile Memory 18

Espresso: Brewing Java For More Non-Volatility with Non-vola...

引用

23rd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)

作者： Wu, Mingyu Zhao, Ziming Li, Haoyu Li, Heting Chen, Haibo Zang, Binyu Guan, Haibing Shanghai Jiao Tong Univ Inst Parallel & Distributed Syst Shanghai Key Lab Scalable Comp & Syst Shanghai Peoples R China

ISBN: (纸本)9781450349116

Fast, byte-addressable non-volatilememory (NVM) embraces both near-DRAM latency and disk-like persistence, which has generated considerable interests to revolutionize system software stack and programming models. However, it is less understood how NVM can be combined with managed run-time like java virtual machine (JVM) to ease persistence management. This paper proposes Espresso(1), a holistic extension to java and its runtime, to enable java programmers to exploit NVM for persistence management with high performance. Espresso first provides a general persistent heap design called Persistent java Heap (PJH) to manage persistent data as normal java objects. The heap is then strengthened with a recoverable mechanism to provide crash consistency for heap metadata. Espresso further provides a new abstraction called Persistent java Object (PJO) to provide an easy-to-use but safe persistence programming model for programmers to persist application data. Evaluation confirms that Espresso significantly outperforms state-of-art NVM support for java (i.e., JPA and PCJ) while being compatible to data structures in existing java programs.

关键词： Non-Volatile Memory Crash Consistency java virtual machine

来源：评论

学校读者我要写书评

暂无评论

lpt: a Tool for Tuning the Level of Parallelism of Spark Applications 25

lpt: a Tool for Tuning the Level of Parallelism of Spark App...

引用

25th Asia-Pacific Software Engineering Conference (APSEC)

作者： Rosales, Eduardo Rosa, Andrea Binder, Walter USI Fac Informat Lugano Switzerland

ISBN: (纸本)9781728119700

Spark is increasingly becoming the platform of choice for several big-data analyses mainly due to its fast fault tolerant, and in-memory processing model. Despite the popularity and maturity of the Spark framework, tuning Spark applications to achieve high performance remains challenging. In this paper, we present Int, a novel tool that assists users hi improving the level of paralklism of applications running on top of Spark in the local amide. 1pt helps users tune the level of parallelism of Spark applications to spaw,n a number of tasks able to fully exploit the available computing resources. Our evaluation results show that optimizations guided by Ipt can achieve speedups up to 2.72x.

关键词： Dynamic analysis java virtual machine Performance analysis Spark Tuning

来源：评论

学校读者我要写书评

暂无评论

Analysis and Optimizations of java Full Garbage Collection 18

Analysis and Optimizations of Java Full Garbage Collection

引用

9th Asia-Pacific Workshop on Systems (APSys)

作者： Li, Haoyu Wu, Mingyu Chen, Haibo Shanghai Jiao Tong Univ Inst Parallel & Distributed Syst Shanghai Peoples R China

ISBN: (纸本)9781450360067

java runtime frees applications from manual memory management by its automatic garbage collection (GC), at the cost of stop-the-world pauses. State-of-the-art collectors leverage multiple generations, which will inevitably suffer from a full GC phase scanning the whole heap and induce a pause tens of times longer than normal collections, which largely affects both throughput and latency of the entire system. In this paper, we analyze the full GC performance of HotSpot Parallel Scavenge garbage collector comprehensively and study its algorithm design in depth. We find out that heavy dependencies among heap regions cause poor thread utilization. Furthermore, many heap regions contain mostly live objects (referred to as dense regions), which are unnecessary to collect. To solve these problems, we introduce two kinds of optimizations: allocating shadow regions dynamically as compaction destination to eliminate region dependencies and skipping dense regions to reduce GC workload. Evaluation results show the optimizations lead to averagely 2.6X (up to 4.5X) improvement in full GC throughput and thereby boost the application performance by 18.2% on average (58.4% at best).

关键词： Full garbage collection java virtual machine Performance Parallel Scavenge Memory management

来源：评论

学校读者我要写书评

暂无评论

DwarfGC: A Space-efficient and Crash-consistent Garbage Collector in NVM for Cloud Computing 12

DwarfGC: A Space-efficient and Crash-consistent Garbage Coll...

引用

12th IEEE International Symposium on Service-Oriented System Engineering (SOSE) / 9th International Workshop on Joint Cloud Computing (JCC)

作者： Li, Heting Wu, Mingyu Shanghai Jiao Tong Univ Inst Parallel & Distributed Syst IPADS Shanghai Peoples R China

ISBN: (纸本)9781538652060

Emerging cloud computing arouses need for large-scale data processing which in turn promises vigorous developments on big data platforms running on java virtual machine (JVM), such as Hadoop, Spark and Flink. Storing a large amount of data in memory allows those platforms to benefit from satisfying performance and powerful memory management and garbage collection service in java. Non-volatile memory (NVM) provides nonvolatility, byte-addressable and fast access speed characteristics and thus becomes a superior alternative for volatile memory utilizing in future cloud system and java world. This paper presents a recoverable garbage collector named DwarfGC to manage java objects in NVM so as to ensure crash consistency and durability. DwarfGC persists heaprelated metadata into NVM at the beginning of GC and relies on it for recovery. The metadata is stored in a spaceefficient fashion but incurring little time overhead.

关键词： java virtual machine Non-volatile memory Garbage collector

来源：评论

学校读者我要写书评

暂无评论

Characterizing and Optimizing Hotspot Parallel Garbage Collection on Multicore Systems 18

Characterizing and Optimizing Hotspot Parallel Garbage Colle...

引用

13th EuroSys Conference (EuroSys)

作者： Suo, Kun Rao, Jia Jiang, Hong Srisa-an, Witawas Univ Texas Arlington Arlington TX 76019 USA Univ Nebraska Lincoln NE USA

ISBN: (纸本)9781450355841

The proliferation of applications, frameworks, and services built on java have led to an ecosystem critically dependent on the underlying runtime system, the java virtual machine (JVM). However, many applications running on the JVM, e.g., big data analytics, suffer from long garbage collection (GC) time. The long pause time due to GC not only degrades application throughput and causes long latency, but also hurts overall system efficiency and scalability. In this paper, we present an in-depth performance analysis of GC in the widely-adopted HotSpot JVM. Our analysis uncovers a previously unknown performance issue the design of dynamic GC task assignment, the unfairness of mutex lock acquisition in HotSpot, and the imperfect operating system (OS) load balancing together cause loss of concurrency in Parallel Scavenge, a state-of-the-art and the default garbage collector in HotSpot. To this end, we propose a number of solutions to these issues, including enforcing GC thread affinity to aid multicore load balancing and designing a more efficient work stealing algorithm. Performance evaluation demonstrates that these proposed approaches lead to the improvement of the overall completion time, GC time and application tail latency by as much as 49.6%, 87.1%, 43%, respectively.

关键词： java virtual machine Garbage collection Performance Multicore

来源：评论

学校读者我要写书评

暂无评论

A Volatile-by-Default JVM for Server Applications

引用

PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL 2017年第OOPSLA期1卷 1-25页

作者： Liu, Lun Millstein, Todd Musuvathi, Madanlal Univ Calif Los Angeles Los Angeles CA 90095 USA Microsoft Res Redmond WA USA

A memory consistency model (or simply memory model) defines the possible values that a shared-memory read may return in a multithreaded programming language. Choosing a memory model involves an inherent performance-programmability tradeoff. The java language has adopted a relaxed (or weak) memory model that is designed to admit most traditional compiler optimizations and obviate the need for hardware fences on most shared-memory accesses. The downside, however, is that programmers are exposed to a complex and unintuitive semantics and must carefully declare certain variables as volatile in order to enforce program orderings that are necessary for proper behavior. This paper proposes a simpler and stronger memory model for java through a conceptually small change: every variable has volatile semantics by default, but the language allows a programmer to tag certain variables, methods, or classes as relaxed and provides the current java semantics for these portions of code. This volatile-by-default semantics provides sequential consistency (SC) for all programs by default. At the same time, expert programmers retain the freedom to build performance-critical libraries that violate the SC semantics. At the outset, it is unclear if the volatile-by-default semantics is practical for java, given the cost of memory fences on today's hardware platforms. The core contribution of this paper is to demonstrate, through comprehensive empirical evaluation, that the volatile-by-default semantics is arguably acceptable for a predominant use case for java today - server-side applications running on Intel x86 architectures. We present VBD-HoTSPoT, a modification to Oracle's widely used HotSpot JVM that implements the volatile-by-default semantics for x86. To our knowledge VBD-HoTSPoT is the first implementation of SC for java in the context of a modern JVM. VBD-HoTSPoT incurs an average overhead versus the baseline HotSpot JVM of 28% for the Da Capo benchmarks, which is significant tho

关键词： memory consistency models volatile by default sequential consistency java virtual machine

来源：评论

学校读者我要写书评

暂无评论

"Slimming" a java virtual machine by way of cold code removal and optimistic partial program loading

引用

SCIENCE OF COMPUTER PROGRAMMING 2011年第11期76卷 1037-1053页

作者： Wagner, Gregor Gal, Andreas Franz, Michael Univ Calif Irvine Irvine CA 92697 USA

Embedded systems provide limited storage capacity. This limitation conflicts with the demands of modern virtual machine platforms, which require large amounts of library code to be present on each client device. These conflicting requirements are often resolved by providing specialized embedded versions of the standard libraries, but even these stripped down libraries consume significant resources. We present a solution for "always connected" mobile devices based on a zero footprint client paradigm. In our approach, all code resides on a remote server. Only those parts of applications and libraries that are likely to be needed are transferred to the mobile client device. Since it is difficult to predict statically which library parts will be needed at run time, we combine static analysis, opportunistic off-target linking and lazy code loading to transfer code with a high likelihood of execution ahead of time while the other code, such as exception code, remains on the server and is transferred only on demand. This allows us to perform not only dead code elimination, but also aggressive elimination of unused code. The granularity of our approach is flexible from class files all the way down to individual basic blocks. Our method achieves total code size reductions of up to 95%. (C) 2010 Elsevier B.V. All rights reserved.

关键词： java virtual machine Just-in-time compilation Embedded connected devices Cold code removal

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：