The performance of embedded java virtual machine can be improved by ahead-of-time compiler ( AOTC), which translates bytecode into machine code in the server and installs the machine code on the client device. Althoug...
详细信息
ISBN:
(纸本)9781595936325
The performance of embedded java virtual machine can be improved by ahead-of-time compiler ( AOTC), which translates bytecode into machine code in the server and installs the machine code on the client device. Although AOTC is more advantageous than just-in-time compiler (JITC) due to its absence of the translation overhead, AOTC cannot be applicable to dynamically downloaded classes at runtime. This paper proposes client-AOTC (c-AOTC) which performs AOTC on the client device using the JITC module installed on the device, complementing the server-AOTC. The machine code of a method translated by JITC is cached on a persistent memory of the device, and when the method is invoked again in a later run of the program, the machine code is loaded and executed directly without the translation and interpretation overhead. One of major issues in c-AOTC is relocation because some of the addresses used by the cached machine code are not correct when the machine code is loaded and used in a different run;those addresses should be corrected before they are used. Constant pool resolution and inlining complicate the relocation problem and we propose our solutions. We developed a c-AOTC on Sun's CDC VM reference implementation ( RI) and our evaluation results indicate that cAOTC can improve the performance significantly, as much as an average of 12%. We also experiment with the issue of reducing the number of c-AOTC methods to be cached when the persistent space is tight, with a graceful degradation of performance.
In recent times, the subject of interoperability has become very popular. In large-scale software applications development, it is a common practice to combine multiple languages in solving peculiar problems and develo...
详细信息
In recent times, the subject of interoperability has become very popular. In large-scale software applications development, it is a common practice to combine multiple languages in solving peculiar problems and developing robust solutions. The ability to combine multiple languages allows an easy migration of an existing project from one language to another or use existing libraries in another language. This makes interoperability a force to be reckoned with when developing new programming languages. The Eolang programming language is a new research and development initiative aimed at achieving true Object-Oriented Programming by having all components of the program as objects. As such, the construct and syntax of Eolang is vastly different from that of java. This makes integration and interoperability between these two languages a challenging issue related to method/object naming conventions, keywords and operators, etc. In this paper we explore the potential of Eolang interoperability with java by looking at the interoperability mechanisms of some other languages with java, describe ways to overcome these challenges with Eolang and develop the solution. Specifically, we focus on the possibility to call java code from Eolang while the semantics of both languages remain preserved. Our solution allows java code to be called in Eolang through wrappers that turn java classes and methods into Eolang Objects. (C) 2021 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (https://***/licenses/by-nc-nd/4.0) Peer-review under responsibility of the scientific committee of KES International.
Task granularity, i.e., the amount of work performed by parallel tasks, is a key performance attribute of parallel applications. On the one hand, fine-grained tasks (i.e., small tasks carrying out few computations) ma...
详细信息
ISBN:
(纸本)9781450356176
Task granularity, i.e., the amount of work performed by parallel tasks, is a key performance attribute of parallel applications. On the one hand, fine-grained tasks (i.e., small tasks carrying out few computations) may introduce considerable parallelization overheads. On the other hand, coarse-grained tasks (i.e., large tasks performing substantial computations) may not fully utilize the available CPU cores, resulting in missed parallelization opportunities. In this paper, we provide a better understanding of task granularity for applications running on a java virtual machine. We present a novel profiler which measures the granularity of every executed task. Our profiler collects carefully selected metrics from the whole system stack with only little overhead, and helps the developer locate performance problems. We analyze task granularity in the DaCapo and ScalaBench benchmark suites, revealing several inefficiencies related to fine-grained and coarse-grained tasks. We demonstrate that the collected task-granularity profiles are actionable by optimizing task granularity in two benchmarks, achieving speedups up to 1.53x.
Most java just-in-time compilers (JITC) try to compile only hot methods since the compilation overhead is part of the running time. This requires precise and efficient hot spot detection, which includes distinguishing...
详细信息
ISBN:
(纸本)9781605581040
Most java just-in-time compilers (JITC) try to compile only hot methods since the compilation overhead is part of the running time. This requires precise and efficient hot spot detection, which includes distinguishing hot methods from cold methods, detecting them as early as possible, and paying a small runtime overhead for detection. A hot method could be identified by measuring its running time during interpretation since a long-running method is likely to be a hot method. However, precise measurement of the running time during execution is too expensive, especially in embedded systems, so many counter-based heuristics have been proposed to estimate it. The Simple heuristic counts only method invocations without any consideration of loops [1], while Sun's HotSpot heuristic counts loop iterations as well, but does not consider loop sizes or method sizes [2,14]. The static analysis heuristic estimates the running time of a method by statically analyzing loops or heavy-cost bytecodes but does not measure their dynamic counts [3]. Although the overhead of these heuristics is low, they do not estimate the running time precisely, which may lead to imprecise hot spot detection. This paper proposes a new hot spot detection heuristic which can estimate the running time more precisely than others with a relatively low overhead. It dynamically counts only important bytecodes interpreted, but with a simple arithmetic calculation it can obtain the precise count of all interpreted bytecodes. We also propose employing a static analysis technique to predict those hot methods which spend a huge execution time once invoked. This static prediction can allow compiling these methods at their first-invocation, complementing the proposed dynamic estimation technique. We implemented both, which led to a performance benefit of 10% compared to the HotSpot heuristic.
Fast, byte-addressable non-volatilememory (NVM) embraces both near-DRAM latency and disk-like persistence, which has generated considerable interests to revolutionize system software stack and programming models. Howe...
详细信息
ISBN:
(纸本)9781450349116
Fast, byte-addressable non-volatilememory (NVM) embraces both near-DRAM latency and disk-like persistence, which has generated considerable interests to revolutionize system software stack and programming models. However, it is less understood how NVM can be combined with managed run-time like java virtual machine (JVM) to ease persistence management. This paper proposes Espresso(1), a holistic extension to java and its runtime, to enable java programmers to exploit NVM for persistence management with high performance. Espresso first provides a general persistent heap design called Persistent java Heap (PJH) to manage persistent data as normal java objects. The heap is then strengthened with a recoverable mechanism to provide crash consistency for heap metadata. Espresso further provides a new abstraction called Persistent java Object (PJO) to provide an easy-to-use but safe persistence programming model for programmers to persist application data. Evaluation confirms that Espresso significantly outperforms state-of-art NVM support for java (i.e., JPA and PCJ) while being compatible to data structures in existing java programs.
The stream programming paradigm aims to expose coarse-grained parallelism in applications that must process continuous sequences of events. The appeal of stream programming comes from its conceptual simplicity. A prog...
详细信息
ISBN:
(纸本)9781595937865
The stream programming paradigm aims to expose coarse-grained parallelism in applications that must process continuous sequences of events. The appeal of stream programming comes from its conceptual simplicity. A program is a collection of independent filters which communicate by the means of uni-directional data channels. This model lends itself naturally to concurrent and efficient implementations on modern multiprocessors. As the output behavior of filters is determined by the state of their input channels, stream programs have fewer opportunities for the errors (such as data races and deadlocks) that plague shared memory concurrent programming. This paper introduces STREAMFLEX, an extension to java which marries streams with objects and thus enables to combine, in the same java virtual machine, stream processing code with traditional object-oriented components. STREAMFLEX targets high-throughput low-latency applications with stringent quality-of-service requirements. To achieve these goals, it must, at the same time, extend and restrict java. To allow for program optimization and provide latency guarantees, the STREAMFLEX compiler restricts java by imposing a stricter typing discipline on filters. On the other hand, STREAMFLEX extends the java virtual machine with real-time capabilities, transactional memory and type-safe region-based allocation. The result is a rich and expressive language that can be implemented efficiently.
java is becoming the main software platform for consumer and embedded devices such as mobile phones, PDAs, TV set-top boxes, and in-vehicle systems. Since many of these systems are memory constrained, it is extremely ...
详细信息
java is becoming the main software platform for consumer and embedded devices such as mobile phones, PDAs, TV set-top boxes, and in-vehicle systems. Since many of these systems are memory constrained, it is extremely important to keep the memory footprint of java applications under control. The goal of this work is to enable the execution of java applications using a smaller heap footprint than that possible using current embedded JVMs. We propose a set of memory management strategies to reduce heap footprint of embedded java applications that execute under severe memory constraints. Our first contribution is a new garbage collector, referred to as the Mark-Compact-Compress (MCC) collector, that allows an application to run with a heap smaller than its footprint. An important characteristic of this collector is that it compresses objects when heap compaction is not sufficient for creating space for the current allocation request. In addition to employing compression, we also consider a heap management strategy and associated garbage collector, called MCL (Mark-Compact-Lazy Allocate), based on lazy allocation of object portions. This new collector operates like the conventional Mark-Compact (MC) collector, but takes advantage of the observation that many java applications create large objects, of which only a small portion is actually used. In addition, we also combine MCC and MCL, and present MCCL (Mark-Compact-Compress-Lazy Allocate), which outperforms both MCC and MCL. We have implemented these collectors using KVM, and performed extensive experiments using a set of ten embedded java applications. We have found our new garbage collection strategies to be useful in two main aspects. First, they reduce the minimum heap size necessary to execute an application without out-of-memory exception. Second, our strategies reduce the heap occupancy. That is, at a given time, they reduce the heap memory requirement of the application being executed. We have also conducted exp
The ability to observe the internal operation of the J9 virtualmachine is essential for effective performance tuning. To this end, tracing is an important method, which is the action of recording events from a runnin...
详细信息
The ability to observe the internal operation of the J9 virtualmachine is essential for effective performance tuning. To this end, tracing is an important method, which is the action of recording events from a running system with minimum performance overhead for online or off-line analysis. In this paper, we propose the integration of LTTng, an effective open-source tracing toolset, with J9 to improve its tracing functions. With this integration, the tracing component is not only decoupled from the virtualmachine but also performed efficiently at both user and kernel levels to achieve a high-throughput result. To validate the integration and its impact performance, some empirical study results based on SpecJBB2005 and SQLBenchmark (supported by instrumented MariaDB) are also presented. Copyright (c) 2014 John Wiley & Sons, Ltd.
Accurately predicting program behaviors (e. g., locality, dependency, method calling frequency) is fundamental for program optimizations and runtime adaptations. Despite decades of remarkable progress, prior studies h...
详细信息
ISBN:
(纸本)9781450302036
Accurately predicting program behaviors (e. g., locality, dependency, method calling frequency) is fundamental for program optimizations and runtime adaptations. Despite decades of remarkable progress, prior studies have not systematically exploited program inputs, a deciding factor for program behaviors. Triggered by the strong and predictive correlations between program inputs and behaviors that recent studies have uncovered, this work proposes to include program inputs into the focus of program behavior analysis, cultivating a new paradigm named input-centric program behavior analysis. This new approach consists of three components, forming a three-layer pyramid. At the base is program input characterization, a component for resolving the complexity in program raw inputs and the extraction of important features. In the middle is input-behavior modeling, a component for recognizing and modeling the correlations between characterized input features and program behaviors. These two components constitute input-centric program behavior analysis, which (ideally) is able to predict the large-scope behaviors of a program's execution as soon as the execution starts. The top layer of the pyramid is input-centric adaptation, which capitalizes on the novel opportunities that the first two components create to facilitate proactive adaptation for program optimizations. By centering on program inputs, the new approach resolves a proactivity-adaptivity dilemma inherent in previous techniques. Its benefits are demonstrated through proactive dynamic optimizations and version selection, yielding significant performance improvement on a set of java and C programs.
Sensor networks have been referred to as part of the background infrastructure required to achieve ubiquitous computing. This has recently promoted a considerable amount of attention from the research community that c...
详细信息
ISBN:
(纸本)9781424449965
Sensor networks have been referred to as part of the background infrastructure required to achieve ubiquitous computing. This has recently promoted a considerable amount of attention from the research community that concluded that existing protocols and techniques for service discovery, such as JINI or UPnP are not suitable for the case of resource poor, battery-powered sensor nodes. We do not really agree with this approach. We think that those protocols could be a good starting point to "power-up" sensor nodes with poor resources for ubiquitous computing support. Starting from this principle and recognizing that existing sensor node system software is not suitable for our purpose, we decided to build a new sensor node software stack. The result was a stand-alone java virtual machine, suitable for sensor nodes with poor resources, an implementation of the ubiquitous TCP/IP communication stack and Jini based middleware to achieve automatic service discover and usage. This software stack was tailored to perfectly fit in the state-of-the-art Mica2 class of sensor nodes.
暂无评论