This paper presents the JAFARDD (a java Architecture based on a Folding Algorithm, with Reservation stations, Dynamic translation, and Dual processing) processor. JAFARDD dynamically translates java stack-dependent by...
详细信息
This paper presents the JAFARDD (a java Architecture based on a Folding Algorithm, with Reservation stations, Dynamic translation, and Dual processing) processor. JAFARDD dynamically translates java stack-dependent bytecodes to RISC-style stack-independent instructions to facilitate the use of a general-purpose RISC core. JAFARDD enables the exploitation of instruction level parallelism among the translated instructions by the use of bytecode folding coupled with Tomasulo's algorithm. We detail the JAFARDD architecture and the global architecture design principles observed while designing each pipeline module. We also illustrate the flow of the java bytecodes through each of the processing phases. Benchmarking of JAFARDD using SPECjvm98 has shown a performance improvement between 1.10 and 2.25.
Over the last 20 years object-oriented programming languages and managed run-times like java have been very popular because of their software engineering benefits. Despite their popularity in many application areas, t...
详细信息
Over the last 20 years object-oriented programming languages and managed run-times like java have been very popular because of their software engineering benefits. Despite their popularity in many application areas, they have not been considered suitable for real-time programming. Besides many other factors, one of the barriers that prevent their acceptance in the development of real-time systems is the long pause times that may arise during large object allocation. This paper examines different kinds of solutions that have been developed so far and introduces a switchable approach to large object allocation in real-time java. A synthetic benchmark application that is developed to evaluate the effectiveness of the presented technique against other currently implemented techniques is also described.
java embedded systems often include java middleware classes installed on the client device. For higher performance, they can be compiled into machine code before runtime using an ahead-of-time compiler (AOTC). There a...
详细信息
java embedded systems often include java middleware classes installed on the client device. For higher performance, they can be compiled into machine code before runtime using an ahead-of-time compiler (AOTC). There are many approaches to AOTC, yet a bytecode-to-C (b-to-C) AOTC which translates the bytecode into the C code and then compiles it using an existing optimizing compiler such as gcc would be the most straightforward one. This paper explores a few important design and optimization issues of a b-to-C AOTC, including the compilation form for the translated C code, the call interfaces among translated and interpreted java methods, and java-specific optimizations by the AOTC that can complement the gcc optimizations. We evaluate these issues with our b-to-C AOTC implemented on the MIPS platform for the Sun's CDC VM to understand their performance impact.
Method calls in object oriented languages, such as java, are bound at run-time, making the method binding technique very important for the performance of the language. Efficient implementations can rely on having addi...
详细信息
Method calls in object oriented languages, such as java, are bound at run-time, making the method binding technique very important for the performance of the language. Efficient implementations can rely on having additional memory and/or processing power available either to store lookup tables or to allow for the construction of caches or rewriting of instructions during runtime. These are luxuries not always available on mobile devices such as mobile phones, tablets, etc. In this paper we describe a novel way of tokenising and compressing method dispatch tables to provide an efficient dispatch process which could be implemented in hardware in only a few operations. We demonstrate this in the context of java, also showing a significant reduction in size for the resulting class files. (C) 2011 Elsevier B.V. All rights reserved.
Bytecode verification is a crucial security component for java applets, on the Web and on embedded devices such as smart cards. This paper reviews the various bytecode verification algorithms that have been proposed, ...
详细信息
Bytecode verification is a crucial security component for java applets, on the Web and on embedded devices such as smart cards. This paper reviews the various bytecode verification algorithms that have been proposed, recasts them in a common framework of dataflow analysis, and surveys the use of proof assistants to specify bytecode verification and prove its correctness.
The java programming language is being increasingly used for application development for mobile and embedded devices. Limited energy and memory resources are important constraints for such systems. Compression is an u...
详细信息
The java programming language is being increasingly used for application development for mobile and embedded devices. Limited energy and memory resources are important constraints for such systems. Compression is an useful and widely employed mechanism to reduce the memory requirements of the system. As the leakage energy of a memory system increases with its size and because of the increasing contribution of leakage to overall system energy, compression also has a significant effect on reducing energy consumption. However, storing compressed data/instructions has a performance and energy overhead associated with decompression at runtime. The underlying compression algorithm, the corresponding implementation of the decompression and the ability to reuse decompressed information critically impact this overhead. In this paper, we explore the influence of compression on overall memory energy using a commercial embedded java virtual machine (JVM) and a customized compression algorithm. Our results show that compression is effective in reducing energy even when considering the runtime decompression overheads for most applications. Further, we show a mechanism that selectively compresses portions of the memory to enhance energy savings. Finally, a scheme for clustering the code and data to,improve the reuse of the decompressed data is presented.
JESSICA stands for java-enabled single-system-image computing architecture, a middleware that runs on top of the standard UNIX operating system to support parallel execution of multithreaded java applications in a clu...
详细信息
JESSICA stands for java-enabled single-system-image computing architecture, a middleware that runs on top of the standard UNIX operating system to support parallel execution of multithreaded java applications in a cluster of computers. JESSICA hides the physical boundaries between machines and makes the cluster appear as a single computer to applications a single system image. JESSICA supports preemptive thread migration, which allows a thread to freely move between machines during its execution, and global object sharing through the help of a distributed shared-memory subsystem. JESSICA implements location-transparency through a message-redirection mechanism. The result is a parallel execution environment where threads are automatically redistributed across the cluster for achieving the maximal possible parallelism. A JESSICA prototype that runs on a Linux cluster has been implemented and considerable speedups have been obtained for all the experimental applications tested. (C) 2000 Academic Press.
Traditional java code generation and instruction fetch path is not efficient, as java binary code is typically written into the data cache first, and then is loaded into the instruction cache through the shared L2 cac...
详细信息
Traditional java code generation and instruction fetch path is not efficient, as java binary code is typically written into the data cache first, and then is loaded into the instruction cache through the shared L2 cache or memory, which takes both time and energy. In this paper, we study three hardware-based code caching strategies, which attempt to write and read the dynamically generated java code faster and more energy-efficiently. Our experimental results indicate that with proper architectural support, writing code directly into the instruction cache can improve the performance for a variety of java applications by 9.6% on average, with up to 42.9%. Also, the average energy dissipation of these java programs can be reduced by 6% with efficient code caching.
The fundamental challenge of garbage collector (GC) design is to maximize the recycled space with minimal time overhead. For efficient memory management, in many GC designs the heap is divided into large object space ...
详细信息
The fundamental challenge of garbage collector (GC) design is to maximize the recycled space with minimal time overhead. For efficient memory management, in many GC designs the heap is divided into large object space (LOS) and normal object space (non-LOS). When either space is full, garbage collection is triggered even though the other space may still have plenty of room, thus leading to inefficient space utilization. Also, space partitioning in existing GC designs implies different GC algorithms for different spaces. This not only prolongs the pause time of garbage collection, but also makes collection inefficient on multiple spaces. To address these problems, we propose Packer, a parallel garbage collection algorithm based on the novel concept of virtual spaces. Instead of physically dividing the heap into multiple spaces, Packer manages multiple virtual spaces in one physical space. With multiple virtual spaces, Packer offers efficient memory management. With one physical space, Packer avoids the problem of an inefficient space utilization. To reduce the garbage collection pause time, we also propose a novel parallelization method that is applicable to multiple virtual spaces. Specifically, we reduce the compacting GC parallelization problem into a discreted acyclic graph (DAG) traversal parallelization problem, and apply it to both normal and large object compaction.
Dynamic flexibility is a major challenge in modern system design to react to context or applicative requirements evolutions. Adapting behaviors may impose substantial code modification across the whole system, in the ...
详细信息
Dynamic flexibility is a major challenge in modern system design to react to context or applicative requirements evolutions. Adapting behaviors may impose substantial code modification across the whole system, in the field, without service interruption and without state loss. This paper presents the JnJVM, a full java virtual machine (JVM) that satisfies these needs by using dynamic aspect weaving techniques and a component architecture. It supports adding or replacing its own code, while it is running, with no overhead on unmodified code execution. Our measurements reveal similar performance when compared with the monolithic JVM Kaffe. Three illustrative examples show different extension scenarios: (i) modifying the JVMs behavior;(ii) adding capabilities to the JVM;and (iii) modifying applications behavior. Copyright (C) 2008 John Wiley & Sons, Ltd.
暂无评论