Babel is a high-performance, n-way language interoperability tool for the HPC community that now includes support for distributed computing via Remote Method Invocation (RMI). We describe the design and implementation...
详细信息
Multicasting of compressed video streams over wireless networks demands significantly different approaches to error control than those used in wired networks, due to high packet loss rate. This paper describes an expe...
详细信息
In this paper we demonstrate the use of the Graphics Processing Unit (GPU) to accelerate Evolutionary Computation applications, in particular Genetic Programming approaches. We show that it is possible to get speed in...
详细信息
DRAM memory is a major contributor for the total power consumption in modern computing systems. Consequently, power reduction for DRAM memory is critical to improve system-level power efficiency. Fine-grained DRAM arc...
详细信息
ISBN:
(纸本)9781479943944
DRAM memory is a major contributor for the total power consumption in modern computing systems. Consequently, power reduction for DRAM memory is critical to improve system-level power efficiency. Fine-grained DRAM architecture [1, 2] has been proposed to reduce the activation/precharge power However those prior work either incurs significant performance degradation or introduces large area overhead. In this paper;we propose a novel memory architecture Half-DRAM, in which the DRAM array is reorganized to enable only half of a row being activated. The half-row activation can effectively reduce activation power and meanwhile sustain the full bandwidth one bank can provide. In addition, the half: row activation in Half-DRAM relaxes the power constraint in DRAM, and opens up opportunities for further performance gain. Furthermore, two half: row accesses can be issued in parallel by integrating the sub-array level parallelism to improve the memory level parallelism. The experimental results show that Half-DRAM can achieve both significant performance improvement and power reduction, with negligible design overhead.
Efficient elementary function implementations require primitives optimized for modern FPGAs. Fixed-point function generators are one such type of primitives. When built around piecewise polynomial approximations they ...
详细信息
ISBN:
(纸本)9780769549699;9781467360050
Efficient elementary function implementations require primitives optimized for modern FPGAs. Fixed-point function generators are one such type of primitives. When built around piecewise polynomial approximations they make use of memory blocks and embedded multipliers, mapping well to contemporary FPGAs. Another type of primitive which can exploit the power series expansions of some elementary functions is floating-point polynomial evaluation. The high costs traditionally associated with floating-point arithmetic made this primitive unattractive for elementary function implementation on FPGAs. In this work we present a novel and efficient way of implementing floating-point polynomial evaluators on a restricted input range. We show on the atan(x) function in double precision that this very different technique reduces memory block count by up to 50% while only slightly increasing DSP count compared to the best implementation built around polynomial approximation fixed-point primitives.
A description is given of Integrity S2, a fault-tolerant, Unix-based computing system designed and implemented to provide a highly available, fault-tolerant computing platform for Unix-based applications. Unlike some ...
详细信息
ISBN:
(纸本)0818621508
A description is given of Integrity S2, a fault-tolerant, Unix-based computing system designed and implemented to provide a highly available, fault-tolerant computing platform for Unix-based applications. Unlike some other fault tolerant computing systems, no additional coding at the user-level is required to take advantage of the fault-tolerant capabilities inherent in the platform. The hardware is an RISC-based triple-modular-redundant processing core, with duplexed global memory and I/O subsystems. The goals for this machine, the system architecture, its implementation and resulting performance, and the hardware and software techniques incorporated to achieve fault tolerance are discussed. Fault tolerance has been accomplished without compromising the programmatic interface, operating system or system performance.
Partial discharges represent one of the main mechanisms of ageing of dielectrics especially when an alternated current is used. This problem is particularly severe when partial discharges are associated with degenerat...
详细信息
ISBN:
(纸本)9783030316761;9783030316754
Partial discharges represent one of the main mechanisms of ageing of dielectrics especially when an alternated current is used. This problem is particularly severe when partial discharges are associated with degenerative phenomena such as the electrical treeing. In this work, we present the latest development of our simulation codes, which are capable of simulating the evolution of discharges in complex three-dimensional geometries through parallel high-performance-computing technologies. The code is capable of both predicting the evolution of some macroscopic quantities, that can be measured directly, and estimating the progression of the internal ageing. For instance, it is possible to simulate the creation of chemically active species in the gas and their interactions with the surfaces of the branches. Some examples of the results obtained in a set of test cases will be discussed here.
As a novel network paradigm, Software Defined Networking (SDN) decouples control logic functions from data forwarding devices, and introduces a separate control plane to manipulate underlying switches via southbound i...
详细信息
Container-managed persistence is an essential technology as it dramatically simplifies the implementation of enterprise data access. However it can also impose a significant overhead on the performance of the applicat...
详细信息
As computer hardware becomes increasingly powerful, there is an ongoing trend towards integrating QoS-critical systems as virtual machines (domains) on a common, virtualized computing platform. Given the lower latency...
详细信息
ISBN:
(纸本)9781479905904;9781479905898
As computer hardware becomes increasingly powerful, there is an ongoing trend towards integrating QoS-critical systems as virtual machines (domains) on a common, virtualized computing platform. Given the lower latency of local inter-domain communication (IDC) on the same host (compared to inter-host communication), system administrators may preferably colocate domains so that they can communicate locally. When multiple IDC flows contend on the same host, it is important to properly prioritize IDC flows among domains to meet their respective QoS requirements. This paper examines the limitations of IDC in Xen, a widely used open-source virtual machine monitor (VMM) that recently has been extended to support real-time domain scheduling. We find that both the VMM scheduler and the manager domain can significantly impact IDC QoS under different conditions, and show that improving the VMM scheduler alone cannot effectively prevent priority inversion for local IDC. To address those limitations, we present RTCA, a Real-Time Communication architecture within the manager domain in Xen, along with experimental results that demonstrate the latency of high-priority IDC can be improved dramatically from ms to mu s by a combination of the RTCA and a real-time VMM scheduler.
暂无评论