We conduct an experimental analysis of a distributed randomized algorithm for edge coloring simple undirected graphs. The algorithm is extremely simple yet, according to the probabilistic analysis, it computes nearly ...
详细信息
Previously developed constitutive models and solution algorithms for continuum‐level anisotropic elastoplastic material strength and an isotropic damage model TEPLA have been implemented in the three‐dimensional Eul...
Previously developed constitutive models and solution algorithms for continuum‐level anisotropic elastoplastic material strength and an isotropic damage model TEPLA have been implemented in the three‐dimensional Eulerian hydrodynamics code known as CONEJO. The anisotropic constitutive modeling is posed in an unrotated material frame of reference using the theorem of polar decomposition to compute rigid‐body rotation. TEPLA is based upon the Gurson flow surface (a potential function used in conjunction with the associated flow law). The original TEPLA equation set has been extended to include anisotropic elastoplasticity and has been recast into a new implicit solution algorithm based upon an eigenvalue scheme to accommodate the anisotropy. This algorithm solves a two‐by‐two system of nonlinear equations using a Newton‐Raphson iteration scheme. Simulations of a shaped‐charge jet formation, a Taylor cylinder impact, and an explosively loaded hemishell were selected to demonstrate the utility of this modeling capability. The predicted deformation topology, plastic strain, and porosity distributions are shown for the three simulations.
Path-oriented scheduling methods, such as trace scheduling and hyperblock scheduling, use speculation to extract instruction-level parallelism from control-intensive programs. These methods predict important execution...
ISBN:
(纸本)9780818676413
Path-oriented scheduling methods, such as trace scheduling and hyperblock scheduling, use speculation to extract instruction-level parallelism from control-intensive programs. These methods predict important execution paths in the current scheduling scope using execution profiling or frequency estimation. Aggressive speculation is then applied to the important execution paths, possibly at the cost of degraded performance along other paths. Therefore, the speed of the output code can be sensitive to the compiler's ability to accurately predict the important execution paths. Prior work in this area has utilized the speculative yield function by Fisher, coupled with dependence height, to distribute instruction priority among execution paths in the scheduling scope. While this technique provides more stability of performance by paying attention to the needs of all paths, it does not directly address the problem of mismatch between compile-time prediction and run-time behavior. The work presented in this paper extends the speculative yield and dependence height heuristic to explicitly minimize the penalty suffered by other paths when instructions are speculated along a path. Since the execution time of a path is determined by the number of cycles spent between a path's entrance and exit in the scheduling scope, the heuristic attempts to eliminate unnecessary speculation that delays any path's exit. Such control of speculation makes the performance much less sensitive to the actual path taken at run time. The proposed method has a strong emphasis on achieving minimal delay to all exits. Thus the name, speculative hedge, is used. This paper presents the speculative hedge heuristic, and shows how it controls over-speculation in a superblock/hyperblock scheduler. The stability of output code performance in the presence of execution variation is demonstrated with six programs from the SPEC CINT92 benchmark suite.
This paper describes a multi-FPGA based platform for emulating the Loongson-2G micro-processor on different mother boards. This platform is developed targeting at verification and evaluation of the Loongson-2G micro-p...
详细信息
ISBN:
(纸本)9781605589114
This paper describes a multi-FPGA based platform for emulating the Loongson-2G micro-processor on different mother boards. This platform is developed targeting at verification and evaluation of the Loongson-2G micro-processor, which is the next generation of Loongson-2 family, composed by one four-issue, out-of-order execution way 64-bit MIPS-compatible processor core named GS464, one 1M byte secondary Cache, one HyperTransport IO interface, one DDR2/3 memory interface and some other low speed IO interfaces. Most parts of this micro-process are mapped into the multi-FPGA based platform which consists two Vertex-5 330 FPGA chips. Semi-custom partitioning tactics within the entire design flow are developed to synthesize the whole designed into the multi-FPGA based platform. Modifications in architectural level are applied to the original architecture of the chip, in order to make it easy to be partitioned into two parts. High speed SEDES of HyperTransport IO link and DDR2/3 memory interface are emulated by using several clocks with different clock phases. To resolve the problem that hard to debug in FPGA system, a method by software probe with help of injected hardware modules in FPGA is developed and used to debug the problem causing by behavior mismatching between the ASIC ram block and the FPGA ram block. Some evaluation work on performance of Loongson-2G is done on this multi-FPGA based platform as pre-silicon test. To the authors' knowledge, there has been no previous work on such a big design used for verification and evaluation.
The increasing software content of battery-powered embedded systems has fueled much interest in techniques for developing energy-efficient embedded software. Source code transformations have previously been considered...
详细信息
The increasing software content of battery-powered embedded systems has fueled much interest in techniques for developing energy-efficient embedded software. Source code transformations have previously been considered for application software to reduce its energy consumption. For complex embedded software applications, which consist of multiple concurrent processes running with the support of an embedded operating system (OS), it is known that the OS and the application-OS interaction significantly affect energy consumption. However, source code transformations explicitly targeting these effects have not been sufficiently studied. This paper proposes novel transformations for the source code of OS-driven multi-process embedded software programs in order to reduce their energy consumption. The key features of our optimizations are that they span process boundaries, and that they minimize the energy consumed in the execution of OS functions and services-opportunities which are beyond the reach of conventional compiler optimizations and source code transformation techniques. We propose four types of transformations, namely process-level concurrency management, message vectorization, computation migration and inter-process communication mechanism selection. We discuss how to systematically identify opportunities for the proposed transformations and apply them directly to the program source code. We have applied the proposed techniques to several multi-process software benchmark programs, and evaluated their applicability in the context of an embedded system containing an Intel StrongARM processor and embedded Linux OS. Our techniques achieve up to 37.9% (23.8% on an average) energy reduction compared to highly compiler-optimized implementations.
A stability analysis is a fundamental problem in power system operation and control. The traditional stability analysis methods are based on the determinate initial parameter of power system. Because of some reasons, ...
详细信息
A stability analysis is a fundamental problem in power system operation and control. The traditional stability analysis methods are based on the determinate initial parameter of power system. Because of some reasons, the determinate value of initial parameter of power system, such as load, cannot be decided, and only their probability distribution can often be gotten. The traditional stability analysis methods cannot be applied to the new condition. This paper proposed a method to compute the probability distribution of fault critical clearing time. This method uses Gram-Charlier expansion of random variable and the property of cumulant, and on the basis of sensitivity computation, changes the computation of probability distribution of fault critical clearing time to the computation of the cumulant of initial parameters. This method was tested in 39-machine system. Compared with the Monte-Carlo simulation, the method does not need a large amount of simulations as statistical samples, but only need once simulation when computing the sensitivity of the fault critical clearing time;The computation results show that the method can accurately approximate the cumulative distribution function of fault critical clearing time, and reduces computational burden and improves the speed of computation. Using this method, the probability of fault critical clearing time distributing in an interval closed to the expectation value can be determined, and it can act as an implement for stability analysis.
Reservoir lifetime can be interpreted as the number of years a reservoir can be used to fulfil its purpose. This study proposes an approach to predict the remaining life of Semenyih reservoir using an empirical method...
Reservoir lifetime can be interpreted as the number of years a reservoir can be used to fulfil its purpose. This study proposes an approach to predict the remaining life of Semenyih reservoir using an empirical method. The result of this study can help in making important decisions in the water supply. This paper mainly focuses on Semenyih dam which is one of the Klang Valley major dams in Selangor, Malaysia built in 1985, with a lifetime plan of 100 years. This watershed basin has one of the main rivers in the state of Selangor, which has been negatively affected by industrial and urban wastes since early 1990. The annual sedimentation rate plan of Semenyih Reservoir was estimated from 13,165,400 m3 in 2004 to 13,511,900m3 in 2010, however as calculated in 2016, sediment delivery reached 13,665,450 m3 in the Semenyih catchment. The proposed result of the Semenyih Reservoir remaining lifetime using the empirical method can be estimated by years, and this number then can be used as a reference to predict the remaining volume of Semenyih Reservoir dead storage using the sediment approach. In this paper the lifetime of the reservoir was estimated at 65 years.
暂无评论