In processor architectures such as MIPS, ALPHA, SPARC and PowerPC, indirect addressing mode is always adopted to access global variables and static ones. Since the addresses of these variables and the corresponding va...
详细信息
In processor architectures such as MIPS, ALPHA, SPARC and PowerPC, indirect addressing mode is always adopted to access global variables and static ones. Since the addresses of these variables and the corresponding values are in different data sections in the corresponding binary file, the data locality of the program will be very poor. As a result, accessing the read only addresses of these variables every time tends to result in non-trivial redundant data cache miss memory accesses. Moreover, such indirect addressing mode will generate two sequential load instructions which have data dependences between them. As a result, the amount of instruction level parallelism (ILP) of the program will be decreased. The authors present an address register promotion method based on feedbacks (ARPF) to solve the above problems. ARPF algorithm reduces the redundant accesses to the read only addresses of the global variables and static ones, increases the amount of instruction level parallelism of a program, and avoids the performance declines due to the increase in register pressure caused by register promotion. The algorithm has been implemented in the Loongson compiler for MIPS architecture. Experiments on SPEC CPU2000INT benchmarks are conducted to show that ARPF can improve the performance of all benchmarks by 1%-6%.
SHACAL-1, known as one of the finalists of the NESSIE project, originates from the compression component of the widely used hash function SHA-1. The requirements of confusion and diffusion are implemented through mixi...
详细信息
ISBN:
(纸本)9781424449729;9780769538242
SHACAL-1, known as one of the finalists of the NESSIE project, originates from the compression component of the widely used hash function SHA-1. The requirements of confusion and diffusion are implemented through mixing operations and rotations other than substitution and permutation, thus there exists little literature on its immunity against fault attacks. In this paper, we apply differential fault analysis on SHACAL-1 in a synthetic approach. We introduce the random word fault model, present some theoretical arguments, and give an efficient fault attack based on the characteristic of the cipher. Both theoretical predications and experimental results demonstrate that, 72 random faults are needed to obtain 512 bits key with successful probability more than 60%, while 120 random faults are enough to obtain 512 bits key with successful probability more than 99%.
With the wide application of EDA technique, the period for the development of electronic products has been shortened. That implements the software of the hardware design and reduces the costs. Based on the analysis of...
详细信息
With the wide application of EDA technique, the period for the development of electronic products has been shortened. That implements the software of the hardware design and reduces the costs. Based on the analysis of the principle of digital logic analyzer circuit, this paper discusses the working principles of its flip-flop circuit module and the implementation method of FPGA, and presents the program design and emulation result of part circuits.
Cellular Automaton (CA) based traffic flow models have been extensively studied due to their effectiveness and simplicity in recent years. This paper develops a discrete time Markov chain (DTMC) analytical framewo...
详细信息
Cellular Automaton (CA) based traffic flow models have been extensively studied due to their effectiveness and simplicity in recent years. This paper develops a discrete time Markov chain (DTMC) analytical framework for a Nagel-Schreckenberg and Fukui Ishibashi combined CA model (W^2H traffic flow model) from microscopic point of view to capture the macroscopic steady state speed distributions. The inter-vehicle spacing Maxkov chain and the steady state speed Markov chain are proved to be irreducible and ergodic. The theoretical speed probability distributions depending on the traffic density and stochastic delay probability are in good accordance with numerical simulations. The derived fundamental diagram of the average speed from theoretical speed distributions is equivalent to the results in the previous work.
In this paper, space charge in oil-paper insulation system has been investigated using the pulsed electroacoustic (PEA) technique. A series of measurements were carried out when the insulation system was subjected to ...
详细信息
ISBN:
(纸本)9781424443666
In this paper, space charge in oil-paper insulation system has been investigated using the pulsed electroacoustic (PEA) technique. A series of measurements were carried out when the insulation system was subjected to different applied voltages. Charge dynamics in the insulation system during the volts-on, volts-off and decay have been analyzed. It has been found that homocharge injection occurred both at the anode and the cathode. Positive charges are observed to accumulate in the layers, which indicate that the oil-paper layer interfaces act as a barrier for positive charges. The decay tests showed that after 30min, about a quarter of space charges remained in the sample, about 90% charges disappeared after 2 hours. Finally, total charge variation in these tests were analyzed.
Speculation is an important method to overcome control flow constraints during instruction scheduling. On the one hand, speculation can exploit more instruction-level parallelism and improve performance. However, on t...
详细信息
Speculation is an important method to overcome control flow constraints during instruction scheduling. On the one hand, speculation can exploit more instruction-level parallelism and improve performance. However, on the other hand, it may also lengthen the live range of variables and increase the register pressure, which in turn results in spilling some variables to memory and deteriorating the performance. Previous work on register pressure sensitive instruction scheduling generally scheduled instructions conservatively when there were too many live variables in the scheduling region. But actually different variables have different spilling costs and different impacts on performance. Here a register pressure sensitive speculative instruction scheduling technology is presented, which not only considers the count of live variables, but also analyzes the benefits and the spilling costs brought by instructions' speculative motions. The decrement of cycles in critical path is calculated as benefit, while the spilled variables are predicted and their spilling cost is used as cost. Only the speculative motion with benefit greater than the cost is permitted in our method. This algorithm has been implemented in Godson Compiler for MIPS architecture. Experiment result shows that the method in this paper can obtain 1.44% speedup on average relative to its register pressure insensitive counterpart on SPEC CPU2000INT benchmarks.
Huffman codes are being widely used as a very efficient technique for compressing data. To achieve high compressing ratio, some properties of encoding and decoding for canonical Huffman table are discussed. A study an...
详细信息
Particle swarm optimization (PSO) is a new stochastic population-based search methodology by simulating the animal social behaviors such as birds flocking and fish *** improvements have been proposed within the framew...
详细信息
Particle swarm optimization (PSO) is a new stochastic population-based search methodology by simulating the animal social behaviors such as birds flocking and fish *** improvements have been proposed within the framework of this biological assumption. However,in this paper,the search pattern of PSO is used to model the branch growth process of natural *** provides a different poten- tial manner from artificial *** illustrate the effectiveness of this new model,apical dominance phenomenon is introduced to construct a ncvel variant by emphasizing the influence of the *** this improvement,the population is divided into three different kinds of buds associated with their ***,a mutation strategy is applied to enhance the ability escaping from a local ***- ulation results demonstrate good performance of the new method when solving high-dimensional multi-modal problems.
We report a passively mode-locked high repetition rate erbium-doped femtosecond fiber laser via nonlinear polarization rotation, with a fundamental repetition rate of 101.94 MHz. The output power is 34 mW when pumped ...
详细信息
We report a passively mode-locked high repetition rate erbium-doped femtosecond fiber laser via nonlinear polarization rotation, with a fundamental repetition rate of 101.94 MHz. The output power is 34 mW when pumped by a single mode fiber coupled laser diode at 370 mW. The spectral width is 25 nm, corresponding to a transform limited pulse width of 105 fs.
Dataflow predication provides a lightweight full support for predicated execution in dataflow-like architectures. One of its major overhead is the large amounts of fanout trees for distributing predicates to all depen...
详细信息
暂无评论