This paper presents a learning activity designed to improve student motivation towards learning certain topics in computer architecture, particularly assembly-level machine organization. This activity applies a pedago...
详细信息
This paper presents a learning activity designed to improve student motivation towards learning certain topics in computer architecture, particularly assembly-level machine organization. This activity applies a pedagogical method suitable for computer architecture courses. The method improves motivation by allowing students to verify by themselves, on real platforms, that they can apply their architecture knowledge to improve software performance and also by showing them the improvements achieved in complete applications. The activity is compared to other such activities based on different motivation and learning techniques.
This work describes the optimization of a Jpeg2000 encoder implementation on an embedded VLIW CPU core. We demonstrate that specifically designed source code leads to significant performance improvements by carefully ...
详细信息
ISBN:
(纸本)0889865078
This work describes the optimization of a Jpeg2000 encoder implementation on an embedded VLIW CPU core. We demonstrate that specifically designed source code leads to significant performance improvements by carefully exploiting the hardware resources of the underlying processor architecture. Conversely, we show that the generic Jpeg2000 source code yields performance results comparable with those obtained on a single-issue RISC CPU (ARM). The dedicated algorithm implementation, with over 40% savings in the total clock cycle count, demonstrates the effectiveness of VLIW CPU cores in high-performance, low-cost and low-power embedded computing platform.
The Internet of Things (IoT) is undoubtedly a current topic for private and public sector. Nowadays, the communication technologies allow to connect even a simplest physical object often with very limited physical res...
详细信息
ISBN:
(纸本)9783319571416
The Internet of Things (IoT) is undoubtedly a current topic for private and public sector. Nowadays, the communication technologies allow to connect even a simplest physical object often with very limited physical resources. The IoT cover among the others also many areas with sensitive data, where the limited devices are also used. The use of these limited devices keep the security issue as a difficult task. The symmetric ciphers are often considered as a best way to encrypt the communication in the limited devices. Despite the fact that there are many hardware optimized solutions, there are still areas where these solutions cannot be used i.e. due to the limiting price or power. This paper focus on a software optimization of the symmetric cipher on limited micro-controller. Two main implementation are introduced. Further, we provide experimental measurements and possible suggestions for time consumption and memory use reduction.
This paper presents general software optimization techniques to improve software performance and energy consumption in embedded systems. software optimization can be categorized into three levels: algorithmic, source ...
详细信息
ISBN:
(纸本)9781467357425;9781467357401
This paper presents general software optimization techniques to improve software performance and energy consumption in embedded systems. software optimization can be categorized into three levels: algorithmic, source code-level. Then these techniques are applied to optimize our tactical message processing software, which is a message encoder-decoder for tactical communication equipped on weapon systems. The optimized software achieved performance increase of about 12%, memory access decrease of 72% and memory usage decrease of 35% over the original software.
Audio coding is indispensable in our life and MPEG AAC is one of the most popular audio coding standards. It has been widely used in variant applications. In this paper, we propose VLIW-aware software optimization tec...
详细信息
Audio coding is indispensable in our life and MPEG AAC is one of the most popular audio coding standards. It has been widely used in variant applications. In this paper, we propose VLIW-aware software optimization techniques for the AAC decoding blocks oil the parallel architecture core DSP (PACDSP) processor. This approach provides the flexibility for adding new extensions and solves two important issues, low power consumption and limited resources problems on DSP for portable devices. We change the traditional sequential algorithms into parallel processes and minimize the memory utilization of each block. The realized decoder can be operated at a lower frequency of only 15 MHz and needs only 27 Kbytes of program memory and 27 Kbytes of data memory(1).
This paper proposes general software optimization techniques for embedded systems based on processors, which mainly include general optimization methods in high language and software & hardware co-optimization in ...
详细信息
This paper proposes general software optimization techniques for embedded systems based on processors, which mainly include general optimization methods in high language and software & hardware co-optimization in assembly language. Then these techniques are applied to optimize our MP3 decoder, which is based on RISC32, a RISC core compatible with MIPS1 instruction set. The last optimization decoder requires 48 MIPS and 49Kbytes memory space to decode 128Kbps, 44.1KHz joint stereo MP3 in real time with CPI 1.15, and we have achieved performance increase of 46.7% and memory space decrease of 38.8% over the original decoding software(1).
Multiple-injection technology is being widely used in common-rail injection (CRI) systems for diesel engines. The precise control of the injected fuel quantity (FQ) and timing of every injection will directly affect t...
详细信息
Multiple-injection technology is being widely used in common-rail injection (CRI) systems for diesel engines. The precise control of the injected fuel quantity (FQ) and timing of every injection will directly affect the engine efficiency and exhaust performance. As the engine speed becomes faster, when more than five injections are engaged in per combustion cycle, the injection interval (between consecutive injections) will become increasingly shorter. In experiments based on the green diesel (GD)-CRI system applied in the YC6112 engine, the FQ of the main injection (MI) was found unbalanced when the pilot injection (PiI) is very close to it. According to the experimental analysis, this phenomenon was validated to mainly result from the electrohydraulic delay and the fuel pressure waves in the line leading from the rail to the injector. In order to minimize this influence, a software compensating strategy, which made an additive correction of the injection quantity, was developed. By this optimized software, not only does the MI FQ become steadily even when the PiI is very close to it, but also the conventional system-restricted injection interval can be decreased to as short as possible. This contributes to fulfilling the multiple-injection technology in the GD-CRI system and provides further benefit for the triple-injection function or greater. It also provides a successful example for up-to-date CRI systems to improve the multiple-injection performance in any other engine types by this software optimization method.
While modern parallel computing systems offer high performance, utilizing these powerful computing resources to the highest possible extent demands advanced knowledge of various hardware architectures and parallel pro...
详细信息
While modern parallel computing systems offer high performance, utilizing these powerful computing resources to the highest possible extent demands advanced knowledge of various hardware architectures and parallel programming models. Furthermore, optimized software execution on parallel computing systems demands consideration of many parameters at compile-time and run-time. Determining the optimal set of parameters in a given execution context is a complex task, and therefore to address this issue researchers have proposed different approaches that use heuristic search or machine learning. In this paper, we undertake a systematic literature review to aggregate, analyze and classify the existing software optimization methods for parallel computing systems. We review approaches that use machine learning or meta-heuristics for software optimization at compile-time and run-time. Additionally, we discuss challenges and future research directions. The results of this study may help to better understand the state-of-the-art techniques that use machine learning and meta-heuristics to deal with the complexity of software optimization for parallel computing systems. Furthermore, it may aid in understanding the limitations of existing approaches and identification of areas for improvement.
Energy efficiency and computational time become the key usability factors of the low power computational systems. In this sense optimization of the source code is a great challenge to make low power computational syst...
详细信息
ISBN:
(纸本)9781538617014
Energy efficiency and computational time become the key usability factors of the low power computational systems. In this sense optimization of the source code is a great challenge to make low power computational system competitive relative to other and adorable for the customers. The paper describes several experiments with a modern low power system that shows the influence of the program loop optimization on microprocessor system power consumption. Conclusions about positive effect of the automatic software optimization are presented.
The Digital Signal Processors of TI C6000 series, have a structure called VLIW and a Harvard structure, and a electron system based on DSPs can meet the real-time requirement depending on making full use of the struct...
详细信息
ISBN:
(纸本)9783037852866
The Digital Signal Processors of TI C6000 series, have a structure called VLIW and a Harvard structure, and a electron system based on DSPs can meet the real-time requirement depending on making full use of the structures by its software. Therefore the executing efficiency of software will directly affect the real-time character of the whole system. In this article, several methods of software optimization for C6000 DSPs are summarized, including the use of intrinsics, data accessing band-width and software pipelining etc. Using these methods to optimize the C code software can mostly solve the bad real-time problems in processing and then the system can meet the real-time requirement.
暂无评论