检索结果-内蒙古大学图书馆

6th international workshop on embedded computer systems - architectures, modeling and simulation

作者： Chaves, Ricardo Kuzmanov, Georgi Sousa, Leonel Vassiliadis, Stamatis Inst Super Tecn INESC ID P-1000029 Lisbon Portugal Delft Univ Technol Comp Engn Lab NL-2600 GA Delft Netherlands

ISBN: (纸本)3540364102

this paper proposes the rescheduling of the SHA-1 hash function operations on hardware implementations. the proposal is mapped on the Xilinx Virtex II Pro technology. the proposed rescheduling allows for a manipulation of the critical path in the SHA-1 function computation, facilitating the implementation of a more parallelized structure without an increase on the required hardware resources. Two cores have been developed, one that uses a constant initialization vector and a second one that allows for different Initialization Vectors (IV), in order to be used in HMAC and in the processing of fragmented messages. A hybrid software/hardware implementation is also proposed. Experimental results indicate a throughput of 1.4 Gbits/s requiring only 533 slices for a constant IV and 596 for an imputable IV. Comparisons to SHA-1 related art suggest improvements of the throughput /slice metric of 29% against the most recent commercial cores and 59% to the current academia proposals.

关键词： computer hardware

来源：评论

学校读者我要写书评

暂无评论

modeling node architectures

Modeling node architectures

引用

Proceedings of the 1998 6th international Symposium on modeling, Analysis and simulation of computer and Telecommunications systems, MASCOTS

作者： Giloi, Wolfgang K. Lindemann, Christoph Pletner, Samuel GMD Inst for Computer Architecture and Software Technology Berlin Germany

this paper presents a modeling approach based on deterministic and stochastic Petri nets (DSPN) for analyzing the performance of node architectures for MIMD multiprocessor systems with distributed memory. DSPN are a numerically solvable modeling formalism with a graphical representation. the modeling approach supports design decisions for node architectures by providing quantitative results concerning processor and memory utilization for several design alternatives. To illustrate the proposed approach, DSPN of two node architectures are presented and employed for a comparative performance study.

关键词： computer architecture

来源：评论

学校读者我要写书评

暂无评论

Generating stream based code from plain C

Generating stream based code from plain C

引用

5th international workshop on embedded computer systems - architectures, modeling, and simulation

作者： Beemster, M van Someren, H Fitzpatrick, L van Royen, R ACE Associated Compiler Expert NL-1011 AB Amsterdam Netherlands

ISBN: (纸本)354026969X

the Stream model is a high level Intermediate Representation that can be mapped to a range of parallel architectures. the Stream model has a limited scope because it is aimed at architectures that reduce the control overhead of programmable hardware to improve the overall computing efficiency. While it has its limitations, the performance critical parts of embedded and media applications can often be compiled to this model. the automatic compilation to Stream programs from C code is demonstrated.

关键词： computer architecture

来源：评论

学校读者我要写书评

暂无评论

Real-time embedded system for rear-view mirror overtaking car monitoring

Real-time embedded system for rear-view mirror overtaking ca...

引用

6th international workshop on embedded computer systems - architectures, modeling and simulation

作者： Diaz, Javier Ros, Eduardo Mota, Sonia Agis, Rodrigo Univ Granada Dept Arquitectura & Tecnol Computadores Granada Spain Univ Cordoba Dept Informat & Anal Numer Cordoba Spain

ISBN: (纸本)3540364102

the main goal of an overtaking monitor system is the segmentation and tracking of the overtaking vehicle. this application can be addressed through an optic flow driven scheme. We can focus on the rear. mirror visual field by placing a camera on the top of it. If we drive a car, the ego-motion optic flow pattern is more or less unidirectional, i.e. all the static objects and landmarks move backwards while the overtaking cars move forward towards our vehicle. this well structured motion scenario facilitates the segmentation of regular motion patterns that correspond to the overtaking vehicle. Our approach is based on two main processing stages: first, the computation of optical flow using a novel superpipelined and fully parallelized architecture capable to extract the motion information with a frame-rate up to 148 frames per second at VGA resolution (640x480 pixels). Second, a tracking stage based on motion pattern analysis provides an estimated position of the overtaking car. We analyze the system performance, resources and show some promising results using a bank of overtaking car sequences.

关键词： embedded systems

来源：评论

学校读者我要写书评

暂无评论

Efficient automated clock gating using CoDeL

Efficient automated clock gating using CoDeL

引用

6th international workshop on embedded computer systems - architectures, modeling and simulation

作者： Agarwal, Nainesh Dimopoulos, Nikitas J. Univ Victoria Dept Elect & Comp Engn Victoria BC Canada

ISBN: (纸本)3540364102

We present a highly efficient automated clock gating platform for rapidly developing power efficient hardware architectures. Our language, called CoDeL, allows hardware description at the algorithm level, and thus dramatically reduces design time. We have extended CoDeL to automatically insert clock gating at the behavioral level to reduce dynamic power dissipation in the resulting architecture. this is, to our knowledge, the first hardware design environment that allows an algorithmic description of a component and yet produces a power aware design. To estimate the power savings, we have developed an estimation framework, which is shown to be consistent with the power savings obtained using statistical power analysis using Synopsys tools. To evaluate our platform we use the CoDeL implementation of a counter and various integer transforms used in the realm of DSP (Digital Signal Processing): discrete wavelet transform, discrete cosine transform and an integer transform used in the H.264 (MPEG4 Part 10) video compression standard. these designs are then clock gated using CoDeL and Synopsys. A simulation based power analysis on the designed circuits shows that CoDeL's clock gating performs better than Synopsys' automated clock gating. CoDeL reduces the power dissipation by 83% on average, while Synopsys gives 81% savings.

关键词： computer architecture

来源：评论

学校读者我要写书评

暂无评论

Scenario selection and prediction for DVS-aware scheduling of multimedia applications

引用

JOURNAL OF SIGNAL PROCESSING systems FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY 2008年第2期50卷 137-161页

作者： Gheorghita, S. V. Basten, T. Corporaal, H. Eindhoven Univ Technol Elect Syst Grp Dept Elect Engn NL-5600 MB Eindhoven Netherlands

Modern multimedia applications usually have real-time constraints and they are implemented using application-domain specific embedded processors. Dimensioning a system requires accurate estimations of resources needed by the applications. Overestimation leads to over-dimensioning. For a good resource estimation, all the cases in which an application can run must be considered. To avoid an explosion in the number of different cases, those that are similar with respect to required resources are combined into, so called application scenarios. this paper presents a methodology and a tool that can automatically detect the most important variables from an application and use them to select and dynamically predict scenarios, with respect to the necessary time budget, for soft real-time multimedia applications. the tool was tested for three multimedia applications. Using a proactive scenario-based dynamic voltage scheduler based on the scenarios and the runtime predictor generated by our tool, the energy consumption decreases with up to 19%, while guaranteeing a frame deadline miss ratio close to zero.

关键词： dynamic voltage scheduling soft real-time application scenarios embedded systems

来源：评论

学校读者我要写书评

暂无评论

Software pipelining support for transport triggered architecture processors

Software pipelining support for transport triggered architec...

引用

6th international workshop on embedded computer systems - architectures, modeling and simulation

作者： Salmela, Perttu Jaaskelainen, Pekka Jarvinen, Tuomas Takala, Jarmo Tampere Univ Technol FIN-33101 Tampere Finland Nokia Technol Platforms FIN-33721 Tampere Finland

ISBN: (纸本)3540364102

Many telecommunication applications, especially baseband processing, and digital signal processing (DSP) applications call for high-performance implementations due to the complexity of algorithms and high throughput requirements. In general, the required performance is obtained with the aid of parallel computational resources. In these application domains, software implementations are often preferred over fixed-function ASICs due to the flexibility and ease of development. Application-specific instruction-set processor (ASIP) architectures can be used to exploit efficiently the inherent parallelism of the algorithms but still maintaining the flexibility. Use of high-level languages to program processor architectures with parallel resources can lead to inefficient resource utilization and, on the other hand, parallel assembly programming is error prone and tedious. In this paper, the inherent problems of parallel programming and software pipelining are mitigated with parallel language syntax and automatic generation of software pipelined code for the iteration kernels. With the aid of the developed tool support, the underlying performance of a processor architecture with parallel resources can be exploited and full utilization of the main processing resources is obtained for pipelined loop kernels. the given examples show that efficiency can be obtained without reducing the performance.

关键词： computer software

来源：评论

学校读者我要写书评

暂无评论

throughput optimization via cache partitioning for embedded multiprocessors

Throughput optimization via cache partitioning for embedded ...

引用

6th international workshop on embedded computer systems - architectures, modeling and simulation

作者： Molnos, Anca A. Cotofana, Soirin D. Heijligers, Marc J. M. van Eijndhoven, Jos T. J. Delft Univ Technol Mekelweg 4 NL-2426 CD Delft Netherlands Philips Res Labs NL-5656 AE Eindhoven Netherlands

ISBN: (纸本)1424401550

In embedded multiprocessors cache partitioning is a known technique to eliminate inter-task cache conflicts, so to increase predictability. On such systems, the partitioning ratio is a parameter that should be tuned to optimize performance. In this paper we propose a Simulated Annealing (SA) based heuristic to determine the cache partitioning ratio that maximizes an application's throughput. In its core, the SA method iterates many times over many partitioning ratios, checking the resulted throughput. Hence the throughput of the system has to be estimated very fast, so we utilize a light simulation strategy. the light simulation derives the throughput from tasks I timings gathered off-line. this is possible because in an environment where tasks don't interfere with each other, their performance figures can be used in any possible combination. An application of industrial relevance (H.264 decoder) running on a parallel homogeneous platform is used to demonstrate the proposed method. For the H.264 application 9% throughput improvement is achieved when compared to the throughput obtained using methods of partitioning for the least number of misses. this is a significant improvement as it represents 45% from the theoretical throughput improvement achievable when assuming an infinite cache.

关键词： throughput

来源：评论

学校读者我要写书评

暂无评论

Modified hotspot cache architecture: A low energy fast cache for embedded processors

Modified hotspot cache architecture: A low energy fast cache...

引用

6th international workshop on embedded computer systems - architectures, modeling and simulation

作者： Ali, Kashif Aboelaze, Mokhtar Datta, Suprakash York Univ Dept Comp Sci & Engn Toronto ON M3J 2R7 Canada

ISBN: (纸本)1424401550

the cache memory plays a crucial role in the performance of any processor. the cache memory (SRAM), especially the on chip cache, is 3-4 times faster than the main memory (DRAM). It can vastly improve the processor performance and speed. Also the cache consumes much less energy than the main memory. that leads to a huge power saving which is very important for embedded applications. In today's processors, although the cache memory reduces the energy consumption of the processor, however the energy consumption in the on-chip cache account to almost 40% of the total energy consumption of the processor. In this paper, we propose a cache architecture, for the instruction cache, that is a modification of the hotspot architecture. Our proposed architecture consists of a small filter cache in parallel with the hotspot cache, between the L1 cache and the main memory. the small filter cache is to hold the code that was not captured by the hotspot cache. We also propose a prediction mechanism to steer the memory access to either the hotspot cache, the filter cache, or the L1 cache. Our design has both a faster access time and less energy consumption compared to both the filter cache and the hotspot cache architectures. We use Mibench and Mediabench benchmarks, together with the simplescalar simulator in order to evaluate the performance of our proposed architecture and compares it with the filter cache and the hotspot cache architectures. the simulation results show that our design outperforms both the filter cache and the hotspot cache in both the average memory access time and the energy consumption.

关键词： Cache memory

来源：评论

学校读者我要写书评

暂无评论

Average routing distance analysis and comparison of master-slave super-super-hypercube 4-cube topology with different message passing architectures

Average routing distance analysis and comparison of master-s...

引用

6th IEEE/ACIS international Conference on computer and Information Science in conjunction with 1st IEEE/ACIS international workshop on e-Activity

作者： Amiripour, M. Abachi, H. Monash Univ Dept Elect & Comp Syst Engn Clayton Vic 3800 Australia

ISBN: (纸本)9780769528410

Advances in the design, modeling and simulation of parallel processing systems provide significant research opportunities which lead to improvements on the speed, performance, fault tolerance, flexibility and cost-effectiveness of distributed systems. Several parameters determine the suitability of the system architecture for a given application. However Average Routing Distance (ARD) is perhaps one of the most important parameters in performance evaluation of parallel processing systems. To this effect, all mathematical modeling and simulation of ARD and Visit Ratio for a class of parallel processing systems are presented.

关键词： Parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：