检索结果-内蒙古大学图书馆

European Conference on Circuit Theory Design

作者： Nieolosi, Leonardo Tetzlaff, Ronald Abt, Felix Blug, Andreas Hoefler, Heinrich Carl, Daniel Tech Univ Dresden Inst Grundlagen Elektrotech & Elekt Dresden Germany FGSW Stuttgart Germany Fraunhofer Inst Physikalische Messtechn IPM Freiberg Germany

ISBN: (纸本)9781424438952

In this paper a new Cellular Neural Network (CNN) based visual algorithm for welding processes is proposed. The idea described in [1] can be used in processes, whose welding direction has a constant orientation well known a priori. The algorithm proposed in the following is omnidirectional in the sense that it does not depend on the welding direction. This fact enables closed loop control systems for welding processes with curved seeds. On Eye-RIS systems [2] processing times of about 110 mu s are achievable for both acquisition and evaluation of full frame images.

关键词： component simd processor Laser welding Closed loop system System application and experience CNN

来源：评论

学校读者我要写书评

暂无评论

A fast parallel Reed-Solomon decoder on a reconfigurable architecture

A fast parallel Reed-Solomon decoder on a reconfigurable arc...

引用

1st IEEE/ACM/IFIP International Conference on Hardward/Software Codesign and System Synthesis

作者： Koohi, A Bagherzadeh, N Pan, CZ Univ Calif Irvine Dept EECS Irvine CA 92717 USA

ISBN: (纸本)1581137427

This paper presents a software implementation of a very fast parallel Reed-Solomon decoder on the second generation of MorphoSys reconfigurable computation platform, which is targeting on streamed applications such as multimedia and DSP. Numerous modifications of the first-generation of the architecture have made a scalable computation and communication intensive architecture capable of extracting parallelisms of fine grain in instruction level. Many algorithms and the whole Digital Video Broadcasting base-band receiver as well, have been mapped onto the second architecture with impressing performance. The mapping of a Reed-Solomon decoder proposed in this paper highly parallelizes all of its sub-algorithms, including Syndrome Computation. Berlekamp Algorithm, Chem Search, and Error Value Computation, in a simd fashion. The mapping is tested on a cycle-accurate simulator, "Mulate", and the performance is encouragingly better than other architectures. The decoding speed of the RS (255,239,16) decoder using two different methods of GF multiplication can be 1.319Gbps and 2.534Gbps, respectively. Furthermore, since there is no functionality specifically tailored to Reed-Solomon decoder, the result has demonstrated the capability of MorphoSys architecture to extracting Instruction Level Parallelism from streamed applications.

关键词： reconfigurable architecture simd processor Reed_Solomon codes berlekamp algorithm chein search

来源：评论

学校读者我要写书评

暂无评论

The Jubi Approach: a Tool Set for Hardware Acceleration in Sensor Network Applications Field

The Jubi Approach: a Tool Set for Hardware Acceleration in S...

引用

5th International Conference on phD Research in MicroElectronics and Electronics

作者： Brousse, O. Sassatelli, G. Grize, F. Univ Montpellier 2 CNRS LIRMM 161 Rue Ada F-34392 Montpellier France Univ Lausanne HEC ISI CH-1015 Lausanne Switzerland

ISBN: (纸本)9781424437320

This paper presents a unified design flow that aims at accelerating parallelizable data-intensive applications in the context of ubiquitous computing. This contribution relies on the JubiTool: a set of integrated tools (JubiSplitter, JubiCompiler), allowing respectively to extract and compile parallelizable parts of applications described in a Java extended language called Jubi. By appending hardware directives to a software agent description, the inherent flexibility of software is combined with the runtime performance of a hardware execution. In the case of typical Perplexus applications such as a biologically plausible neural network simulator, this contribution takes profit of the intrinsic property of the Perplexus Ubichip in terms of parallelism resulting in an expected speedup of one order of magnitude. Finally, we show that this original flow allowing HW acceleration can be modified to support other types of distributed platforms.

关键词： Development flow Compiler simd processor Java Distributed computing

来源：评论

学校读者我要写书评

暂无评论

Design of a Coarse-grained Processing Element for Matrix Multiplication on FPGA 8

Design of a Coarse-grained Processing Element for Matrix Mul...

引用

8th IEEE International Symposium on Embedded Multicore/Manycore Systems-on-Chip (MCSoC)

作者： Okuyama, Yuichi Takano, Shigeyuki Shirai, Tokimasa Univ Aizu Dept Comp Sci & Engn Aizu Wakamatsu Fukushima Japan On Semicond Gunma Japan NJK Corp Tokyo Japan

ISBN: (纸本)9781479943050

In this paper, we discuss and evaluate about a grain size of the PE of a matrix operation specific architecture with fused multiply add (FMA) units, RapidMatriX, on FPGAs. Recent FPGAs have many DSP blocks which are high-performance arithmetic units. Hereby, implementing functional units for matrix operation to array structure of the RapidMatriX, we propose to use DSP blocks efficiently by increasing grain size of FMA unit. We implement the RapidMatriX using the refined PEs on an FPGA. In addition, we evaluate the clock frequencies and the clock cycles of calculation. As a result, throughput of the PE for 4x4 matrix FMA is 3.14 times in comparison with the original PEs of scalar FMA for 8 x 8 matrix multiplication.

关键词： FPGA matrix multiplication simd processor

来源：评论

学校读者我要写书评

暂无评论

A Reconfigurable ASIP for High-Throughput and Flexible FFT Processing in SDR Environment

A Reconfigurable ASIP for High-Throughput and Flexible FFT P...

引用

6th International Conference on Digital Image Processing (ICDIP)

作者： Chen, Ting Liu, Hengzhu Zhang, Botao Natl Univ Def Technol Changsha HN Peoples R China

ISBN: (纸本)9781628411867

This paper presents a high-throughput and reconfigurable processor for fast Fourier transformation (FFT) processing based on SDR methodology. It adopts application specific instruction-set (ASIP) and single instruction multiple data (simd) architecture to exploit the parallelism of butterfly operations in FFT algorithm. Moreover, a novel 3-dimension multi-bank memory is proposed for parallel conflict-free accesses. The overall throughput and power-efficiency are greatly enhanced by parallel and streamline processing. A test chip supporting 64 similar to 2048-point FFT is setup for experiment. Logic synthesis reveals a maximum clock frequency of 500MHz and an area of 0.49 mm(2) for the processor's logic using a low power 45-nm technology, and the dynamic power estimation is about 96.6mW. Compared with previous works, our FFT ASIP achieves a higher energy-efficiency with relative low area cost.

关键词： FFT SDR simd processor

来源：评论

学校读者我要写书评

暂无评论

Maximizing the Potential of Custom RISC-V Vector Extensions for Speeding up SHA-3 Hash Functions

Maximizing the Potential of Custom RISC-V Vector Extensions ...

引用

Design, Automation and Test in Europe Conference and Exhibition (DATE)

作者： Li, Huimin Mentens, Nele Picek, Stjepan Delft Univ Technol Delft Netherlands Leiden Univ Leiden Netherlands Katholieke Univ Leuven Leuven Belgium Radboud Univ Nijmegen Nijmegen Netherlands

ISBN: (纸本)9798350396249

SHA-3 is considered to be one of the most secure standardized hash functions. It relies on the Keccak-f[1 600] permutation, which operates on an internal state of 1 600 bits, mostly represented as a 5 x 5 x 64-bit matrix. While existing implementations process the state sequentially in chunks of typically 32 or 64 bits, the Keccak-f[1 600] permutation can benefit a lot from speedup through parallelization. This paper is the first to explore the full potential of parallelization of Keccak-f[1 600] in RISC-V based processors through custom vector extensions on 32-bit and 64-bit architectures. We analyze the Keccakf[1 600] permutation, composed of five different step mappings, and propose ten custom vector instructions to speed up the computation. We realize these extensions in a simd processor described in System Verilog. We compare the performance of our designs to existing architectures based on vectorized application-specific instruction set processors (ASIP). We show that our designs outperform all related work in throughput due to our carefully selected custom vector instructions.

关键词： Keccak SHA-3 Vector Extensions simd processor RISC-V

来源：评论

学校读者我要写书评

暂无评论

Fractally-structured CMOS processor for quantum-circuit emulation

引用

JAPANESE JOURNAL OF APPLIED PHYSICS PART 1-REGULAR PAPERS SHORT NOTES & REVIEW PAPERS 2002年第4B期41卷 2329-2334页

作者： O'uchi, S Fujishima, M Hoh, K Univ Tokyo Sch Engn Bunkyo Ku Tokyo 1138656 Japan Univ Tokyo Sch Frontier Sci Bunkyo Ku Tokyo 1138656 Japan Japan Sci & Technol Corp CREST Bunkyo Ku Tokyo 1138656 Japan

A novel complementary-metal-oxide-semiconductor (CMOS) processor labelled quantum-circuit processor (QCP) for the high-performance emulation of quantum computing is presented. The QCP performs the emulation of the calculation per-formed in the quantum circuit by simple matrix calculations based on single-instruction-stream-multiple-data-stream (simd) parallel processing, Using the parallel operation of an enormous number of devices in LSI, it executes quantum algorithms at a speed comparable to that of the quantum computer. A 5-qubit processor was implemented using a programmable logic device (PLD), and the quantum Fourier transformation was demonstrated by this processor.

关键词： quantum computing quantum circuit parallel computing CMOS simd processor hypercube network fractal structure

来源：评论

学校读者我要写书评

暂无评论

A novel spatter detection algorithm based on typical cellular neural network operations for laser beam welding processes

引用

MEASUREMENT SCIENCE AND TECHNOLOGY 2012年第1期23卷 015401-15401页

作者： Nicolosi, L. Abt, F. Blug, A. Heider, A. Tetzlaff, R. Hoefler, H. Tech Univ Dresden D-01062 Dresden Germany Univ Stuttgart Inst Strahlwerkzeuge D-7000 Stuttgart Germany Fraunhofer Inst Phys Messtech IPM Freiburg Germany

Real-time monitoring of laser beam welding (LBW) has increasingly gained importance in several manufacturing processes ranging from automobile production to precision mechanics. In the latter, a novel algorithm for the real-time detection of spatters was implemented in a camera based on cellular neural networks. The latter can be connected to the optics of commercially available laser machines leading to real-time monitoring of LBW processes at rates up to 15 kHz. Such high monitoring rates allow the integration of other image evaluation tasks such as the detection of the full penetration hole for real-time control of process parameters.

关键词： cellular neural networks imaging systems laser welding monitoring systems simd processor system application and experience time resolved imaging spatters

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：