检索结果-内蒙古大学图书馆

26th International Symposium on Symbolic and Numeric algorithms for Scientific Computing, SYNASC 2024

作者： Chehaitly, Mouhamad Querol, Jorge Vummadisetty, Praveen Chatzinotas, Symeon University of Luxembourg SnT Luxembourg Luxembourg

ISBN: (纸本)9798331532833

Artificial Intelligence has emerged as a transformative technology, revolutionizing numerous industries by enabling advanced automation, predictive analytics, and decision-making capabilities. For that Artificial Intelligence overruns many domains like telecommunication, smart manufacturing industry, autonomous machines, Automated Disease Diagnosis in Medical Imaging, defense, and others. On the other hand, the hardware implementation of Artificial Intelligence comes with certain challenges and constraints, especially in a critical area, which leverages machine learning algorithms and real-time data analysis to optimize production processes and improve overall efficiency. Statistical operations play a crucial role in various machine learning algorithms to understand, process data, or make predictions to optimize models. So, in this work, we developed a high-speed and low-area design and implemented statistical operations for image or signal processing using an FPGA Device. To enhance the performance, we develop different hardware architectures based on different levels of parallelism to process the statistical operations to compute the Mean, Variance, and RMS (Root Mean Square). These generic architectures work in parallel/pipeline architectures with and without memory. The proposed architectures implement an FPGA target (Intel/Altera Agilex 7: AGMH039R47A2E1V) using Altera Quartus prime pro edition version 23.4 and achieve an ultra-high throughput with low-area consumption compared to the state-of-art methods. For 480×640 image size, the mean calculation architecture involves 1498 logic registers, 1912 slice LUT, and just 29kbits memory and it operates at a maximum frequency of 406.5MHz. Additionally, for an 8×8 image size, we need 33 clock cycles to achieve the mean calculation and 33+1 clock cycles to complete the variance calculation, compared to other approaches that require more than 64 clock cycles. © 2024 IEEE.

关键词： Digital storage

来源：评论

学校读者我要写书评

暂无评论

Generic High-Speed Design with Low-Area implementations of Statistical Operations Based on an FPGA Device

Generic High-Speed Design with Low-Area Implementations of S...

引用

International Symposium on Symbolic and Numeric algorithms for Scientific Computing (SYNASC)

作者： Mouhamad Chehaitly Jorge Querol Praveen Vummadisetty Symeon Chatzinotas SnT University of Luxembourg Luxembourg

ISBN: (数字)9798331532833

ISBN: (纸本)9798331532840

关键词： Industries Machine learning algorithms Computer architecture Throughput Hardware Real-time systems Artificial intelligence Root mean square Field programmable gate arrays Clocks

来源：评论

学校读者我要写书评

暂无评论

advanced Baseband processing algorithms, Circuits, and implementations for 5G Communication

引用

IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS 2017年第4期7卷 477-490页

作者： Zhang, Chuan Huang, Yuan-Hao Sheikh, Farhana Wang, Zhongfeng Southeast Univ Natl Mobile Commun Res Lab Nanjing 210096 Jiangsu Peoples R China Natl Tsing Hua Univ Dept Elect Engn Hsinchu 30013 Taiwan Natl Tsing Hua Univ Inst Commun Engn Hsinchu 30013 Taiwan Intel Corp Circuit Res Lab Intel Labs Hillsboro OR 97124 USA Nanjing Univ Sch Elect Sci & Engn Nanjing 210046 Jiangsu Peoples R China

The rapid emergence of 5G communications technology and standardization has seen an accelerated transfer of theoretical concepts to advanced development and implementation. Not only are 5G baseband signal processing algorithms becoming more important, but also the co-design and implementation of corresponding circuits, architectures, and platforms are becoming necessary due to rapid standardization of 5G communications. This timely overview paper introduces circuits and systems (CAS) for key enabling technologies for the new 5G era: massive MIMO, mmWave baseband systems, NOMA schemes, advanced channel coding, and so on. The state-of-the-art research progress in these areas is summarized for interested readers to initiate discussion on limitations of existing solutions and open research problems that are looking for innovative solutions, especially in the CAS area. We hope this paper can bridge the gap between the theoretical investigation and application implementation for 5G communications.

关键词： 5G baseband implementations massive MIMO mmWave precoder non-orthogonal multiple-access (NOMA) schemes channel decoders

来源：评论

学校读者我要写书评

暂无评论

Dynamically Reconfigurable architectures for Real-time Baseband processing

Dynamically Reconfigurable Architectures for Real-time Baseb...

引用

作者： Chenxin Zhang University of Lund

学位级别：博士

Motivated by challenges from today's fast-evolving wireless communication standards and soaring silicon design cost, it is important to design a flexible hardware platform that can be dynamically reconfigured to adapt to current operating scenarios, provide seamless handover between different communi- cation networks, and extend the longevity of advanced systems. Moreover, increasingly sophisticated baseband processing algorithms pose stringent re- quirements of real-time processing for hardware implementations, especially for power-budget limited mobile terminals. With existing hardware platforms such as Application-Specific Integrated Circuits (ASICs), Field-Programmable Gate Arrays (FPGAs), and Digital signal Processors (DSPs), the contradictory design requirements of flexibility, computational performance, and hardware ef- ficiency cannot be attained at the same time. To achieve a balance between the aforementioned design requirements, a coarse-grained dynamically reconfigurable cell array architecture is proposed. The architecture is constructed from an array of heterogeneous function units interconnected through a hierarchical on-chip network. The adopted in-cell configuration scheme enables fast context switching between standards and be- tween computational tasks during run-time. Although cell array is a generic hardware platform, this thesis focuses on the architectural development of the cell array tailored specifically for digital baseband processing of contemporary wireless communication systems. Various degrees of flexibilities among operat- ing scenarios, algorithms, tasks, and supporting standards are exploited. Be- sides, high hardware efficiency is attained by conducting algorithm-architecture, hardware-software, and processing-memory co-design. In this thesis, flexibility, performance and efficiency of the proposed archi- tecture are demonstrated through two case studies. First, the cell array is de- ployed in a digital front-end receiver, aiming t

关键词：

来源：评论

学校读者我要写书评

暂无评论

NInFEA: an embedded framework for the real-time evaluation of fetal ECG extraction algorithms

引用

BIOMEDICAL ENGINEERING-BIOMEDIZINISCHE TECHNIK 2013年第1期58卷 13-26页

作者： Pani, Danilo Barabino, Gianluca Raffo, Luigi Univ Cagliari Dept Elect & Elect Engn DIEE I-09123 Cagliari Italy

Fetal electrocardiogram (ECG) extraction from non-invasive biopotential recordings is a long-standing research topic. Despite the significant number of algorithms presented in the scientific literature, it is difficult to find information about embedded hardware implementations able to provide real-time support for the required features, bridging the gap between theory and practice. This article presents the NInFEA (non-invasive fetal ECG analysis) tool, an embedded hardware/software framework based on the hybrid dual-core OMAP-L137 low-power processor for the real-time evaluation of fetal ECG extraction algorithms. The hybrid platform, including a digital signal processor (DSP) and a general-purpose processor (GPP), allows achieving the best performance compared with single-core architectures. The GPP provides a portable graphical user interface, whereas the DSP is extensively used for advanced signal processing tasks. As a case study, three state-of-the-art fetal ECG extraction algorithms have been ported onto NInFEA, along with some support routines needed to provide the additional information required by the clinicians and supported by the user interface. NInFEA can be regarded both as a reference design for similar applications and as a common embedded low-power testbed for real-time fetal ECG extraction algorithms.

关键词： biomedical signal processing DSP platforms non-invasive fetal ECG real-time processing

来源：评论

学校读者我要写书评

暂无评论

Energy-Efficient Synthetic-Aperture Radar processing on a Manycore Architecture

Energy-Efficient Synthetic-Aperture Radar Processing on a Ma...

引用

42nd Annual International Conference on Parallel processing (ICPP)

作者： Zain-ul-Abdin Ahlander, Anders Svensson, Bertil Halmstad Univ Ctr Res Embedded Syst CERES Halmstad Sweden Saab AB Gothenburg Sweden

ISBN: (纸本)9780769551173

The next generation radar systems have high performance demands on the signal processing chain. Examples include the advanced image creating sensor systems in which complex calculations are to be performed on huge sets of data in real time. Manycore architectures are gaining attention as a means to overcome the computational requirements of the complex radar signal processing by exploiting massive parallelism inherent in the algorithms in an energy efficient manner. In this paper, we evaluate a manycore architecture, namely a 16-core Epiphany processor, by implementing two significantly large case studies, viz. an autofocus criterion calculation and the fast factorized back-projection algorithm, both key components in modern synthetic aperture radar systems. The implementation results from the two case studies are compared on the basis of achieved performance and programmability. One of the Epiphany implementations demonstrates the usefulness of the architecture for the streaming based algorithm (the autofocus criterion calculation) by achieving a speedup of 8.9x over a sequential implementation on a state-of-the-art general-purpose processor of a later silicon technology generation and operating at a 2.7x higher clock speed. On the other case study, a highly memory-intensive algorithm (fast factorized backprojection), the Epiphany architecture shows a speedup of 4.25x. For embedded signal processing, low power dissipation is equally important as computational performance. In our case studies, the Epiphany implementations of the two algorithms are, respectively, 78x and 38x more energy efficient.

关键词： Manycore architecture Parallel programming Radar signal processing

来源：评论

学校读者我要写书评

暂无评论

Energy-Efficient Synthetic-Aperture Radar processing on a Manycore Architecture

Energy-Efficient Synthetic-Aperture Radar Processing on a Ma...

引用

International Conference on Parallel processing (ICPP)

作者： Zain-Ul-Abdin Anders Åhlander Bertil Svensson Centre for Research on Embedded Systems (CERES) Halmstad University Halmstad Sweden Saab AB Gothenburg Sweden

The next generation radar systems have high performance demands on the signal processing chain. Examples include the advanced image creating sensor systems in which complex calculations are to be performed on huge sets of data in real time. Many core architectures are gaining attention as a means to overcome the computational requirements of the complex radar signal processing by exploiting massive parallelism inherent in the algorithms in an energy efficient manner. In this paper, we evaluate a many core architecture, namely a 16-core Epiphany processor, by implementing two significantly large case studies, viz. an auto focus criterion calculation and the fast factorized back-projection algorithm, both key components in modern synthetic aperture radar systems. The implementation results from the two case studies are compared on the basis of achieved performance and programmability. One of the Epiphany implementations demonstrates the usefulness of the architecture for the streaming based algorithm (the auto focus criterion calculation) by achieving a speedup of 8.9x over a sequential implementation on a state-of-the-art general-purpose processor of a later silicon technology generation and operating at a 2.7x higher clock speed. On the other case study, a highly memory-intensive algorithm (fast factorized back projection), the Epiphany architecture shows a speedup of 4.25x. For embedded signal processing, low power dissipation is equally important as computational performance. In our case studies, the Epiphany implementations of the two algorithms are, respectively, 78x and 38x more energy efficient.

关键词： signal processing algorithms Computer architecture Synthetic aperture radar Parallel processing Radar imaging Image resolution

来源：评论

学校读者我要写书评

暂无评论

Correctly rounded floating-point division for DSP-enabled FPGAs

Correctly rounded floating-point division for DSP-enabled FP...

引用

International Conference on Field Programmable Logic and Applications

作者： Bogdan Pasca Altera European Technology Centre High Wycombe UK

Floating-point division is a very costly operation in FPGA designs. High-frequency implementations of the classic digit-recurrence algorithms for division have long latencies (of the order of the number fraction bits) and consume large amounts of logic. Additionally, these implementations require important routing resources, making timing closure difficult in complete designs. In this paper we present two multiplier-based architectures for division which make efficient use of the DSP resources in recent Altera FPGAs. By balancing resource usage between logic, memory and DSP blocks, the presented architectures maintain high frequencies is full designs. Additionally, compared to classical algorithms, the proposed architectures have significantly lower latencies. The architectures target faithfully rounded results, similar to most elementary functions implementations for FPGAs but can also be transformed into correctly rounded architectures with a small overhead. The presented architectures are built using the Altera DSP Builder advanced framework and will be part of the default blockset.

关键词： Digital signal processing Polynomials Field programmable gate arrays Approximation error Memory management

来源：评论

学校读者我要写书评

暂无评论

Proceedings of SPIE - advanced signal processing algorithms, architectures, and implementations xvi

Proceedings of SPIE - Advanced Signal Processing Algorithms,...

引用

advanced signal processing algorithms, architectures, and implementations xvi

ISBN: (纸本)0819463922

The proceedings contain 29 papers. The topics discussed include: optimization of spanning tree adders;estimating adders for a low density parity check decoder;sublinear constant multiplication algorithms;new identities and transformations for hardware power operators;interconnection scheme for networks of online modules;reconfigurable architecture for the efficient solution of large-scale non-Hermitian eigenvalue problems;high-resolution iris image reconstruction from low-resolution imagery;using mean-squared error to assess visual image quality;time-frequency analysis of classical and quantum noise;application of time-frequency analysis methods to speaker verification;time-frequency decomposition based on information;time-frequency approximations with applications to filtering, modulation, and propagation;and on the development of a high-order texture analysis using the PWD and Rènyl entropy.

关键词： signal processing

来源：评论

学校读者我要写书评

暂无评论

A high-speed implementation of manifold coordinate representations of hyperspectral imagery: A GPU-based approach to rapid nonlinear modeling

A high-speed implementation of manifold coordinate represent...

引用

Conference on algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery xvi

作者： Topping, T. Russell French, James Hancock, Monte F., Jr. Celestech Inc Phoenix AZ 85048 USA

ISBN: (纸本)9780819481597

Working with the Naval Research Laboratory, Celestech has implemented advanced non-linear hyperspectral image (HSI) processing algorithms optimized for Graphics processing Units (GPU). These algorithms have demonstrated performance improvements of nearly 2 orders of magnitude over optimal CPU-based implementations. The paper briefly covers the architecture of the NIVIDIA GPU to provide a basis for discussing GPU optimization challenges and strategies. The paper then covers optimization approaches employed to extract performance from the GPU implementation of Dr. Bachmann's algorithms including memory utilization and process thread optimization considerations. The paper goes on to discuss strategies for deploying GPU-enabled servers into enterprise service oriented architectures. Also discussed are Celestech's on-going work in the area of middleware frameworks to provide an optimized multi-GPU utilization and scheduling approach that supports both multiple GPUs in a single computer as well as across multiple computers. This paper is a complementary work to the paper submitted by Dr. Charles Bachmann entitled "A Scalable Approach to Modeling Nonlinear Structure in Hyperspectral Imagery and Other High-Dimensional Data Using Manifold Coordinate Representations". Dr. Bachmann's paper covers the algorithmic and theoretical basis for the HSI processing approach.

关键词： Hyperspectral Image processing Graphics processing Units GPU Non-linear

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：