检索结果-内蒙古大学图书馆

A Comparative Survey of Big Data computing and HPC: From a parallel Programming Model to a Cluster architecture

INTERNATIONAL JOURNAL OF parallel PROGRAMMING 2022年第1期50卷 27-64页

作者： Yin, Fei Shi, Feng Beijing Inst Technol Sch Comp Sci Beijing 100081 Peoples R China

With the rapid growth of artificial intelligence (AI), the Internet of Things (IoT) and big data, emerging applications that cross stacks with different techniques bring new challenges to parallel computing systems. These cross-stack functionalities require one system to possess multiple characteristics, such as the ability to process data under high throughput and low latency, the ability to carry out iterative and incremental computation, transparent fault tolerance, and the ability to perform heterogeneous tasks that evolve dynamically. However, high-performance computing (HPC) and big data computing, as two categories of parallel computing architecture, are incapable of meeting all these requirements. Therefore, by performing a comparative analysis of HPC and big data computing from the perspective of the parallel programming model layer, middleware layer, and infrastructure layer, we explore the design principles of the two architectures and discuss a converged architecture to address the abovementioned challenges.

关键词： High-performance computing Big data computing parallel computing architecture Iterative computation Heterogeneous tasks Converged architecture Cross-stack functionality

来源：评论

学校读者我要写书评

暂无评论

SPCTRE: sparsity-constrained fully-digital reservoir computing architecture on FPGA

引用

INTERNATIONAL JOURNAL OF parallel EMERGENT AND DISTRIBUTED SYSTEMS 2024年第2期39卷 197-213页

作者： Abe, Yuki Nishida, Kohei Ando, Kota Asai, Tetsuya Hokkaido Univ Grad Sch Informat Sci & Technol Sapporo Japan Hokkaido Univ Fac Engn Sapporo Hokkaido Japan Hokkaido Univ Fac Informat Sci & Technol Sapporo Japan Hokkaido Univ Fac Informat Sci & Technol Grad Sch Kita 14Nishi 9Kita Ku Sapporo 0600814 Japan

This paper proposes an unconventional architecture and algorithm for implementing reservoir computing on FPGA. An architecture-oriented algorithm with improved throughput and architecture designed to reduce memory and hardware resource requirements are presented. The proposed architecture exhibits good performance in terms of benchmarks for reservoir computing. A prediction accelerator for reservoir computing that operates on 55.45 mW at 450 K fps with <3000 LEs is realized by implementing the architecture on FPGA. The proposed approach presents a novel FPGA implementation of reservoir computing focussing on both algorithms and architecture that may serve as a basis for applications of AI at network edge. [GRAPHICS] .

关键词： Reservoir computing parallel computing architecture FPGA implementation

来源：评论

学校读者我要写书评

暂无评论

2D hierarchical fuzzy clustering using kernel-based membership functions

引用

ELECTRONICS LETTERS 2016年第3期52卷 193-194页

作者： Proietti, A. Liparulo, L. Panella, M. Univ Roma La Sapienza Dept Informat Engn Elect & Telecommun Via Eudossiana 18 I-00184 Rome Italy

2D clustering aims at solving problems concerning bi-dimensional datasets in several application fields, such as medical imaging, image retrieval, computer vision and so on. A novel approach for 2D hierarchical fuzzy clustering is proposed, which relies on the use of kernel-based membership functions. This new metric allows to obtain unconstrained structures for data modelling. The performed tests show that the proposed approach can overcome well-known hierarchical clustering algorithms against different benchmarks, also having the chance to be deployed on parallel computing architectures.

关键词： data models image processing pattern clustering 2D hierarchical fuzzy clustering algorithm bi-dimensional datasets computer vision data modelling image retrieval kernel-based membership functions medical imaging parallel computing architecture unconstrained structures

来源：评论

学校读者我要写书评

暂无评论

parallel implementation of the modified subset sum problem in CUDA 22

Parallel implementation of the modified subset sum problem i...

引用

22nd Telecommunications Forum, TELFOR 2014

作者： Ristovski, Zlatko Mishkovski, Igor Gramatikov, Sasho Filiposka, Sonja Faculty of Computer Sciences and Engineering .O. Box 393 Skopje Macedonia

ISBN: (纸本)9781479961900

In the recent years, computing is shifting from 'central procebing' on the CPU to 'co-procebing' on the CPU and GPU. This computing paradigm shift is due to the development of CUDA (Compute Unified Device architecture) parallel computing architecture. CUDA is a programming model for parallel computing in Graphics Procebing Units (GPUS). In this work, we have implemented parallel solution of the NP-complete modified subset sum algorithm using CUDA. With our implementation, for a certain problem size, we have obtained speedup of 20 times, compared to the CPU version. © 2014 IEEE.

关键词： graphics processing units mathematics computing optimisation parallel architectures set theory CPU CUDA GPU NP-complete modified subset sum algorithm central processing compute unified device architecture computing paradigm shift coprocessing parallel computing architecture Central Processing Unit Computer architecture Graphics processing units Instruction sets Peer-to-peer computing Programming Vectors GPGPU Modified subset sum algorithm parallel Speedup

来源：评论

学校读者我要写书评

暂无评论

Resource-efficient and scalable solution to problem of real-data polyphase discrete Fourier transform channelisation with rational over-sampling factor

引用

IET SIGNAL PROCESSING 2013年第4期7卷 296-305页

作者： Jones, Keith John L3 TRL Technol Head Off Unit 19 Tewkesbury GL20 8DN Glos England

The study describes the results of research carried out into the design of a parallel and resource-efficient solution to the real-data polyphase discrete Fourier transform (DFT), or PDFT. The solution is able to exploit both the real-valued nature of the data and the parallel processing capabilities of the computing technology - assumed to be a field-programmable gate array - to yield a solution with a low size, weight and power requirement. A parallel computing architecture has been devised, based upon batch processing, whereby pipelined operation of the polyphase filter bank (PFB) is achieved using shared resources and pipelined operation of the real-data DFT using the resource-efficient regularised fast Hartley transform (RFHT). The PFB outputs are appropriately re-ordered for input to the RFHT by means of a suitably defined finite state machine. The resulting design, which includes a flexible up-sampling capability (with rational over-sampling factor) to address the problem of adjacent channel interference, trade-off time complexity against space complexity in order to satisfy the associated timing constraints. The solution is also scalable, in terms of the number of channels, so that it might be easily adapted, for new or multiple applications, at minimal re-design effort and cost.

关键词： adjacent channel interference channel bank filters computational complexity discrete Fourier transforms finite state machines Hartley transforms parallel processing resource-efficient solution scalable solution real-data polyphase discrete Fourier transform channelisation rational over-sampling factor PDFT parallel processing capabilities computing technology field programmable gate array parallel computing architecture batch processing pipelined operation polyphase filter bank PFB real-data DFT resource-efficient regularised fast Hartley transform resource-efficient RFHT finite state machine flexible up-sampling capability adjacent channel interference trade-off time complexity space complexity timing constraints adjacent channel interference channel bank filters computational complexity discrete Fourier transforms finite state machines Hartley transforms parallel processing resource-efficient solution scalable solution real-data polyphase discrete Fourier transform channelisation rational over-sampling factor PDFT parallel processing capabilities computing technology field programmable gate array parallel computing architecture batch processing pipelined operation polyphase filter bank PFB real-data DFT resource-efficient regularised fast Hartley transform resource-efficient RFHT finite state machine flexible up-sampling capability adjacent channel interference trade-off time complexity space complexity timing constraints

来源：评论

学校读者我要写书评

暂无评论

The Virtual Marathon: parallel computing Supports Crowd Simulations

引用

IEEE COMPUTER GRAPHICS AND APPLICATIONS 2009年第4期29卷 26-33页

作者： Yilmaz, Erdal Isler, Veysi Cetin, Yasemin Yardimci Middle E Tech Univ Inst Informat Ankara Turkey

To be realistic, an urban model must include appropriate numbers of pedestrians, vehicles, and other dynamic entities. Using a parallel-computing architecture, researchers simulated a marathon with more than a million participants. To simulate participant behavior, they used fuzzy logic on a GPU to perform millions of inferences in real time.

关键词： fuzzy logic parallel algorithms road vehicles sport traffic engineering computing virtual reality GPU algorithm crowd simulation dynamic entity parallel computing architecture participant behavior simulation pedestrian simulation urban model vehicle simulation virtual athlete virtual marathon simulation parallel populace simulation virtual Urban modeling Fuzzy logic virtual reality road vehicles simulating Sports engineering computing Marathons parallel algorithms parallel Lines Land transportation parallel PROCESSING (COMPUTERS) Virtual

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：