检索结果-内蒙古大学图书馆

ICEMI'2005第七届国际电子测量与仪器学术会议

作者： Sun Cuili Lu Hongnian Yang Min (Beijing University of Aeronautics and Astronautics, NDT Center, Beijing, 100083,China)

This paper describe the parallel implementation of cone beam CT reconstruction using MPI(Message Processing Interface) on workstation, it also analysis FDK algorithm and its parallel implementation. T

关键词： Cone beam CT(computed tomography) parallel implementation MPI(message processing interface).

来源：评论

学校读者我要写书评

暂无评论

parallel implementation of Kvazaar HEVC on Multicore ARM Processor

Parallel Implementation of Kvazaar HEVC on Multicore ARM Pro...

引用

International Conference on Modelling, Identification and Control

作者： Mohamed Maazouz Nejmeddine Bahri Noureddine Batel Abdelmoughni Toubal Nouri Masmoudi Dept. of Electrical Engineering of Saad Dahleb Blida Algeria National School of Engineers / LETI Laboratory University of Sfax-Tunisia LSEA laboratory University of Medea Algeria

ISBN: (纸本)9781509015948

The emergence of the new standard HEVC (High Efficiency Video Coding) is accompanied with serious problems related to resource consumption and encoding time. The proposal of new tools and optimizations is strongly recommended to ensure the integration of this new encoder in various platforms and multimedia applications. In this context, Kvazaar HEVC encoder is introduced to overcome the problems related to HEVC test model (HM) reference software. This academic open-source is tailored to fit the programmer's needs by enabling high-level parallel processing. In this context, this paper presents different parallel implementations of the Kvazaar HEVC encoder on a powerful Octa-core CubieBoard4 platform including two quad-core ARM A7 and ARM A15 for efficient power and high performance in a single chip. A performance comparison of different parallelization strategies is performed. For single-threaded implementation, experimental results show that the high speed preset (RD1) can save up to 48% and 91% of encoding time for Random Access (RA) and All-Intra (AI) configurations respectively. When moving to multi-threaded implementation, time saving is about 65% to 75% for AI configuration. Moreover, experiments show that Wavefront parallel Processing (WPP) outperforms tiles-level parallelization in terms of encoding speed without inducing video quality degradation or bitrate increase.

关键词： HEVC encoder parallel implementation ARM Multi-core plateform Multi-threading

来源：评论

学校读者我要写书评

暂无评论

parallel implementation of Mesh Simplification on a Beowulf Cluster

Parallel Implementation of Mesh Simplification on a Beowulf ...

引用

The Ninth International Symposium on Distributed Computing and Applications to Business,Engineering and Science

作者： Lu Yongquan,Gao Pengdong,Qiu Chu,Wang Jintao,Lv Rui High Performance Computing Center Communication University of China Beijing 100024,China

The parallel implementation of a novel mesh simplification method is introduced detailedly in this paper, which is based on a Beowulf cluster *** full advantage of the distributed memory and high performance network,we can simplify out-of-core models quickly and avoid thrashing the virtual memory *** addition,the file I/O and load balancing are also considered to make sure a near optimal utilization of the computational resources as well as obtaining high quality output.A set of numerical experiments have demonstrated that our parallel implementation can not only reduce the execution time greatly but also obtain higher parallel efficiency.

关键词： mesh simplification Beowulf cluster distributed memory parallel implementation

来源：评论

学校读者我要写书评

暂无评论

Linear scaling Coulomb interaction in the multiwavelet basis,a parallel implementation

引用

International Journal of Modeling, Simulation, and Scientific Computing 2014年第S1期5卷 28-50页

作者： Stig Rune Jensen Jonas Jusélius Antoine Durdek Tor Fl˚a Peter Wind Luca Frediani Department of Physics and Technology Centre for Theoretical and Computational Chemistry UiT The Arctic University of Norway N-9037 TromsøNorway High Performance Computing Group Centre for Theoretical and Computational Chemistry UiT The Arctic University of Norway N-9037 TromsøNorway Department of Mathematics and Statistics Centre for Theoretical and Computational Chemistry UiT The Arctic University of Norway N-9037 TromsøNorway Department of Chemistry Centre for Theoretical and Computational Chemistry UiT The Arctic University of Norway N-9037 TromsøNorway

We present a parallel and linear scaling implementation of the calculation of the electrostatic potential arising from an arbitrary charge *** approach is making use of the multi-resolution basis of *** potential is obtained as the direct solution of the Poisson equation in its Green’s function integral *** the multiwavelet basis,the formally non local integral operator decays rapidly to negligible values away from the main diagonal,yielding an effectively banded structure where the bandwidth is only dictated by the requested *** sparse operator structure has been exploited to achieve linear scaling and parallel *** has been achieved both through the shared memory(OpenMP)and the message passing interface(MPI)*** implementation has been tested by computing the electrostatic potential of the electronic density of long-chain alkanes and diamond fragments showing(sub)linear scaling with the system size and efficent parallelization.

关键词： Multiwavelets electrostatic potentials Poisson equation integral operators linear scaling parallel implementation

来源：评论

学校读者我要写书评

暂无评论

Visual predictive control for differential drive robots with parallel implementation on GPU

引用

COMPUTERS & ELECTRICAL ENGINEERING 2022年 102卷

作者： Durand-Petiteville, A. Cadenat, V. Fed Univ Pernambuco UFPE Dept Mech Engn Ave Arquitetura BR-50740550 Recife PE Brazil CNRS 7 Ave colonel Roche F-31400 Toulouse France Univ Toulouse UPS F-31400 Toulouse France

This work focuses on the control of a camera mounted on a differential drive robot via a VPC (Visual Predictive Control) scheme. First, an exact model of the visual feature prediction is presented for this robotic system. Next, relying on the equivalent command vector concept, a parallel implementation on a GPU (Graphics Processing Unit) of the computation of the cost function and its gradient is presented. Finally, results show that the proposed approach is more accurate than the ones classically used and can be up to six times faster than CPU-based (Central Processing Unit) one for large prediction horizons and numerous visual features. It then becomes possible to implement a VPC controller running sufficiently fast to perform a navigation tasks, while guaranteeing the closed-loop stability by relying on large prediction horizons.

关键词： Mobile robot Visual servoing Model predictive control parallel implementation

来源：评论

学校读者我要写书评

暂无评论

Cellular watersheds: A parallel implementation of the watershed transform on the CNN universal machine

引用

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS 2007年第4期E90D卷 791-794页

作者： Eom, Seongeun Shin, Vladimir Ahn, Byungha GIST Dept Mechatron Gwangju South Korea

The watershed transform has been used as a powerful morphological segmentation tool in a variety of image processing applications. This is because it gives a good segmentation result if a topographical relief and markers are suitably chosen for different type of images. This paper proposes a parallel implementation of the watershed transform on the cellular neural network (CNN) universal machine, called cellular watersheds. Owing to its fine grain architecture, the watershed transform can be parallelized using local information. Our parallel implementation is based on a simulated immersion process. To evaluate our implementation, we have experimented on the CNN universal chip, ACE16k, for synthetic and real images.

关键词： watershed transform parallel implementation cellular neural network universal machine image segmentation

来源：评论

学校读者我要写书评

暂无评论

A phase-field model for ductile fracture with shear bands: A parallel implementation

引用

INTERNATIONAL JOURNAL OF MECHANICAL SCIENCES 2021年 200卷 106424-106424页

作者： Samaniego, C. Ulloa, J. Rodriguez, P. Houzeaux, G. Vazquez, M. Samaniego, E. Barcelona Supercomp Ctr BSC CNS Nexus 2 BldgJordi Girona 29 Barcelona 08034 Spain Katholieke Univ Leuven Dept Civil Engn Kasteelpk Arenberg 40 B-3001 Leuven Belgium Leibniz Univ Hannover Dept Math & Phys Chair Computat Sci & Simulat Technol Appelstr 11 D-30167 Hannover Germany Univ Cuenca Sch Engn Av 12 Abril S-N Cuenca Ecuador Univ Cuenca Dept Recursos Hidr & Ciencias Ambientales Av 12 Abril S-N Cuenca Ecuador

Modeling complex material failure with competing mechanisms is a difficult task that often leads to mathematical and numerical challenges. This work contributes to the study of localized failure mechanisms by means of phase fields in a variational framework: in addition to the treatment of brittle and ductile fracture, done in previous work, we consider the case of shear band formation followed by ductile fracture. To achieve this, a new degradation function is introduced, which distinguishes between two successive failure mechanisms: (i) plastic strain localization and (ii) ductile fracture. Specifically, the onset of elastic damage is delayed to allow for the formation of shear bands driven by plastic deformations, thus accounting for the mechanisms that precede the coalescence of voids and microcracks into macroscopic ductile fractures. Once a critical degradation value has been reached, a phase-field model is introduced to capture the (regularized) kinematics of macroscopic cracks. To tackle the issue of potentially high computational cost, we propose a parallel implementation of the phase-field approach based on an iterative algorithm. The algorithm was implemented within the Alya system, a high performance computational mechanics code. Several examples show the capabilities of our implementation. We pay special attention to the ability to capture different failure mechanisms.

关键词： Phase-field Ductile fracture Shear band parallel implementation

来源：评论

学校读者我要写书评

暂无评论

High Throughput parallel implementation of RLS

引用

IFAC Proceedings Volumes 1988年第9期21卷 561-566页

作者： L. Chisci G. Zappa Dipartimento di Sistemi e Informatica Universita di Firenze Via di Santa Marta. 3 50139 Firenze. italy

parallel implementations of RLS algorithms over systolic architectures are considered and their efficiency in terms of estimate updating rate is discussed. New implementations are proposed, which allow higher throughputs (up to 0(1) estimate updates per time unit). Since in some of them, a distortion with respect to exact RLS is introduced, their performance is investigated, both analytically and experimentally. Tradeoffs between complexity and performance are discussed.

关键词： RLS fast algorithms parallel implementation systolic architectures

来源：评论

学校读者我要写书评

暂无评论

Towards a HPC-oriented parallel implementation of a learning algorithm for bioinformatics applications

引用

BMC BIOINFORMATICS 2014年第5-Sup期15卷 1-15页

作者： D'Angelo, Gianni Rampone, Salvatore Univ Sannio Dept Sci & Technol Benevento Italy Futuridea Innovaz Utile & Sostenibile Benevento Italy

Background: The huge quantity of data produced in Biomedical research needs sophisticated algorithmic methodologies for its storage, analysis, and processing. High Performance Computing (HPC) appears as a magic bullet in this challenge. However, several hard to solve parallelization and load balancing problems arise in this context. Here we discuss the HPC-oriented implementation of a general purpose learning algorithm, originally conceived for DNA analysis and recently extended to treat uncertainty on data (U-BRAIN). The U-BRAIN algorithm is a learning algorithm that finds a Boolean formula in disjunctive normal form (DNF), of approximately minimum complexity, that is consistent with a set of data (instances) which may have missing bits. The conjunctive terms of the formula are computed in an iterative way by identifying, from the given data, a family of sets of conditions that must be satisfied by all the positive instances and violated by all the negative ones;such conditions allow the computation of a set of coefficients (relevances) for each attribute (literal), that form a probability distribution, allowing the selection of the term literals. The great versatility that characterizes it, makes U-BRAIN applicable in many of the fields in which there are data to be analyzed. However the memory and the execution time required by the running are of O(n(3)) and of O(n(5)) order, respectively, and so, the algorithm is unaffordable for huge data sets. Results: We find mathematical and programming solutions able to lead us towards the implementation of the algorithm U-BRAIN on parallel computers. First we give a Dynamic Programming model of the U-BRAIN algorithm, then we minimize the representation of the relevances. When the data are of great size we are forced to use the mass memory, and depending on where the data are actually stored, the access times can be quite different. According to the evaluation of algorithmic efficiency based on the Disk Model, in order to r

关键词： Execution Time parallel implementation Positive Instance Negative Instance Disjunctive Normal Form

来源：评论

学校读者我要写书评

暂无评论

A parallel FPGA implementation of the CCSDS-123 Compression Algorithm

引用

REMOTE SENSING 2019年第6期11卷 673-673页

作者： Orlandic, Milica Fjeldtvedt, Johan Johansen, Tor Arne Norwegian Univ Sci & Technol Dept Elect Syst N-7491 Trondheim Norway Norwegian Univ Sci & Technol Dept Engn Cybernet Ctr Autonomous Marine Operat & Syst NTNU AMOS N-7491 Trondheim Norway

Satellite onboard processing for hyperspectral imaging applications is characterized by large data sets, limited processing resources and limited bandwidth of communication links. The CCSDS-123 algorithm is a specialized compression standard assembled for space-related applications. In this paper, a parallel FPGA implementation of CCSDS-123 compression algorithm is presented. The proposed design can compress any number of samples in parallel allowed by resource and I/O bandwidth constraints. The CCSDS-123 processing core has been placed on Zynq-7035 SoC and verified against the existing reference software. The estimated power use scales approximately linearly with the number of samples processed in parallel. Finally, the proposed implementation outperforms the state-of-the-art implementations in terms of both throughput and power.

关键词： CCSDS-123 compression parallel implementation Field programmable gate arrays (FPGA) hyperspectral imaging real-time processing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：