检索结果-内蒙古大学图书馆

Development of a vector algorithm of three-dimensional crystal lattice parametric identification based on estimation of the spacing between adjacent lattice planes

引用

Procedia Engineering 2017年 201卷 690-697页

作者： Alexandr Shirokanev Dmitriy Kirsh Alexandr Kupriyanov Samara National Research Institute 34 Moskovskoye shosse Samara 443086 Russia Image Processing Systems Institute - Branch of the Federal Scientific Research Center "Crystallography and Photonics" of Russian Academy of Sciences Russia 443001 Samara Molodogvardeyskaya Street 151

The analysis of a crystal nanostructure is provided by the information obtained from the electron microscopy. Mathematically, a crystal structure is described by unit cells – minimum building blocks, which form the entire crystal lattice by parallel transfer. Parametric identification is an important problem in the field of three-dimensional crystal lattice research. Application of the constant step size gradient descent method to solve this problem ensured sufficient increase of the identification accuracy. However, computational complexity of the applied algorithm significantly exceeds the computational complexity of the existing parametric identification algorithms, which has caused substantial increment of the execution time. In order to eliminate such disadvantage this work proposes vector algorithm of crystal lattice parametric identification implemented with CUDA technology.

关键词： Bravais lattice unit cell gradient descent method parametric identification parallel algorithm CUDA

来源：评论

学校读者我要写书评

暂无评论

High Performance LDA through Collective Model Communication Optimization

引用

Procedia Computer Science 2016年 80卷 86-97页

作者： Bingjing Zhang Bo Peng Judy Qiu Indiana University Bloomington Indiana U.S.A. Peking University Beijing China

LDA is a widely used machine learning technique for big data analysis. The application includes an inference algorithm that iteratively updates a model until it converges. A major challenge is the scaling issue in parallelization owing to the fact that the model size is huge and parallel workers need to communicate the model continually. We identify three important features of the model in parallel LDA computation: 1. The volume of model parameters required for local computation is high; 2. The time complexity of local computation is proportional to the required model size; 3. The model size shrinks as it converges. By investigating collective and asynchronous methods for model communication in different tools, we discover that optimized collective communication can improve the model update speed, thus allowing the model to converge faster. The performance improvement derives not only from accelerated communication but also from reduced iteration computation time as the model size shrinks during the model convergence. To foster faster model convergence, we design new collective communication abstractions and implement two Harp-LDA applications, “lgs” and “rtt”. We compare our new approach with Yahoo! LDA and Petuum LDA, two leading implementations favoring asynchronous communication methods in the field, on a 100-node, 4000-thread Intel Haswell cluster. The experiments show that “lgs” can reach higher model likelihood with shorter or similar execution time compared with Yahoo! LDA, while “rtt” can run up to 3.9 times faster compared with Petuum LDA when achieving similar model likelihood.

关键词： Latent Dirichlet Allocation parallel algorithm Big Model Communication Model Communication Optimization

来源：评论

学校读者我要写书评

暂无评论

CONSTRUCTING AN EXACT PARITY BASE IS INRNC2

引用

parallel Processing Letters 1992年第4期2卷 301-309页

作者： G. GALBIATI F. MAFFIOLI Dipartimento di Informatica e Sistemistica Università di Pavia Italy Dipartimento di Elettronica e Informazione Politecnico di Milano Italy

In this work we address the parallel complexity of two combinatorial problems, specifically the problems of the existence and of the construction of a parity base of preassigned weight ( exact parity base for short) in a 0-1 weighted, represented matroid, subject to parity conditions. We prove that these problems lie in the parallel complexity class RNC 2 , i.e. they are solvable with one-sided error by a logspace uniform family of bounded fan-in circuits of polynomial size and quadratic logarithmic depth which receive, in addition to the problem input, a polynomial number of random input bits. We also show that the more general cases of these problems, defined over matroids weighted with integral instead of 0-1 weights, also belong to RNC 2 , as long as the weights are given in unary notation. As a consequence some special cases of these problems, which are of independent interest, belong to the same parallel complexity class: examples of these are the problem of the construction of a perfect matching of preassigned weight in a 0-1 weighted graph, recently addressed in [1], or that of the construction of a base of preassigned weight, in the intersection of two 0-1 weighted represented matroids.

关键词： parallel algorithm parallel complexity classes exact combinatorial problems matching and matroid theory

来源：评论

学校读者我要写书评

暂无评论

PREFIX-SUMS algorithmS ON RECONFIGURABLE MESHES

引用

parallel Processing Letters 1995年第1期5卷 23-35页

作者： KOJI NAKANO Advanced Research Laboratory Hitachi Ltd. Hatoyama Saitama 350-03 Japan

This paper shows that the prefix-sums of n binary values can be computed in time on an n × m reconfigurable mesh of the word model. It also shows that prefix-sums of n binary values can be computed in time on an n × m reconfigurable mesh of the word model if the reconfigurable mesh has communication capability that allows simultaneous sending to the same bus.

关键词： Reconfigurable mesh parallel algorithm prefix-sums algorithm binary values

来源：评论

学校读者我要写书评

暂无评论

AN EFFICIENT algorithm FOR SUMMING-UP BINARY VALUES ON A RECONFIGURABLE MESH

引用

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES 1994年第4期E77A卷 652-657页

作者： NAKANO, K Hitachi Ltd Saitama-ken Japan

This paper presents an algorithm which sums up n binary values on an n x m reconfigurable mesh in O(log n/square-root m log m) time. This algorithm also yields a corollary which states that n binary values can be summ... 详细信息

关键词： RECONFIGURABLE MESH parallel algorithm SUMMING algorithm

来源：评论

学校读者我要写书评

暂无评论

OPTIMAL MULTISELECTION IN HYPERCUBES**This work was partially supported by ARC Grant under its Large Research Grant (1996-98) A849602031.

引用

parallel algorithms and Applications 2000年第3期14卷 203-212页

作者： Hong Shen[a] [a] School of Computing and Information Technology Griffith University Nathan Australia

We study efficient parallel solutions to the problem of selectingrelements at specified ranks from a set of n arbitrary elements, known asmultiselection, in a hypercube withp<nprocessors. We propose two parallel al... 详细信息

We study efficient parallel solutions to the problem of selectingrelements at specified ranks from a set of n arbitrary elements, known asmultiselection, in a hypercube withpparallel algorithms based on different approaches, where one requires processors to operate in the SIMD mode, and the other in the MIMD mode. Our SIMD algorithm runs inO(nϵmin{r, logp}) time whenp=n1−rfor any 0<ϵ<1, which is cost-optimal whenr≥p. With the same number of processors, our MIMD algorithm runs inO(nϵlogr) time and is cost-optimal for any values ofr. Both algorithms are more efficient than straightforward solutions and that of direct simulation of the optimal EREW algorithm.

关键词： Hypercube Multiselection parallel algorithm Selection Sorting

来源：评论

学校读者我要写书评

暂无评论

A Study on Sequence Generation Powers of Small Cellular Automata

引用

SICE Journal of Control, Measurement, and System Integration 2012年第4期5卷 191-199页

作者： Naoki Kamikawa Hiroshi Umeo Media Communication Center Osaka Electro-Communication University Faculty of Information and Communication Engineering Osaka Electro-Communication University

A model of cellular automata (CA) is considered to be a well-studied non-linear model of complex systems in which an infinite one-dimensional array of finite state machines (cells) updates itself in a synchronous manner according to a uniform local rule. A sequence generation problem on the CAs has been studied and many scholars proposed several real-time sequence generation algorithms for a variety of non-regular sequences such as prime, Fibonacci, and {2n|n=1,2,3,...} sequences etc. The paper describes the sequence generation powers of CAs having a small number of states, focusing on the CAs with one, two, and three internal states, respectively. The authors enumerate all of the sequences generated by two-state CAs and present several non-regular sequences that can be generated in real-time by three-state CAs, but not generated by any two-state CA. It is shown that there exists a sequence generation gap among the powers of those small CAs.

关键词： cellular automata real-time sequence generation problem parallel algorithm computational complexity

来源：评论

学校读者我要写书评

暂无评论

A Multithreaded algorithm for the Computation of Sample Entropy

引用

algorithmS 2023年第6期16卷 299-299页

作者： Manis, George Bakalis, Dimitrios Sassi, Roberto Univ Ioannina Sch Engn Dept Elect & Comp Engn Ioannina 45110 Greece Univ Milan Dipartimento Informat I-20133 Milan Italy

Many popular entropy definitions for signals, including approximate and sample entropy, are based on the idea of embedding the time series into an m-dimensional space, aiming to detect complex, deeper and more informative relationships among samples. However, for both approximate and sample entropy, the high computational cost is a severe limitation. Especially when large amounts of data are processed, or when parameter tuning is employed premising a large number of executions, the necessity of fast computation algorithms becomes urgent. In the past, our research team proposed fast algorithms for sample, approximate and bubble entropy. In the general case, the bucket-assisted algorithm was the one presenting the lowest execution times. In this paper, we exploit the opportunities given by the multithreading technology to further reduce the computation time. Without special requirements in hardware, since today even our cost-effective home computers support multithreading, the computation of entropy definitions can be significantly accelerated. The aim of this paper is threefold: (a) to extend the bucket-assisted algorithm for multithreaded processors, (b) to present updated execution times for the bucket-assisted algorithm since the achievements in hardware and compiler technology affect both execution times and gain, and (c) to provide a Python library which wraps fast C implementations capable of running in parallel on multithreaded processors.

关键词： entropy sample entropy fast algorithm parallel algorithm bucket-assisted algorithm

来源：评论

学校读者我要写书评

暂无评论

動態批量平行演算法之探討

引用

Journal of Industrial and Production Engineering 1999年第2期16卷 173-182页

作者： Jr-Jung Lyu[a] Ming-Chang Lee[a] [a]Department Of Industrial Management National Cheng Kung Uniersity China

摘要摘要由於批量之決定影響生產系統之效率甚大，故在MRP架構中一直扮演著很重要的角色。雖然目前已有不少這方面之研究，大部份的最佳批量演算法卻受限於龐大之計算量而較不受實務界重視。隨著平行處理機的性能價格比日漸提昇，如何... 详细信息

摘要摘要由於批量之決定影響生產系統之效率甚大，故在MRP架構中一直扮演著很重要的角色。雖然目前已有不少這方面之研究，大部份的最佳批量演算法卻受限於龐大之計算量而較不受實務界重視。隨著平行處理機的性能價格比日漸提昇，如何運用平行演算法以求解如動態批量這樣計算繁雜的問題便是一値得重視的研究方向。本文提出了二個動態批量平行演算法，在問題大小爲n時，前者複雜度爲O(n{su2})（如果有n個處理器），後者則爲O(n{su3}/p+np{su2})（如果有P個處理器，且P<algorithms is hindered by the huge amount of computer resources required to solve the models, even for a modest problem. Since the powerful parallel computers are becoming cost-effective nowadays, it is necessary to explore paraIlel algorithms that can be used to solve these laborious computational problems. This paper presents two parallel algorithms for solving dynamic lot sizing problem using the cost path concept. Given n is the size of the problem. it is shown that the first proposed parallel algorithm is O(n2) with n processors and the second proposed parallel algorithm iswith p processors (p<parallel algorithms and some future research directions are also provided.

关键词：物料需求規劃平行演算法平行計算之複雜度 MRP parallel algorithm complexity

来源：评论

学校读者我要写书评

暂无评论

MODELLING A MORPHOLOGICAL THINNING algorithm FOR SHARED MEMORY SIMD COMPUTERS

引用

parallel Processing Letters 1991年第1期1卷 59-65页

作者： ABHIJIT DATTA SHIRISH V. JOSHI RABI N. MAHAPATRA Department of Electronics and Electrical Communication Engineering Indian Institute of Technology Kharagpur 721302 India

This letter presents the modelling of a morphological thinning algorithm suggested by Jang and Chin [1] on the four models of shared memory SIMD computers. The time and cost complexity analyses for the models have been given. The performance of this algorithm on SIMD computers has been compared with the performance of a conventional thinning algorithm [2] proposed recently.

关键词： Shared memory computers mathematical morphology thinning parallel algorithm pixels

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：