检索结果-内蒙古大学图书馆

parallel Multi-objective Memetic Algorithm for Competitive Facility Location 1

10th international conference on parallel processing and Applied Mathematics (PPAM)

作者： Lancinskas, Algirdas Zilinskas, Julius Vilnius Univ Inst Math & Informat LT-08663 Vilnius Lithuania

ISBN: (数字)9783642551956

ISBN: (纸本)9783642551956

A hybrid genetic algorithm for global multi-objective optimization is parallelized and applied to solve competitive facility location problems. the impact of usage of the local search on the performance of the parallel algorithm has been investigated. An asynchronous version of the parallel genetic algorithm with the local search has been proposed and investigated by solving competitive facility location problem utilizing hybrid distributed and shared memory parallel programming model on high performance computing system.

关键词： Facility location Multi-objective optimization Memetic algorithms

来源：评论

学校读者我要写书评

暂无评论

algorithms and architectures for parallel processing: 13th international conference, ICA3PP 2013, Vietri sul Mare, Italy, December 18-20, 2013, Proceedings, Part I

Algorithms and Architectures for Parallel Processing: 13th I...

引用

2013年

作者： Joanna KołOdziej Beniamino Di Martino Domenico Talia

来源：评论

学校读者我要写书评

暂无评论

Performance-Aware Data Placement in Hybrid parallel File Systems

Performance-Aware Data Placement in Hybrid Parallel File Sys...

引用

14th international conference on algorithms and architectures for parallel processing (ICA3PP)

作者： He, Shuibing Sun, Xian-He Feng, Bo Feng, Kun Wuhan Univ Sch Comp Wuhan 430072 Hubei Peoples R China Illinois Inst Technol Dept Comp Sci Chicago IL USA

ISBN: (纸本)9783319111971;9783319111964

Hybrid parallel file systems (PFS), which consist of both HDD and SSD servers, provide a promising solution for data-intensive applications. In this study, we propose a performance-aware data placement (PADP) strategy to enable efficient data layout in hybrid PFSs. the basic idea of PADP is to dispatch data on different file servers with adaptive varied-size file stripes based on the server storage performance. By using an effective data access cost model and a linear programming optimization method, the appropriate stripe sizes for each file server are determined effectively. We have implemented PADP within OrangeFS, a widely used parallel file system in HPC domain. Experimental results of representative benchmark show that PADP can significantly improve the I/O performance of hybrid PFSs.

关键词： parallel I/O System parallel File system Solid State Drive

来源：评论

学校读者我要写书评

暂无评论

A Quantum Algorithm Processor Architecture based on Register Reordering 22

A Quantum Algorithm Processor Architecture based on Register...

引用

22nd IFIP WG 10.5/IEEE international conference on Very Large Scale Integration (VLSI-SoC)

作者： Nakanishi, Masaki Matsuyama, Miki Yokoo, Yumi Yamagata Univ Fac Educ Art & Sci Yamagata 9908560 Japan Kirayaka Bank Ltd Yamagata Japan

ISBN: (纸本)9781479960163

Quantum computer simulators play an important role when we evaluate quantum algorithms. Quantum computation can be regarded as parallel computation in some sense, and thus, it is suitable to implement a simulator on a hardware, which can process a lot of operations in parallel. In this paper, we propose a processor architecture dedicated to simulating quantum algorithms. the proposed architecture is based on the register reordering method that shifts and swaps registers containing probability amplitudes so that the probability amplitudes of target basis states can be quickly selected. this reduces the number of large multiplexers and improves clock frequency. We implemented the processor on an FPGA. Experimental results show that the proposed processor has scalability in terms of the number of quantum bits, and can simulate quantum algorithms faster than software simulators.

关键词： field programmable gate arrays microprocessor chips parallel architectures quantum computing FPGA clock frequency large multiplexers parallel computation probability amplitudes quantum algorithm processor architecture quantum bits quantum computation quantum computer simulators register reordering method software simulators target basis states Arrays Computers Logic gates Quantum computing Registers Vectors quantum computing Field programmable gate arrays Registers QUBITS clock frequency parallel processing (COMPUTERS) Microprocessor chips parallel architectures Logic gates Computers Processor architectures simulator quanta

来源：评论

学校读者我要写书评

暂无评论

algorithms and architectures for parallel processing: 13th international conference, ICA3PP 2013, Vietri sul Mare, Italy, December 18-20, 2013, Proceedings, Part II

Algorithms and Architectures for Parallel Processing: 13th I...

引用

2013年

作者： Rocco Aversa Joanna KołOdziej Jun Zhang

来源：评论

学校读者我要写书评

暂无评论

A power modelling approach for many-core architectures 10

A power modelling approach for many-core architectures

引用

10th international conference on Semantics, Knowledge and Grids, SKG 2014

作者： Lai, Zhiquan Lam, King Tin Wang, Cho-Li Su, Jinshu Changsha China College of Computer National University of Defense Technology Changsha China Department of Computer Science University of Hong Kong Hong Kong

ISBN: (纸本)9781479967155

Many-core architectures are playing an important role in the HPC systems. But they are giving high performance at the cost of a great electrical power consumption. On Tianhe-2 supercomputer, the Xeon Phi many-core processors contribute nearly 80% of the system power. Power models are important to guide the design of dynamic power management (DPM) algorithms by predicting the power consumption with respect to power states and program execution patterns. However, the complexity of many-core hardware design makes power modelling be a challenging work. these concerns lead us to try a power modelling approach for many-core architectures based on the performance monitoring counters (PMC). the key insight is based on a large number of micro benchmarks on a real many-core platform, where we find some essential rules determining the chip power. Following the modelling approach, we develop an accurate chip power model for the Intel SCC many-core chip. Experimental comparison shows that our model is much more accurate than others. © 2014 IEEE.

关键词： Supercomputers

来源：评论

学校读者我要写书评

暂无评论

parallel super-resolution reconstruction based on neighbor embedding technique

Lecture Notes in Computer Science (including subseries Lectu...

引用

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2015年 9156卷 134-143页

作者： Moustafa, Marwa Ebied, Hala M. Helmy, Ashraf Nazamy, Taymoor M. Tolba, Mohamed F. Faculty of Computer and Information Sciences Ain Shams University Cairo Egypt Data Reception Analysis and Receiving Station Affairs National Authority for Remote Sensing and Space Science Cairo Egypt

Super Resolution (SR) is a technique to recover a high-resolution (HR) image from different noisy low resolution (LR) images. the missing highfrequency components in LR images should be restored correctly in HR image. Because of the extensive size of satellite images, the utilize to parallel algorithms can accomplish results more quickly with accurate results. this paper proposes an accelerated parallel implementation for an example based super-resolution algorithm, Neighbor Embedding (NE), using GPU. the NE trains the dictionary with patches obtained from a single image in the training phase. Euclidean distances are used to obtain the optimal weights that will be used in the construction of high-resolution images. Compute Device Unified Architecture (CUDA) by NVidia’s has been used to implement the proposed parallel NE. Different experiments have been carried out on a synthetic test image and satellite test image. the proposed GPU implementation of the NE was benchmarked against the serial implementation. the experimental results show that the speed of the implementation depends on the image size. the speed of the GPU implementation compared to the serial one using CPU ranged from 20× for small images to more than 30× for large image size. © Springer international Publishing Switzerland 2015.

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

Low complexity and efficient architecture of 1D-DCT based Cordic-Loeffler for wireless endoscopy capsule

Low complexity and efficient architecture of 1D-DCT based Co...

引用

IEEE SSD international Multi-conference on Systems, Signals and Devices

作者： N. Jarray M. Elhaji A. Zitouni Faculty of Sciences of Monastir Electronic and Micro-Electronic Laboratory Tunisa Monastir University Tunisia University of Dammam Saudi Arabia

Due to the power limitation and the small size condition of the wireless capsule endoscopy, therefore the principal defiance is to reduce the area and the power consumption. the aim is to preserve acceptable image reconstruction and coding. In this paper, we present a Low complexity and efficient architecture of 1D-DCT based Cordic-Loeffler technique for wireless capsule endoscopy. Our improvement over the original algorithm is performed in CORDIC part. this brings us to reduce the number of addition operations from 18 to 10. As a result, the number of addition is reduced from 38 to 30 operations in the main algorithm. Also, to more ameliorate our results, we used Modified Carry look Ahead adder (MCLA) and Carry Save Adder (CSA) adder which are characterized by low power and high speed compared to classical Carry Look Ahead adder (CLA). Our aim is to provide an optimized architecture in terms of area and power consumption. the proposed design has been implemented on FPGA. Compared to other architectures, the proposed architecture has not only reduced the computation complexity, but also the area and the power consumption. It should be noted that the proposed DCT architecture is very suitable for low-power and high-quality codecs, especially for battery-based systems.

关键词： Discrete cosine transforms Computer architecture Adders Signal processing algorithms Wireless communication Endoscopes Complexity theory

来源：评论

学校读者我要写书评

暂无评论

parallel Graph Coloring algorithms on the GPU Using OpenCL

Parallel Graph Coloring Algorithms on the GPU Using OpenCL

引用

8th international conference on Computing for Sustainable Global Development (INDIACom)

作者： Sengupta, Shilpi JSSATE Dept Comp Sci & Engn Noida India

ISBN: (纸本)9789380544120

GPUs (Graphics processing Units) are designed to solve large data-parallel problems encountered in the fields of age processing, scene rendering, video playback, and gaming. CPUs are therefore designed to handle a higher degree of parallelism as compared to conventional CPUs. GPGPU (General Purpose computing on Graphics processing Units) enables users to do parallel computing on the graphics hardware commonly available on current personal computers. these days' systems are available with multi-core GPUs that provide the necessary hardware infrastructure, thereby enabling high performance computing on personal computers. NVIDIA's CUDA (Compute Unified Device Architecture) and the industry standard OpenCL (Open Computing Language) provides the software platform required to utilize the graphics hardware to solve computational problems using parallel algorithms, otherwise solvable mostly in supercomputing environments. this paper presents two parallel CREW (Concurrent Read Exclusive Write) PRAM algorithms for optimal coloring of general graphs on stream processing architectures such as the CPU. the algorithms are implemented using OpenCL. the first algorithm presents the techniques for computing vertex independent sets on the GPU and then assigns colors to them. the second algorithm focuses on the optimization of the vertex independent set computation for edge-transitive graphs by taking advantage of the structures of such graphs and then assigns color to each of the normalized independent sets.

关键词： graph GPU OpenCL vertex color VIS

来源：评论

学校读者我要写书评

暂无评论

Improving Speculation Accuracy with Inter-thread Fetching Value Prediction 1

引用

14th international conference on algorithms and architectures for parallel processing (ICA3PP)

作者： Xu, Fan Shen, Li Wang, Zhiying Guo, Hui Su, Bo Chen, Wei Natl Univ Def Technol Coll Comp State Key Lab High Performance Comp Changsha 410073 Hunan Peoples R China

ISBN: (数字)9783319111940

ISBN: (纸本)9783319111940;9783319111933

Conventional software speculative parallel models are facing challenges due to the increasing number of the processor core and the diversification of the application. the speculation accuracy is one of the key factors to the performance of software speculative parallel model. In this paper, we proposed a novel value prediction mechanism named Inter-thread Fetching Value Prediction(IFVP). It supports a speculative thread to read the values of conflict variables speculatively from another speculative thread. this method can remarkably reduce the miss speculation rate in a loop to be parallelized with cross-iter dependencies. We have proved that the IFVP can improve the speculation accuracy by about 19.1% on the average, and can improve the performance by about 37.1% on the average, compared with the conventional models without value prediction.

关键词： computer architecture thread level speculation parallel computing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：