检索结果-内蒙古大学图书馆

Proceedings - International Conference on Quality Software 2010年 xi页

作者： Wang, Ji Chan, Wing Kwong Kuo, Fei-Ching National Laboratory for Parallel and Distributed Processing Changsha China City University of Hong Kong Hong Kong Hong Kong Swinburne University of Technology Australia

来源：评论

学校读者我要写书评

暂无评论

A GPU-based Fast Solution for Riesz Space Fractional Reaction-Diffusion Equation

A GPU-based Fast Solution for Riesz Space Fractional Reactio...

引用

International Conference on Network-Based Information Systems (NBIS)

作者： Qinglin Wang Jie Liu Chunye Gong Yang Zhang Zuocheng Xing Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha China College of Aerospace Science and Engineering National University of Defense Technology Changsha China Science and Technology on Space Physics Laboratory Beijing China

The fast numerical solutions of Riesz fractional equation have computational cost of O(NMlogM), where M, N are the number of grid points and time steps. In this paper, we present a GPU-based fast solution for Riesz space fractional equation. The GPU-based fast solution, which is based on the fast method using FFT and implemented with CUDA programming model, consists of parallel FFT, vector-vector addition and vector-vector multiplication on GPU. The experimental results show that the GPU-based fast solution compares well with the exact solution. Compared to the known parallel fast solution on 8-core Intel E5-2670 CPU, the overall performance speedup on NVIDIA GTX650 GPU reaches 2.12 times and that on NVIDIA K20C GPU achieves 10.93 times.

关键词： Graphics processing units Instruction sets Mathematical model Arrays Yttrium Approximation methods parallel processing

来源：评论

学校读者我要写书评

暂无评论

An incremental learning algorithm for supervised neural network with contour preserving classification

An incremental learning algorithm for supervised neural netw...

引用

International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information technology, ECTI-CON

作者： Piyabute Fuangkhon Thitipong Tanprasert Parallel and Distributed Computing Research Laboratory Faculty of Science and Technology Assumption University Bangkok Thailand

This paper presents an alternative algorithm for integrating the existing knowledge of a supervised learning neural network with the new training data. The algorithm allows the existing knowledge to age out in slow rate as a neural network is gradually retrained with consecutive sets of new samples, resembling the change of application locality under a consistent environment. The algorithm also utilizes the contour preserving classification algorithm to increase the accuracy of classification. The experiment is performed on 2-dimension partition problem and the result convincingly confirms the effectiveness of the algorithm.

关键词： Neural networks Support vector machines Support vector machine classification Supervised learning Partitioning algorithms Neurons Classification algorithms Speech recognition Data mining distributed computing

来源：评论

学校读者我要写书评

暂无评论

An efficient local address generation for the block-cyclic distribution 3

An efficient local address generation for the block-cyclic d...

引用

3rd International Conference on Algorithms and Architectures for parallel processing, ICA3PP 1997

作者： Kwon, Oh-Young Kim, Tae-Geun Han, Tack-Don Yang, Sung-Bong Kim, Shin-Dug Distributed Computing Lab. Systems Engineering Research Institute 1 Ueun-dong Yusong-gu Taejon305-333 Korea Republic of Parallel Processing System Laboratory Dept. of Computer Science Yonsei University Seoul120-749 Korea Republic of

ISBN: (纸本)0780342291

In order to generate local addresses for an array section A(l:h:s) with block-cyclic distribution, an efficient compiling method is required. In this paper, two local address generation methods for the block-cyclic distribution are presented. One is a simple local address generation method that is modified from the virtual-block scheme. The other is a linear-time ΔM table construction method. The array elements of A(l:h:s) to be accessed at run-time build up a family of lines. By using the equation of the lines, a ΔM table can be generated in O(k) time. Experimental results show that a simple local address generation method has poor performance but a linear-time ΔM table generation method is faster than other algorithms in ΔM table generation time and access time for 10,000 array elements. © 1997 IEEE.

关键词： Virtual addresses

来源：评论

学校读者我要写书评

暂无评论

A chronological history-based execution time estimation model for embarrassingly parallel applications on grids

A chronological history-based execution time estimation mode...

引用

3rd International Symposium on parallel and distributed processing and Applications, ISPA 2005

作者： Yang, Chao-Tung Shih, Po-Chi Lin, Cheng-Fang Hsu, Ching-Hsien Li, Kuan-Ching High Performance Computing Laboratory Department of Computer Science and Information Engineering Tunghai University Taichung 40704 Taiwan Department of Computer Science and Information Engineering Chung Hua University Hsinchu 300 Taiwan Parallel and Distributed Processing Center Department of Computer Science and Information Management Providence University Taichung 43301 Taiwan

ISBN: (纸本)3540297693

In order to identify and schedule jobs that are suitable for determined resources, an execution time estimation model is required. In this paper, it is described a Chronological history-based execution time estimation model to predict current execution time, according to the previous execution results. We built a heterogeneous computational Grid environment using Globus Toolkit, and our research is focused in Grid computing environments and to execute parallel jobs on multiple resources by measuring its accuracy. The experimental results shown that our model can accurately predict the execution time of embarrassingly parallel applications. © Springer-Verlag Berlin Heidelberg 2005.

关键词： distributed computer systems

来源：评论

学校读者我要写书评

暂无评论

Performance analysis of applying replica selection technology for Data Grid environments

Performance analysis of applying replica selection technolog...

引用

8th International Conference on parallel Computing Technologies, PaCT 2005

作者： Yang, Chao-Tung Chen, Chun-Hsiang Li, Kuan-Ching Hsu, Ching-Hsien High-Performance Computing Laboratory Department of Computer Science and Information Engineering Tunghai University Taichung 40704 Taiwan Parallel and Distributed Processing Center Department of Computer Science and Information Management Providence University Taichung 43301 Taiwan Department of Computer Science and Information Engineering Chung Hua University Hsinchu 300 Taiwan

The Data Grid enables the sharing, selection, and connection of a wide variety of geographically distributed computational and storage resources for solving large-scale data intensive scientific applications. Such technology efficiently manage and transfer terabytes or even petabytes of data for data-intensive, high-performance computing applications in wide-area, distributed computing environments. Replica selection process allows an application to choose a replica from replica catalog, based on its performance and data access features. In this paper, we build a Grid environment based on three existing PC Cluster environments and perform performance analysis of data transfers using GridFTP protocol over these systems. In addition, based on experimental results, it is proposed a cost model to pick the best replica, in real and dynamic network situations. © Springer-Verlag Berlin Heidelberg 2005.

关键词： distributed computer systems

来源：评论

学校读者我要写书评

暂无评论

A cooperative CSMA/CA MAC for channel-hopping rendezvous based cognitive radio networks

A cooperative CSMA/CA MAC for channel-hopping rendezvous bas...

引用

International Conference on Communications and Networking in China (CHINACOM)

作者： Quan Liu Gang Hu Xiaodong Wang Xingming Zhou Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha Hunan China Department of Network Engineering National University of Defense Technology Changsha Hunan China

ISBN: (纸本)9781479959716

Rendezvous problem is known as the fundamental issue in MAC design for cognitive radio networks (CRNs). With the concept of blind rendezvous, numerous channel-hopping sequence (CHS) based rendezvous schemes have been proposed to solve this problem. However, little attention is paid to the design of a CSMA/CA MAC based on these rendezvous schemes. In this paper, we propose a cooperative CSMA/CA MAC (named CoCH-CSMA/CA MAC) which tailors 802.11 distributed coordination function (DCF) to the slotted operation manner of existing CHS based rendezvous schemes. More importantly, the rendezvous de-synchronization problem is identified in MAC design based on these rendezvous schemes. To alleviate its impact on MAC performance, a cooperative control feedback scheme employing correlation-based signal detection is proposed to help secondary users avoid backoff misbehavior and improve networking performance (i.e., packet delivery delay and network throughput). Extensive simulations are conducted to prove the effectiveness of our MAC design.

关键词： Delays Multiaccess communication Correlation Radiation detectors IEEE 802.11 Standards Receivers

来源：评论

学校读者我要写书评

暂无评论

Device View Redundancy: an adaptive low-overhead fault tolerance mechanism for many-core system

Device View Redundancy: an adaptive low-overhead fault toler...

引用

International Workshop on Intelligent Communication and Social Networks

作者： Wentao Jia Chunyuan Zhang Jian Fu National Key Laboratory of Parallel and Distributed Processing College of Computer National University of Defense Technology Institute for Informatics University of Amsterdam

ISBN: (纸本)9781479909735

Continued increasing of fault rate in integrate circuit makes processors more susceptible to errors, especially many-core processor. Meanwhile, most systems or applications do not need full fault coverage, which has excessive overhead. So on-demand fault tolerance is desired for these applications. In this paper, we propose an adaptive low-overhead fault tolerance mechanism for many-core system, called Device View Redundancy (DVR). It treats fault tolerance as a device that can be configured and used by application when high reliability is needed. Nevertheless, DVR exploits the idle resources for low-overhead fault tolerance, which is based on the observation that the utilization of many-core system is low due to lack of parallelism in application. Finally, the experiment shows that the performance overhead of DVR is reduced by 16% to 98% compared with full Dual Modular Redundancy (DMR).

关键词： On-demand redundancy Idle resource exploitation Dynamic coupling Low-overhead Many core system

来源：评论

学校读者我要写书评

暂无评论

AUXILIARY-TASKS LEARNING FOR PHYSICS-INFORMED NEURAL NETWORK-BASED PARTIAL DIFFERENTIAL EQUATIONS SOLVING

arXiv

引用

arXiv 2023年

作者： Yan, Junjun Chen, Xinhai Wang, Zhichao Zhou, Enqiang Liu, Jie Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha410073 China Laboratory of Digitizing Software for Frontier Equipment National University of Defense Technology Changsha410073 China

Physics-informed neural networks (PINNs) have emerged as promising surrogate modes for solving partial differential equations (PDEs). Their effectiveness lies in the ability to capture solution-related features through neural networks. However, original PINNs often suffer from bottlenecks, such as low accuracy and non-convergence, limiting their applicability in complex physical contexts. To alleviate these issues, we proposed auxiliary-task learning-based physics-informed neural networks (ATL-PINNs), which provide four different auxiliary-task learning modes and investigate their performance compared with original PINNs. We also employ the gradient cosine similarity algorithm to integrate auxiliary problem loss with the primary problem loss in ATL-PINNs, which aims to enhance the effectiveness of the auxiliary-task learning modes. To the best of our knowledge, this is the first study to introduce auxiliary-task learning modes in the context of physics-informed learning. We conduct experiments on three PDE problems across different fields and scenarios. Our findings demonstrate that the proposed auxiliary-task learning modes can significantly improve solution accuracy, achieving a maximum performance boost of 96.62% (averaging 28.23%) compared to the original single-task PINNs. The code and dataset are open source at https://***/junjun-yan/ATL-PINN. Copyright © 2023, The Authors. All rights reserved.

关键词： Partial differential equations

来源：评论

学校读者我要写书评

暂无评论

Visuel: A novel performance monitoring and analysis toolkit for cluster and grid environments

引用

6th International Conference on Algorithms and Architectures for parallel processing, ICA3PP

作者： Li, Kuan-Ching Cheng, Hsiang-Yao Yang, Chao-Tung Hsu, Ching-Hsien Wang, Hsiao-Hsi Hsu, Chia-Wen Hung, Sheng-Shiang Chang, Chia-Fu Liu, Chun-Chieh Pan, Yu-Hwa Parallel and Distributed Processing Center Department of Computer Science and Information Management Providence University Taichung 43301 Taiwan High Performance Computing Laboratory Department of Computer Science and Information Engineering Tunghai University Taichung 40704 Taiwan Department of Computer Science and Information Engineering Chung Hua University Hsinchu 300 Taiwan

ISBN: (纸本)3540292357

The computing power provided by high performance low-cost PC-based Cluster and Grid platforms are attractive, and they are equal or superior to supercomputers and mainframes widely available. In this research paper, we present the design rationale and implementation of Visuel, a toolkit for performance measurement and analysis of MPI parallel programs and real time resources monitoring in cluster and grid computing environments. The proposed toolkit is web-based interface to show performance activities of all computing nodes involved in the execution of a MPI parallel program, such as CPU and memory usage levels of each computing node, and monitors all computing nodes of a computing platform by displaying real time performance data. In addition, this toolkit is able to display comparative performance data charts of multiple executions of MPI parallel application under investigation, which facilitates the "what-if" analysis. The usage of this toolkit shows that it outperforms in easing the process of investigation of parallel applications. © Springer-Verlag Berlin Heidelberg 2005.

关键词： distributed computer systems

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：