检索结果-内蒙古大学图书馆

IEEE International Conference on Parallel and Distributed computing, Applications and Technologies (PDCAT)

作者： Qiang Li Zhigang Huo Ninghui Sun Graduate University of Chinese Academy of Sciences Beijing China Key Laboratory of Computer System and Architecture Chinese Academy of Sciences Beijing China Institute of Computing Technology Chinese Academy of Sciences Beijing China Institute of Computing Technology Chinese Academy of Sciences Beijing CN

MPI All to all communication is widely used in many high performance computing (HPC) applications. In All to all communication, each process sends a distinct message to all other participating processes. In multicore clusters, processes within a node simultaneously contend for the same network resource of the node in All to all communication. However, many small synchronization messages are required in All to all communication of large messages. With the contention, their latency is orders of magnitude larger than that without contention. As a result, the synchronization overhead is significantly increased and accounts for a large proportion to the whole latency of All to all communication. In this paper, we analyse the considerable overhead of synchronization messages. Base on the analysis, an optimization is presented to reduce the number of synchronization messages from 3N to 2¡ÌN. Evaluations on a 240-core cluster show that the performance is improved by almost constant ratio, which is mainly determined by message size and independent of system scale. The performance of All to all communication is improved by 25% for 32K and 64K bytes messages. For FFT application, performance is improved by 20%.

关键词： Synchronization Protocols Multicore processing Receivers Bandwidth Program processors Benchmark testing

来源：评论

学校读者我要写书评

暂无评论

Particle swarm optimization for automatic parameters determination of pulse coupled neural network

引用

Journal of computers 2011年第8期6卷 1546-1553页

作者： Xinzheng, Xu Shifei, Ding Zhongzhi, Shi Zuopeng, Zhao Hong, Zhu School of Computer Science and Technology China University of Mining and Technology Xuzhou China Key Laboratory of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences Beijing China

Pulse coupled neural network (PCNN), a wellknown class of neural networks, has original advantage when applied to image processing because of its biological background. However, when PCNN is used, the main problem is that its parameters aren't self-adapting according to different image which limits the application range of PCNN. Considering that, this paper proposed a new method based on particle swarm optimization (PSO) to determine automatically the parameters of PCNN. In this method, the algorithm of PSO is applied to search automatically optimum in the solution space of PCNN's parameters until finding global optimal solution. Experimental results demonstrate that the proposed method is accurate and robust for image segmentation, and its performance is better than the methods of Otsu, manual adjustment of parameters when mutual information is adopted as evaluation criteria. © 2011 ACADEMY PUBLISHER.

关键词： Image segmentation

来源：评论

学校读者我要写书评

暂无评论

A rough RBF neural networks optimized by the genetic algorithm

引用

Advances in Information Sciences and Service Sciences 2011年第7期3卷 332-339页

作者： Ding, Shifei Ma, Gang Xu, Xinzheng School of Computer Science and Technology China University of Mining and Technology Xuzhou China Key Laboratory of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Science Beijing China

The large-scale data parallelism processing is an inherent characteristic of artificial neural networks, but the networks bring the efficiency problems of data processing. As one of the artificial neural networks, Radial Basis Function (RBF) neural networks have the same problem. Therefore, how to reduce the scale of data to improve the efficiency of data processing has been a hot issue among the artificial intelligence scholars. Based on the traditional RBF neural networks, this paper puts forward a method which determines the important degree of the sample attributes based on knowledge entropy of Rough set by analyzing the relationship between the knowledge entropy and the weight of the sample attributes, and assesses the importance of the sample attributes between the input layer and the hidden layer, namely the attribution reduction, so as to achieve reduce the scale of data processing. The ultimate aim of training RBF neural networks is to seek a set of suitable networks parameters which makes the sample output error achieve the minimum or required accuracy, while Genetic Algorithm (GA) has the properties of finding out the optimal solution through multiplepoint random search in the solution space, so Genetic Algorithm is used to optimize the centers, the widths and the weights between the hidden layer and the output layer of RBF neural networks in training the networks. Finally, a model about A Rough RBF Neural Networks Optimized by the Genetic Algorithm (GA-RS-RBF) is proposed in this paper. The simulation results show that the rough RBF neural network optimized by the Genetic Algorithm is better than the traditional RBF neural networks in classification about Iris datasets.

关键词： Genetic algorithms

来源：评论

学校读者我要写书评

暂无评论

A fine-grained component-level power measurement method

A fine-grained component-level power measurement method

引用

International Green computing Conference

作者： Zehan Cui Yan Zhu Yungang Bao Mingyu Chen Chinese Academy and Sciences Beijing China State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy and Sciences Beijing China Chinese Academy of Sciences Beijing Beijing CN

ISBN: (纸本)9781457712227

The ever growing energy consumption of computer systems have become a more and more serious problem in the past few years. Power profiling is a fundamental way for us to better understand where, when and how energy is consumed. This paper presents a direct measurement method to measure the power of main computer components with fine time granularity. To achieve this goal, only small amount of extra hardware are employed. An approach to synchronize power dissipation with program phases has also been proposed in this paper. Based on the preliminary version of our tools, we measure the power of CPU, memory and disk when running SPEC CPU2006 benchmarks, and prove that measurement with fine time granularity is essential. The phenomenon we observe from memory power may be served as a guide for memory management or architecture design towards energy efficiency.

关键词： Power measurement Current measurement Wires Voltage measurement Connectors Central Processing Unit Time measurement

来源：评论

学校读者我要写书评

暂无评论

A cost-effective substantial-impact-filter based method to tolerate voltage emergencies

A cost-effective substantial-impact-filter based method to t...

引用

Design, Automation and Test in Europe Conference and Exhibition

作者： Songjun Pan Yu Hu Xing Hu Xiaowei Li Chinese Academy and Sciences Beijing China Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy and Sciences Beijing China Chinese Academy of Sciences Beijing Beijing CN

Supply voltage fluctuation caused by inductive noises has become a critical problem in microprocessor design. A voltage emergency occurs when supply voltage variation exceeds the acceptable voltage margin, jeopardizing the microprocessor reliability. Existing techniques assume all voltage emergencies would definitely lead to incorrect program execution and prudently activate rollbacks or flushes to recover, and consequently incur high performance overhead. We observe that not all voltage emergencies result in external visible errors, which can be exploited to avoid unnecessary protection. In this paper, we propose a substantial-impact-filter based method to tolerate voltage emergencies, including three key techniques: 1) Analyze the architecture-level masking of voltage emergencies during program execution; 2) Propose a metric intermittent vulnerability factor for intermittent timing faults (IV F itf ) to quantitatively estimate the vulnerability of microprocessor structures (load/store queue and register file) to voltage emergencies; 3) Propose a substantial-impact-filter based method to handle voltage emergencies. Experimental results demonstrate our approach gains back nearly 57% of the performance loss compared with the once-occur-then-rollback approach.

关键词： Microprocessors Delay computer architecture Computational modeling Sensors Benchmark testing

来源：评论

学校读者我要写书评

暂无评论

Visual saliency detection by spatially weighted dissimilarity

Visual saliency detection by spatially weighted dissimilarit...

引用

作者： Duan, Lijuan Wu, Chunpeng Miao, Jun Qing, Laiyun Fu, Yu College of Computer Science and Technology Beijing University of Technology Beijing 100124 China Key Laboratory of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences Beijing 100190 China School of Information Science and Engineering Graduate University Chinese Academy of Sciences Beijing 100049 China Department of Computing University of Surrey Guildford Surrey GU2 7XH United Kingdom

ISBN: (纸本)9781457703942

In this paper, a new visual saliency detection method is proposed based on the spatially weighted dissimilarity. We measured the saliency by integrating three elements as follows: the dissimilarities between image patches, which were evaluated in the reduced dimensional space, the spatial distance between image patches and the central bias. The dissimilarities were inversely weighted based on the corresponding spatial distance. A weighting mechanism, indicating a bias for human fixations to the center of the image, was employed. The principal component analysis (PCA) was the dimension reducing method used in our system. We extracted the principal components (PCs) by sampling the patches from the current image. Our method was compared with four saliency detection approaches using three image datasets. Experimental results show that our method outperforms current state-of-the-art methods on predicting human fixations. © 2011 IEEE.

关键词： Principal component analysis

来源：评论

学校读者我要写书评

暂无评论

Optimizing Web Browser on Many-Core architectures

Optimizing Web Browser on Many-Core Architectures

引用

IEEE International Conference on Parallel and Distributed computing, Applications and Technologies (PDCAT)

作者： Lingjun Fan Weisong Shi Shibin Tang Chenggang Yan Dongrui Fan Graduate University of Chinese Academy of Sciences Beijing China Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing China Department of Computer Science Wayne State University Detroit USA

As more and more Web applications emerging on sever end today, the Web browser on client end has become a host of a variety of applications other than just rendering static Web pages. This leads to more and more performance requirements of a Web browser, for which user experience is very important. This situation may become more urgency when on handheld devices. Some efforts like redesign a new Web browser have been made to overcome this problem. In this paper, we address this issue by optimizing the main processes of the Web browser on a state-of-the-art 64-core architecture, Godson-T, which was developed at Chinese Academy of Sciences, as multi-/many-core architecture to be the mainstream processor in the upcoming years. We start a new core to process a new tab when facing up to intensive URL requests, and we use scratch-pad memory (SPM) of each core as a local buffer to store the HTML source data to be processed to reduce off-chip memory access and exploit more data locality, otherwise, we use DTA to transfer HTML data for backup. Experiments conducted on the cycle-accurate simulator show that, starting each tab process by a new core could obtain 5.7% to 50% speedup with different number of cores used to process corresponding URL requests, with on-chip scratchpad memory of each core used to store the HTML data, more speedup could be achieved when number of cores increase. Also, as Data Transfer Agent (DTA) used to transfer the HTML data, the backup of HTML data can get 2X to 5X speedups according to different data amount.

关键词： Browsers HTML Web pages Layout Multicore processing System-on-a-chip

来源：评论

学校读者我要写书评

暂无评论

Modeling and analysis of dual-arm cluster tools for wafer fabrication with revisiting

Modeling and analysis of dual-arm cluster tools for wafer fa...

引用

2011 7th IEEE International Conference on Automation Science and Engineering, CASE 2011

作者： Qiao, Yan Wu, NaiQi Zhou, MengChu Department of Industrial Engineering School of Mechatronics Engineering Guangdong University of Technology Guangzhou 510006 China Key Laboratory of Embedded System and Service Computing Ministry of Education Tongji University Shanghai 200092 China Department of Electrical and Computer Engineering New Jersey Institute of Technology Newark NJ 07102-1982 United States

ISBN: (纸本)9781457717307

Some wafer fabrication processes are repeated processes, e.g. atomic layer deposition (ALD) process. For such processes, the wafers need to visit some processing modules for a number of times, which complicates the cycle time analysis. This paper studies the cycle time analysis problem for such processes. With a Petri net model, it is found that such processes contain local cycles involving only the revisiting PMs and global cycles involving both revisiting and non-revisiting PMs. The process switches between these two types of cycles such that the process never reaches a steady state. Based on this finding, the mechanism underlying such processes is revealed and analytical expressions are given for the calculation of their cycle time. Illustrative examples are presented to show the application of the proposed approach. © 2011 IEEE.

关键词： Atomic layer deposition

来源：评论

学校读者我要写书评

暂无评论

On a generalization of cubic spline interpolation

Journal of Software

引用

Journal of Software 2011年第9期6卷 1632-1639页

作者： Gao, Shang Zhang, Zaiyue Cao, Cungen School of Computer Science and Technology Jiangsu University of Science and Technology Zhenjiang 212003 China Key Laboratory of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences Beijing 100080 China

Based on analysis of basic cubic spline interpolation, the clamped cubic spline interpolation is generalized in this paper. The methods are presented on the condition that the first derivative and second derivative of arbitrary node are given. The Clamped spline and Curvature-adjusted cubic spline are also generalized. The methods are presented on the condition that the first derivatives of arbitrary two nodes or second derivatives of arbitrary two node are given. At last, these calculation methods are illustrated through examples. © 2011 ACADEMY PUBLISHER.

关键词： Interpolation

来源：评论

学校读者我要写书评

暂无评论

Rough neural networks: A review

引用

Journal of Computational Information Systems 2011年第7期7卷 2338-2346页

作者： Ding, Shifei Chen, Jinrong Xu, Xinzheng Li, Jianying School of Computer Science and Technology China University of Mining and Technology Xuzhou 221116 China Key Laboratory of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Science Beijing 100080 China

The rough neural networks (RNNs) are the neural networks based on rough set and one kind of hot research in the artificial intelligence in recent years, which synthesize the advantage of rough set to process uncertainly question: attributes reduce by none information losing then extract rule, and the neural networks have the strongly fault tolerance, self-organization, massively parallel processing and self-adapted. So that RNNs can process the massively and uncertainly information, which is widespread applied in our life. This article summarizes the recent research development of RNNs. First introduce the theory of rough set and the rough neuron;next analyze the RNNs in following four aspects: the neural network based on using rough set in preprocessing information, the neural networks base on rough logic, the neural networks based on rough neuron and the neural networks based on rough-granular;then give the flow chart of the RNNs processing question and the application of the classical neural network based on rough set;last give some advice to the development of RNNs in the future. Copyright © 2011 Binary Information Press.

关键词： Rough set theory

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：