检索结果-内蒙古大学图书馆

2011 12th International Conference on Parallel and Distributed computing, Applications and Technologies, PDCAT 2011

作者： Xu, Shixiong Han, Dongni Chen, Li Key Laboratory of Computer System and Architecture Institute of Computing Technology University of Chinese Beijing China

ISBN: (纸本)9780769545646

The wide application of General Purpose Graphic Processing Units (GPGPUs) results in large manual efforts on porting and optimizing algorithms on them. However, most existing automatic ways of generating GPGPU code fail to conduct optimization strategies regarding a specific computation and to reuse constantly evolving manual optimizations. In this paper, we present a computation pattern driven approach for computation-specific GPGPU code generation and optimization, which in turn reuses manual optimizations to a certain extent. We suggest language extensions to OpenMP, high-level data structure attributes, in order to assist the process of computation pattern matching and to help give users intuitive performance tuning parameters in the view of data structure attributes. We illustrate the feasibility of this approach through three important computation dwarfs, which are dense matrix, sparse matrix, and structured mesh computation in scientific computing. We also build a prototype OpenMP-to-CUDA translator that consists of computation pattern recognition and code template instantiation. The experimental results demonstrate the performance benefits of computation pattern driven method. To our best knowledge, it is the first work on reusing manual optimizations for GPGPUs with computation pattern driven approach. © 2011 IEEE.

关键词： Data structures

来源：评论

学校读者我要写书评

暂无评论

Battery discharge characteristics of wireless sensors in building applications

Battery discharge characteristics of wireless sensors in bui...

引用

IEEE International Conference on Networking, Sensing and Control

作者： Wenqi Guo William M. Healy Meng Chu Zhou Department of Electrical and Computer Engineering New Jersey Institute of Technology Newark NJ USA National Institute for Standards and Technology Gaithersburg MD USA Key Laboratory of Embedded System and Service Computing Ministry of Education University of Tongji Shanghai China ECE Department New Jersey Institute of Technology Newark NJ USA

Sensor nodes in wireless networks often use batteries as their source of energy, but replacing or recharging exhausted batteries in a deployed network can be difficult and costly. Therefore, prolonging battery life becomes a principal objective in the design of wireless sensor networks (WSNs). There is little published data that quantitatively analyze a sensor node's lifetime under different operating conditions. This paper presents several experiments to quantify the impact of key wireless sensor network design and environmental parameters on battery performance. Our testbed consists of MicaZ motes, commercial alkaline batteries, and a suite of techniques for measuring battery performance. We evaluate known parameters, such as communication distance, working channel and operating power that play key roles in battery performance. Through extensive real battery discharge measurements, we expect our results to serve as a quantitative basis for future research in designing and implementing battery-efficient sensing applications and protocols.

关键词： Batteries Wireless sensor networks Discharges (electric) Sensor phenomena and characterization Battery charge measurement Voltage measurement

来源：评论

学校读者我要写书评

暂无评论

Using Data-Level Parallelism to Accelerate Instruction -Level Redundancy

Using Data-Level Parallelism to Accelerate Instruction -Leve...

引用

World Automation Congress

作者： Yu Hu Zhong liang Chen Xiaowei Li Key Laboratory of Computer System and Architecture Institute of Computing Technology CAS Beijing 100190 China Department of Electrical and Computer Engineering Northeastern University Boston MA 02115 USA

ISBN: (纸本)9781467344975

Instruction-level redundancy is an effective scheme to reduce the susceptibility of microprocessors to soft errors, offering high error detection and recovery capability;however, it usually incurs significant performance degradation due to resource racing. Motivated by the fact that narrow-width operands are commonly seen in applications, we exploit data-level parallelism to accelerate instruction-level redundancy. For the instructions within sphere of replication (SoR) of data-level redundancy, normal and redundant versions of the narrow-width operand of the instruction are folded into one register to share the same functional unit during execution hence alleviating resource racing. The other instructions are all protected by instructionlevel redundancy. We run SPECint2000 benchmarks on a modified version of SimpleScalar simulator, and synthesize the extra hardware to evaluate area overhead of the proposed pipeline. Experimental results show that our acceleration scheme outperforms conventional instruction-level redundancy by 13% in IPC. Besides, the extra area overhead is negligible.

关键词： keywords Data-level redundancy Instruction-level redundancy Narrow-width value Sphere of replication

来源：评论

学校读者我要写书评

暂无评论

Activity recognition based on semantic spatial relation

Activity recognition based on semantic spatial relation

引用

International Conference on Pattern Recognition

作者： Lingxun Meng Laiyun Qing Peng Yang Jun Miao Xilin Chen Dimitris N. Metaxas University of the Chinese Academy of Sciences Beijing Beijing CN Computer Science Department Rutgers University Piscataway NJ USA Key Laboratory of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences Beijing China

We propose an approach to recognize group activities which involve several persons based on modeling the interactions between human bodies. Benefitted from the recent progress in pose estimation [1], we model the activities as the interactions between the parts belong to the same person (intra-person) and those between the parts of different persons (inter-person). Then a unified, discriminative model which integrates both types of interactions is developed. The experiments on the UT-Interaction Dataset [2] show the promising results and demonstrate the power of the interacting models.

关键词： Joints Humans Semantics Testing Estimation Biological system modeling Image recognition

来源：评论

学校读者我要写书评

暂无评论

Virtual resource monitoring in cloud computing

引用

Journal of Shanghai University(English Edition) 2011年第5期15卷 381-385页

作者：韩芳芳彭俊杰张武李青李建敦江钦龙袁勤 School of Computer Engineering and Science Shanghai University Key Laboratory of Computer System and Architecture Institute of Computing TechnologyChinese Academy of Science

Cloud computing is a new computing model. The resource monitoring tools are immature compared to traditional distributed computing and grid computing. In order to better monitor the virtual resource in cloud computing, a periodically and event-driven push （PEP） monitoring model is proposed. Taking advantage of the push and event-driven mechanism, the model can provide comparatively adequate information about usage and status of the resources. It can simplify the communication between Master and Work Nodes without missing the important issues happened during the push interval. Besides, we develop ＂mon＂ to make up for the deficiency of Libvirt in monitoring of virtual CPU and memory.

关键词： virtual resource monitoring cloud computing virtualization

来源：评论

学校读者我要写书评

暂无评论

New methodologies for parallel architecture

New methodologies for parallel architecture

引用

作者： Fan, Dong-Rui Li, Xiao-Wei Li, Guo-Jie Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing 100190 China

Moore's law continues to grant computer architects ever more transistors in the foreseeable future, and para-llelism is the key to continued performance scaling in modern microprocessors. In this paper, the achievements in our research project, which is supported by the National Basic Research 973 Program of China, on parallel architecture, are systematically presented. The innovative approaches and techniques to solve the significant problems in parallel architecture design are summarized, including architecture level optimization, compiler and language-supported technologies, reliability, power-performance efficient design, test and verification challenges, and platform building. Two prototype chips, a multi-heavy-core Godson-3 and a many-light-core Godson-T, are described to demonstrate the highly scalable and reconfigurable parallel architecture designs. We also present some of our achievements appearing in ISCA, MICRO, ISSCC, HPCA, PLDI, PACT, IJCAI, Hot Chips, DATE, IEEE Trans. VLSI, IEEE Micro, IEEE Trans. computers, etc. © 2011 Springer Science+Business Media, LLC & Science Press, China.

关键词： Parallel architectures

来源：评论

学校读者我要写书评

暂无评论

Fast implementation of DGEMM on Fermi GPU

Fast implementation of DGEMM on Fermi GPU

引用

2011 International Conference for High Performance computing, Networking, Storage and Analysis, SC11

作者： Tan, Guangming Li, Linchuan Triechle, Sean Phillips, Everett Bao, Yungang Sun, Ninghui Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Science China Nvidia Corporation China

ISBN: (纸本)9781450307710

In this paper we present a thorough experience on tuning double-precision matrix-matrix multiplication (DGEMM) on the Fermi GPU architecture. We choose an optimal algorithm with blocking in both shared memory and registers to satisfy the constraints of the Fermi memory hierarchy. Our optimization strategy is further guided by a performance modeling based on micro-architecture benchmarks. Our optimizations include software pipelining, use of vector memory operations, and instruction scheduling. Our best CUDA algorithm achieves comparable performance with the latest CUBLAS library1. We further improve upon this with an implementation in the native machine language, leading to 20% increase in performance. That is, the achieved peak performance (efficiency) is improved from 302Gflop/s (58%) to 362Gflop/s (70%). Copyright 2011 ACM.

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

Petri net-based real-time scheduling of time-constrained single-arm cluster tools with activity time variation

Petri net-based real-time scheduling of time-constrained sin...

引用

IEEE International Conference on Robotics and Automation (ICRA)

作者： Yan Qiao NaiQi Wu MengChu Zhou Department of Industrial Engineering School of Mechatronics Engineering Guangdong University of Technology Guangzhou China The Key Laboratory of Embedded System and Service Computing Ministry of Education University of Tongji Shanghai China Department of Electrical and Computer Engineering New Jersey Institute of Technology Newark NJ USA

It is challenging to schedule time-constrained cluster tools subject to activity time variation. With the help of their Petri net model, a real-time control policy is used to offset the activity time variation. Based on it, the schedulability conditions and scheduling algorithms are presented for single-arm cluster tools. The schedulability conditions can be analytically checked. Algorithms are developed based on analytical expressions such that it is also computationally efficient. The schedule obtained by the scheduling algorithms together with a real-time control policy forms the real-time schedule. It is optimal in terms of cycle time.

关键词： Robots Semiconductor device modeling Schedules Load modeling Real time systems Manganese Bismuth

来源：评论

学校读者我要写书评

暂无评论

Petri net-based scheduling analysis of dual-arm cluster tools with wafer revisiting

Petri net-based scheduling analysis of dual-arm cluster tool...

引用

IEEE International Conference on Automation Science and Engineering (CASE)

作者： Yan Qiao NaiQi Wu MengChu Zhou Department of Industrial Engineering School of Electro-Mechanical Engineering Guangdong University of Technology Guangzhou China The Key Laboratory of Embedded System and Service Computing Ministry of Education Tongji University Shanghai China Department of Electrical and Computer Engineering New Jersey Institute of Technology Newark NJ USA

With wafer revisit, it is complicated to schedule cluster tools in semiconductor fabrication. In wafer fabrication processes, such as atomic layer deposition (ALD), the wafers need to visit some process modules for a number of times. The existing swap-based strategy can be used to operate a dual-arm cluster tool for such a process. It results in a 3-wafer cyclic schedule. However, it is not optimal in the sense of cycle time. Thus, to search for a better schedule, a Petri net model is developed for a dual-arm cluster tool with wafer revisit. With it, the properties of the 3-wafer schedule are analyzed. It is found that, to improve the performance, it is necessary to reduce the number of wafers completed in a cycle. Thus, a 1-wafer schedule is developed by using a new swap-based strategy.

关键词： Robots Schedules Semiconductor device modeling Firing Steady-state Load modeling Transient analysis

来源：评论

学校读者我要写书评

暂无评论

Left conjugate product of polynomial matrices and solutions to the dual Sylvester-conjugate matrix equations

Left conjugate product of polynomial matrices and solutions ...

引用

Australian Control Conference (AUCC)

作者： Ai-Guo Wu Ye Chen Victor Sreeram Wanquan Liu Tyrone Fernando Shenzhen Key Laboratory of Wind Power and Smart Grid Harbin Institute of Technology Shenzhen Graduate School Shenzhen China School of Electrical Electronic and Computer Engineering University of West Australia Perth Australia Department of Computing Curtin University Perth Australia

In this paper, the concept of left conjugate product is first presented. Some interesting properties of the concept are then derived. Using left conjugate product as a tool, we investigate dual Sylvester-conjugate matrix equations which include Lyapunov matrix equations and generalized Sylvester-observer matrix equations as special cases. An explicit solution of this matrix equation is presented with a free parameter matrix.

关键词： Polynomials Mathematical model Algorithm design and analysis Australia Educational institutions Linear systems

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：