检索结果-内蒙古大学图书馆

ACM/IEEE international symposium on Networks-on-Chip (NOCS)

作者： Rodrigo, S. Flich, J. Roca, A. Medardoni, S. Bertozzi, D. Camacho, J. Silla, F. Duato, J. Univ Politecn Valencia Parallel Architectures Grp Valencia 46022 Spain Integrated Syst Lab Minatec F-38000 Grenoble France Univ Ferrara Dept Engn I-44100 Ferrara Italy Simula Res Lab N-1364 Oslo Norway

The high-performance computing domain is enriching with the inclusion of networks-on-chip (NoCs) as a key component of many-core (CMPs or MPSoCs) architectures. NoCs face the communication scalability challenge while meeting tight power, area, and latency constraints. Designers must address new challenges that were not present before. Defective components, the enhancement of application-level parallelism, or power-aware techniques may break topology regularity, thus, efficient routing becomes a challenge. This paper presents universal logic-based distributed routing (uLBDR), an efficient logic-based mechanism that adapts to any irregular topology derived from 2-D meshes, instead of using routing tables. uLBDR requires a small set of configuration bits, thus being more practical than large routing tables implemented in memories. Several implementations of uLBDR are presented highlighting the tradeoff between routing cost and coverage. The alternatives span from the previously proposed LBDR approach (with 30% of coverage) to the uLBDR mechanism achieving full coverage. This comes with a small performance cost, thus exhibiting the tradeoff between fault tolerance and performance. Power consumption, area, and delay estimates are also provided highlighting the efficiency of the mechanism. To do this, different router models (one for CMPs and one for MPSoCs) have been designed as a proof concept.

关键词： Fault-tolerance logic design networks-on-chip routing

来源：评论

学校读者我要写书评

暂无评论

Variable Granularity Access Tracking Scheme for Improving the Performance of Software Transactional Memory

Variable Granularity Access Tracking Scheme for Improving th...

引用

international symposium on parallel and distributed Processing (IPDPS)

作者： Sandya S. Mannarswamy Ramaswamy Govindarajan CSA IISc and Hewlett Packard Bangalore India SERC Indian Institute of Science Bangalore India

Software transactional memory (STM) has been proposed as a promising programming paradigm for shared memory multi-threaded programs as an alternative to conventional lock based synchronization primitives. Typical STM implementations employ a conflict detection scheme, which works with uniform access granularity, tracking shared data accesses either at word/cache line or at object level. It is well known that a single fixed access tracking granularity cannot meet the conflicting goals of reducing false conflicts without impacting concurrency adversely. A fine grained granularity while improving concurrency can have an adverse impact on performance due to lock aliasing, lock validation overheads, and additional cache pressure. On the other hand, a coarse grained granularity can impact performance due to reduced concurrency. Thus, in general, a fixed or uniform granularity access tracking (UGAT) scheme is application-unaware and rarely matches the access patterns of individual application or parts of an application, leading to sub-optimal performance for different parts of the application(s). In order to mitigate the disadvantages associated with UGAT scheme, we propose a Variable Granularity Access Tracking (VGAT) scheme in this paper. We propose a compiler based approach wherein the compiler uses inter-procedural whole program static analysis to select the access tracking granularity for different shared data structures of the application based on the application's data access pattern. We describe our prototype VGAT scheme, using TL2 as our STM implementation. Our experimental results reveal that VGAT-STM scheme can improve the application performance of STAMP benchmarks from 1.87% to up to 21.2%.

关键词： Benchmark testing Concurrent computing Arrays Program processors Random access memory

来源：评论

学校读者我要写书评

暂无评论

Vitis: A Gossip-based Hybrid Overlay for Internet-scale Publish/Subscribe Enabling Rendezvous Routing in Unstructured Overlay Networks

Vitis: A Gossip-based Hybrid Overlay for Internet-scale Publ...

引用

international symposium on parallel and distributed Processing (IPDPS)

作者： Fatemeh Rahimian Sarunas Girdzijauskas Amir H. Payberah Seif Haridi Royal Institute of Technology Stockholm Sweden Swedish Institute of Computer Science Stockholm Sweden

Peer-to-peer overlay networks are attractive solutions for building Internet-scale publish/subscribe systems. However, scalability comes with a cost: a message published on a certain topic often needs to traverse a large number of uninterested (unsubscribed) nodes before reaching all its subscribers. This might sharply increase resource consumption for such relay nodes (in terms of bandwidth transmission cost, CPU, etc) and could ultimately lead to rapid deterioration of the system's performance once the relay nodes start dropping the messages or choose to permanently abandon the system. In this paper, we introduce Vitis, a gossip-based publish/subscribe system that significantly decreases the number of relay messages, and scales to an unbounded number of nodes and topics. This is achieved by the novel approach of enabling rendezvous routing on unstructured overlays. We construct a hybrid system by injecting structure into an otherwise unstructured network. The resulting structure resembles a navigable small-world network, which spans along clusters of nodes that have similar subscriptions. The properties of such an overlay make it an ideal platform for efficient data dissemination in large-scale systems. We perform extensive simulations and evaluate Vitis by comparing its performance against two base-line publish/subscribe systems: one that is oblivious to node subscriptions, and another that exploits the subscription similarities. Our measurements show that Vitis significantly outperforms the base-line solutions on various subscription and churn scenarios, from both synthetic models and real-world traces.

关键词： Peer to peer computing Routing Subscriptions Relays Logic gates Correlation Protocols

来源：评论

学校读者我要写书评

暂无评论

Proceedings of the 2010 IEEE international symposium on parallel and distributed Processing, Workshops and Phd Forum, IPDPSW 2010

Proceedings of the 2010 IEEE International Symposium on Para...

引用

2010 IEEE international symposium on parallel and distributed Processing, Workshops and Phd Forum, IPDPSW 2010

ISBN: (纸本)9781424465347

The proceedings contain 239 papers. The topics discussed include: characterizing heterogeneous computing environments using singular value decomposition;statistical predictors of computing power in heterogeneous clusters;a first step to the evaluation of SimGrid in the context of a real application;dynamic adaptation of DAGs with uncertain execution times in heterogeneous computing systems;robust resource allocation of DAGs in a heterogeneous multicore system;decentralized dynamic scheduling across heterogeneous multi-core desktop grids;a configurable-hardware document-similarity classifier to detect web attacks;a configurable high-throughput linear sorter system;hardware implementation for scalable lookahead regular expression detection;reducing grid energy consumption through choice of resource allocation method;and scheduling parallel tasks on multiprocessor computers with efficient power management.

关键词：

来源：评论

学校读者我要写书评

暂无评论

LSAP'11 - Proceedings of the 3rd international Workshop on Large-Scale System and Application Performance: Foreword

LSAP'11 - Proceedings of the 3rd International Workshop on L...

引用

LSAP'11 - Proceedings of the 3rd international Workshop on Large-Scale System and Application Performance 2011年 iii页

作者： Arlitt, Martin Epema, Dick Moreira, Jose HP Labs. United States University of Calgary Canada Delft University of Technology Netherlands IBM T.J. Watson Research Lab. United States

来源：评论

学校读者我要写书评

暂无评论

Advances in parallel and distributed computing models - APDCM

IEEE International Symposium on Parallel and Distributed Pro...

引用

IEEE international symposium on parallel and distributed Processing Workshops and Phd Forum 2011年 531-531页

作者： Ibarra, Oscar H. Nakano, Koji Bordim, Jacir L. Fujiwara, Akihiro Caillouet, Christelle Fujita, Satoshi Ichikawa, Shuichi Inoguchi, Yasushi Ito, Yasuaki Iwamoto, Chuzo Jiang, Xiaohong Kakugawa, Hirotsugu Li, Guoqiang Li, Keqin Marowka, Ami Matsumae, Susumu Miyano, Eiji Motoki, Mitsuo Ono, Hirotaka Rajasekaran, Sanguthevar Stojmenovic, Ivan Sun, Wei Takenaga, Yasuhiko Trahan, Jerry L. Yamagiwa, Shinichi Zhang, Jingyuan Ja'Ja, Joseph Rosenberg, Arnold L. Sahni, Sartaj K. Wu, Jie Yew, Pen-Chung Zomaya, Albert Y. University of California Santa Barbara United States Hiroshima University Japan University of Brasilia Brazil Kyushu Institute of Technology Japan LIG Lab. France Toyohashi University of Technology Japan JAIST Japan Future University Hakodate Japan Osaka University Japan Shanghai Jiao Tong University China State University of New York New Paltz United States Bar-Ilan University Israel Saga University Japan Kanazawa Technical College Japan Kyushu University Japan University of Connecticut United States University of Ottawa Canada NEC Japan University of Electro-Communications Japan Louisiana State University United States Kochi University of Technology JST PRESTO Japan University of Alabama United States University of Maryland United States Northeastern University United States Colorado State University United States University of Florida United States Temple University United States University of Minnesota United States University of Sydney Australia

来源：评论

学校读者我要写书评

暂无评论

High-performance grid and cloud computing workshop - HPGC

IEEE International Symposium on Parallel and Distributed Pro...

引用

IEEE international symposium on parallel and distributed Processing Workshops and Phd Forum 2011年 880-880页

作者： Aubanel, Eric Bhavsar, Virendra C. Frumkin, Michael Alex Aggarwal, Akshai Bacigalupo, David Chrisochoides, Nikos P. Chronopoulos, Anthony T. Du, Weichang Huedo, Eduardo Krishnamurthy, Diwakar Krishnan, Sriram Lastovetsky, Alexey Lu, Paul Mateescu, Gabriel Montero, Rubén S. Podlipnig, Stefan Prasad, Sushil Ranka, Sanjay Rau-Chaplin, Andrew Rauber, Thomas Shaw, Ruth Van Der Wijngaart, Rob F. Yang, Laurence T. University of New Brunswick Canada Google United States University of Windsor Windsor Canada University of Southampton United Kingdom College of William and Mary Wlliamsburg VA United States Univ. of Texas San Antonio United States Universidad Complutense de Madrid Spain University of Calgary Canada San Diego Supercomputer Center United States University College Dublin Ireland University of Alberta Canada Virginia Bioinformatics Institute Blacksburg VA United States University of Innsbruck Austria Georgia State University United States Univ. of Florida United States Dalhousie University Canada University of Bayreuth Germany Intel Corporation United States St. Francis Xavier University Canada

来源：评论

学校读者我要写书评

暂无评论

Workshop on desktop grids and volunteer computing systems - PCGrid

IEEE International Symposium on Parallel and Distributed Pro...

引用

IEEE international symposium on parallel and distributed Processing Workshops and Phd Forum 2011年 1838-1838页

作者： Fedak, Gilles Kondo, Derrick Heien, Eric Abramson, David Anderson, David Andrzejak, Artur Araujo, Filipe Bal, Henri Balaton, Zoltan Beberg, Adam Brasileiro, Francisco Canonico, Massimo Casanova, Henri Chandra, Abhishek Gabriel, Edgar He, Haiwu Javadi, Bahman Kee, Yang-Suk Legrand, Arnaud Malewicz, Grzegorz Sussman, Alan Taufer, Michela Toth, David Traversat, Bernard Varela, Carlos Varrette, Sebastien Weissman, Jon Zhan, Zhiyuan INRIA France UC Davis CA United States Monash University Australia University of California Berkeley United States University of Heidelberg Germany University of Coimbra Portugal Vrije Universiteit Netherlands SZTAKI Hungary Stanford University United States Federal University of Campina Grande Brazil University of Piemonte Orientale Italy University of Hawaii Manoa United States University of Minnesota United States University of Houston United States University of Melbourne Australia University of Southern California United States CNRS France University of Alabama United States University of Maryland United States University of Delaware United States Merrimack College United States Oracle Corporation United States Rensselaer Polytechnic Institute United States University of Luxembourg Luxembourg Microsoft United States

来源：评论

学校读者我要写书评

暂无评论

distributed Skycube Computation with Anthill

Distributed Skycube Computation with Anthill

引用

international symposium on Computer Architecture and High Performance computing (SBAC-PAD)

作者： Renê R. Veloso Loïc Cerf Chedy Raïssi Wagner Meira Jr. DCC - UFMG Belo Horizonte Brazil INRIA Nancy Grand-Est Nancy Grand-Est France

Recently skyline queries have gained considerable attention and are among the most important tools for multi-criteria analysis. In order to process all possible combinations of criteria along with their inherent analysis, researchers introduced and studied the notion of skycube. Simply put, a skycube is a pre-materialization of all possible subspaces with their associated skylines. An efficient skycube computation relies on the detection of redundancies in the different processing steps and enhanced result sharing between subspaces. Lately, the Orion algorithm was proposed to compute the skycube in a very efficient way. The approach relies on the derivation of skyline points over different subspaces. Nevertheless, because there are 2 |D| - 1 subspaces (where D is the set of dimensions) in a skycube, the running time still grows exponentially with the number of dimensions and easily becomes intractable on real-world datasets. In this study, we detail the distribution of Orion within a filter-stream framework and we conduct an extensive set of experiments on large datasets collected from Twitter to demonstrate the efficiency of our method.

关键词： parallel processing Computational modeling Asynchronous communication Computer architecture Face Load management Hardware

来源：评论

学校读者我要写书评

暂无评论

A Waterfall Model to Achieve Energy Efficient Tasks Mapping for Large Scale GPU Clusters

A Waterfall Model to Achieve Energy Efficient Tasks Mapping ...

引用

IEEE international symposium on parallel and distributed Processing Workshops and Phd Forum (IPDPSW)

作者： Wenjie Liu Zhihui Du Yu Xiao David A. Bader Chen Xu Tsinghua National Laboratory for Information Science and Technology Department of Computer Science and Technology Tsinghua University Beijing China Beijing University of Posts and Telecommunications China College of Computing Georgia Institute of Technology Atlanta GA USA

High energy consumption has become a critical problem for supercomputer systems. GPU clusters are becoming an increasingly popular architecture for building supercomputers because of its great improvement in performance. In this paper, we first formulate the tasks mapping problem as a mini-mal energy consumption problem with deadline constraint. Its optimizing object is very different from the traditional mapping problem which often aims at minimizing make span or minimizing response time. Then a Waterfall Energy Consumption Model, which abstracts the energy consumption of one GPU cluster system into several levels from high to low, is proposed to achieve an energy efficient tasks mapping for large scale GPU clusters. Based on our Waterfall Model, a new task mapping algorithm is developed which tries to apply different energy saving strategies to keep the system remaining at lower energy levels. Our mapping algorithm adopts the Dynamic Voltage Scaling, Dynamic Resource Scaling and β-migration for GPU sub-task to significantly reduce the energy consumption and achieve a better load balance for GPU clusters. A task generator based on the real task traces is developed and the simulation results show that our mapping algorithm based on the Waterfall Model can reduce nearly 50% energy consumption compared with traditional approaches which can only run at a high energy level. Not only the task deadline can be satisfied, but also the task execution time of our mapping algorithm can be reduced.

关键词： Graphics processing unit Energy consumption Heuristic algorithms Clustering algorithms Computational modeling Energy efficiency Algorithm design and analysis

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：