检索结果-内蒙古大学图书馆

International conference for High Performance computing, Networking, Storage and Analysis (SC)

作者： Flick, Patrick Aluru, Srinivas Georgia Inst Technol Atlanta GA 30332 USA

ISBN: (数字)9781450362290

ISBN: (纸本)9781450362290

Suffix arrays and trees are important and fundamental string data structures which lie at the foundation of many string algorithms, with important applications in computational biology, text processing, and information retrieval. Recent work enables the efficient parallel construction of suffix arrays and trees requiring at most O(n/p) memory per process in distributed memory. However, querying these indexes in distributed memory has not been studied extensively. Querying common string indexes such as suffix arrays, enhanced suffix arrays, and FM-Index, all require random accesses into O(n) memory - which in distributed memory settings becomes prohibitively expensive. In this paper, we introduce a novel distributed string index, the distributed Enhanced Suffix Array (DESA). We present efficient algorithms for the construction and querying of this distributed data structure, all while requiring only O(n/p) memory per process. We further provide a scalable parallel implementation and demonstrate its performance and scalability.

关键词： Scalability High performance computing memory management Load management Information retrieval Libraries Arrays Parallel algorithms Text processing Indexing

来源：评论

学校读者我要写书评

暂无评论

On the computational power of self-stabilizing systems

引用

THEORETICAL COMPUTER SCIENCE 1997年第1-2期182卷 159-170页

作者： Abello, J Dolev, S BEN GURION UNIV NEGEV DEPT MATH & COMP SCIIL-84105 BEER SHEVAISRAEL TEXAS A&M UNIV DEPT COMP SCICOLLEGE STNTX 77843

The computational power of self-stabilizing distributed systems is examined. Assuming availability of any number of processors, each with (small) constant size memory we show that any computable problem can be realized in a self-stabilizing fashion. The result is derived by presenting a distributed system which tolerates transient faults and simulates the execution of a Turing machine. The total amount of memory required by the distributed system is equal to the memory used by the Turing machine (up to a constant factor).

关键词： PROCESSOR SELF-STABILIZING Automata theory-Turing machines memory Power Legal Executions Power transient fault computational power distributed Systems

来源：评论

学校读者我要写书评

暂无评论

TYPE-SPECIFIC COHERENCE PROTOCOLS FOR distributed SHARED memory 12

TYPE-SPECIFIC COHERENCE PROTOCOLS FOR DISTRIBUTED SHARED MEM...

引用

12TH INTERNATIONAL CONF ON distributed computing SYSTEMS

作者： LEONG, HV AGRAWAL, D Department of Computer Science University of California Santa Barbara 93106 CA United States

ISBN: (纸本)0818628650

The concept of a structured distributed shared memory in which memory units are objects is introduced. The coherence of object replicas is maintained by type-specific coherence protocols that are based on the semantics of operations on objects. The aim is to reduce message traffic and operation latency in many common situations. The protocols subsume traditional distributed shared memory protocols based on the read/write model. © 1992 IEEE.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

Dynamic load sharing with unknown memory demands in clusters

Dynamic load sharing with unknown memory demands in clusters

引用

21st IEEE International conference on distributed computing Systems

作者： Chen, SQ Xiao, L Zhang, XD Coll William & Mary Dept Comp Sci Williamsburg VA 23187 USA

ISBN: (纸本)0769510779

A compute farm is a pool of clustered workstations to provide high performance computing services for CPU-intensive, memory-intensive, and I/O active jobs in a batch merle. Existing loan sharing schemes with memory considerations assume jobs' memory demand sizes are known in advance or predictable based on users' hints. This assumption call greatly simplify the designs and implementations of load sharing schemes, but is not desirable in practice. In order to address this concern, we present three new results and contributions in this study (1) Conducting Linux kernel instrumentation, we have collected different types of workload execution traces to quantitatively characterize job interactions, and modeled page fault behavior as a function of the overloaded memory sizes and the amount of jobs' I/O activities. (2) Based on experimental results and collected dynamic system information, we have built a simulation model which accurately emulates the memory system operations and job migrations with virtual memory considerations. (3) We have proposed a memory-centric load sharing scheme and its variations to effectively process dynamic memory allocation demands, aiming at minimizing execution time of each individual job by dynamically migrating and remotely submitting jobs to eliminate or reduce page faults and to reduce the queuing rime for CPU services. Conducting trace-driven simulations, we have examined these load sharing policies to show their effectiveness.

关键词： distributed computer systems

来源：评论

学校读者我要写书评

暂无评论

An evaluation of parallel algorithms on current memory consistency models

An evaluation of parallel algorithms on current memory consi...

引用

18th IASTED International conference on Parallel and distributed computing and Systems

作者： Cong, Guojing IBM Corp Thomas J Watson Res Ctr Yorktown Hts NY 10598 USA

ISBN: (纸本)9780889866386

memory consistency model is crucial to the performance of shared-memory multiprocessors, and in current architectures several different models are adopted. In this paper, using graph algorithms for illustrative purposes, we consider the impact of memory model on the implementation and performance of parallel algorithms on shared-memory multiprocessors. We show that the implementation of PRAM algorithm's is largely "oblivious" of the underlying memory model, and has good performance on relaxed models. More importantly, we show that different memory models can favor drastically different algorithm designs.

关键词： consistency models parallel algorithms shared memory

来源：评论

学校读者我要写书评

暂无评论

COMPARING SHARED AND distributed memory COMPUTERS

引用

PARALLEL computing 1988年第1-3期8卷 101-110页

作者： BAILLIE, CF Caltech Concurrent Computation Project California Institute of Technology Pasadena CA 91125 U.S.A.

There are two distinct types of MIMD (Multiple Instruction, Multiple Data) computers: the shared memory machine, e.g. Butterfly, and the distributed memory machine, e.g. Hypercubes, Transputer arrays. Typically these utilize different programming models: the shared memory machine has monitors, semaphores and fetch-and-add; whereas the distributed memory machine uses message passing. Moreover there are two popular types of operating systems: a multi-tasking, asynchronous operating system and a crystalline, loosely synchronous operating system. In this paper I firstly describe the Butterfly, Hypercube and Transputer array MIMD computers, and review monitors, semaphores, fetch-and-add and message passing; then I explain the two types of operating systems and give examples of how they are implemented on these MIMD computers. Next I discuss the advantages and disadvantages of shared memory machines with monitors, semaphores and fetch-and-add, compared to distributed memory machines using message passing, answering questions such as “is one model ‘easier’ to program than the other?” and “which is ‘more efficient‘?”. One may think that a shared memory machine with monitors, semaphores and fetch-and-add is simpler to program and runs faster than a distributed memory machine using message passing but we shall see that this is not necessarily the case. Finally I briefly discuss which type of operating system to use and on which type of computer. This of course depends on the algorithm one wishes to compute.

关键词： MIMD computers shared memory distributed memory programming models monitor semaphore fetch-and-add message passing operating systems multi-tasking loosely synchronous

来源：评论

学校读者我要写书评

暂无评论

memory utilization analysis of Java middleware for distributed real-time and embedded systems

Memory utilization analysis of Java middleware for distribut...

引用

18th IASTED International conference on Parallel and distributed computing and Systems

作者： Qu, Runtao Hirano, Satoshi Ohkawa, Takeshi AIST Natl Inst Adv Ind Sci & Technol Informat Technol Res Inst AIST Tsukuba Cent 2 Tsukuba Ibaraki 3058568 Japan

ISBN: (纸本)9780889866386

distributed object-oriented middleware technologies have been adopted for ubiquitous communicating real-time and embedded systems. Although Java plays an important role in building distributed object-oriented middleware because of its portability and productivity, it is not popularto real-time system developers because it is slow and hard to keep real-timeness due to garbage collection. This paper presents a novel memory classification and measurement approach for modem programming language such as Java to analyze the memory utilization of middleware technologies in order to improve them for embedded systems using limited computing resource. We classified Java memory into four categories such as static, quasi-static, quasi-dynamic and dynamic, and show how to measure the size of each memory. Intensive case studies using popular Java middleware technologies such as Web Services, CORBA, RMI and HORB revealed that reducing the use of dynamic memory has contributed far more to the efficiency and performance than reducing the use of quasi-static and quasi-dynamic memory.

关键词： memory analysis Java middleware distributed system

来源：评论

学校读者我要写书评

暂无评论

Invariant consistency: A mechanism for inter-process ordering in distributed shared memory systems

Invariant consistency: A mechanism for inter-process orderin...

引用

22nd International conference on distributed computing Systems

作者： Singh, G Kansas State Univ Manhattan KS 66506 USA

ISBN: (纸本)0769515851

The notion of invariant consistency was proposed that allowed the programmers to specify inter-process ordering requirements. Data consistency protocols provided a consistent view of the shared memory in the presence of multiple copies, implemented in message passing systems. The combination of invariant consistency and sequential consistency, known as InvSC consistency was studied, and a systematic way to modify the Lazy Cache algorithm for implementing InvSC consistency was proposed.

关键词： Multiprocessing systems

来源：评论

学校读者我要写书评

暂无评论

computing In-memory, Revisited 38

Computing In-Memory, Revisited

引用

38th IEEE International conference on distributed computing Systems (ICDCS)

作者： Milojicic, Dejan Bresniker, Kirk Campbell, Gary Faraboschi, Paolo Strachan, John Paul Williams, Stan Hewlett Packard Labs Syst Lab Palo Alto CA 94304 USA Hewlett Packard Labs Off CTO Palo Alto CA USA

ISBN: (纸本)9781538668719

The Von Neumann's architecture has been the dominant computing paradigm ever since its inception in the mid-forties. It revolves around the concept of a "stored program" in memory, and a central processing unit that executes the program. As an alternative, Processing-In-memory (PIM) ideas have been around for at least two decades, however with very limited adoption. Today, three trends are creating a compelling motivation to take a second look. Novel devices such as memristor blur the boundary between memory and compute, effectively providing both in the same element. Power efficiency has become very important, both in the datacenter and at the edge. Machine learning applications driven by a data-flow model have become ubiquitous. In this paper, we sketch our computing-In-memory (CIM) vision, and its substantial performance and power improvement potential. Compared to PIM models, CIM more clearly separates computing from memory. We then discuss the programming model, which we consider the biggest challenge. We close by describing how CIM impacts non-functional characteristics, such as reliability, scale, and configurability.

关键词： Architecture computing memory interconnects accelerators programming configuring performance scaling

来源：评论

学校读者我要写书评

暂无评论

Achieving Speedups for distributed Graph Biconnectivity

Achieving Speedups for Distributed Graph Biconnectivity

引用

IEEE High Performance Extreme computing Virtual conference (HPEC)

作者： Bogle, Ian Slota, George M. Rensselaer Polytech Inst Dept Comp Sci Troy NY 12181 USA

ISBN: (数字)9781665497862

ISBN: (纸本)9781665497862

As data scales continue to increase, studying the porting and implementation of shared memory parallel algorithms for distributed memory architectures becomes increasingly important. We consider the problem of biconnectivity for this current study, which identifies cut vertices and cut edges in a graph. As part of our study, we implemented and optimized a shared memory biconnectivity algorithm based on color propagation within a distributed memory context. This algorithm is neither work nor time efficient. However, when we compare to distributed implementations of theoretically efficient algorithms, we find that simple non-optimal algorithms can greatly outperform time-efficient algorithms in practice when implemented for real distributed-memory environments and real data. Overall, our distributed implementation for computing graph biconnectivity demonstrates an average strong scaling speedup of 15 x across 64 MPI ranks on a suite of irregular real-world inputs. We also note an average of 11 x and 7.3 x speedup relative to the optimal serial algorithm and fastest shared-memory implementation for the biconnectivity problem, respectively.

关键词： parallel algorithms graph algorithms biconnectivity

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：