the proceedings contain 35 papers. the topics discussed include: the Nornir run-time system for parallel programs using Kahn process networks;network performance of pruned hierarchical Torus network;super-peer availab...
ISBN:
(纸本)9780769538372
the proceedings contain 35 papers. the topics discussed include: the Nornir run-time system for parallel programs using Kahn process networks;network performance of pruned hierarchical Torus network;super-peer availability prediction strategy in unstructured P2P network;VirtualNet: mapping distributed communication on a single node;DPM: a demand-driven virtual disk prefetch mechanism for mobile personal computing environments;explorations of honeycomb topologies for network-on-chip;adaptive energy-efficient packet transmission for voice delivering in wireless sensor networks;sleep scheduling and gradient query in sensor networks for target monitoring;re-exploring the potential of using tree structure in P2P live streaming networks;reliable downloading algorithms for BitTorrent-like systems;a theoretical model of lock-keeper data exchange and its practical verification;and system level speedup oriented cache partitioning for multi-programmed systems.
Malicious traffic constantly threatens the security and stability of networks. However, most existing identification methods based on traditional machine learning and deep learning leverage statistical features of an ...
详细信息
Auto-optimization of tensor programs is a crucial technique in deep learning compilers. Traditional tensor program optimization methods rely on the known fixed shape of the tensor, but when inputs of tensor program ar...
详细信息
the Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) algorithm is an effective algorithm for solving large-scale sparse eigenvalue problems in various scientific and engineering applications. Based on ...
详细信息
Correctness and performance are the principal requirement of a parallel system. Due to the complicated and uncertainty, it is necessary to model it. A hierarchical TCPN model proposed in this paper can investigate on ...
详细信息
ISBN:
(纸本)9781424449903
Correctness and performance are the principal requirement of a parallel system. Due to the complicated and uncertainty, it is necessary to model it. A hierarchical TCPN model proposed in this paper can investigate on various levels of abstraction and analyze concerning performance, functional validity and correctness. It describes the parallel program and the resources respectively to bring less effect to modify the program structure because of running environment changes.
Emergent computation is a relatively new approach for understanding the behaviors of complex systems. Central to this approach is the idea that system-level behavior emerges from interaction among individual elements....
详细信息
ISBN:
(纸本)9781424449903
Emergent computation is a relatively new approach for understanding the behaviors of complex systems. Central to this approach is the idea that system-level behavior emerges from interaction among individual elements. this paper proposes a network emergent cotnputation model based on cellular automaton, which introduces network operating mechanism, such as store-and-forward, adjacency interaction, rate adjustment, resource competition and delayed feedback as the interactions among individual cells. the simulation results show that the system-level behaviors, such as power-law, self-similarity and 1/f(r) noise, can be generated spontaneously by local nonlinear interaction among cells, which have led to a better understanding of the macro-behavior of the system from the perspective of microscopic mechanism.
the complexity of an interconnection network often determines the size of the parallel computer and thus the attainable performance of a parallel computer is limited by the characteristics of the interconnection netwo...
详细信息
ISBN:
(纸本)9781424449903
the complexity of an interconnection network often determines the size of the parallel computer and thus the attainable performance of a parallel computer is limited by the characteristics of the interconnection network. Pruning technique reduces the complexity and hence increases the performance. In this paper, we apply the pruning technique on Hierarchical Torus network (HTN) and study the architectural details of the pruned HTN. We have explored the network diameter, average distance, bisection width, peak number of vertical links, and VLSI layout area of different HTN. It is shown that the pruned HTN possesses several attractive features including small diameter, small average distance, small number of wires, a particularly small number of vertical links, and economic layout area as compared to its non-pruned counterpart.
Shared-memory concurrency is the prevalent paradigm used for developing parallel applications targeted towards small- and middle-sized machines, but experience has shown that it is hard to use. this is largely caused ...
详细信息
Rectangular mesh and torus are the mostly used topologies in network-on-chip (NoC) based systems. In this paper, we quantitatively illustrate that the honeycomb topology is an advantageous design alternative in terms ...
详细信息
ISBN:
(纸本)9781424449903
Rectangular mesh and torus are the mostly used topologies in network-on-chip (NoC) based systems. In this paper, we quantitatively illustrate that the honeycomb topology is an advantageous design alternative in terms of network cost which is one of the most important parameters that reflects bothnetwork performance and implementation cost. Comparing withthe rectangular mesh and torus, honeycomb mesh and torus topologies lead to 40% decrease of the network cost. then we explore the NoC related topological properties of both honeycomb mesh and torus topologies. By transforming the honeycomb topologies into rectangular brick shapes, we demonstrate that the honeycomb topologies are feasible to be implemented with rectangular devices. We also propose a 3D honeycomb topology since 3D IC has become an emerging and promising technique. Another contribution of this paper is the proposal of deadlock free routing algorithms. Based on either the concept of turn model or the logical network, deadlock free routing for all the discussed honeycomb topologies can be achieved.
In this paper, we propose a demand-driven virtual disk (VD) prefetch mechanism-DPM to improve the performance of virtual machine (VM) at destination site for mobile personal computing environments. DPM uses an optimiz...
详细信息
ISBN:
(纸本)9781424449903
In this paper, we propose a demand-driven virtual disk (VD) prefetch mechanism-DPM to improve the performance of virtual machine (VM) at destination site for mobile personal computing environments. DPM uses an optimized COW (Copy-on-Write) virtual block device to split the traditional one-piece large-sized VD image into multiple small-sized SVDs (Software Virtual Disk) at a basic granularity of a single kind of software. Based on the fine-grained VD splitting, DPM uses an access frequency and priority-based prefetch target identifying-APTI algorithm to identify the SVDs of the being-used software by the user in real-time mode at destination site, and prefetches those SVDs by utilizing P2P transportation mechanism at background. We have built a prototype to realize DPM on Xen virtual machine monitor (VMM). Experiments on the prototype show that DPM can effectively improve the VM performance at an unexpected destination site without any cached VD state, supporting agile mobility of personal computing environments.
暂无评论