Software development of high-performance graph algorithms is difficult on modern parallel computers. To simplify this task, we have designed and implemented a collection of C++ graph primitives, basic building blocks,...
详细信息
Next-generation e-Science applications will require the ability to transfer information at high data rates between distributed computing centers and data repositories. To support such applications, lambda grid network...
详细信息
The paper presents the Abstract Configuration Language (ACL) implemented within the parallel Objects object-oriented parallel programming environment. ACL defines a set of directives that allow users to specify the al...
详细信息
ISBN:
(纸本)0818678836
The paper presents the Abstract Configuration Language (ACL) implemented within the parallel Objects object-oriented parallel programming environment. ACL defines a set of directives that allow users to specify the allocation needs of his/her application components without being aware of the architectural details. ACL directives drive the allocation decisions of the run-time support, by adapting its general-purpose behaviour to follow applications particular allocation needs. The effectiveness of the ACL approach in increasing the performances of parallelapplications is confirmed by a testbed application.
Current estimates of mobile data traffic in the years to come foresee a 1,000 increase of mobile data traffic in 2020 with respect to 2010, or, equivalently, a doubling of mobile data traffic every year. This unpreced...
详细信息
Current estimates of mobile data traffic in the years to come foresee a 1,000 increase of mobile data traffic in 2020 with respect to 2010, or, equivalently, a doubling of mobile data traffic every year. This unprecedented growth demands a significant increase of wireless network capacity. Even if the current evolution of fourth-generation (4G) systems and, in particular, the advancements of the long-term evolution (LTE) standardization process foresees a significant capacity improvement with respect to third-generation (3G) systems, the European Telecommunications Standards Institute (ETSI) has established a roadmap toward the fifth-generation (5G) system, with the aim of deploying a commercial system by the year 2020 [1]. The European Project named ?Mobile and Wireless Communications Enablers for the 2020 Information Society? (METIS), launched in 2012, represents one of the first international and large-scale research projects on fifth generation (5G) [2]. In parallel with this unparalleled growth of data traffic, our everyday life experience shows an increasing habit to run a plethora of applications specifically devised for mobile devices, (smartphones, tablets, laptops)for entertainment, health care, business, social networking, traveling, news, etc. However, the spectacular growth in wireless traffic generated by this lifestyle is not matched with a parallel improvement on mobile handsets? batteries, whose lifetime is not improving at the same pace [3]. This determines a widening gap between the energy required to run sophisticated applications and the energy available on the mobile handset. A possible way to overcome this obstacle is to enable the mobile devices, whenever possible and convenient, to offload their most energy-consuming tasks to nearby fixed servers. This strategy has been studied for a long time and is reported in the literature under different names, such as cyberforaging [4] or computation offloading [5], [6]. In recent years, a strong impul
Artificial neural networks can solve complex problems such as time series prediction, handwritten pattern recognition or speech processing. Though software simulations are essential when one sets about to study a new ...
详细信息
The proceedings contain 128 papers. The topics discussed include: C parallelizing compiler on local-net work- based computer environment;OCCAM prototyping of massively parallelapplications from colored Petri-nets;per...
ISBN:
(纸本)0818634421
The proceedings contain 128 papers. The topics discussed include: C parallelizing compiler on local-net work- based computer environment;OCCAM prototyping of massively parallelapplications from colored Petri-nets;performance characteristics of the iPSC/SSO and CM-2 I/O systems;automatic parallelization of LINPACK routines on distributed memory parallel processors;transformation of doacross loops on distributed memory systems;an efficient atomic multicast protocol for client-server models;a new horizon for sorting on mesh architectures;mapping of uniform dependence algorithm onto fixed size processor arrays;and towards understanding block partitioning for sparse Cholesky factorization.
distributed computing involves systems that operate across networks transparently, using the resources of multiple machines. The Open Software Foundation's distributed Computing Environment (DCE) has evolved to ad...
详细信息
ISBN:
(纸本)0818677589
distributed computing involves systems that operate across networks transparently, using the resources of multiple machines. The Open Software Foundation's distributed Computing Environment (DCE) has evolved to address the need for a vendor-neutral platform to which distributedapplications can be developed, and upon which they can run. Central to the design philosophy of DCE is its reliance on the Remote Procedure Call (RPC) to facilitate communication among the entities in the distributed environment. Since it profoundly affects the performance of both the DCE environment and applications running on top of it, the performance of RPCs is very much a concern of both application developers and system managers in a DCE installation This short paper reports some results from an ongoing empirical investigation of the OS/2 DCE RPC facility. Our interest in this project is the effect on end-to-end RPC performance of protocol processing, flow control mechanisms within DCE, other load on the network, and interoperation with multiple DCE platforms.
Vertex component analysis (VCA) has become a very popular and useful tool to linear unmix large hyperspectral datasets without the use of any a priori knowledge of the constituent spectra. Although VCA is fast method,...
详细信息
ISBN:
(纸本)9781467311595
Vertex component analysis (VCA) has become a very popular and useful tool to linear unmix large hyperspectral datasets without the use of any a priori knowledge of the constituent spectra. Although VCA is fast method, many hyperspectral imagery applications require a response in real time or near-real time. This paper proposes two different optimizations for accelerating the computational performance of VCA: the first one focus a parallel implementation based on graphics computing units (GPUs) to alleviate the VCA computational burden;The second one is focused on the development of a strategy to remove a large proportion of mixed pixels that play no effect on the VCA functioning. Experiments are conducted using simulated and real hyperspectral datasets. These results reveal considerable acceleration factors, which satisfies the real-time constraints given by the data acquisition rate.
Graph partitioning is often used for load balancing in parallel computing, but it is known that hypergraph partitioning has several advantages. First, hypergraphs more accurately model communication volume, and second...
详细信息
This paper describes the design and implementation of a solution to the constrained 2-D cutting stock problem on a cluster of workstations. The constrained 2-D cutting stock problem is an irregular problem with a dyna...
详细信息
ISBN:
(纸本)0818675829
This paper describes the design and implementation of a solution to the constrained 2-D cutting stock problem on a cluster of workstations. The constrained 2-D cutting stock problem is an irregular problem with a dynamically modified global data set and irregular amounts and patterns of communication. A replicated data structure is used for the parallel solution since the ratio of reads to writes is known to be large. Mutual exclusion and consistency are maintained using a token-based lazy consistency mechanism, and a randomized protocol for dynamically balancing the distributed work queue is employed. Speedups are reported for three benchmark problems executed on a cluster of workstations interconnected by a 10 Mbps Ethernet.
暂无评论