This paper presents a novel approach, called WebGOP, for architecture modeling and programming of web-based distributedapplications. WebGOP uses the graph-oriented programming (GOP) mode, under which the components o...
详细信息
ISBN:
(纸本)0769516777
This paper presents a novel approach, called WebGOP, for architecture modeling and programming of web-based distributedapplications. WebGOP uses the graph-oriented programming (GOP) mode, under which the components of a distributed program are configured as a logical graph and implemented using a set of operations defined over the graph. WebGOP extends the application of GOP to the World Wide Web environment and provides more powerful architectural support. In WebGOP, the architecture graph is reified as an explicit object which itself is distributed over the network providing a graph-oriented context for the execution of distributedapplications. The programmer can specialize the type of a graph to represent a particular architecture style tailored for an application. WebGOP also has built in support for flexible and dynamic architectures, including both planed and unplanned dynamic reconfiguration of distributedapplications. We describe the WebGOP framework a prototypical implementation of the framework on top of SOAP, and performance evaluation of the prototype. Results of the performance evaluation showed that the overhead introduced by WebGOP over SOAP is reasonable and acceptable.
Context-based Adaptive Binary Arithmetic Coding (CABAC) is the only compute-intensive task in the High Efficiency Video Coding (HEVC) Standard that does not contain significant data-level parallelism. As a result, it ...
详细信息
ISBN:
(纸本)9781728165820
Context-based Adaptive Binary Arithmetic Coding (CABAC) is the only compute-intensive task in the High Efficiency Video Coding (HEVC) Standard that does not contain significant data-level parallelism. As a result, it is often a throughput bottleneck for the overall decoding process, especially for high-quality videos. Consequently, the use of high-level parallelization techniques is inevitable to reach throughput requirements for CABAC decoding. Multiple high-level parallelization tools are specified in HEVC, amongst which Wavefront parallelprocessing (WPP) has only small losses in coding efficiency. However, it lacks in parallel efficiency due to a ramp-up and -down in active parallel threads within a frame. This is a serious problem for systems that cannot process multiple frames at the same time due to performance or memory constraints (e.g. mobile devices), and also for low-delay applications such as video conferencing. To address this issue, we present three improved WPP implementations for HEVC CABAC decoding. They differ in the granularity at which dependency checks are performed. The improvement comes from increased parallel efficiency of the WPP implementation while using the same number of threads as conventional WPP. The proposed implementations allow speedups up to 1.83 x with very little implementation overhead.
Data movement between memory subsystem and processor unit is a crippling performance and energy bottleneck for data-intensive applications. Near Memory processing (NMP) is a promising solution to alleviate the data mo...
详细信息
ISBN:
(纸本)9781665414555
Data movement between memory subsystem and processor unit is a crippling performance and energy bottleneck for data-intensive applications. Near Memory processing (NMP) is a promising solution to alleviate the data movement bottleneck. The introduction of 3D-stacked memories and more importantly hybrid memory systems enable the long-wished NMP capability. This work explores the feasibility and efficacy of having NMP on the hybrid memory system for a given set of applications. In this paper, we first redefine a set of NMP-centric performance metrics in order to analyze the efficacy of a given processing unit. Leveraging the proposed metrics, we characterize various sets of applications to assess the suitability of a processing unit in terms of performance. Specifically, in this work we motivate the efficiency of NMP subsystems to process memory-intensive applications when 3D-NVM technologies are employed.
In this paper we address a very important issue in parallel rendering systems, reliability. distributed systems, such as clusters of PCs, are low-cost alternatives for running parallel rendering systems. However, dist...
详细信息
ISBN:
(纸本)1932415262
In this paper we address a very important issue in parallel rendering systems, reliability. distributed systems, such as clusters of PCs, are low-cost alternatives for running parallel rendering systems. However, distributed systems are usually not reliable, machines can fail during the rendering process, resulting in incomplete final images. Therefore, our goal is to take advantage of specific features of the parallel rendering applications, like tile-based computation, to include mechanisms to dynamically detect machine failure and automatically process tasks retrieval, with low overhead and no extra hardware. We developed three different parallel rendering systems, all based on the parallel ZSweep algorithm[5], to provide fault-tolerance in different ways. Our experimental results show that the three systems present a small overhead to detect the failures, and when a failure occurs, the redistribution of the work does not degrade the system performance. We conclude that it is possible to provide fault-tolerance at low-cost in a cluster of PCs.
Particle tracking methods are central to a wide spectrum of scientific computing applications. To support such applications, this paper presents a compact software architecture that can be used to interface parallel p...
详细信息
ISBN:
(纸本)1892512416
Particle tracking methods are central to a wide spectrum of scientific computing applications. To support such applications, this paper presents a compact software architecture that can be used to interface parallel particle tracking software to computational mesh management systems. Proposed is the in-element particle tracking framework, which can enable most particle tracking applications and is supported by this software architecture. The use, of the parallel software architecture is demonstrated through the implementation of two differential equation solvers, forward Euler and an implicit trapezoidal method, on a distributed unstructured computational mesh. A design goal of this software effort has been to interface to legacy software libraries, such as the Portable Extensible Toolkit for Scientific Computing (PETSc) library and the Scalable Unstructured Mesh Algorithms and applications (SUMAA3d), as well as application codes (e.g., FEMWATER). How this goal is achieved through a software architecture that specifies a lightweight functional interface that maintains the functionality required by particle-mesh methods is discussed. The utility of this system's use through its interface with different parallel programming environments written in C and Fortran is demonstrated.
With the advance of parallel processors, Generalized Task System (GTS) has been proposed to capture the notion of parallelism within a single task Based on this model, the complexity of minimizing the total (or averag...
详细信息
ISBN:
(纸本)1892512416
With the advance of parallel processors, Generalized Task System (GTS) has been proposed to capture the notion of parallelism within a single task Based on this model, the complexity of minimizing the total (or average) flaw time for the preemptive and non-preemptive scheduling discipline has been proved to be binary NP-hard even when there are only two processors and each task has exactly two independent subtasks. In this paper, we propose a preemptive approximation algorithm to minimize the total flow time and its worst-case and average performance analyzed.
This paper presents a framework to easily build and execute parallelapplications in container-based distributed computing platforms in a user transparent way. The proposed framework is a combination of the COMP Super...
详细信息
ISBN:
(纸本)9781509060580
This paper presents a framework to easily build and execute parallelapplications in container-based distributed computing platforms in a user transparent way. The proposed framework is a combination of the COMP Superscalar and Docker. We have built a prototype in order to evaluate how it performs by evaluating the overhead in the building, deployment and execution phases. We have observed an important gain compared with cloud environments during the building and deployment phases. In contrast, we have detected an extra overhead during the execution, which is mainly due to the multi-host Docker networking.
Image processing is often considered a good candidate for the application of parallelprocessing because of the large volumes of data and the complex algorithms commonly encountered. This paper presents a tutorial int...
详细信息
Image processing is often considered a good candidate for the application of parallelprocessing because of the large volumes of data and the complex algorithms commonly encountered. This paper presents a tutorial introduction to the field of parallel image processing. After introducing the classes of parallelprocessing a brief review of architectures for parallel image processing is presented. Software design for low-level image processing and parallelism in high-level image processing are discussed and an application of parallelprocessing to handwritten postcode recognition is described. The paper concludes with a look at future technology and market trends.
This paper presents the design and implementation of a Peer-to-Peer distributed database system PDBS. In PDBS, peers in a peer group have their own local databases, and distributed information in these local databases...
详细信息
ISBN:
(纸本)9781932415605
This paper presents the design and implementation of a Peer-to-Peer distributed database system PDBS. In PDBS, peers in a peer group have their own local databases, and distributed information in these local databases can be shared through user queries. PDBS supports peer users to create and send queries and update queries to other peers, and integrates distributed query results. As a fully decentralized P2P distributed database information sharing application, each node of PDBS has both server and client Junctions. The peer server component provides query service and schema service, and the peer client component provides user interface and client controller. PDBS is implemented on Java-JXTA P2P platform;PDBS is a complete application of JXTA, with query service and schema service built on top of the underling JXTA services.
The main objective of this research is to provide a generic and autonomous fault recovery service in distributed environment. This service detects fault, diagnoses the causes of the fault, and provides ways for the sy...
详细信息
ISBN:
(纸本)1892512416
The main objective of this research is to provide a generic and autonomous fault recovery service in distributed environment. This service detects fault, diagnoses the causes of the fault, and provides ways for the system to recover from them. For this purpose we have reviewed the mobile agent technology and their advantages. As a result, a mobile agent-based solution is proposed. This fault recovery service has four main components, namely Fault Detector, Agent Manager, Agents Repository and Recovery Agents. Each component has its distinct role, such as detecting fault, managing agents activities, store all recovery agents, diagnose the causes of fault and recover from them. This paper shows the feasibility of an autonomous, scalable and flexible fault recovery service for wide distributed environment.
暂无评论