Although the use of virtual environments provided by cloud computing infrastructures is gaining consensus from the scientific community, running applications in these environments is still far from reaching the maturi...
详细信息
ISBN:
(纸本)9783642297403;9783642297397
Although the use of virtual environments provided by cloud computing infrastructures is gaining consensus from the scientific community, running applications in these environments is still far from reaching the maturity of more usual computing facilities such as clusters or grids. Indeed, current solutions for managing virtual environments are mostly based on centralized approaches that barter large-scale concerns such as scalability, reliability and reactivity for simplicity. However, considering current trends about cloud infrastructures in terms of size (larger and larger) and in terms of usage (cross-federation), every large-scale concerns must be addressed as soon as possible to efficiently manage next generation of cloud computing platforms. In this work, we propose to investigate an alternative approach lever-aging distributed and COoperative mechanisms to manage Virtual EnviRonments autonomicallY (DISCOVERY). This initiative aims at overcoming the main limitations of the traditional server-centric solutions while integrating all mandatory mechanisms into a unified distributed framework. The system we propose to implement, relies on a peer-to-peer model where each agent can efficiently deploy, dynamically schedule and periodically checkpoint the virtual environments they manage. The article introduces the global design of the DISCOVERY proposal and gives a preliminary description of its internals.
Writing parallel programs that can take advantage of non-dedicated processors is much more difficult than writing such programs for networks of dedicated processors. In a non-dedicated environment such programs must u...
详细信息
This report presents empirical results of fine-grain communication on the 80-processor EM-X distributed-memory multiprocessor. EM-X has hardware support for low latency, high throughput fine-grain communication - this...
详细信息
This report presents empirical results of fine-grain communication on the 80-processor EM-X distributed-memory multiprocessor. EM-X has hardware support for low latency, high throughput fine-grain communication - this hardware support includes packet generation integrated into the instruction execution pipeline for single-cycle communication overhead, direct memory access for remote references, and rapid context switching for latency tolerance. We study the fine-grain communication performance of integer radix sort, a code with irregular communication, on EM-X, and compare it to the Fujitsu AP 1000+ and the Cray Server CS6400. Our experimental results indicate that EM-X achieves high throughput and low overhead for fine-grain communication. Whereas EM-X's communication performance scales perfectly as we increase the number of processors, other coarse-grain message-passing machines exhibit fluctuation and performance degradation for larger configurations due to network contention.
DCT/IDCT based source coding and decoding techniques are widely accepted in HDTV systems and other MPEG based MULTIMEDIA applications. In this paper, we propose a new direct 2-D DCT algorithm based on parallel divide ...
详细信息
DCT/IDCT based source coding and decoding techniques are widely accepted in HDTV systems and other MPEG based MULTIMEDIA applications. In this paper, we propose a new direct 2-D DCT algorithm based on parallel divide and conquer approach for computation in real-time. The algorithm distributes computation by considering one time-domain coefficient at a time and doing partial computation and updating as every coefficient arrives. A novel parallel and fully pipelined architecture with an effective processing time of one cycle per pixel for an N×N size block is designed to implement the algorithm. An unique feature of this architecture is that it integrates shuffling and source-coding into a single compact data-path. We avoid the insertion of a FIFO between the motion-estimator and compression engine. The entire block of frequency co-efficients are sampled in a single cycle for statistical encoding after compression. Also, we use only N2 multipliers and N2 adders.
The proceedings contain 128 papers. The topics discussed include: fast and reliable random number generators for scientific computing;large-scale computations with the unified Danish Eulerian model;a chemical engineer...
详细信息
ISBN:
(纸本)3540290672
The proceedings contain 128 papers. The topics discussed include: fast and reliable random number generators for scientific computing;large-scale computations with the unified Danish Eulerian model;a chemical engineering challenge problem that can benefit from interval methods;interval based Markov Decision processes;a verification method for solutions of linear programming problems;on the approximation of interval functions;the distributed interval geometric machine model;new algorithms for statistical analysis of interval data;on efficiency of tightening bonds in interval global optimization;applying software testing matrices to lapack;parallel algorithms for balanced truncation model reduction of sparse systems;and applying high performance computing techniques in astrophysics.
One approach to fully exploit the potential of Cloud technologies consists in leveraging on the Autonomic Computing paradigm. It could be exploited in order to put in place reconfiguration strategies spanning the whol...
详细信息
ISBN:
(纸本)9783642297373;9783642297366
One approach to fully exploit the potential of Cloud technologies consists in leveraging on the Autonomic Computing paradigm. It could be exploited in order to put in place reconfiguration strategies spanning the whole protocol stack, starting from the infrastructure and then going up to platform/application level protocols. On the other hand, the very base for the design and development of Cloud oriented Autonomic Managers is represented by monitoring sub-systems, able to provide audit data related to any layer within the stack. In this article we present the approach that has been taken while designing and implementing the monitoring sub-system for the Cloud-TM FP7 project, which is aimed at realizing a self-adapting, Cloud based middleware platform providing transactional data access to generic customer applications.
The proceedings contain 26 papers. The topics discussed include: parallel Lepp-bisection algorithm over distributed memory systems;analysis of cost-aware policies for intersection caching in search nodes;an overview o...
ISBN:
(纸本)9781509004263
The proceedings contain 26 papers. The topics discussed include: parallel Lepp-bisection algorithm over distributed memory systems;analysis of cost-aware policies for intersection caching in search nodes;an overview of syntactic and semantic issues in software product lines;a conceptual framework to develop mobile recommender systems of points of interest;exploring the trust and knowledge obsolescence relation;toward a more generalized quantum-inspired evolutionary algorithm for combinatorial optimization problems;wind speed forecast under a distributed learning approach;new version of Davies-Bouldin index for clustering validation based on cylindrical distance;combining techniques to find the number of bins for discretization;anomaly detection using forecasting methods ARIMA and HWDS;a descriptor for handwritten strokes off-line analysis: a preliminary study;OVMMSOM: a variation of MMSOM and VMSOM as a clusterization technique;digital image processing by a parallel selective model;combining descriptors obtained through different sampling techniques in image retrieval;network anomaly detection by IP flow graph analysis: a DDoS attack case study;an approach based on language ontology and serious play methodologies to improve the participation and validation of enterprise architecture structural models;method for applying process mining to the distribution of non-alcoholic beverages;and a feasibility analysis for the application of design patterns in search based product line design.
Performance prediction has always been important in the domain of parallel computing. For programs which are executed on workstation clusters and super computing systems, precise prediction of execution time can help ...
详细信息
ISBN:
(纸本)9781450371964
Performance prediction has always been important in the domain of parallel computing. For programs which are executed on workstation clusters and super computing systems, precise prediction of execution time can help task scheduling and resource management. A practical and effective type of prediction method is the skeleton-based method. It extracts an executable code snippet, called skeleton, from the traces of program executions, and uses the skeleton to replay the behaviors and predict the performance of the original program. However, traditional skeleton-based methods require fixed inputs to construct reliable skeletons. This requirement limits the application scope of skeleton-based methods. In this paper, we present a novel method to construct skeleton for parallel programs. Our method combines code instrument and machine learning techniques, which enable skeletons to dynamically respond varying inputs and make corresponding performance prediction. In our evaluations on three benchmarks, MCB, LULESH and STREAM, the proposed method can achieve 27%, 7% and 9% average prediction error rate, respectively.
Due to the flexibility of data operations and scalability of in-memory cache, Spark has revealed the potential to become the standard distributed framework to replace Hadoop for data-intensive processing in both indus...
详细信息
ISBN:
(纸本)9781509029914
Due to the flexibility of data operations and scalability of in-memory cache, Spark has revealed the potential to become the standard distributed framework to replace Hadoop for data-intensive processing in both industry and academia. However, we observe that the built-in scheduling algorithms in Spark (i.e., FIFO and FAIR) are not optimized for the applications with multiple parallel and independent branches in stages. Specifically, the child stage needs to wait and collect data from all its parent branches, but this wait has no guaranteed upper bound since it is tightly coupled with each branch's workload characteristic, stage order, and their corresponding allocated computing resource. To address this challenge, we investigate a superior solution which ensures all branches acquire suitable resources according to their workload demand in order to let the finish time of each branch be as close as possible. Based on this, we propose a novel scheduling policy, named AutoPath, which can effectively reduce the overall makespan of such kind of applications by detecting and leveraging the parallel path, and adaptively assigning computing resources based on the estimated workload demands during runtime. We implemented the new scheduling scheme in Spark v1.5.0 and evaluated it with selected representative workloads. The experiments demonstrate that our new scheduler effectively reduces the makespan and improves resource utilizations for these applications, compared to the current FIFO and FAIR schedulers.
A single-chip video codec with embedded display controller for videotelephony applications is described. It encodes and decodes simultaneously up to 30 CIF pictures per second according to video-conferencing recommend...
详细信息
A single-chip video codec with embedded display controller for videotelephony applications is described. It encodes and decodes simultaneously up to 30 CIF pictures per second according to video-conferencing recommendations H261, H263 (all five options), and H263+ (six additional options). The die area is 132 mm(2) in a 0.35-mu m technology, and the power consumption is 1.4 W. The chip uses a distributed dedicated multiprocessor architecture, where computation-intensive functions are done by dedicated hardware, and where picture quality or standard dependent parts are done in software on dedicated programmable processors. Main architectural choices are discussed, and emphasis is put on hardware/software partitioning and codesign.
暂无评论