the rapid development of web service technology brings up a number of crucial requirements for designing service computing runtime, such as supporting multiple message exchange patterns, switching among different tran...
详细信息
ISBN:
(纸本)9781457706783
the rapid development of web service technology brings up a number of crucial requirements for designing service computing runtime, such as supporting multiple message exchange patterns, switching among different transports, integrating various extended web service protocols and achieving robust performance under high concurrency. Based on staged event-driven architecture, we propose a novel architecture for an adaptive web-service-centric service computing runtime, named SEDA4SC. In SEDA4SC, the process of basic and extended web service protocols is divided into four primary event-driven stages to enable system independence and module isolation. Moreover, this architecture allows messages to be handled in two independent pipelines: the input pipeline and the output pipeline. Arbitrary message exchange patterns can be supported through a combination of the two pipelines. With SEDA4SC, we design and implement a service computing runtime system. the performance evaluation results show that our system exhibits robust performance under high concurrency.
the solution of large and complex coupled electromechanical problems requires highperformancecomputing resources. In the last years the use of Graphic Processing Units (GPUs) has gained increasing popularity in scie...
详细信息
ISBN:
(纸本)9781467303057;9781467303064
the solution of large and complex coupled electromechanical problems requires highperformancecomputing resources. In the last years the use of Graphic Processing Units (GPUs) has gained increasing popularity in scientific computing because of their low cost and parallel architecture. In this paper the authors report the main results of the GPU approach to the parallelization of a research code for the electromagnetic launcher analysis. Programming a GPU - based environment poses a number of critical issues that have to be carefully addressed in order to fully exploit the potentiality of the system. Data have to be properly organized in order to fit the Single Instruction Multiple Data scheme;the data transfer between the host and the device, as well as the memory management of the GPU deserve accurate programming. Two examples of application of the parallelized code have been reported to show the performance improvements that can be obtained in the numerical analysis of both rail and induction launchers.
Future highperformancecomputing will undoubtedly reach Petascale and beyond. Today's HPC is tomorrow's Personal computing. What are the evolving processor architectures towards Multi-core and Many-core for t...
详细信息
the last decade has seen several changes in the structure and emphasis of enterprise IT systems. Specific infrastructure trends have included the emergence of large consolidated data centers, the adoption of virtualiz...
详细信息
ISBN:
(纸本)0769522750
the last decade has seen several changes in the structure and emphasis of enterprise IT systems. Specific infrastructure trends have included the emergence of large consolidated data centers, the adoption of virtualization and modularization, and an increased commoditization of hardware. At the application level, boththe workload mix and usage patterns have evolved to an increased emphasis on service-centric computing and SLA-driven performance tuning. these, often dramatic, changes in the enterprise IT landscape motivate equivalent changes in the emphasis of architecture research. In this paper, we summarize some recent trends in enterprise IT systems and discuss the implications for architecture research, suggesting some high-level challenges and open questions for the community to address.
In recent years, the Deep Neural Network (DNN) has been successfully used in image classification. Most of existing DNN often need to learn a very large set of parameters, which require a huge amount of computational ...
详细信息
ISBN:
(纸本)9781538637906
In recent years, the Deep Neural Network (DNN) has been successfully used in image classification. Most of existing DNN often need to learn a very large set of parameters, which require a huge amount of computational resources and time to train these model parameters using the gradient descent and back-propagation procedure. To solve this issue, the PCANet has been developed for high efficient design and training of the DNN. Compared with traditional DNN, PCANet has simpler structure and better performance, which makes it attractive for hardware design. To overcome the limitations of PCANet and significantly improve its performance, we have proposed a novel model named Constrained high Dispersal Network (CHDNet) which is a variant of PCANet. In this paper, we implement the CHDNet on the Xilinx ZYNQ FPGA to ensure the instantaneity of the system with lower power than personal computer needed by taking advantage of the algorithmic parallelism and ZYNQ architecture. Our experimental results over two major datasets, the MNIST dataset for handwritten digits recognition, and the Extended Yale B dataset for face recognition, demonstrate that our model of implementation on FPGA is more than 15x faster than software implementation on PC (Intel i7-4720HQ, 2.6GHz).
the growth in data-intensive scientific applications poses strong demands on the HPC storage subsystem, as data needs to be copied from compute nodes to I/O nodes and vice versa for jobs to run. the emerging trend of ...
详细信息
ISBN:
(纸本)9781538677698
the growth in data-intensive scientific applications poses strong demands on the HPC storage subsystem, as data needs to be copied from compute nodes to I/O nodes and vice versa for jobs to run. the emerging trend of adding denser, NVM-based burst buffers to compute nodes, however, offers the possibility of using these resources to build temporary filesystems with specific I/O optimizations for a batch job. In this work, we present echofs, a temporary filesystem that coordinates withthe job scheduler to preload a job's input files into node-local burst buffers. We present the results measured with NVM emulation, and different FS backends with DAX/FUSE on a local node, to show the benefits of our proposal and such coordination.
Cloud computing allows for elasticity as users can dynamically benefit from new virtual resources when their workload increases. Such a feature requires highly reactive resource provisioning mechanisms. In this paper,...
详细信息
ISBN:
(纸本)9781509012336
Cloud computing allows for elasticity as users can dynamically benefit from new virtual resources when their workload increases. Such a feature requires highly reactive resource provisioning mechanisms. In this paper, we propose two new workload prediction models, based on constraint programming and neural networks, that can be used for dynamic resource provisioning in Cloud environments. We also present two workload trace generators that can help to extend an experimental dataset in order to test more widely resource optimization heuristics. Our models are validated using real traces from a small Cloud provider. Both approaches are shown to be complimentary as neural networks give better prediction results, while constraint programming is more suitable for trace generation.
A radiative transfer solver that implements the LTSn method was optimized and parallelized using the MPI message passing communication library. Timing and profiling information was obtained for the sequential code in ...
详细信息
ISBN:
(纸本)0769520464
A radiative transfer solver that implements the LTSn method was optimized and parallelized using the MPI message passing communication library. Timing and profiling information was obtained for the sequential code in order to identify performance bottlenecks. performance tests were executed in a distributed memory parallel machine, a multi-computer based on IA-32 architecture. the radiative transfer equation was solved for a cloud test case to evaluate the parallel performance of the LTSn method. the LTSn code include spatial discretization of the domain and Fourier decomposition of the radiances leading to independent azimuthal modes. this yields an independent radiative transfer equation for each mode that can be executed by a different processor in a parallel implementation. Speed-up results show that the parallel implementation is suitable for the used architecture.
Researchers are constantly looking for ways to improve the execution time of parallel applications on distributed systems. Although compile-time static scheduling heuristics employ complex mechanisms, the quality of t...
详细信息
ISBN:
(纸本)0769520464
Researchers are constantly looking for ways to improve the execution time of parallel applications on distributed systems. Although compile-time static scheduling heuristics employ complex mechanisms, the quality of their schedules are handicapped by estimated run-time costs. On the other hand, while dynamic schedulers use actual run-time costs, they have to be of low complexity in order to reduce the scheduling overhead this paper investigates the viability of integrating these two approaches into a hybrid scheduling framework. the relationship between static schedulers, dynamic heuristics and scheduling events are examined the results show that a hybrid scheduler can indeed improve the schedules produced by good traditional static list scheduling algorithms.
We consider the implementation of a parallel Monte Carlo code for high-performance simulations on PC clusters with MPI. We carry out tests of speedup and efficiency. the code is used for numerical simulations of pure ...
详细信息
ISBN:
(纸本)0769520464
We consider the implementation of a parallel Monte Carlo code for high-performance simulations on PC clusters with MPI. We carry out tests of speedup and efficiency. the code is used for numerical simulations of pure SU (2) lattice gauge theory at very large lattice volumes, in order to study the infrared behavior of gluon and ghost propagators. this problem is directly related to the confinement of quarks and gluons in the physics of strong interactions.
暂无评论