We are developing a system for collaborative research and development for a distributed group of researchers at different institutions around the world. In a new paradigm for collaborative computational science, the c...
ISBN:
(纸本)9780769502878
We are developing a system for collaborative research and development for a distributed group of researchers at different institutions around the world. In a new paradigm for collaborative computational science, the computer code and supporting infrastructure itself becomes the collaborating instrument, just as an accelerator becomes the collaborating tool for large numbers of distributed researchers in particle physics. the design of this "Collaboratory" allows many users, with very different areas of expertise, to work coherently together, on distributed computers around the world. Different supercomputers may be used separately, or for problems exceeding the capacity of any single system, multiple supercomputers may be networked together through high speed gigabit networks. Central to this Collaboratory is a new type of community simulation code, called "Cactus". the scientific driving force behind this project is the simulation of Einstein's equations for studying black holes, gravitational waves, and neutron stars, which has brought together researchers in very different fields from many groups around the world to make advances in the study of relativity and astrophysics. But the system is also being developed to provide scientists and engineers, without expert knowledge of parallel or distributed computing, mesh refinement, and so on, with a simple framework for solving any system of partial differential equations on many parallel computer systems, from traditional supercomputers to networks of workstations.
the exponential growth of technological advancements in satellite and airborne remote sensing is giving rise to large volumes of high-dimensional hyperspectral image data. Apache Spark is one of the most popular, exte...
详细信息
ISBN:
(数字)9798331531539
ISBN:
(纸本)9798331531546
the exponential growth of technological advancements in satellite and airborne remote sensing is giving rise to large volumes of high-dimensional hyperspectral image data. Apache Spark is one of the most popular, extensively used and open-source distributedprocessing frameworks, which is proven effective in processing large volumes of remotely sensed hyperspectral images in a time-efficient manner. Open-source distributedprocessing frameworks have proven effective in processing large volumes of remotely sensed hyperspectral images quickly and efficiently. While computational power has been increasing, the rate of data accumulation is more than the processing capabilities. therefore, more efficient algorithms such as dimensionality reduction are needed to process and get accurate performance for the application. this paper proposes an efficient and parallel spectral dimensionality reduction approach based on feature partitioning principal component analysis called scalable SubXPCA. We implemented scalable SubXPCA on a spark cluster distributed environment. We compared scalable SubXPCA against other distributed feature partitioning and various non-feature partitioning dimensionality reduction methods. Our experiments on different real and synthetic datasets of hyperspectral images confirm that SubXPCA classification performance is not only better than its competitors but also that the running time of SubXPCA is better in distributedprocessingthan serial processing. As the size of the hyperspectral image dataset increased, SubXPCA showed a speed up factor of 5.7× and more in the spark cluster compared to the serial version.
In this paper, we develop a multithreaded algorithm for pricing simple options and implement it on a 8 node SMP machine using MIT's supercomputer programming language Cilk. the algorithm dynamically creates lots o...
详细信息
In this paper, we develop a multithreaded algorithm for pricing simple options and implement it on a 8 node SMP machine using MIT's supercomputer programming language Cilk. the algorithm dynamically creates lots of threads to exploit parallelism and relies on the Cilk runtime system to distribute the computation load. We present both analytical and experimental results and our results explain how Cilk could be used effectively to exploit parallelism in the given problem. the analytical results show that our algorithm has a very high average parallelism and hence Cilk is the target paradigm to implement the algorithm. We conclude from our implementation results that the size of the threads, the number of threads created, the load balancer the cost of spawning a thread are parameters that must be considered while designing the algorithm on the Cilk platform.
To enhance the accuracy of ultra-short-term load forecasting while accounting for the impact of distributed energy resources, this paper presents a novel TCN-LSTM cascaded model that integrates photovoltaic penetratio...
详细信息
ISBN:
(数字)9798331523527
ISBN:
(纸本)9798331523534
To enhance the accuracy of ultra-short-term load forecasting while accounting for the impact of distributed energy resources, this paper presents a novel TCN-LSTM cascaded model that integrates photovoltaic penetration rates and multi-level temporal feature extraction for distribution transformer area load prediction. the proposed methodology first establishes a relationship model between PV generation and load consumption to quantify the influence of distributed PV systems on local power demand. Subsequently, a multi-level temporal feature extraction framework is introduced to capture multi-scale temporal patterns within load data. the correlation between load and temporal features is quantified using Mutual Information, with strongly correlated features selected as inputs for the Temporal Convolutional Network model. the architecture culminates in a TCN-LSTM cascade model, where TCN predictions and load data serve as inputs to the Long Short-Term Memory network. this design leverages TCN’s advantages in parallelprocessing and multi-scale feature extraction, while harnessing LSTM’s capability in capturing long-term dependencies. the empirical validation is conducted using load data from a distribution network in a city in southern China. Comparative analyses against standalone machine learning models demonstrate that the proposed method achieves superior load forecasting accuracy, robust stability, and enhanced dynamic response capabilities.
this paper introduces a scalable solution for distributing content-based video analysis tasks using the emerging MapReduce programming model. Scalable and efficient solutions are needed for this type of tasks, as the ...
详细信息
ISBN:
(纸本)9781467362337
this paper introduces a scalable solution for distributing content-based video analysis tasks using the emerging MapReduce programming model. Scalable and efficient solutions are needed for this type of tasks, as the number of multimedia content is growing at an increasing rate. We present a novel implementation utilizing the popular Apache Hadoop MapReduce framework for both analysis job scheduling and video data distribution. We employ face detection as a case example because it represents a popular visual content analysis task. the main contribution of this paper is the performance evaluation of distribution models for video content processing in various configurations. In our experiments, we have compared the performance of our video data distribution method against two alternatives solutions on a seven node cluster. Hadoop's performance overhead in video content analysis was also evaluated. We found Hadoop to be a data efficient solution with minimal computational overhead for the face detection task.
暂无评论