Software Architecture Recovery (SAR) techniques analyze dependencies between software modules and automatically cluster them to achieve high modularity. Many of these approaches employ Genetic algorithms (GAs) for clu...
详细信息
Software Architecture Recovery (SAR) techniques analyze dependencies between software modules and automatically cluster them to achieve high modularity. Many of these approaches employ Genetic algorithms (GAs) for clustering software modules. A major drawback of these algorithms is their lack of scalability. In this paper, we address this drawback by introducing generic software components that can encapsulate subroutines (operators) of a GA to execute them in parallel. We use these components to implement a novel hybrid GA for SAR that exploits parallelism to find better solutions faster. We compare the effectiveness of parallel algorithms with respect to the sequential counterparts that are previously proposed for SAR. We observe that parallelization enables a greater number of iterations to be performed in the search for high-quality solutions. The increased efficiency achieved through parallel processing allows for faster convergence towards optimal solutions by harnessing the power of multiple processing units in a coordinated manner. The amount of improvement in modularity is above 50%, which particularly increases in the context of large-scale systems. Our algorithm can scale to recover the architecture of a large system, Chromium, which has more than 18,500 modules and 750,000 dependencies among these modules.
The realized performance (error-cost tradeoff) of three computational electromagnetic (CEM) methods, which use parallel algorithms on a supercomputer to predict the radar cross section (RCS) of complex targets, are qu...
详细信息
The realized performance (error-cost tradeoff) of three computational electromagnetic (CEM) methods, which use parallel algorithms on a supercomputer to predict the radar cross section (RCS) of complex targets, are quantified using the Austin RCS Benchmark Suite. The article demonstrates how modern benchmark suites can be used to evaluate CEM methods empirically and compare their performances objectively. The Austin RCS Benchmark Suite [1], [2] has recently been populated with 20 carefully selected problem sets that span a wide range in six dimensions of computational difficulty [3].
Computational birational geometry is one of the key playing fields in an algorithmic approach to algebraic geometry, since birational maps are the fundamental way to relate algebraic varieties (or schemes). An importa...
详细信息
ISBN:
(纸本)9783031645280;9783031645297
Computational birational geometry is one of the key playing fields in an algorithmic approach to algebraic geometry, since birational maps are the fundamental way to relate algebraic varieties (or schemes). An important application is an algorithmic approach to the Minimal Model Program (MMP), which aims to classify algebraic varieties with mild singularities by finding simple birational models of such varieties in their birational equivalence class. This note presents work towards parallel methods to solve problems in birational geometry. Making use of a representation of algebraic schemes in terms of charts allows for a parallel computational approach for handling both the varieties and rational maps between them. In this note, we illustrate this approach on examples.
Graph coloring problems are among the most fundamental problems in parallel and distributed computing, and have been studied extensively in both settings. In this context, designing efficient deterministic algorithms ...
详细信息
ISBN:
(纸本)9798350387117;9798350387124
Graph coloring problems are among the most fundamental problems in parallel and distributed computing, and have been studied extensively in both settings. In this context, designing efficient deterministic algorithms for these problems has been found particularly challenging. In this work we consider this challenge, and design a novel framework for derandomizing algorithms for coloring-type problems in the Massively parallel Computation (MPC) model with sublinear space. We give an application of this framework by showing that a recent (degree + 1)-list coloring algorithm by Halldorsson et al. (STOC'22) in the LOCAL model of distributed computation can be translated to the MPC model and efficiently derandomized. Our algorithm runs in O(log log log n) rounds, which matches the complexity of the state of the art algorithm for the (Delta + 1)-coloring problem.
We consider learning problems over training sets in which both, the number of training examples and the dimension of the feature vectors, are large. To solve these problems we propose the random parallel stochastic al...
详细信息
We consider learning problems over training sets in which both, the number of training examples and the dimension of the feature vectors, are large. To solve these problems we propose the random parallel stochastic algorithm (RAPSA). We call the algorithm random parallel because it utilizes multiple parallel processors to operate on a randomly chosen subset of blocks of the feature vector. RAPSA is doubly stochastic since each processor utilizes a random set of functions to compute the stochastic gradient associated with a randomly chosen sets of variable coordinates. algorithms that are parallel in either of these dimensions exist, but RAPSA is the first attempt at a methodology that is parallel in both the selection of blocks and the selection of elements of the training set. In RAPSA, processors utilize the randomly chosen functions to compute the stochastic gradient component associated with a randomly chosen block. The technical contribution of this paper is to show that this minimally coordinated algorithm converges to the optimal classifier when the training objective is strongly convex. Moreover, we present an accelerated version of RAPSA (ARAPSA) that incorporates the objective function curvature information by premultiplying the descent direction by a Hessian approximation matrix. We further extend the results for asynchronous settings and show that if the processors perform their updates without any coordination the algorithms are still convergent to the optimal argument. RAPSA and its extensions are then numerically evaluated on a linear estimation problem and a binary image classification task using the MNIST handwritten digit dataset.
作者:
Nigro, LiberoUniv Calabria
Engn Dept Informat Modelling Elect & Syst Sci DIM I-87036 Arcavacata Di Rende Italy
K-means is a well-known clustering algorithm often used for its simplicity and potential efficiency. Its properties and limitations have been investigated by many works reported in the literature. K-means, though, suf...
详细信息
K-means is a well-known clustering algorithm often used for its simplicity and potential efficiency. Its properties and limitations have been investigated by many works reported in the literature. K-means, though, suffers from computational problems when dealing with large datasets with many dimensions and great number of clusters. Therefore, many authors have proposed and experimented different techniques for the parallel execution of K-means. This paper describes a novel approach to parallel K-means which, today, is based on commodity multicore machines with shared memory. Two reference implementations in Java are developed and their performances are compared. The first one is structured according to a map/reduce schema that leverages the built-in multi-threaded concurrency automatically provided by Java to parallel streams. The second one, allocated on the available cores, exploits the parallel programming model of the Theatre actor system, which is control-based, totally lock-free, and purposely relies on threads as coarse-grain "programming-in-the-large" units. The experimental results confirm that some good execution performance can be achieved through the implicit and intuitive use of Java concurrency in parallel streams. However, better execution performance can be guaranteed by the modular Theatre implementation which proves more adequate for an exploitation of the computational resources.
We describe the parallel algorithms for studying the structural features of the anomalies in the gravity and magnetic fields of the lithosphere, which are based on the height transformations of the data. The algorithm...
详细信息
We describe the parallel algorithms for studying the structural features of the anomalies in the gravity and magnetic fields of the lithosphere, which are based on the height transformations of the data. The algorithms are numerically implemented on the Uran supercomputer. The suggested computer technology is used for constructing the maps of the regional and local anomalies of the magnetic and gravity fields for the northeastern sector of Europe within an area confined between 48A degrees-62A degrees E and 60A degrees-68A degrees N.
Contour tracing is a critical technique in image analysis and computer vision, with applications in medical imaging, big data analytics, machine learning, and robotics. We introduce a novel hardware accelerator based ...
详细信息
Contour tracing is a critical technique in image analysis and computer vision, with applications in medical imaging, big data analytics, machine learning, and robotics. We introduce a novel hardware accelerator based on the adapted and segmented (AnS) vertex following (VF) and run-data-based-following (RDBF) families of fast contour tracing algorithms implemented on the Zynq-7000 field-programmable gate array (FPGA) platform. Our algorithmic implementation utilizing a mesh-interconnected multiprocessor architecture is at least 55x faster than the existing implementations. With input-output overheads, it is up to 12.5x faster. Our hardware accelerator for contour tracing is benchmarked on mesh-interconnected hardware, all three families of contour tracing algorithms, and a random image from the Imagenet database. Our implementation is, thus, faster for FPGA, application-specific integrated circuit (ASIC), graphics processing unit (GPU), and supercomputer hardware in comparison to the central processing unit (CPU)-GPU collaborative approach and offers a better solution for those systems where the input-output overheads can be minimized, such as parallel processing arrays and mesh-connected sensor networks.
Modern power grids incorporate renewable energy at an increased pace, placing greater stress on the power grid equipment and shifting their operational conditions towards their limits. As a result, failures of any net...
详细信息
Modern power grids incorporate renewable energy at an increased pace, placing greater stress on the power grid equipment and shifting their operational conditions towards their limits. As a result, failures of any network component, such as a transmission line or power generator, can be critical to the overall grid operation. The security constrained optimal power flow (SCOPF) aims for the long term precontingency operating state, such that in the event of any contingency, the power grid will remain secure. For a realistic power network, however, with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames established by real-time industrial operations. We propose a parallel distributed memory structure exploiting framework, BELTISTOS-SC, which accelerates the solution of SCOPF problems over state of the art techniques. The acceleration on single-core execution is achieved by a structure-exploiting interior point method, employing successive Schur complement evaluations to further reduce the size of the systems solved at each iteration while maintaining sparsity, resulting in lower computational resources for the linear system solution. Additionally the parallel, distributed memory implementation of the proposed framework is also presented in detail and validated through several large-scale examples, demonstrating its efficiency for large-scale SCOPF problems.
Achieving lifelike atmospheric effects, such as fog, is essential in creating immersive environments and poses a formidable challenge in real-time rendering. Highly realistic rendering of complex lighting interacting ...
详细信息
Achieving lifelike atmospheric effects, such as fog, is essential in creating immersive environments and poses a formidable challenge in real-time rendering. Highly realistic rendering of complex lighting interacting with dynamic fog can be very resource-intensive, due to light bouncing through a complex participating media multiple times. We propose an approach that uses a multi-layered spherical harmonics probe grid to share computations temporarily. In addition, this world-space storage enables the sharing of radiance data between multiple viewers. In the context of cloud rendering this means faster rendering and a significant enhancement in overall rendering quality with efficient resource utilization.
暂无评论