A number of papers have emerged in the last two years that apply and study asynchronous master-slave evolutionary algorithms based on a steady-state model. These efforts are largely motivated by the observation that, ...
详细信息
ISBN:
(纸本)9781450342063
A number of papers have emerged in the last two years that apply and study asynchronous master-slave evolutionary algorithms based on a steady-state model. These efforts are largely motivated by the observation that, unlike traditional (synchronous) EAs, asynchronous EAs are able to make maximal use of many parallel processors, even when some individuals evaluate more slowly than others. Asynchronous EAs do not behave the same as their synchronous counterparts, however, and as of yet there is very little theory that makes it possible to predict how they will perform on new problems. Of some concern is evidence suggesting that the steady-state versions tend to be biased toward regions of the search space where fitness evaluation is cheaper. This has led some authors to suggest a so-called 'quasi-generational' asynchronous EA as an intermediate solution that incurs neither idle time nor significant bias toward fast solutions. We perform experiments with the quasi-generational EA, and show that it does not deliver the promised bene fits: it is, in fact, just as biased toward fast solutions as the steady-state approach is, and it tends to converge even more slowly than the traditional, generational EA.
The proliferation of the web presents an unsolved problem of automatically analyzing billions of pages of natural language. We introduce a scalable algorithm that clusters hundreds of millions of web pages into hundre...
详细信息
ISBN:
(纸本)9781450334693
The proliferation of the web presents an unsolved problem of automatically analyzing billions of pages of natural language. We introduce a scalable algorithm that clusters hundreds of millions of web pages into hundreds of thousands of clusters. It does this on a single mid-range machine using efficient algorithms and compressed document representations. It is applied to two web-scale crawls covering tens of terabytes. ClueWeb09 and ClueWeb12 contain 500 and 733 million web pages and were clustered into 500,000 to 700,000 clusters. To the best of our knowledge, such fine grained clustering has not been previously demonstrated. Previous approaches clustered a sample that limits the maximum number of discoverable clusters. The proposed EM-tree algorithm uses the entire collection in clustering and produces several orders of magnitude more clusters than the existing algorithms. Fine grained clustering is necessary for meaningful clustering in massive collections where the number of distinct topics grows linearly with collection size. These fine-grained clusters show an improved cluster quality when assessed with two novel evaluations using ad hoc search relevance judgments and spam classifications for external validation. These evaluations solve the problem of assessing the quality of clusters where categorical labeling is unavailable and unfeasible.
In this paper we present a scalable, distributed architecture that allocates idle CPUs for task execution, where any node may request the execution of a group of tasks by other ones. A fast, scalable discovery protoco...
详细信息
ISBN:
(纸本)3540393684
In this paper we present a scalable, distributed architecture that allocates idle CPUs for task execution, where any node may request the execution of a group of tasks by other ones. A fast, scalable discovery protocol is an essential component. Also, up to date information about free nodes is efficiently managed in each node by an availability protocol. Both protocols exploit a tree-based peer-to-peer network that adds fault-tolerant capabilities. Results from experiments and simulation tests, using a simple allocation method, show discovery and allocation costs scaling logarithmically with the number of nodes, even with low communication overhead and little, bounded state in each node.
We observe a continuously increased use of Deep Learning (DL) as a specific type of Machine Learning (ML) for data-intensive problems (i.e., 'big data') that requires powerful computing resources with equally ...
详细信息
ISBN:
(纸本)9781665435772
We observe a continuously increased use of Deep Learning (DL) as a specific type of Machine Learning (ML) for data-intensive problems (i.e., 'big data') that requires powerful computing resources with equally increasing performance. Consequently, innovative heterogeneous High-Performance Computing (HPC) systems based on multi-core CPUs and many-core GPUs require an architectural design that addresses end user communities' requirements that take advantage of ML and DL. Still the workloads of end user communities of the simulation sciences (e.g., using numerical methods based on known physical laws) needs to be equally supported in those architectures. This paper offers insights into the Modular Supercomputer Architecture (MSA) developed in the Dynamic Exascale Entry Platform (DEEP) series of projects to address the requirements of both simulation sciences and data-intensive sciences such as High Performance Data Analytics (HPDA). It shares insights into implementing the MSA in the Julich Supercomputing Centre (JSC) hosting Europe No. 1 Supercomputer Julich Wizard for European Leadership Science (JUWELS). We augment the technical findings with experience and lessons learned from two application communities case studies (i.e., remote sensing and health sciences) using the MSA with JUWELS and the DEEP systems in practice. Thus, the paper provides details into specific MSA design elements that enable significant performance improvements of ML and DL algorithms. While this paper focuses on MSA-based HPC systems and application experience, we are not losing sight of advances in Cloud Computing (CC) and Quantum Computing (QC) relevant for ML and DL.
We consider the problem of balancing load items (tokens) on networks. Starting with an arbitrary load distribution, we allow in each round nodes to exchange tokens with their neighbors. The goal is to achieve a distri...
详细信息
ISBN:
(纸本)9781467343831
We consider the problem of balancing load items (tokens) on networks. Starting with an arbitrary load distribution, we allow in each round nodes to exchange tokens with their neighbors. The goal is to achieve a distribution where all nodes have nearly the same number of tokens. For the continuous case where tokens are arbitrarily divisible, most load balancing schemes correspond to Markov chains whose convergence is fairly well-understood in terms of their spectral gap. However, in many applications load items cannot be divided arbitrarily and we need to deal with the discrete case where the load is composed of indivisible tokens. This discretization entails a non-linear behavior due to its rounding errors, which makes the analysis much harder than in the continuous case. Therefore, it has been a major open problem to understand the limitations of discrete load balancing and its relation to the continuous case. We investigate several randomized protocols for different communication models in the discrete case. Our results demonstrate that there is almost no difference between the discrete and continuous case. For instance, for any regular network in the matching model, all nodes have the same load up to an additive constant in (asymptotically) the same number of rounds required in the continuous case. This generalizes and tightens the previous best result, which only holds for expander graphs.
The Hurricane Weather Research and Forecasting (HWRF) model is one of the premier models in NOAA's operational suite of severe weather forecasting systems. An axiom in numerical weather prediction suggests that mo...
详细信息
ISBN:
(纸本)9781479961238
The Hurricane Weather Research and Forecasting (HWRF) model is one of the premier models in NOAA's operational suite of severe weather forecasting systems. An axiom in numerical weather prediction suggests that modeling the environment at high resolution optimizes forecast accuracy. However, due to operational time constraints, only the region immediately surrounding a single hurricane can be modeled in high resolution. Currently, this is achieved by embedding a relatively small high resolution, storm-following pair of grids within a larger and coarser grid. In a previous work, we extended HWRF to support multiple such independent storm-following pair of grids. The result was improved forecast accuracy by virtue of modeling storm-to-storm interactions in high resolution. However, some shortcomings in the underlying WRF framework cause these independent pairs of grids to be simulated sequentially. This limits the model's scalability and makes it impossible to harness this novel capability within the operational time constraints. In this paper, we address this issue by modifying the underlying WRF framework to simulate these independent pairs of storm-following grids in parallel. This is the first approach to be successfully implemented in the history of the WRF framework.
During the recent years, a number of efficient and scalable frequent itemset mining algorithms for big data analytics have been proposed by many researchers. Initially, MapReduce-based frequent itemset mining algorith...
详细信息
Using computationally efficient techniques for transforming the massive amount of Remote Sensing (RS) data into scientific understanding is critical for Earth science. The utilization of efficient techniques through i...
详细信息
ISBN:
(纸本)9781665403696
Using computationally efficient techniques for transforming the massive amount of Remote Sensing (RS) data into scientific understanding is critical for Earth science. The utilization of efficient techniques through innovative computing systems in RS applications has become more widespread in recent years. The continuously increased use of Deep Learning (DL) as a specific type of Machine Learning (ML) for data-intensive problems (i.e., 'big data') requires powerful computing resources with equally increasing performance. This paper reviews recent advances in High-Performance Computing (HPC), Cloud Computing (CC), and Quantum Computing (QC) applied to RS problems. It thus represents a snapshot of the state-of-the-art in ML in the context of the most recent developments in those computing areas, including our lessons learned over the last years. Our paper also includes some recent challenges and good experiences by using Europeans fastest supercomputer for hyper-spectral and multi-spectral image analysis with state-of-the-art data analysis tools. It offers a thoughtful perspective of the potential and emerging challenges of applying innovative computing paradigms to RS problems.
Due to the explosive growth in the variety of smart mobile terminals in wireless networks, the increasing computing capability of mobile chips, and the public's growing concern for personal privacy, it is a better...
详细信息
The scale of modern datasets necessitates the development of efficient distributed optimization methods for machine learning. We present a general-purpose framework for distributed computing environments, CoCoA, that ...
详细信息
The scale of modern datasets necessitates the development of efficient distributed optimization methods for machine learning. We present a general-purpose framework for distributed computing environments, CoCoA, that has an efficient communication scheme and is applicable to a wide variety of problems in machine learning and signal processing. We extend the framework to cover general non-strongly-convex regularizers, including L1-regularized problems like lasso, sparse logistic regression, and elastic net regularization, and show how earlier work can be derived as a special case. We provide convergence guarantees for the class of convex regularized loss minimization objectives, leveraging a novel approach in handling non-strongly-convex regularizers and non-smooth loss functions. The resulting framework has markedly improved performance over state-of-the-art methods, as we illustrate with an extensive set of experiments on real distributed datasets.
暂无评论