检索结果-内蒙古大学图书馆

Genetic and Evolutionary Computation Conference (GECCO)

作者： Scott, Eric O. De Jong, Kenneth A. George Mason Univ Dept Comp Sci Fairfax VA 22030 USA

ISBN: (纸本)9781450342063

A number of papers have emerged in the last two years that apply and study asynchronous master-slave evolutionary algorithms based on a steady-state model. These efforts are largely motivated by the observation that, unlike traditional (synchronous) EAs, asynchronous EAs are able to make maximal use of many parallel processors, even when some individuals evaluate more slowly than others. Asynchronous EAs do not behave the same as their synchronous counterparts, however, and as of yet there is very little theory that makes it possible to predict how they will perform on new problems. Of some concern is evidence suggesting that the steady-state versions tend to be biased toward regions of the search space where fitness evaluation is cheaper. This has led some authors to suggest a so-called 'quasi-generational' asynchronous EA as an intermediate solution that incurs neither idle time nor significant bias toward fast solutions. We perform experiments with the quasi-generational EA, and show that it does not deliver the promised bene fits: it is, in fact, just as biased toward fast solutions as the steady-state approach is, and it tends to converge even more slowly than the traditional, generational EA.

关键词： Evolutionary algorithms parallel and distributed algorithms Asynchronous algorithms

来源：评论

学校读者我要写书评

暂无评论

parallel Streaming Signature EM-tree: A Clustering Algorithm for Web Scale Applications 15

Parallel Streaming Signature EM-tree: A Clustering Algorithm...

引用

24th International Conference on World Wide Web (WWW)

作者： De Vries, Christopher M. De Vine, Lance Geva, Shlomo Nayak, Richi Game Analyt ApS Berlin Germany Queensland Univ Technol Brisbane Qld Australia

ISBN: (纸本)9781450334693

The proliferation of the web presents an unsolved problem of automatically analyzing billions of pages of natural language. We introduce a scalable algorithm that clusters hundreds of millions of web pages into hundreds of thousands of clusters. It does this on a single mid-range machine using efficient algorithms and compressed document representations. It is applied to two web-scale crawls covering tens of terabytes. ClueWeb09 and ClueWeb12 contain 500 and 733 million web pages and were clustered into 500,000 to 700,000 clusters. To the best of our knowledge, such fine grained clustering has not been previously demonstrated. Previous approaches clustered a sample that limits the maximum number of discoverable clusters. The proposed EM-tree algorithm uses the entire collection in clustering and produces several orders of magnitude more clusters than the existing algorithms. Fine grained clustering is necessary for meaningful clustering in massive collections where the number of distinct topics grows linearly with collection size. These fine-grained clusters show an improved cluster quality when assessed with two novel evaluations using ad hoc search relevance judgments and spam classifications for external validation. These evaluations solve the problem of assessing the quality of clusters where categorical labeling is unavailable and unfeasible.

关键词： Document Clustering Large Scale Learning parallel and distributed algorithms Random Projections Hashing Document Signatures Search Tree Compressed Learning

来源：评论

学校读者我要写书评

暂无评论

Scalable architecture for allocation of idle CPUs in a P2P network

引用

2nd International Conference on High Performance Computing and Communications (HPCC 2006)

作者： Celaya, Javier Arronategui, Unai Univ Zaragoza Dept Comp Sci & Syst Engn Zaragoza 50018 Spain

ISBN: (纸本)3540393684

In this paper we present a scalable, distributed architecture that allocates idle CPUs for task execution, where any node may request the execution of a group of tasks by other ones. A fast, scalable discovery protocol is an essential component. Also, up to date information about free nodes is efficiently managed in each node by an availability protocol. Both protocols exploit a tree-based peer-to-peer network that adds fault-tolerant capabilities. Results from experiments and simulation tests, using a simple allocation method, show discovery and allocation costs scaling logarithmically with the number of nodes, even with low communication overhead and little, bounded state in each node.

关键词： parallel and distributed architectures networking protocols and routing and algorithms reliability and fault-tolerance grid computing peer-to-peer computing parallel and distributed algorithms

来源：评论

学校读者我要写书评

暂无评论

Practice and Experience in using parallel and Scalable Machine Learning with Heterogenous Modular Supercomputing Architectures

Practice and Experience in using Parallel and Scalable Machi...

引用

35th IEEE International parallel and distributed Processing Symposium (IPDPS)

作者： Riedel, Morris Sedona, Rocco Barakat, Chadi Einarsson, Petur Hassanian, Reza Cavallaro, Gabriele Book, Matthias Neukirchen, Helmut Lintermann, Andreas Univ Iceland Dept Comp Sci Reykjavik Iceland Forschungszentrum Julich Julich Supercomp Ctr Julich Germany

ISBN: (纸本)9781665435772

We observe a continuously increased use of Deep Learning (DL) as a specific type of Machine Learning (ML) for data-intensive problems (i.e., 'big data') that requires powerful computing resources with equally increasing performance. Consequently, innovative heterogeneous High-Performance Computing (HPC) systems based on multi-core CPUs and many-core GPUs require an architectural design that addresses end user communities' requirements that take advantage of ML and DL. Still the workloads of end user communities of the simulation sciences (e.g., using numerical methods based on known physical laws) needs to be equally supported in those architectures. This paper offers insights into the Modular Supercomputer Architecture (MSA) developed in the Dynamic Exascale Entry Platform (DEEP) series of projects to address the requirements of both simulation sciences and data-intensive sciences such as High Performance Data Analytics (HPDA). It shares insights into implementing the MSA in the Julich Supercomputing Centre (JSC) hosting Europe No. 1 Supercomputer Julich Wizard for European Leadership Science (JUWELS). We augment the technical findings with experience and lessons learned from two application communities case studies (i.e., remote sensing and health sciences) using the MSA with JUWELS and the DEEP systems in practice. Thus, the paper provides details into specific MSA design elements that enable significant performance improvements of ML and DL algorithms. While this paper focuses on MSA-based HPC systems and application experience, we are not losing sight of advances in Cloud Computing (CC) and Quantum Computing (QC) relevant for ML and DL.

关键词： High performance computing cloud computing quantum computing machine learning deep learning parallel and distributed algorithms remote sensing health sciences modular supercomputer architecture

来源：评论

学校读者我要写书评

暂无评论

Tight Bounds for Randomized Load Balancing on Arbitrary Network Topologies

Tight Bounds for Randomized Load Balancing on Arbitrary Netw...

引用

IEEE 53rd Annual Symposium on Foundations of Computer Science (FOCS)

作者： Sauerwald, Thomas Sun, He Max Planck Inst Informat D-66123 Saarbrucken Germany

ISBN: (纸本)9781467343831

We consider the problem of balancing load items (tokens) on networks. Starting with an arbitrary load distribution, we allow in each round nodes to exchange tokens with their neighbors. The goal is to achieve a distribution where all nodes have nearly the same number of tokens. For the continuous case where tokens are arbitrarily divisible, most load balancing schemes correspond to Markov chains whose convergence is fairly well-understood in terms of their spectral gap. However, in many applications load items cannot be divided arbitrarily and we need to deal with the discrete case where the load is composed of indivisible tokens. This discretization entails a non-linear behavior due to its rounding errors, which makes the analysis much harder than in the continuous case. Therefore, it has been a major open problem to understand the limitations of discrete load balancing and its relation to the continuous case. We investigate several randomized protocols for different communication models in the discrete case. Our results demonstrate that there is almost no difference between the discrete and continuous case. For instance, for any regular network in the matching model, all nodes have the same load up to an additive constant in (asymptotically) the same number of rounds required in the continuous case. This generalizes and tightens the previous best result, which only holds for expander graphs.

关键词： randomized algorithms parallel and distributed algorithms graph expansion Markov chains load balancing

来源：评论

学校读者我要写书评

暂无评论

Improving the Scalability of a Hurricane Forecast System in Mixed-parallel Environments Advancing the WRF framework toward faster and more accurate forecasts 16

Improving the Scalability of a Hurricane Forecast System in ...

引用

16th IEEE Int Conf on High Performance Computing and Communications/11th IEEE Int Conf on Embedded Software and Systems\6th Int Symposium on Cyberspace Safety and Security

作者： Quirino, Thiago Santos Delgado, Javier Zhang, Xuejin NOAA Hurricane Res Div US DOC OARAOMLHRD Miami FL 33165 USA Univ Miami Cooperat Inst Marine & Atmospher Studies Miami FL USA

ISBN: (纸本)9781479961238

The Hurricane Weather Research and Forecasting (HWRF) model is one of the premier models in NOAA's operational suite of severe weather forecasting systems. An axiom in numerical weather prediction suggests that modeling the environment at high resolution optimizes forecast accuracy. However, due to operational time constraints, only the region immediately surrounding a single hurricane can be modeled in high resolution. Currently, this is achieved by embedding a relatively small high resolution, storm-following pair of grids within a larger and coarser grid. In a previous work, we extended HWRF to support multiple such independent storm-following pair of grids. The result was improved forecast accuracy by virtue of modeling storm-to-storm interactions in high resolution. However, some shortcomings in the underlying WRF framework cause these independent pairs of grids to be simulated sequentially. This limits the model's scalability and makes it impossible to harness this novel capability within the operational time constraints. In this paper, we address this issue by modifying the underlying WRF framework to simulate these independent pairs of storm-following grids in parallel. This is the first approach to be successfully implemented in the history of the WRF framework.

关键词： High-performance scientific and engineering computing HWRF WRF parallel and distributed algorithms

来源：评论

学校读者我要写书评

暂无评论

A data structure perspective to the RDD-based Apriori algorithm on Spark

引用

International Journal of Information Technology (Singapore) 2022年第3期14卷 1585-1594页

作者： Singh, Pankaj Singh, Sudhakar Mishra, P.K. Garg, Rakhi Department of Computer Science Banaras Hindu University Varanasi India Department of Electronics and Communication University of Allahabad Prayagraj India Mahila Maha Vidyalaya Banaras Hindu University Varanasi India

During the recent years, a number of efficient and scalable frequent itemset mining algorithms for big data analytics have been proposed by many researchers. Initially, MapReduce-based frequent itemset mining algorithms on Hadoop cluster were proposed. Although, Hadoop has been developed as a cluster computing system for handling and processing big data, but the performance of Hadoop does not meet the expectation for the iterative algorithms of data mining, due to its high I/O, and writing and then reading intermediate results in the disk. Consequently, Spark has been developed as another cluster computing infrastructure which is much faster than Hadoop due to its in-memory computation. It is highly suitable for iterative algorithms and supports batch, interactive, iterative, and stream processing of data. Many frequent itemset mining algorithms have been re-designed on the Spark, and most of them are Apriori-based. All these Spark-based Apriori algorithms use Hash Tree as the underlying data structure. This paper investigates the efficiency of various data structures for the Spark-based Apriori. Although, the data structure perspective has been investigated previously, but for the MapReduce-based Apriori, and it must be re-investigated in the distributed computing environment of Spark. The considered underlying data structures are Hash Tree, Trie, and Hash Table Trie. The experimental results on the benchmark datasets show that the performance of Spark-based Apriori with Trie and Hash Table Trie are almost similar but both perform many times better than Hash Tree in the distributed computing environment of Spark. © 2019, Bharati Vidyapeeth's Institute of Computer Applications and Management.

关键词： Apriori Big data analytics Frequent itemset mining parallel and distributed algorithms RDD Spark

来源：评论

学校读者我要写书评

暂无评论

PRACTICE AND EXPERIENCE IN USING parallel AND SCALABLE MACHINE LEARNING IN REMOTE SENSING FROM HPC OVER CLOUD TO QUANTUM COMPUTING

PRACTICE AND EXPERIENCE IN USING PARALLEL AND SCALABLE MACHI...

引用

IEEE International Geoscience and Remote Sensing Symposium (IGARSS)

作者： Rieder, Morris Cavallaro, Gabriele Benediktsson, Jon Atli Univ Iceland Sch Engn & Nat Sci Reykjavik Iceland Forschungszentrum Julich Julich Supercomp Ctr Julich Germany

ISBN: (纸本)9781665403696

Using computationally efficient techniques for transforming the massive amount of Remote Sensing (RS) data into scientific understanding is critical for Earth science. The utilization of efficient techniques through innovative computing systems in RS applications has become more widespread in recent years. The continuously increased use of Deep Learning (DL) as a specific type of Machine Learning (ML) for data-intensive problems (i.e., 'big data') requires powerful computing resources with equally increasing performance. This paper reviews recent advances in High-Performance Computing (HPC), Cloud Computing (CC), and Quantum Computing (QC) applied to RS problems. It thus represents a snapshot of the state-of-the-art in ML in the context of the most recent developments in those computing areas, including our lessons learned over the last years. Our paper also includes some recent challenges and good experiences by using Europeans fastest supercomputer for hyper-spectral and multi-spectral image analysis with state-of-the-art data analysis tools. It offers a thoughtful perspective of the potential and emerging challenges of applying innovative computing paradigms to RS problems.

关键词： High performance computing cloud computing quantum computing machine learning deep learning parallel and distributed algorithms remote sensing

来源：评论

学校读者我要写书评

暂无评论

Efficiency Optimization Method of Wireless Federated Learning Considering Computational Capability and Channel State 23

Efficiency Optimization Method of Wireless Federated Learnin...

引用

23rd IEEE International Conference on Communication Technology, ICCT 2023

作者： Pang, Guohao Li, Fengguo Zhu, Xiaorong College of Portland Nanjing University of Posts and Telecommunications Nanjing China China Mobile Company Shandong China College of Telecommunication and Information Engineering Nanjing University of Posts and Telecommunications Nanjing China

ISBN: (纸本)9798350325959

Due to the explosive growth in the variety of smart mobile terminals in wireless networks, the increasing computing capability of mobile chips, and the public's growing concern for personal privacy, it is a better solution to decentralize the deep learning framework for mobile services that can enhance user experience to the mobile terminal layer. In this paper, we study the joint optimization problem of processor performance and channel state in a non-independent distribution scenario (non-IID), while considering the user's device experience problem to improve the battery efficiency of the terminal device (TD) as much as possible and maximize the efficiency of the federated learning (FL) system while ensuring low local upload latency. To improve the efficiency of wireless federated learning (WFL), we propose a specific and complete scheduling strategy involving both computational and communication aspects. First, we model the total problem and decouple it into several sub-problems to solve according to the nature of the variables. Then, we propose a Reduced Load algorithm (RL) to solve the task allocation problem and a dynamic bandwidth allocation strategy to solve the bandwidth allocation problem. Simulation results show that the proposed scheduling strategy can achieve higher learning performance with lower training latency and is capable of adaptively adjusting the bandwidth allocation to decrease upload latency. © 2023 IEEE.

关键词： Federated learning parallel and distributed algorithms resource allocation scheduling policies wireless communication

来源：评论

学校读者我要写书评

暂无评论

CoCoA: A General Framework for Communication-Efficient distributed Optimization

引用

JOURNAL OF MACHINE LEARNING RESEARCH 2018年 18卷

作者： Smith, Virginia Forte, Simone Ma, Chenxin Takac, Martin Jordan, Michael I. Jaggi, Martin Stanford Univ Dept Comp Sci Stanford CA 94305 USA Swiss Fed Inst Technol Dept Comp Sci CH-8006 Zurich Switzerland Lehigh Univ Ind & Syst Engn Dept Bethlehem PA 18015 USA Univ Calif Berkeley Div Comp Sci Berkeley CA 94720 USA Univ Calif Berkeley Dept Stat Berkeley CA 94720 USA Ecole Polytech Fed Lausanne Sch Comp & Commun Sci CH-1015 Lausanne Switzerland

The scale of modern datasets necessitates the development of efficient distributed optimization methods for machine learning. We present a general-purpose framework for distributed computing environments, CoCoA, that has an efficient communication scheme and is applicable to a wide variety of problems in machine learning and signal processing. We extend the framework to cover general non-strongly-convex regularizers, including L1-regularized problems like lasso, sparse logistic regression, and elastic net regularization, and show how earlier work can be derived as a special case. We provide convergence guarantees for the class of convex regularized loss minimization objectives, leveraging a novel approach in handling non-strongly-convex regularizers and non-smooth loss functions. The resulting framework has markedly improved performance over state-of-the-art methods, as we illustrate with an extensive set of experiments on real distributed datasets.

关键词： Convex optimization distributed systems large-scale machine learning parallel and distributed algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：