In the age of network sciences and machine learning, efficient algorithms are now in higher demand more than ever before. Big Data fundamentally challenges the classical notion of efficient algorithms: algorithms that...
详细信息
ISBN:
(纸本)9781450355810
In the age of network sciences and machine learning, efficient algorithms are now in higher demand more than ever before. Big Data fundamentally challenges the classical notion of efficient algorithms: algorithms that used to be considered efficient, according to polynomial-time characterization, may no longer be adequate for solving today's problems. It is not just desirable, but essential, that efficient algorithms should be scalable. In other words, their complexity should be nearly linear or sub-linear with respect to the problem size. Thus, scalability, not just polynomial-time computability, should be elevated as the central complexity notion for characterizing efficient computation. Using several basic tasks in network analysis, machine learning, and optimization as examples - in this talk - I will highlight a family of fundamental algorithmic techniques for designing provably-good scalable algorithms.
This editorial outlines the research context, the needs and challenges on the route to exascale. In particular the focus is on novel mathematical methods and mathematical modeling approaches together with scalable sci...
详细信息
This editorial outlines the research context, the needs and challenges on the route to exascale. In particular the focus is on novel mathematical methods and mathematical modeling approaches together with scalable scientific algorithms that are needed to enable key science applications at extreme-scale. This is especially true as HPC systems continue to scale up in compute node and processor core count. These extreme-scale systems require novel mathematical methods to be developed that lead to scalable scientific algorithms to hide network and memory latency, have very high computation/communication overlap, have minimal communication, have fewer synchronization points. It stresses the need of scalability at all levels, starting from mathematical methods level through algorithmic level, and down to systems level in order to achieve overall scalability. It also points out that with the advances of Data Science in the past few years the need of such scalable mathematical methods and algorithms able to handle data and compute intensive applications at scale becomes even more important. The papers in the special issue are selected to address one or several key challenges on the route to exascale. (C) 2016 Published by Elsevier B.V.
Linear model trees are regression trees that incorporate linear models in the leaf nodes. This preserves the intuitive interpretation of decision trees and at the same time enables them to better capture linear relati...
详细信息
Linear model trees are regression trees that incorporate linear models in the leaf nodes. This preserves the intuitive interpretation of decision trees and at the same time enables them to better capture linear relationships, which is hard for standard decision trees. But most existing methods for fitting linear model trees are time consuming and therefore not scalable to large data sets. In addition, they are more prone to overfitting and extrapolation issues than standard regression trees. In this paper we introduce PILOT, a new algorithm for linear model trees that is fast, regularized, stable and interpretable. PILOT trains in a greedy fashion like classic regression trees, but incorporates an L2 boosting approach and a model selection rule for fitting linear models in the nodes. The abbreviation PILOT stands for PIecewise Linear Organic Tree, where 'organic' refers to the fact that no pruning is carried out. PILOT has the same low time and space complexity as CART without its pruning. An empirical study indicates that PILOT tends to outperform standard decision trees and other linear model trees on a variety of data sets. Moreover, we prove its consistency in an additive model setting under weak assumptions. When the data is generated by a linear model, the convergence rate is polynomial.
scalable algorithms are conceived for obtaining the sum-rate capacity of the reconfigurable intelligent surface (RIS)-aided multiuser (MU) multiple-input multiple-output (MIMO) broadcast channel (BC), where a multi-an...
详细信息
scalable algorithms are conceived for obtaining the sum-rate capacity of the reconfigurable intelligent surface (RIS)-aided multiuser (MU) multiple-input multiple-output (MIMO) broadcast channel (BC), where a multi-antenna base station (BS) transmits signals to multi-antenna users with the help of an RIS equipped with a massive number of finite-resolution programmable reflecting elements (PREs). As a byproduct, scalable path-following algorithms emerge for determining the sum-rate capacity of the conventional MIMO BCs, closing a long-standing open problem of information theory. The paper also develops scalable algorithms for maximizing the minimum rate (max-min rate optimization) of the users achieved by the joint design of RIS's PRE and transmit beamforming for such an RIS-aided BC. The simulations provided confirm the high performance achieved by the algorithms developed, despite their low computational complexity.
Delaunay refinement is a useful tool for sampling and meshing. Pioneered by Ruppert and Chew for piecewise-linear complexes in R 2, Delaunay refinement has been extended to myriad classes of shapes including smooth 1-...
详细信息
Delaunay refinement is a useful tool for sampling and meshing. Pioneered by Ruppert and Chew for piecewise-linear complexes in R 2, Delaunay refinement has been extended to myriad classes of shapes including smooth 1- and 2-manifolds, volumes bounded by smooth 2-manifolds, and piecewise-smooth complexes. Delaunay refinement algorithms often carry certain guarantees regarding the geometric and topological closeness of output to input, as well as guarantees of the quality of mesh elements, making meshes generated via Delaunay refinement a natural choice for simulation and rendering. A major shortcoming of Delaunay refinement is its scalability: as the size of the mesh grows, the data structures necessary to carry out Delaunay refinement efficiently (such as the Delaunay triangulation and its dual, the Voronoi diagram) also grow, and this incurs memory thrashing when generating dense meshes. In this dissertation, we improve Delaunay refinement in two main capacities: (1) we improve the memory scalability of Delaunay refinement to allow for the generation of truly huge meshes, and (2) we improve the time scalability of Delaunay refinement by developing a novel parallel algorithm. To address the issue of memory scalability, we developed a localized refinement method embodying a divide-and-conquer paradigm for meshing smooth surfaces. The algorithm divides the sampling of the input domain via octree and meshes one node of the octree at a time, thus reducing memory requirements. Our theoretical results show that the algorithm terminates, and that the output is not only consistent, but also is geometrically close to the input and is a subcomplex of the Delaunay triangulation of the sampling restricted to the input surface. Our initial work nicely addresses the aforementioned shortcoming of Delaunay refinement, but only for smooth 2-manifolds. In later work, we extended this technique to another important input class: volumes bounded by smooth 2-manifolds. It is not immediat
Genome sequencing projects are rapidly contributing to the rise of high-dimensional protein sequence datasets. Extracting features from a high-dimensional protein sequence dataset poses many challenges. However, many ...
详细信息
Genome sequencing projects are rapidly contributing to the rise of high-dimensional protein sequence datasets. Extracting features from a high-dimensional protein sequence dataset poses many challenges. However, many features extraction methods exist, but extracting features from millions of protein sequences becomes impractical because these approaches are not scalable. Therefore, to design an efficient scalable feature extraction approach that extracts significant features, we have proposed two Apache Spark-based scalable feature extraction approaches that extracts significantly important features based on statistical properties from huge protein sequences, which are termed 60d-SPF (60-dimensional scalable Protein Feature) and 6d-SCPSF (6-dimensional scalable Co-occurrence-based Probability-Specific Feature). The proposed 60d-SPF and 6d-SCPSF approaches capture the statistical properties of amino acids to create a fixed-length numeric feature vector that represents each protein sequence in terms of 60-dimensional and 6-dimensional features, respectively. The preprocessed huge protein sequences are used as an input in four clustering algorithms, i.e., scalable random sampling with iterative optimization fuzzy c-means (SRSIO-FCM), scalable literal fuzzy c-means (SLFCM), kernelized SRSIO-FCM (KSRSIO-FCM), and kernelized SLFCM (KSLFCM) for clustering. We have conducted extensive experiments on various soybean protein datasets to demonstrate the effectiveness of the proposed feature extraction methods, 60d-SPF, 6d-SCPSF, and existing feature extraction methods on SRSIO-FCM, SLFCM, KSRSIO-FCM, and KSLFCM clustering algorithms. The reported results in terms of the Silhouette index and the Davies-Bouldin index show that the proposed 60d-SPF extraction method on SRSIO-FCM, SLFCM, KSRSIO-FCM, and KSLFCM clustering algorithms achieve significantly better results than the proposed 6d-SCPSF and existing feature extraction approaches.
Edge-computing methods allow devices to efficiently train a high-performing, robust, and personalized model for predictive tasks. However, these methods succumb to privacy and scalability concerns such as adversarial ...
详细信息
ISBN:
(纸本)9798400704369
Edge-computing methods allow devices to efficiently train a high-performing, robust, and personalized model for predictive tasks. However, these methods succumb to privacy and scalability concerns such as adversarial data recovery and expensive model communication. Furthermore, edge computing methods unrealistically assume that all devices train an identical model. In practice, edge devices have varying computational and memory constraints, which may not allow certain devices to have the space or speed to train a specific model. To overcome these issues, we propose MIDDLE, a model-independent distributed learning algorithm that allows heterogeneous edge devices to assist each other in training while communicating only non-sensitive information. MIDDLE unlocks the ability for edge devices, regardless of computational or memory constraints, to assist each other even with completely different model architectures. Furthermore, MIDDLE does not require model or gradient communication, significantly reducing communication size and time. We prove that MIDDLE attains the optimal convergence rate O(1/root TM) of stochastic gradient descent for convex and non-convex smooth optimization (for total iterations.. and batch size M). Finally, our experimental results demonstrate that MIDDLE attains robust and high-performing models without model or gradient communication.
Today, as increasingly complex predictive models are developed, simple rule sets remain a crucial tool to obtain interpretable predictions and drive high-stakes decision making. However, a single rule set provides a p...
详细信息
ISBN:
(纸本)9798400704901
Today, as increasingly complex predictive models are developed, simple rule sets remain a crucial tool to obtain interpretable predictions and drive high-stakes decision making. However, a single rule set provides a partial representation of a learning task. An emerging paradigm in interpretable machine learning aims at exploring the Rashomon set of all models exhibiting near-optimal performance. Existing work on Rashomon-set exploration focuses on exhaustive search of the Rashomon set for particular classes of models, which can be a computationally challenging task. On the other hand, exhaustive enumeration leads to redundancy that often is not necessary, and a representative sample or an estimate of the size of the Rashomon set is sufficient for many applications. In this work, we propose, for the first time, efficient methods to explore the Rashomon set of rule-set models with or without exhaustive search. Extensive experiments demonstrate the effectiveness of the proposed methods in a variety of scenarios.
This paper considers a cell-free massive multiple-input multiple-output network (cfm-MIMO) with a massive number of access points (APs) distributed across an area to deliver information to multiple users. Based on onl...
详细信息
This paper considers a cell-free massive multiple-input multiple-output network (cfm-MIMO) with a massive number of access points (APs) distributed across an area to deliver information to multiple users. Based on only local channel state information, conjugate beamforming is used under both proper and improper Gaussian signalings. To accomplish the mission of cfm-MIMO in providing fair service to all users, the problem of power allocation to maximize the geometric mean (GM) of users' rates (GM-rate) is considered. A new scalable algorithm, which iterates linear-complex closed-form expressions and thus is practical regardless of the scale of the network, is developed for its solution. The problem of quality-of-service (QoS) aware network energy-efficiency is also addressed via maximizing the ratio of the GM-rate and the total power consumption, which is also addressed by iterating linear-complex closed-form expressions. Intensive simulations are provided to demonstrate the ability of the GM-rate based optimization to achieve multiple targets such as a uniform QoS, a good sum rate, and a fair power allocation to the APs.
Graphs are a representation of structured data that captures the relationships between sets of objects. With the ubiquity of available network data, there is increasing industrial and academic need to quickly analyze ...
详细信息
ISBN:
(纸本)9798400701030
Graphs are a representation of structured data that captures the relationships between sets of objects. With the ubiquity of available network data, there is increasing industrial and academic need to quickly analyze graphs with billions of nodes and trillions of edges. A common first step for network understanding is Graph Embedding, the process of creating a continuous representation of nodes in a graph. A continuous representation is often more amenable, especially at scale, for solving downstream machine learning tasks such as classification, link prediction, and clustering. A high-performance graph embedding architecture leveraging Tensor Processing Units (TPUs) with configurable amounts of high-bandwidth memory is presented that simplifies the graph embedding problem and can scale to graphs with billions of nodes and trillions of edges. We verify the embedding space quality on real and synthetic large-scale datasets.
暂无评论