With the advent and the growing usage of Machine Learning as a Service (MLaaS), cloud and network systems are now offering the possibility to deploy ML tasks on heterogeneous clusters. Then, network and cloud operator...
详细信息
ISBN:
(纸本)9798350395679;9798350395662
With the advent and the growing usage of Machine Learning as a Service (MLaaS), cloud and network systems are now offering the possibility to deploy ML tasks on heterogeneous clusters. Then, network and cloud operators have to schedule these tasks, determining both when and on which devices to execute them. In parallel, several solutions, such as neural network compression, were proposed to build small models which can run on limited hardware. These solutions allow choosing the model size at inference time for any targeted processing time without having to re-train the network. In this work, we consider the Deadline Scheduling with Compressible Tasks (DSCT) problem: a novel scheduling problem with task deadlines where the tasks can be compressed. Each task can be executed with a certain compression, presenting a trade-off between its compression level (and, its processing time) and its obtained utility. The objective is to maximize the tasks utilities. We propose an approximation algorithm with proved guarantees to solve the problem. We validate its efficiency with extensive simulation, obtaining near optimal results. As application scenario, we study the problem when the tasks are Deep Learning classification jobs, and the objective is to maximize their global accuracy, but we believe that this new framework and solutions apply to a wide range of application cases.
In a weighed directed graph G = (V, E, omega) with m edges and n vertices, we are interested in its basic graph parameters such as diameter, radius and eccentricities, under the nonstandard measure of min-distance whi...
详细信息
ISBN:
(纸本)9781665455190
In a weighed directed graph G = (V, E, omega) with m edges and n vertices, we are interested in its basic graph parameters such as diameter, radius and eccentricities, under the nonstandard measure of min-distance which is defined for every pair of vertices u, v is an element of V as the minimum of the shortest path distances from u to v and from v to u. Similar to standard shortest paths distances, computing graph parameters exactly in terms of min-distances essentially requires (Omega) over tilde (mn) time under plausible hardness conjectures. Hence, for faster running time complexities we have to tolerate approximations. Abboud, Vassilevska Williams and Wang [SODA 2016] were the first to study min-distance problems, and they obtained constant factor approximation algorithms in acyclic graphs, with running time (O) over tilde (m) and (O) over tilde (m root n) for diameter and radius, respectively. The time complexity of radius in acyclic graphs was recently improved to (O) over tilde (m) by Dalirrooyfard and Kaufmann [ICALP 2021], but at the cost of an O(log n) approximation ratio. For general graphs, the authors of [DWV+, ICALP 2019] gave the first constant factor approximation algorithm for diameter, radius and eccentricities which runs in time (O) over tilde (m root n);besides, for the diameter problem, the running time can be improved to (O) over tilde (m) while blowing up the approximation ratio to O(log n). A natural question is whether constant approximation and near-linear time can be achieved simultaneously for diameter, radius and eccentricities;so far this is only possible for diameter in the restricted setting of acyclic graphs. In this paper, we answer this question in the affirmative by presenting near-linear time algorithms for all three parameters in general graphs.
Given a set of points P, the k-center problem is to find a minimum radius r* and associated center set of k points C such that the distance from each point in P to its closest center is at most r*. While this problem ...
详细信息
ISBN:
(纸本)9781611977936
Given a set of points P, the k-center problem is to find a minimum radius r* and associated center set of k points C such that the distance from each point in P to its closest center is at most r*. While this problem is NP-hard to solve, there exists a famous O(kn) time 2-approximation algorithm due to Gonzalez. This works by repeatedly adding the furthest neighbor in P to the current center set C until C contains exactly k centers. The dynamic version of the problem is to maintain P over insertions and deletions of points, in a way that permits efficiently solving the k center problem for the current P. There are various specialized (2 + epsilon)-approximation algorithms for solving this. Another dynamic problem, which does not seem to have been previously studied, is how to dynamically maintain a set of points P so as to efficiently solve the approximate furthest neighbor problem i.e., given a second point set C ', to find a point p is an element of P that is a (1 + epsilon) approximate furthest neighbor from C '. We show that, for points in bounded doubling dimension, the approximate furthest neighbor problem can be solved using the known navigating nets data structure in new way. This immediately provides a new algorithm for solving the dynamic k-center problem by replacing the search for a furthest neighbor in P to C in Gonzalez's algorithm with an approximate furthest neighbor search. Unlike some of the older algorithms, this new approach does not require knowing k or epsilon in advance. This new approach can also be used to solve the dynamic Euclidean k-center problem.
In this paper, a novel data-driven approximation of the Koopman theory algorithm is proposed to simulate the coherent and spatial-temporal correlated sea clutter. The evolution of sea clutter's in-phase and quadra...
详细信息
ISBN:
(纸本)9781665460231
In this paper, a novel data-driven approximation of the Koopman theory algorithm is proposed to simulate the coherent and spatial-temporal correlated sea clutter. The evolution of sea clutter's in-phase and quadrature (I/Q) components are well simulated on the state space. Meanwhile, the amplitude distribution and the phase retrieval of sea clutter can be extracted and modeled. The experimental sea clutter data measured by the IPIX radar is used to demonstrate that this newly proposed data-driven approximation of the Koopman theory algorithm can simulate the sea clutter with the phase retrieval, the expected amplitude information, and the spatial-temporal relationship. Our work provides a useful and powerful simulation scheme for the sea clutter, especially when Doppler information is needed.
The maximum Nash social welfare (NSW)-which maximizes the geometric mean of agents' utilities-is a fundamental solution concept with remarkable fairness and efficiency guarantees. The computational aspects of NSW ...
详细信息
ISBN:
(纸本)1577358872
The maximum Nash social welfare (NSW)-which maximizes the geometric mean of agents' utilities-is a fundamental solution concept with remarkable fairness and efficiency guarantees. The computational aspects of NSW have been extensively studied for one-sided preferences where a set of agents have preferences over a set of resources. Our work deviates from this trend and studies NSW maximization for two-sided preferences, wherein a set of workers and firms, each having a cardinal valuation function, are matched with each other. We provide a systematic study of the computational complexity of maximizing NSW for many-to-one matchings under two-sided preferences. Our main negative result is that maximizing NSW is NP-hard even in a highly restricted setting where each firm has capacity 2, all valuations are in the range {0, 1, 2}, and each agent positively values at most three other agents. In search of positive results, we develop approximation algorithms as well as parameterized algorithms in terms of natural parameters such as the number of workers, the number of firms, and the firms' capacities. We also provide algorithms for restricted domains such as symmetric binary valuations and bounded degree instances.
Combinatorial Optimization (CO) problems over graphs appear routinely in many applications such as in optimizing traffic, viral marketing in social networks, and matching for job allocation. Due to their combinatorial...
详细信息
ISBN:
(纸本)1577358872
Combinatorial Optimization (CO) problems over graphs appear routinely in many applications such as in optimizing traffic, viral marketing in social networks, and matching for job allocation. Due to their combinatorial nature, these problems are often NP-hard. Existing approximation algorithms and heuristics rely on the search space to find the solutions and become time-consuming when this space is large. In this paper, we design a neural method called COMBHELPER to reduce this space and thus improve the efficiency of the traditional CO algorithms based on node selection. Specifically, it employs a Graph Neural Network (GNN) to identify promising nodes for the solution set. This pruned search space is then fed to the traditional CO algorithms. COMBHELPER also uses a Knowledge Distillation (KD) module and a problem-specific boosting module to bring further efficiency and efficacy. Our extensive experiments show that the traditional CO algorithms with COMBHELPER are at least 2 times faster than their original versions.
Hyperspectral super-resolution based on coupled Tucker decomposition has been recently considered in the remote sensing community. The state-of-the-art approaches did not fully exploit the coupling of information cont...
详细信息
ISBN:
(数字)9781665496209
ISBN:
(纸本)9781665496209
Hyperspectral super-resolution based on coupled Tucker decomposition has been recently considered in the remote sensing community. The state-of-the-art approaches did not fully exploit the coupling of information contained in hyperspectral and multispectral images of the same scene. This paper proposes a new algorithm that overcomes this limitation. It accounts for both the high-resolution and the low-resolution information in the model by solving a set of least-squares problems. In addition, we provide exact recovery conditions for the super-resolution image in the noiseless case. Our simulations show that the proposed algorithm achieves very good reconstruction quality with a very low computational complexity.
There has been an approximation-based distributed optimization algorithm that solves univariate nonconvex problems to arbitrary precision. The key idea is to construct approximations of local objectives and address a ...
详细信息
ISBN:
(数字)9781665451963
ISBN:
(纸本)9781665451963
There has been an approximation-based distributed optimization algorithm that solves univariate nonconvex problems to arbitrary precision. The key idea is to construct approximations of local objectives and address a more structured approximate version of the problem. By representing diverse local objectives with compressed coefficients vectors, such algorithms enjoy gradient-free iterations but face severe security issues when adversaries occur. In this paper, we propose a resilient approximation-based distributed nonconvex optimization algorithm termed R-ADOA to defend attacks from malicious nodes. First, errors caused by adversaries are quantified and unified as the perturbation of coefficient vectors of approximations. Next, we propose a filtering mechanism and resilient stopping mechanism to limit errors arising in consensus-based iterations. Finally, an upper bound of the deviations of the obtained solutions from optimal solutions is given based on the eigenvalue perturbation theory of matrices. Numerical experiments are provided to illustrate the effectiveness of our algorithm. Compared to existing resilient distributed optimization algorithms, R-ADOA addresses nonconvex problems, converges exponentially fast, and contains explicit bounds for the deviations of solutions.
Transformer architecture is one of the most remarkable recent breakthroughs in neural networks, achieving state-of-the-art (SOTA) performance on various natural language processing (NIP) and computer vision tasks. Sel...
详细信息
ISBN:
(数字)9781665484947
ISBN:
(纸本)9781665484947
Transformer architecture is one of the most remarkable recent breakthroughs in neural networks, achieving state-of-the-art (SOTA) performance on various natural language processing (NIP) and computer vision tasks. Self-attention is the key enabling operation for transformer-based models. However, its quadratic computational complexity to the sequence length makes this operation the major performance bottleneck for those models. Thus, we propose a novel self-attention accelerator that skips most of the computation by utilizing an approximate candidate selection algorithm. Implemented in a 40nm CMOS technology, our 5.64 mm(2) chip operates at 100-600 MHz consuming 483-685 mW to achieve the energy and area efficiency of 0.354-5.61 TOPS/W and 239 GOPS/mm(2), respectively.
Artificial intelligence (AI) processors offer significant progress for edge computing devices in the Industrial Internet of Things (IIoT) domain, promoting a revolution in computational intelligence and efficiency. A ...
详细信息
Artificial intelligence (AI) processors offer significant progress for edge computing devices in the Industrial Internet of Things (IIoT) domain, promoting a revolution in computational intelligence and efficiency. A hardware-efficient leaky rectified linear unit (ReLU) activation function with polynomial approximation and shifter implementation is proposed to facilitate the deployment of AI processors in edge devices. The constant parameters of the leaky ReLU are approximated using second-order or higher-order polynomials and transformed into arithmetic operations that can be achieved by a combination of shift, addition, or subtraction operations. Furthermore, the approximate circuit modules of the leaky ReLU are developed by employing shifters, adders, and subtractors instead of multipliers and dividers. The experimental results prove that the cubic or higher-order polynomial approximation of leaky ReLU has a negligible impact on the accuracy of cutting-edge deep neural networks. Moreover, approximate leaky ReLU modules with low power consumption, low latency, and a small circuit footprint are demonstrated by substituting a multiplier or divider with a combination of shifters and adders or subtractors. The proposed approximate leaky ReLU techniques have the potential to produce energy-efficient, low-latency, and compact circuit solutions for AI processors in IIoT.
暂无评论