Deep neural networks (DNNs) have been widely used for learning various wireless communication policies. While DNNs have demonstrated the ability to reduce the time complexity of inference, their training often incurs ...
详细信息
Deep neural networks (DNNs) have been widely used for learning various wireless communication policies. While DNNs have demonstrated the ability to reduce the time complexity of inference, their training often incurs a high computational cost. Since practical wireless systems require retraining due to operating in open and dynamic environments, it is crucial to analyze the factors affecting the training complexity, which can guide the DNN architecture selection and the hyper-parameter tuning for efficient policy learning. As a metric of time complexity, the number of floating-point operations (FLOPs) for inference has been analyzed in the literature. However, the time complexity of training DNNs for learning wireless communication policies has only been evaluated in terms of runtime. In this paper, we introduce the number of serial FLOPs (se-FLOPs) as a new metric of time complexity, accounting for the ability of parallel computing. The se-FLOPs metric is consistent with actual runtime, making it suitable for measuring the time complexity of training DNNs. Since graph neural networks (GNNs) can learn a multitude of wireless communication policies efficiently and their architectures depend on specific policies, no universal GNN architecture is available for analyzing complexities across different policies. Thus, we first use precoder learning as an example to demonstrate the derivation of the numbers of se-FLOPs required to train several DNNs. Then, we compare the results with the se-FLOPs for inference of the DNNs and for executing a popular numerical algorithm, and provide the scaling laws of these complexities with respect to the numbers of antennas and users. Finally, we extend the analyses to the learning of general wireless communication policies. We use simulations to validate the analyses and compare the time complexity of each DNN trained for achieving the best learning performance and achieving an expected performance.
Automatic speech recognition (ASR) with an encoder equipped with self-attention, whether streaming or non-streaming, takes quadratic time in the length of the speech utterance. This slows down training and decoding, i...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Automatic speech recognition (ASR) with an encoder equipped with self-attention, whether streaming or non-streaming, takes quadratic time in the length of the speech utterance. This slows down training and decoding, increase the cost, and limits the deployment of the ASR in constrained devices. SummaryMixing is a promising linear-time complexity alternative to self-attention for non-streaming speech recognition that, for the first time, preserves or outperforms the accuracy of self-attention models. Unfortunately, the original definition of SummaryMixing is not suited to streaming speech recognition. Hence, this work extends SummaryMixing to a Conformer Transducer that works in both a streaming and an offline mode. It shows that this new linear-time complexity speech encoder outperforms self-attention in both scenarios while requiring less compute and memory during training and decoding.
In the field of computer science, sorting algorithms are crucial because they facilitate the effective processing and arrangement of data in a variety of scenarios, including data analysis, searching, and optimal syst...
详细信息
ISBN:
(数字)9798331527549
ISBN:
(纸本)9798331527556
In the field of computer science, sorting algorithms are crucial because they facilitate the effective processing and arrangement of data in a variety of scenarios, including data analysis, searching, and optimal system operation. The objective of this study is to look at and compare different sorting algorithms to see how well they work and how useful they are in different situations. The study tests popular sorting algorithms like Quick Sort, Merge Sort, and Bubble Sort on different input sizes, types of data (sorted, reversed, and random), and how they are implemented in the C++ programming language. The study shows how they work in terms of time complexity, resource use, and real applications through a systematic analysis. The most important results show that algorithms work differently depending on the inputs. For example, Quick Sort does better in most situations, while Merge Sort stays stable in the worst ones. This work also finds situations where simpler algorithms, like Bubble Sort, are good for small data sets. This paper is unique because it gives real-life cases, suggestions for choosing the best algorithm based on specific needs, and suggestions for how to do that. The goal of these results is to help programmers, teachers, and experts pick the right sorting algorithms for their computer tasks.
Most of works on the time complexity analysis of evolutionary algorithms havealways focused on some artificial binary problems. The time complexity of the algorithms forcombinatorial optimisation has not been well und...
详细信息
Most of works on the time complexity analysis of evolutionary algorithms havealways focused on some artificial binary problems. The time complexity of the algorithms forcombinatorial optimisation has not been well understood. This paper considers the time complexity ofan evolutionary algorithm for a classical combinatorial optimisation problem, to find the maximumcardinality matching in a graph. It is shown that the evolutionary algorithm can produce a matchingwith nearly maximum cardinality in average polynomial time.
We analyze the time complexity of iterative-deepening-A* (IDA*). We first show how to calculate the exact number of nodes at a given depth of a regular search tree, and the asymptotic brute-ford branching factor. We t...
详细信息
We analyze the time complexity of iterative-deepening-A* (IDA*). We first show how to calculate the exact number of nodes at a given depth of a regular search tree, and the asymptotic brute-ford branching factor. We then use this result to analyze IDA* with a consistent, admissible heuristic function. Previous analyses relied on an abstract analytic model, and characterized the heuristic function in terms of its accuracy, but do not apply to concrete problems. In contrast, our analysis allows us to accurately predict the performance of IDA* on actual problems such as the sliding-tile puzzles and Rubik's Cube. The heuristic function is characterized by the distribution of heuristic values over the problem space. Contrary to conventional wisdom, our analysis shows that the asymptotic heuristic branching factor is the same as the brute-force branching factor. Thus, the effect of a heuristic function is to reduce the effective depth of search by a constant, relative to a brute-force search, rather than reducing the effective branching factor. (C) 2001 Elsevier Science B.V. All rights reserved.
In this paper we deal with the time complexity of single- and identical parallel-machine scheduling problems in which the durations and precedence constraints of the activities are stochastic. The stochastic precedenc...
详细信息
In this paper we deal with the time complexity of single- and identical parallel-machine scheduling problems in which the durations and precedence constraints of the activities are stochastic. The stochastic precedence constraints are given by GERT networks. First, we sketch the basic concepts of GERT networks and machine scheduling with GERT network precedence constraints. Second, we discuss the time complexity of some open single-machine scheduling problems with GERT network precedence constraints. Third, we investigate the time complexity of identical parallel-machine scheduling problems with GERT network precedence constraints. Finally, we present an efficient reduction algorithm for the problem of computing the expected makespan for the latter type of scheduling problem.
The problem of fast identification of continuous-time systems is formulated in the metric complexity theory setting. It is shown that the two key steps to achieving fast identification, i.e., optimal input design and ...
详细信息
The problem of fast identification of continuous-time systems is formulated in the metric complexity theory setting. It is shown that the two key steps to achieving fast identification, i.e., optimal input design and optimal model selection, can be carried out independently when the true system belongs to a general a priori set. These two optimization problems can be reduced to standard Gel'fand and Kolmogorov n-width problems in metric complexity theory. It is shown that although arbitrarily accurate identification can be achieved on a small time interval by reducing the noise-signal ratio and designing the input carefully, identification speed is limited by the metric complexity of the a priori uncertainty set when the noise/signal ratio is fixed.
In-memory computing (IMC) with cross-point resistive memory arrays has been shown to accelerate data-centric computations, such as the training and inference of deep neural networks, due to the high parallelism endowe...
详细信息
In-memory computing (IMC) with cross-point resistive memory arrays has been shown to accelerate data-centric computations, such as the training and inference of deep neural networks, due to the high parallelism endowed by physical rules in the electrical circuits. By connecting cross-point arrays with negative feedback amplifiers, it is possible to solve linear algebraic problems, such as linear systems and matrix eigenvectors in just one step. Based on the theory of feedback circuits, we study the dynamics of the solution of linear systems within a memory array, showing that the time complexity of the solution is free of any direct dependence on the problem size N, rather it is governed by theminimal eigenvalue of an associatedmatrix of the coefficient matrix. We show that when the linear system is modeled by a covariancematrix, the time complexity is O(logN) or O(1). In the case of sparse positive-definite linear systems, the time complexity is solely determined by the minimal eigenvalue of the coefficient matrix. These results demonstrate the high speed of the circuit for solving linear systems in a wide range of applications, thus supporting IMC as a strong candidate for future big data and machine learning accelerators.
Matrix-vector multiplication (MVM) is the core operation of many important algorithms. Crosspoint resistive memory array enables naturally calculating MVM in one operation, thus representing a highly promising computi...
详细信息
Matrix-vector multiplication (MVM) is the core operation of many important algorithms. Crosspoint resistive memory array enables naturally calculating MVM in one operation, thus representing a highly promising computing accelerator for various applications. To evaluate computing performance as well as scalability of in-memory MVM, the fundamental issue of time complexity of the circuit shall be elaborated. Based on the most common MVM circuit that uses transimpedance amplifiers to read out current product in crosspoint array, we analyze its dynamic response and the corresponding time complexity. The result shows that the computing time is governed by the maximal row sum of the implemented matrix, which leads to an explicit time complexity for a specific dataset, e.g., O(N-1/2) and O(lnN) for discrete cosine transformation and Toeplitz matrix, respectively. By changing accordingly feedback conductance of transimpedance amplifier for different matrix sizes, it is possible to reduce the time complexity to O(1). Impact of non-ideal factors of the circuit on computing time is also studied. This work provides an insight into the performance and its improvement of MVM computation for efficient in-memory computing accelerators.
暂无评论