In this paper, we give three new parallel sorting algorithms on a mesh-connected computer with wraparound connections (i.e., a torus). These three algorithms, with the minimum queue size of 1, sort n 2 random input da...
详细信息
In this paper, we give three new parallel sorting algorithms on a mesh-connected computer with wraparound connections (i.e., a torus). These three algorithms, with the minimum queue size of 1, sort n 2 random input data items into a blocked snakelike row major order, a row major order, and a snakelike row major order, in 1.5n + o(n), 2n + o(n), and 2n + o(n) average steps, respectively. These results improve the previous results of 2n + o(n), 2.5n + o(n), and 2.5n + o(n), respectively. In addition, we prove in this paper that the distance bound n on a torus is an average-time lower bound independent of indexing schemes of sorting random input data items on it.
Effective visuals are essential for solving computational issues and comprehending complicated algorithms. Through the use of a web-based sorting Algorithm Visualizer and associated components, this study presents a f...
详细信息
This paper presents a new sorting algorithm that sorts input data elements without any comparison operations between the datacomparison-free sorting. Our algorithm's time complexity is on the order of O(N) for bot...
详细信息
This paper presents a new sorting algorithm that sorts input data elements without any comparison operations between the datacomparison-free sorting. Our algorithm's time complexity is on the order of O(N) for both single- and multi-threaded CPU and many-core GPU implementations. Our results show speedups on average of 4.6x, 4x, and 3.5xfor single-threaded CPU, 8-threaded CPU, and many-threaded GPU implementations, respectively, for input sizes ranging from 2(7) to 2(30) elements as compared to common sorting algorithms for a wide variation of element distributions, ranging from all unique elements to a single repeated element. In addition, our proposed algorithm more efficiently utilizes the GPU architecture as compared to a multi-core CPU architecture, showing a speedup of approximately 4x for input sizes ranging from 2(7) to 2(30) elements.
sorting is the task of ordering n elements using pairwise comparisons. It is well known that m = Theta(nlogn) comparisons are both necessary and sufficient when the outcomes of the comparisons are observed with no noi...
详细信息
sorting is the task of ordering n elements using pairwise comparisons. It is well known that m = Theta(nlogn) comparisons are both necessary and sufficient when the outcomes of the comparisons are observed with no noise. In this paper, we study the sorting problem when each comparison is incorrect with some fixed yet unknown probability p. Unlike the common approach in the literature which aims to minimize the number of pairwise comparisons m to achieve a given desired error probability, we consider randomized algorithms with expected number of queries E[M] and aim at characterizing the maximal sorting rate nlogn/E[M] such that the ordering of the elements can be estimated with a vanishing error probability asymptotically. The maximal rate is referred to as the noisy sorting capacity. In this work, we derive upper and lower bounds on the noisy sorting capacity. The two lower bounds - one for fixed-length algorithms and one for variable-length algorithms - are established by combining the insertion sort algorithm with the well-known Burnashev-Zigangirov algorithm for channel coding with feedback. Compared with existing methods, the proposed algorithms are universal in the sense that they do not require the knowledge of p, while maintaining a strictly positive sorting rate. Moreover, we derive a general upper bound on the noisy sorting capacity, along with an upper bound on the maximal rate that can be achieved by sorting algorithms that are based on insertion sort.
We present a new sorting algorithm, called adaptive ShiversSort, that exploits the existence of monotonic runs for sorting efficiently partially sorted data. This algorithm is a variant of the well-known algorithm Tim...
详细信息
We present a new sorting algorithm, called adaptive ShiversSort, that exploits the existence of monotonic runs for sorting efficiently partially sorted data. This algorithm is a variant of the well-known algorithm TimSort, which is the sorting algorithm used in standard libraries of programming languages, such as Python or Java (for non-primitive types). More precisely, adaptive ShiversSort is a so-called k-aware merge-sort algorithm, a class that captures 'TimSort-like' algorithms and that was introduced by Buss and Knop. In this article, we prove that, although adaptive ShiversSort is simple to implement and differs only slightly from TimSort, its computational cost, in number of comparisons performed, is optimal within the class of natural merge-sort algorithms, up to a small additive linear term. This makes adaptive ShiversSort the first k-aware algorithm to benefit from this property, which is also a 33% improvement over TimSort's worst-case. This suggests that adaptive ShiversSort could be a strong contender for being used instead of TimSort. Then, we investigate the optimality of k-aware algorithms. We give lower and upper bounds on the best approximation factors of such algorithms, compared to optimal stable natural merge-sort algorithms. In particular, we design generalisations of adaptive ShiversSort whose computational costs are optimal up to arbitrarily small multiplicative factors. CCS Concepts: center dot Theory of computation -> sorting and searching
This paper presents a new parallel structured lookahead multidimensional sorting algorithm. Our algorithm can be based on any sequential sorting algorithm. The amount of parallelism can be controlled using several par...
详细信息
This paper presents a new parallel structured lookahead multidimensional sorting algorithm. Our algorithm can be based on any sequential sorting algorithm. The amount of parallelism can be controlled using several parameters such as the number of threads, word size, memory/processor communication overhead, and the dimension of the algorithm. The proposed technique is ideally suited for general purpose graphic processing units and shared-memory massively parallel processor systems. It ensures that data being processed exhibits temporal and spatial locality to maximize the utilization of processor cache. The algorithm achieves a speedup even when a single processor is used. A lookahead algorithm is also proposed to achieve even higher speedup. The performance of the proposed algorithm is verified numerically and experimentally.
A king in a tournament is a player who beats any other player directly or indirectly. According to the existence of a king in every tournament, Wu and Sheng [Inform. Process. Lett. 79 (2001) 297-299] recently presente...
详细信息
A king in a tournament is a player who beats any other player directly or indirectly. According to the existence of a king in every tournament, Wu and Sheng [Inform. Process. Lett. 79 (2001) 297-299] recently presented an algorithm for finding a sorted sequence of kings in a tournament of size n, i.e., a sequence of players u(1), u(2),..., u(n) such that u(i) --> u(i)+(1) (u(i) beats u(i+1)) and ui is a king in the sub-tournament induced by {u(i), u(i)+(1),..., u(n)} for each i = 1, 2,..., n - 1. With each pair u, v of players in a tournament, let b(u, v) denote the number of third players used for u to beat v indirectly. Then, a king u is called a strong king if the following condition is fulfilled: if v --> u then b(u, v) > b(v, u). In the sequel, we will show that the algorithm proposed by Wu and Sheng indeed generates a sorted sequence of strong kings, which is more restricted than the previous one. (C) 2003 Elsevier B.V. All rights reserved.
In this paper, we propose a novel sorting algorithm that sorts input data integer elements on-the-fly without any comparison operations between the data-comparison-free sorting. We present a complete hardware structur...
详细信息
In this paper, we propose a novel sorting algorithm that sorts input data integer elements on-the-fly without any comparison operations between the data-comparison-free sorting. We present a complete hardware structure, associated timing diagrams, and a formal mathematical proof, which show an overall sorting time, in terms of clock cycles, that is linearly proportional to the number of inputs, giving a speed complexity on the order of O(N). Our hardware-based sorting algorithm precludes the need for SRAM-based memory or complex circuitry, such as pipelining structures, but rather uses simple registers to hold the binary elements and the elements' associated number of occurrences in the input set, and uses matrix-mapping operations to perform the sorting process. Thus, the total transistor count complexity is on the order of O(N). We evaluate an application-specified integrated circuit design of our sorting algorithm for a sample sorting of N = 1024 elements of size K = 10-bit using 90-nm Taiwan Semiconductor Manufacturing Company (TSMC) technology with a 1 V power supply. Results verify that our sorting requires approximately 4-6 mu s to sort the 1024 elements with a clock cycle time of 0.5 GHz, consumes 1.6 mW of power, and has a total transistor count of less than 750 000.
The quest for efficient sorting is ongoing, and we will explore a graph-based stable sorting strategy, in particular employing comparison graphs. We use the topological sort to map the comparison graph to a linear dom...
详细信息
The quest for efficient sorting is ongoing, and we will explore a graph-based stable sorting strategy, in particular employing comparison graphs. We use the topological sort to map the comparison graph to a linear domain, and we can manipulate our graph such that the resulting topological sort is the sorted array. By taking advantage of the many relations between Hamiltonian paths and topological sorts in comparison graphs, we design a Divide-and-Conquer algorithm that runs in the optimal O(n logn) time. In the process, we construct a new merge process for graphs with relevant invariant properties for our use. Furthermore, this method is more space-efficient than the famous MERGESORT since we modify our fixed graph only. (C) 2020 Elsevier B.V. All rights reserved.
In this paper we present a novel parallel sorting algorithm, which works through a cascade of elementary sorting units and leads to a scalable architecture. The algorithm's complexity is analyzed and compared with...
详细信息
In this paper we present a novel parallel sorting algorithm, which works through a cascade of elementary sorting units and leads to a scalable architecture. The algorithm's complexity is analyzed and compared with a classical parallel algorithm. It comes out that, although it may be less efficient than classical approaches, the proposed algorithm is highly suited for VLSI implementation for its simplicity and scalability. The paper describes the applications of such device to the asynchronous data acquisition for a gamma-ray telescope.
暂无评论