This paper offers an exercise for revisiting the main sorting algorithms after they have been taught to students. This is done in a way that emphasizes the relationships between them, and shows how considering abstrac...
详细信息
ISBN:
(纸本)9781450305006
This paper offers an exercise for revisiting the main sorting algorithms after they have been taught to students. This is done in a way that emphasizes the relationships between them, and shows how considering abstraction and extreme cases can lead to the generation of new algorithms. A number of authors (including textbook authors) have noted particular relationships between algorithms, such as an uneven split in merge sort being equivalent to insertion sort. In this paper we use a flexible priority queue, the d-heap, to derive three common sorting algorithms. We combine this with using a BST as a priority queue, plus prior observations in the literature, to show strong relationships between the main sorting algorithms that appear in textbooks. In the process students are able to revisit a number of algorithms and data structures and explore elegant relationships between them. This approach can also lead to exercises and exam questions that go beyond desk-checking to evaluate students' understanding of these algorithms.
In this paper, we investigate neutron-induced errors in three implementations of sort algorithms (QuickSort, Merge-Sort, and RadixSort) executed on modern graphics processing units designed for high-performance comput...
详细信息
In this paper, we investigate neutron-induced errors in three implementations of sort algorithms (QuickSort, Merge-Sort, and RadixSort) executed on modern graphics processing units designed for high-performance computing and large server applications. We measure the radiation-induced error rate of sort algorithms taking advantage of the neutron beam available at the Los Alamos Neutron Science Center facility. We also analyze output error criticality by identifying specific output error patterns. We found that radiation can cause wrong elements to appear in the sorted array, misalign values as well as application crashes or system hangs. This paper presents results showing that the criticality of the radiation-induced output error pattern depends on the application. Additionally, an extensive fault-injection campaign has been performed. This campaign allows for better understanding of the observed phenomena. We take advantage of SASS-assembly Intrumentator Fault Injector developed by NVIDIA, which can inject faults into all the useraccessible architectural state. Comparing fault-injection results with radiation experiments data provides an understanding that not all the output errors observed under radiation can be replicated in fault injection. However, fault injection is useful in identifying possible root causes of the output errors observed in radiation testing. Finally, we take advantage of our experimental and analytical study to design efficient experimentally tuned hardening strategies. We detect the error patterns that are critical to the final application and find the more efficient way to detect them. With an overhead as low as 16% of the execution time, we are able to reduce the output error rate of sort of about one order of magnitude.
Mobile devices performance and uptime heavily depend on energy consumed at the hardware and software level. Hence implementation of efficient algorithms has become a crucial aspect for increasing the performance of su...
详细信息
Mobile devices performance and uptime heavily depend on energy consumed at the hardware and software level. Hence implementation of efficient algorithms has become a crucial aspect for increasing the performance of such systems and battery life for mobile devices. sorting algorithms are implicitly the building block of many program implementation. Over the past years, researchers have spent more time optimizing hardware components to reduce their energy consumption. However, it has not been so clear which sorting algorithm is more energy efficient. In this study, we conduct a meta-analytical comparison of the energy consumed by the two most common sorting algorithms namely quick sort and merge sort. Our study mainly focused on energy consumption for mobile devices and embedded systems. For our meta-analysis and literature review, we took into consideration studies published not more than 20 years ago. The meta-analytical results show that there is no significant difference between both algorithms in terms of energy efficiency. (c) 2021 Elsevier Inc. All rights reserved.
Seven internal methods for sorting a set ( a 1 , a 2 ,…, a n ) of real numbers into non-descending order are compared with regard to their performance on the vector computers CRAY-1S, CRAY-1M, CRAY X-MP, AMDAHL 1100,...
详细信息
Seven internal methods for sorting a set ( a 1 , a 2 ,…, a n ) of real numbers into non-descending order are compared with regard to their performance on the vector computers CRAY-1S, CRAY-1M, CRAY X-MP, AMDAHL 1100, AMDAHL 1200 and the AMDAHL 470/V7. The algorithms considered are: Bubble sort, odd-even transposition sort, Batcher's parallel merge-exchange sort, heapsort, quicksort, vector quicksort and diamond sort. Moreover, certain variants of some of these algorithms are also considered. The suitability of the algorithms with respect to vector machine implementation is discussed and the FORTRAN Cray codes for Batcher's parallel merge-exchange sort as well as Diamond sort are given.
In this paper, we show how the theory of sorting networks can be applied to synthesize optimized general-purpose sorting libraries. Standard sorting libraries are often based on combinations of the classic Quicksort a...
详细信息
In this paper, we show how the theory of sorting networks can be applied to synthesize optimized general-purpose sorting libraries. Standard sorting libraries are often based on combinations of the classic Quicksort algorithm, with insertion sort applied as base case for small, fixed, numbers of inputs. Unrolling the code for the base case by ignoring loop conditions eliminates branching, resulting in code equivalent to a sorting network. By replacing it with faster sorting networks, we can improve the performance of these algorithms. We show that by considering the number of comparisons and swaps alone we are not able to predict any real advantage of this approach. However, significant speed-ups are obtained when taking advantage of instruction level parallelism and non-branching conditional assignment instructions, both of which are common in modern CPU architectures. Furthermore, a close control of how often registers have to be spilled to memory gives us a complete explanation of the performance of different sorting networks, allowing us to choose an optimal one for each particular architecture. Our experimental results show that using code synthesized from these efficient sorting networks as the base case for Quicksort libraries results in significant real-world speed-ups.
We present three parallel sorting algorithms suitable for implementation on tightly coupled multiprocessors and compare their performance on the Denelcor HEP. Two of the algorithms implemented—parallel Shellsort and ...
详细信息
We present three parallel sorting algorithms suitable for implementation on tightly coupled multiprocessors and compare their performance on the Denelcor HEP. Two of the algorithms implemented—parallel Shellsort and quickmerge—are new. Shellsort is amenable to parallelization; however, since Shellsort has higher complexity than quicksort, parallel Shellsort is inferior to parallel quicksort. A second new parallel algorithm, called quickmerge, is based upon both quicksort and mergesort. Our implementation of quickmerge achieves significantly higher speedup than occur implementation of parallel quicksort.
We define a family of maps on lattice paths, called sweep maps, that assign levels to each step in the path and sort steps according to their level. Surprisingly, although sweep maps act by sorting, they appear to be ...
详细信息
We define a family of maps on lattice paths, called sweep maps, that assign levels to each step in the path and sort steps according to their level. Surprisingly, although sweep maps act by sorting, they appear to be bijective in general. The sweep maps give concise combinatorial formulas for the q, t-Catalan numbers, the higher q, t-Catalan numbers, the q, t-square numbers, and many more general polynomials connected to the nabla operator and rational Catalan combinatorics. We prove that many algorithms that have appeared in the literature (including maps studied by Andrews, Egge, Gorsky, Haglund, Hanusa, Jones, Killpatrick, Krattenthaler, Kremer, Orsina, Mazin, Papi, Vaille, and the present authors) are all special cases of the sweep maps or their inverses. The sweep maps provide a very simple unifying framework for understanding all of these algorithms. We explain how inversion of the sweep map (which is an open problem in general) can be solved in known special cases by finding a "bounce path" for the lattice paths under consideration. We also define a generalized sweep map acting on words over arbitrary alphabets with arbitrary weights, which is also conjectured to be bijective. (C) 2015 Elsevier Inc. All rights reserved.
Background and purpose: To investigate the impact of Toshiba phase- and amplitude-sorting algorithms on the margin strategies for free-breathing lung radiotherapy treatments in the presence of breathing variations. Ma...
详细信息
Background and purpose: To investigate the impact of Toshiba phase- and amplitude-sorting algorithms on the margin strategies for free-breathing lung radiotherapy treatments in the presence of breathing variations. Material and methods: 4D CT of a sphere inside a dynamic thorax phantom was acquired. The 4D CT was reconstructed according to the phase- and amplitude-sorting algorithms. The phantom was moved by reproducing amplitude, frequency, and a mix of amplitude and frequency variations. Artefact analysis was performed for Mid-Ventilation and ITV-based strategies on the images reconstructed by phase and amplitude-sorting algorithms. The target volume deviation was assessed by comparing the target volume acquired during irregular motion to the volume acquired during regular motion. Results: The amplitude-sorting algorithm shows reduced artefacts for only amplitude variations while the phase-sorting algorithm for only frequency variations. For amplitude and frequency variations, both algorithms perform similarly. Most of the artefacts are blurring and incomplete structures. We found larger artefacts and volume differences for the Mid-Ventilation with respect to the ITV strategy, resulting in a higher relative difference of the surface distortion value which ranges between maximum 14.6% and minimum 4.1%. Conclusions: The amplitude- is superior to the phase-sorting algorithm in the reduction of motion artefacts for amplitude variations while phase-sorting for frequency variations. A proper choice of 4D CT sorting algorithm is important in order to reduce motion artefacts, especially if Mid-Ventilation strategy is used. (C) 2016 Elsevier Ireland Ltd. All rights reserved.
The timing results on the performance of some sorting algorithms on vector computers presented recently by Rönsch and Strauss, and by Carnevali are supplemented with data for the new ETA 10-P supercomputer. The D...
详细信息
The timing results on the performance of some sorting algorithms on vector computers presented recently by Rönsch and Strauss, and by Carnevali are supplemented with data for the new ETA 10-P supercomputer. The DIAMONDSORT and RADIXSORT vector algorithms in their FORTRAN 200 implementation due to Krieger are tested and compared against the CYBER 205. The MAGEV library routine QSORT is also tested. For the RADIXSORT routine without keeping index list, the ETA 10-P outperforms all the machines but the CYBER 205 (for all previously tested algorithms) at vector lengths equal or longer than 25 600 elements.
The problem of sorting a sequence of n elements on a parallel computer with k processors is considered. The algorithms we present can all be run on a single instruction stream multiple data stream computer. For large ...
详细信息
The problem of sorting a sequence of n elements on a parallel computer with k processors is considered. The algorithms we present can all be run on a single instruction stream multiple data stream computer. For large n, each achieves an asymptotic speed-up ratio of k with respect to the best sequential algorithm, which is optimal in the number of processors used.
暂无评论