Two-dimensional high-resolution inviscid and viscous detonations were conducted in the supersonic combustible mixture with the open-source program AMROC. The results show that as the grid resolution increases, more sm...
详细信息
Convolution operation is the most important and time consuming step in a convolution neural network *** this work,we analyze the computing complexity of direct convolution and fast-Fourier-transform-based(FFT-based) *...
详细信息
ISBN:
(纸本)9781510835368
Convolution operation is the most important and time consuming step in a convolution neural network *** this work,we analyze the computing complexity of direct convolution and fast-Fourier-transform-based(FFT-based) *** creatively propose CS-unit,which is equivalent to a combination of a convolutional layer and a pooling layer but more *** computing complexity of and some other similar operation is demonstrated,revealing an advantage on computation of ***,practical experiments are also performed and the result shows that CS-unit holds a real superiority on run time.
In this research, we apply the Green's theory for converting the partial differential equation to the boundary integral equation for geometric transformation. Green's theory is designed specifically for integr...
详细信息
Fault resilience has became a major issue for HPC systems, particularly, in the perspective of future E-scale systems, which will consist of millions of CPU cores and other components. MPI-level fault tolerant constru...
详细信息
Adaptivity is the capacity of software to adjust itself to changes in its environment. A common approach to achieving adaptivity is to introduce dedicated code during software development stage. However,since those co...
详细信息
Adaptivity is the capacity of software to adjust itself to changes in its environment. A common approach to achieving adaptivity is to introduce dedicated code during software development stage. However,since those code fragments are designed a priori, self-adaptive software cannot handle situations adequately when the contextual changes go beyond those that are originally anticipated. In this case, the original builtin adaptivity should be tuned. For example, new code should be added to provide the capacity to sense the unexpected environment or to replace outdated adaptation decision logic. The technical challenges in this process, especially that of tuning software adaptivity at runtime, cannot be understated. In this paper,we propose an architecture-centric application framework for self-adaptive software named Auxo. Similar to existing work, our framework supports the development and running of self-adaptive software. Furthermore,our framework supports the tuning of software adaptivity without requiring the running self-adaptive software to be terminated. In short, the architecture style that we are introducing can encapsulate not only general functional logic but also the concerns in the self-adaptation loop(such as sensing, decision, and execution)as architecture elements. As a result, a third party, potentially the operator or an augmented software entity equipped with explicit domain knowledge, is able to dynamically and flexibly adjust the self-adaptation concerns through modifying the runtime software architecture. To truly exercise, validate, and evaluate our approach,we describe a self-adaptive application that was deployed on the framework, and conducted several experiments involving self-adaptation and the online tuning of software adaptivity.
It is widely believed that Shor's factoring algorithm provides a driving force to boost the quantum computing ***, a serious obstacle to its binary implementation is the large number of quantum gates. Non-binary quan...
详细信息
It is widely believed that Shor's factoring algorithm provides a driving force to boost the quantum computing ***, a serious obstacle to its binary implementation is the large number of quantum gates. Non-binary quantum computing is an efficient way to reduce the required number of elemental gates. Here, we propose optimization schemes for Shor's algorithm implementation and take a ternary version for factorizing 21 as an example. The optimized factorization is achieved by a two-qutrit quantum circuit, which consists of only two single qutrit gates and one ternary controlled-NOT gate. This two-qutrit quantum circuit is then encoded into the nine lower vibrational states of an ion trapped in a weakly anharmonic potential. Optimal control theory(OCT) is employed to derive the manipulation electric field for transferring the encoded states. The ternary Shor's algorithm can be implemented in one single step. Numerical simulation results show that the accuracy of the state transformations is about 0.9919.
Interconnect network plays an important role in high performance computing systems. And its manageability directly affects the RAS (i.e., Reliability, Availability, and Serviceability) of the whole system. The Tianhe-...
详细信息
Interconnect network plays an important role in high performance computing systems. And its manageability directly affects the RAS (i.e., Reliability, Availability, and Serviceability) of the whole system. The Tianhe-2 system located in NSCC-gz (i.e., National Supercomputing Center of China in Guangzhou) uses proprietary interconnect network, which includes 5,856 high-radix network router chips (i.e., NRC) and 18,304 network interface chips (i.e., NIC). For such a very large-scale interconnect network, it is a great challenge to manage (such as configure, monitor, and debug) the numerous network chips and its network ports in an efficient way. By implementing the in-band management with very few hardware resources, the interconnect network in Tianhe-2 system achieves a highly efficient network management. In this paper, we introduce the design and implementation of the in-band management for interconnect network in Tianhe-2 system, especially emphasizing on several key features, including the set of achieved management functionalities, the architecture of network management, the format of management packets, the data flow and processing of management packets, etc. In this paper, we also evaluate the performance of in-band management by mainly comparing with out-band management scheme. The preliminary results demonstrate the efficiency of the in-band management for interconnect network in Tianhe-2 system.
Audio matching automatically retrieves all excerpts that have the same content as the query audio clip from given audio recordings. The extracted feature is critical for audio matching and the Chroma Energy Normalized...
详细信息
Audio matching automatically retrieves all excerpts that have the same content as the query audio clip from given audio recordings. The extracted feature is critical for audio matching and the Chroma Energy Normalized Statistics(CENS) feature is the state-of-the-arts. However, CENS might behave unsatisfactorily on some audio because it is a handcraft feature. In this paper, we propose to utilize the features learned by Convolutional Deep Belief Network(CDBN) to enhance the performance of audio matching. Benefit from the strong generalization ability of CDBN, our method works better than CENS based methods on most audio datasets. Since the features learned by CDBN are binary-valued, we can develop a more efficient audio matching algorithm by taking the advantage of this property. Experimental results on both TIMIT dataset and a simulated music dataset confirm effectiveness of the proposed CDBN based method comparing with the traditional CENS feature based algorithm.
Edge extraction is an indispensable task in digital image processing. With the sharp increase in the image data, real-time problem has become a limitation of the state of the art of edge extraction *** this paper, QSo...
详细信息
Edge extraction is an indispensable task in digital image processing. With the sharp increase in the image data, real-time problem has become a limitation of the state of the art of edge extraction *** this paper, QSobel, a novel quantum image edge extraction algorithm is designed based on the flexible representation of quantum image(FRQI) and the famous edge extraction algorithm Sobel. Because FRQI utilizes the superposition state of qubit sequence to store all the pixels of an image, QSobel can calculate the Sobel gradients of the image intensity of all the pixels simultaneously. It is the main reason that QSobel can extract edges quite fast. Through designing and analyzing the quantum circuit of QSobel, we demonstrate that QSobel can extract edges in the computational complexity of O(n2) for a FRQI quantum image with a size of2 n × 2n. Compared with all the classical edge extraction algorithms and the existing quantum edge extraction algorithms, QSobel can utilize quantum parallel computation to reach a significant and exponential ***, QSobel would resolve the real-time problem of image edge extraction.
A novel framework for parallel subgraph isomorphism on GPUs is proposed, named GPUSI, which consists of GPU region exploration and GPU subgraph matching. The GPUSI iteratively enumerates subgraph instances and solves ...
详细信息
A novel framework for parallel subgraph isomorphism on GPUs is proposed, named GPUSI, which consists of GPU region exploration and GPU subgraph matching. The GPUSI iteratively enumerates subgraph instances and solves the subgraph isomorphism in a divide-and-conquer fashion. The framework completely relies on the graph traversal, and avoids the explicit join operation. Moreover, in order to improve its performance, a task-queue based method and the virtual-CSR graph structure are used to balance the workload among warps, and warp-centric programming model is used to balance the workload among threads in a warp. The prototype of GPUSI is implemented, and comprehensive experiments of various graph isomorphism operations are carried on diverse large graphs. The experiments clearly demonstrate that GPUSI has good scalability and can achieve speed-up of 1.4–2.6 compared to the state-of-the-art solutions.
暂无评论