In this paper we extend our previous research on coherent observer-based pole placement approach to study the synthesis of robust decoherence-free (DF) modes for linear quantum passive systems, which is aimed at prese...
详细信息
Ever-growing CNN size incurs a significant amount of redundancy in model parameters, which in turn, puts considerable burden on hardware. Unstructured pruning is widely used to reduce model sparsity. While, the irregu...
详细信息
ISBN:
(纸本)9781450397339
Ever-growing CNN size incurs a significant amount of redundancy in model parameters, which in turn, puts considerable burden on hardware. Unstructured pruning is widely used to reduce model sparsity. While, the irregularity introduced by unstructured pruning makes it difficult to accelerate sparse CNNs on systolic array. To address this issue, a variety of accelerators have been proposed. SIGMA, the state-of-the-art sparse GEMM accelerator, achieves significant speedup over systolic array. However, SIGMA suffers from two disadvantages: 1) it only supports one-side sparsity, leaving potential for further performance gains; 2) SIGMA improves utilization of large-sized systolic arrays at the cost of extra overhead. In this paper, we propose DSSA, a dual-side sparse systolic array, to accelerate CNN training. DSSA bases its designs on a small-sized systolic array, which naturally achieves higher cell utilization without additional overhead. To facilitate dual-side sparsity processing, DSSA utilizes a cross-cycle reduction module to accumulate partial sum that belongs to the same column but being processed in different cycles. A comprehensive design space exploration is performed to seek the local optimal configurations for DSSA. We implement the logic design of DSSA using Verilog in RTL and evaluate its performance using a C++-based cycle-accurate performance simulator we built. Experimental results show that DSSA delivers, on average, a speedup of 2.13x and 13.81x over SIGMA and a basic systolic array with the same number of cells. Compared to SIGMA, DSSA incurs 16.59% area overhead and 25.49% power overhead when sparse filter is excluded, as SIGMA did.
Scene text retrieval aims to find all images containing the query text from an image gallery. Current efforts tend to adopt an Optical Character Recognition (OCR) pipeline, which requires complicated text detection an...
详细信息
SM4-GCM is an encryption algorithm with authentication function. The algorithm achieves the purpose of data security and information integrity. The SM4-GCM algorithm, implemented using traditional software methods, ha...
详细信息
ISBN:
(纸本)9781665424509
SM4-GCM is an encryption algorithm with authentication function. The algorithm achieves the purpose of data security and information integrity. The SM4-GCM algorithm, implemented using traditional software methods, has low throughput and high resource consumption. In order to further improve the algorithm performance, this paper uses FPGA to optimize the SM4-GCM algorithm to achieve full-pipeline parallel acceleration. Firstly, the SM4 module is optimized using pipelining techniques. Then, the GHASH module is optimized using the Karatsuba algorithm and fast reduction. Finally, a loosely coupled architecture is used to connect various modules with asynchronous FIFOs, which improves the resource utilization and throughput of the FPGA circuit. The experimental results show that the throughput of the optimized SM4-GCM algorithm reaches 28.8 Gbps. It is better than other schemes, has a higher throughput, and meets the actual application requirements.
This paper mainly studies the optimal scheme of dynamic adjustment of active mirror in FAST system by establishing mathematical model. According to the constraints such as distance variation range, a mathematical mode...
详细信息
ISBN:
(纸本)9781450384155
This paper mainly studies the optimal scheme of dynamic adjustment of active mirror in FAST system by establishing mathematical model. According to the constraints such as distance variation range, a mathematical model describing the dynamic trajectory of the position and angle of the active mirror is established, and the correctness of the solution of the model is verified by simulation. The optimal surface model is selected from all the results. When the measured object is directly above the fast system, it is modeled based on the ideal parabola to ensure that the expansion constraints and threshold conditions of the actuator are met. Because the distribution of actuator and cable in the whole fast system is not a uniform sphere, the change of radial distance of cable can be considered as the expansion of actuator; In addition, the model is optimized from two aspects: adjusting the position of focus on the horizontal plane and changing the size of zoom alignment. From these two aspects, the parabolic focusing range satisfying the expansion range is obtained. Finally, the optimization strategy of dynamic adjustment of active mirror in fast system is given.
Knowledge graph representation learning provides a lot of help for subsequent tasks such as knowledge graph completion, information retrieval, and intelligent question answering. By representing the knowledge graph as...
详细信息
ISBN:
(纸本)9781450384094
Knowledge graph representation learning provides a lot of help for subsequent tasks such as knowledge graph completion, information retrieval, and intelligent question answering. By representing the knowledge graph as low-dimensional dense vectors, these tasks can improve efficiency significantly. However, limited by the sparseness and huge scale of the actual structure, representation learning only from a structural perspective can no longer meet the research *** improve the performance, researchers introduced auxiliary information into representation learning. This paper focuses on models using text as auxiliary information, dividing all text-combined models into two categories: Closed-world assumption models and Open-world assumption models. The former is limited by the model's demand for the structural representation of the graph itself, and can only predict the entities and relationships that already exist in the knowledge graph. The latter can handle entities that have not previously been seen during model training, and connect brand-new entities to the knowledge graph, which is more in line with the dynamic trend of the knowledge graph in real world. Open-world assumption models can be further subdivided into multiple types according different joint functions, such as alignment function, fusion function, score function and transformation function. This paper summarizes existing methods in detail and looks forward to future possible research directions.
Existing deep learning-based encrypted traffic recognition methods can achieve high precision identification performance while protecting user privacy, but almost all of them focus on closed sets, in which training da...
详细信息
Software vulnerability mining is an important component of network attack and defense technology. To address the problems of high leakage rate and false positive rate of existing static analysis methods, this paper pr...
详细信息
In recent years, artificial intelligence has fueled the development of numerous applications [1, 2]. Person re-identification (re-ID) is a typical artificial intelligence system designed to automatically retrieve imag...
In recent years, artificial intelligence has fueled the development of numerous applications [1, 2]. Person re-identification (re-ID) is a typical artificial intelligence system designed to automatically retrieve images of specific individuals from galleries captured by different cameras [3].
暂无评论