检索结果-内蒙古大学图书馆

parallel implementation of PIPO and Its Application for Format Preserving Encryption

IEEE ACCESS 2022年 10卷 99963-99972页

作者： Kim, Hyunji Kim, Hyunjun Eum, Siwoo Kwon, Hyeokdong Yang, Yujin Seo, Hwajeong Hansung Univ IT Convergence Div Seoul 02876 South Korea

The PIPO block cipher, a domestic lightweight block cipher, was announced at ICISC'20. In particular, the bitslicing technique is implemented in the S-Layer for the PIPO block cipher. Because this is a part that can be operated in parallel, we implemented the PIPO block cipher efficiently in a parallel approach through AVX2 instructions, and provide implementations for ECB and CTR modes. Compared to the existing PIPO implementation, we achieved a performance improvement by 7.345x. In addition, we applied the AVX2-PIPO implementation to the round function of format-preserving encryption. When repeatedly encrypting 128-byte plaintext, we achieved performance similar to that of the existing FF1-AES implementation. The FF1-AVX2-PIPO implementation successfully encrypted the database and enabled efficient database management in terms of memory space and speed factor. Finally, AVX2-PIPO-CTR and FF1-AVX2-PIPO were applied to image processing. In the case of CTR mode, the encryption performance was better than that of ECB mode. Partial encryption with object detection and FF1-AVX2-PIPO was successfully performed, and it is expected that privacy protection in CCTV or image processing can be improved.

关键词： PIPO block cipher parallel implementation format preserving encryption

来源：评论

学校读者我要写书评

暂无评论

parallel implementation of an aggregation/disaggregation method for evaluating quasi-stationary behavior in continuous-time Markov chains

引用

parallel COMPUTING 1997年第10期23卷 1545-1559页

作者： Bebbington, MS Department of Statistics Massey University Private Bag 11222 Palmerston North New Zealand

We describe how an aggregation/disaggregation method for finding quasi-stationary distributions of continuous-time Markov chains can be implemented on a massively parallel computer. The method is similar to an algebraic multigrid, using restriction operators that depend on the current iteration of the solution, and Jacobi smoothers at each level of the multigrid. The method is illustrated using a simple epidemic model, and the performance compared to a sequential implementation as the size of the population increases. (C) 1997 Elsevier Science B.V.

关键词： aggregation/disaggregation methods algebraic multigrid epidemic parallel implementation

来源：评论

学校读者我要写书评

暂无评论

parallel implementation Strategy for CoHOG-Based Pedestrian Detection Using a Multi-Core Processor

引用

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES 2011年第11期E94A卷 2315-2322页

作者： Miyamoto, Ryusuke Sugano, Hiroki Nara Inst Sci & Technol Grad Sch Informat Sci Ikoma 6300192 Japan

Pedestrian detection from visual images, which is used for driver assistance or video surveillance, is a recent challenging problem. Co-occurrence histograms of oriented gradients (CoHOG) is a powerful feature descriptor for pedestrian detection and achieves the highest detection accuracy. However, its calculation cost is too large to calculate it in real-time on state-of-the-art processors. In this paper, to obtain optimal parallel implementation for an NVIDIA GPU, several kinds of parallelism of CoHOG-based detection are shown and evaluated suitability for implementation. The experimental result shows that the detection process can be performed at 16.5 fps in QVGA images on NVIDIA Testa C1060 by optimized parallel implementation. By our evaluation, it is shown that the optimal strategy of parallel implementation for an NVIDIA GPU is different from that of FPGA. We discuss about the reason and show the advantages of each device. To show the scalability and portability of GPU implementation, the same object code is executed on other NVIDA GPUs. The experimental result shows that GTX570 can perform the CoHOG-based pedestiran detection 21.3 fps in QVGA images.

关键词： pedestrian detection parallel implementation CoHOG GPU computing

来源：评论

学校读者我要写书评

暂无评论

A parallel implementation OF THE KALMAN FILTER WITH APPLICATION TO MEASUREMENTS ON ELECTRICAL DRIVES

引用

TRANSACTIONS OF THE INSTITUTE OF MEASUREMENT AND CONTROL 1994年第2期16卷 108-116页

作者： BUCCI, G GERMANO, A TOFONI, T ADV SCH TELECOMMUN G REISS ROMOLI LAQUILAITALY

In this paper a parallel implementation of the Kalman filter is proposed, to speed up computation using concurrent calculus techniques and factorisation methods, that help avoid numerical instability problems. The algorithm has been implemented on a measuring system based on the use of a transputer network and a data acquisition board, and applied to measurement on asynchronous motors. Some experimental results obtained with the proposed real system are also shown and the performance is reported.

关键词： KALMAN FILTER parallel implementation ELECTRICAL DRIVE MEASUREMENTS

来源：评论

学校读者我要写书评

暂无评论

parallel implementation of K-Means Algorithm on FPGA

引用

IEEE ACCESS 2020年 8卷 41071-41084页

作者： Dias, Leonardo A. Ferreira, Joao C. Fernandes, Marcelo A. C. Univ Fed Rio Grande do Norte Lab Machine Learning & Intelligent Instrumentat nPITI IMD BR-59078970 Natal RN Brazil Univ Porto INESC TEC P-4200465 Porto Portugal Univ Porto Fac Engn P-4200465 Porto Portugal Univ Fed Rio Grande do Norte Dept Comp Engn & Automat BR-59078970 Natal RN Brazil Harvard Univ John A Paulson Sch Engn & Appl Sci Cambridge MA 02138 USA

The K-means algorithm is widely used to find correlations between data in different application domains. However, given the massive amount of data stored, known as Big Data, the need for high-speed processing to analyze data has become even more critical, especially for real-time applications. A solution that has been adopted to increase the processing speed is the use of parallel implementations on FPGA, which has proved to be more efficient than sequential systems. Hence, this paper proposes a fully parallel implementation of the K-means algorithm on FPGA to optimize the system's processing time, thus enabling real-time applications. This proposal, unlike most implementations proposed in the literature, even parallel ones, do not have sequential steps, a limiting factor of processing speed. Results related to processing time (or throughput) and FPGA area occupancy (or hardware resources) were analyzed for different parameters, reaching performances higher than 53 millions of data points processed per second. Comparisons to the state of the art are also presented, showing speedups of more than over a partially serial implementation.

关键词： parallel implementation FPGA K-means algorithm reconfigurable computing

来源：评论

学校读者我要写书评

暂无评论

parallel implementation of a three-dimensional cellular automaton model of the electrochemical oxidation of carbon "Ketjenblack EC-600JD"

引用

JOURNAL OF SUPERCOMPUTING 2019年第12期75卷 7790-7798页

作者： Kireeva, A. E. Sabelfeld, K. K. Gribov, E. N. Maltseva, N. V. RAS Inst Computat Math & Math Geophys SB Pr Lavrentjeva 6 Novosibirsk Russia RAS Boreskov Inst Catalysis SB Pr Lavrentieva 5 Novosibirsk Russia Novosibirsk State Univ Pirogova Str 2 Novosibirsk Russia

The paper presents a three-dimensional cellular automaton model of electrochemical oxidation of the carbon. The sample of the electro-conductive carbon black "Ketjen-black EC-600JD" consisting of granules of carbon is simulated. The electrochemical oxidation of the carbon granules occurs through a fewsuccessive stages. parallel implementation of the three-dimensional cellular automaton model of carbon corrosion is developed. The efficiency and speedup of the parallel code are analyzed. The portions of surface carbon atoms and atoms with different degree of oxidation are computed by the parallel code. Based on the obtained values of atom portions the electrochemical capacity is calculated. The results of computer simulation are compared with the experimental data.

关键词： parallel implementation Cellular automaton Domain decomposition Connected component Electrochemical oxidation Carbon corrosion

来源：评论

学校读者我要写书评

暂无评论

parallel implementation OF THE SCHUR BERLEKAMP-MASSEY ALGORITHM ON A LINEARLY CONNECTED PROCESSOR ARRAY

引用

IEEE TRANSACTIONS ON COMPUTERS 1995年第7期44卷 930-933页

作者： ZAROWSKI, CJ Dept. of Electr. Eng. Queen""s Univ. Kingston Ont. Canada Abstract Authors References Cited By Keywords Metrics Similar Download Citation Email Print Request Permissions

The Berlekamp-Massey algorithm (BMA) is important in the decoding of Reed-Solomon (RS), and more generally, Bose-Chaudhuri-Hocquenghem (BCH) block error-control codes. For a t-error correcting code the BMA has time complexity O(t(2)) when implemented on a sequential computer. However, the BMA does not run efficiently on a parallel computer. The Bh IA can be mapped into the Schur BMA. This paper presents the implementation of the BMA and Schur BMA together on a linearly connected array of 2t processors. The resulting machine computes the error-locator polynomial with a time complexity of O(t).

关键词： SCHUR ALGORITHM BERLEKAMP-MASSEY ALGORITHM parallel implementation LINEARLY CONNECTED PROCESSOR ARRAY

来源：评论

学校读者我要写书评

暂无评论

parallel implementation OF THE EXTENDED SQUARE-ROOT COVARIANCE FILTER FOR TRACKING APPLICATIONS

引用

IEEE TRANSACTIONS ON parallel AND DISTRIBUTED SYSTEMS 1993年第4期4卷 446-457页

作者： LEE, EKB HAYKIN, S MCMASTER UNIV COMMUN APPL RES LABHAMILTON L8S 4L8ONTARIOCANADA

parallel implementations of the extended square-root covariance filter (ESRCF) for tracking applications are developed in this paper. The decoupling technique and special properties in the tracking Kalman filter (KF) are explored to reduce computational requirements and to increase parallelism. The application of the decoupling technique to the ESRCF results in the time and measurement updates of m decoupled (n/m)-dimensional matrices instead of 1 coupled n-dimensional matrix, where m denotes the tracking dimension and n denotes the number of state elements. The updates of m decoupled matrices are found to require approximately m times less processing elements and clock cycles than the updates of 1 coupled matrix. The transformation of the Kalman gain which accounts for the decoupling technique is found straightforward to implement. The sparse nature of the measurement matrix and the sparse, band nature of the transition matrix are explored to simplify matrix multiplications.

关键词： DECOUPLING TECHNIQUE EXTENDED SQUARE-ROOT COVARIANCE FILTER KALMAN FILTER parallel implementation SYSTOLIC ARRAY TRACKING KF PROPERTIES VLSI

来源：评论

学校读者我要写书评

暂无评论

parallel implementation OF 3D CONVEX-HULL ALGORITHM

引用

COMPUTER-AIDED DESIGN 1991年第3期23卷 177-188页

作者： DAY, AM School of Information Systems University of East Anglia Norwich UK

The paper presents a parallel implementation of a 3D convex-hull algorithm on a Meiko Computing Surface using OCCAM and C. The parallel program is adapted from a serial divide-and-conquer version;the outline of the serial verison is also given. Details relating to the practical problems involved in the parallelization of such a geometric algorithm are reported. The performance of the parallel program is monitored for several different sizes of network, and compared with the performance of the serial version running on a Sun workstation. Experimental results are presented, and suggestions for further developments of the implementation are also discussed.

关键词： COMPUTATIONAL GEOMETRY parallel implementation CONVEX HULL TRANSPUTER

来源：评论

学校读者我要写书评

暂无评论

parallel implementation of the Nonlinear Semi-NMF Based Alternating Optimization Method for Deep Neural Networks

引用

NEURAL PROCESSING LETTERS 2018年第3期47卷 815-827页

作者： Imakura, Akira Inoue, Yuto Sakurai, Tetsuya Futamura, Yasunori Univ Tsukuba Tennodai 1-1-1 Tsukuba Ibaraki 3058573 Japan CREST JST Kawaguchi Saitama Japan

For computing weights of deep neural networks (DNNs), the backpropagation (BP) method has been widely used as a de-facto standard algorithm. Since the BP method is based on a stochastic gradient descent method using derivatives of objective functions, the BP method has some difficulties finding appropriate parameters such as learning rate. As another approach for computing weight matrices, we recently proposed an alternating optimization method using linear and nonlinear semi-nonnegative matrix factorizations (semi-NMFs). In this paper, we propose a parallel implementation of the nonlinear semi-NMF based method. The experimental results show that our nonlinear semi-NMF based method and its parallel implementation have competitive advantages to the conventional DNNs with the BP method.

关键词： Deep neural networks Nonlinear semi-nonnegative matrix factorizations parallel implementation

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：