High-level synthesis (HLS) enables the automated conversion of high-level language algorithms into synthesizable register-transfer level code, allowing computation-intensive algorithms to be accelerated on FPGAs. Most...
详细信息
High-level synthesis (HLS) enables the automated conversion of high-level language algorithms into synthesizable register-transfer level code, allowing computation-intensive algorithms to be accelerated on FPGAs. Most HLS tools have C++ as their input language, as it is widely known in both software and hardware industry. However, even though C++ receives a new standard every three years, the HLS tool vendors have mostly provided support and examples using C++98/03. Limiting to early C++ standards imposes a productivity penalty, since the newer standards provide both compilation time reductions and more concise, expressive, and maintainable way of writing code. In this study, we make the case for adopting modern C++ in HLS. We inspect the language features of C++11 and forward, and consider their benefits for HLS. We also test the present support for the modern language features with two state-of-the-art commercial HLS tools. Finally, we provide an extended example, demonstrating the increased clarity of code achieved using the newer standards. We note that the investigated HLS tools already have good support for modern C++ features, and urge their adoption to increase designer productivity.
Optimization is an imperative feature in almost all fields of Engineering, Economics, and Sciences. Due to the advent of high-end computers and the gradual increase in the complexity of optimization problems, algorith...
详细信息
Optimization is an imperative feature in almost all fields of Engineering, Economics, and Sciences. Due to the advent of high-end computers and the gradual increase in the complexity of optimization problems, algorithms for numerical optimization have been developed. Numerous existing numerical optimization algorithms suffer from premature convergence, poor local/global search abilities, and high computational complexity. A chaotic optimization algorithm and a chaotic map could help overcome most of these setbacks. This paper offers a detailed study and analysis of five chaotic maps used for global Optimization, namely Chebyshev, Cubic, ICMIC, Neuron, and Sine maps. This work also proposes a pioneering global optimization method, Hybrid Chaotic Pattern Search Algorithm (HCPSA), for finding the global minimum for multivariable unconstrained optimization problems. Numerical results over 12 benchmark functions and comparative results (comparison of accuracy and computational time) with some popular algorithms evidence the effectiveness of the proposed algorithm for higher dimensional non-linear functions. The efficient usage of chaotic maps has helped reduce the computational time to evaluate the optimum for higher dimensional non-linear functions. To showcase the use of HCPSA in a real-world problem, we have taken the problem of analyzing financial ratios for predicting bankruptcy. Banks predict bankruptcy from the start of their businesses to determine their financial stability. In this work, we initially perform Logistic Regression (LR) on the data obtained from the banks to get the reliability function with financial ratios as decision variables. After this, the function is maximized using HCPSA and a Chebyshev map. This methodology is beneficial for decision-makers within a bank to maximize the reliability of the financial ratios and, most essential, to protect the bank from disasters. Comparative results of reliability prediction using HCPSA and PSO and a non-par
Real-time information exchange on traffic and channel selection results among users in dynamic spectrum access (DSA) system consumes scarce spectrum resources. However, it is difficult to avoid collision and improve s...
详细信息
Real-time information exchange on traffic and channel selection results among users in dynamic spectrum access (DSA) system consumes scarce spectrum resources. However, it is difficult to avoid collision and improve system-wide global utility simultaneously without assistance of these information in a distributed way. To solve this problem, we propose a multi-agent deep reinforcement learning (RL) based traffic priority-aware multi-user distributed DSA scheme for a multiple orthogonal channels scenario. Different from the conventional approaches for throughput sum maximization, we maximize a total network utility parameterized by the situation of each user's traffic buffer queue. This scheme includes off-line centralized training and distributed execution. The deep Q-learning neural network (DQN) of each user is trained by an offline simulator with global information to learn near-optimal channel selection policies from the transition history. The input of DQN requires only user's local observation to ensure that the scheme based on the trained DQNs can be executed in a distributed way. Simulation results show that the proposed scheme compared with benchmark algorithms can achieve about 90% or more of performance of Genie-aided algorithm based on global information, and is much better than random-type algorithms.
Efficient panoramic video coding plays a crucial role in the metaverse and Web 3.0 by enhancing content delivery, accessibility, and scalability. However, panoramic video is viewed in the spherical domain, while it is...
详细信息
Efficient panoramic video coding plays a crucial role in the metaverse and Web 3.0 by enhancing content delivery, accessibility, and scalability. However, panoramic video is viewed in the spherical domain, while it is coded in the typical two-dimensional plane. Such a framework renders the compression-distortion metric unable to align well with the spherical- distortion perceived by viewers, resulting in inefficient rate-distortion optimization (RDO) in the coding process. Additionally, independent RDO on an individual panoramic video frame is also inefficient as it disregards the distortion propagation caused by the inter-prediction of video coding. To address these issues, a temporal-dependent spherical-distortion model is proposed for efficient panoramic video coding. Using the geometric projection principle, an independent mapping model between the spherical-distortion and the compression-distortion is first established for individual frames. Subsequently, the temporal-dependent spherical-distortion model for consecutive frames is deduced based on the inter-prediction structure. This model is then employed to guide the RDO process for panoramic video coding. Experimental results demonstrate that the proposed algorithm outperforms state-of-the-art methods, achieving an average bitrate reduction of 4.2% compared to the reference software VTM with the 360Lib extension.
The Karatsuba algorithm is an effective way to accelerate large integer multiplications through recursive function calls. However, existing hardware implementations of Karatsuba multipliers are limited to fixed operan...
详细信息
The Karatsuba algorithm is an effective way to accelerate large integer multiplications through recursive function calls. However, existing hardware implementations of Karatsuba multipliers are limited to fixed operand sizes. To enable their application in diverse domains, including homomorphic encryption with varying multiplicative depths, it is necessary to support variable operand sizes. In this paper, we propose a novel Karatsuba multiplier design, named FlexKA, which supports variable operand sizes through a state machine that manages the dynamic call states of the operation. We evaluate FlexKA on the Xilinx ZynqMP FPGA and demonstrate that it supports variable operand sizes up to 256K bits, achieving a 9.2x speedup compared to a highly-optimized software library running on a CPU. Our results show that FlexKA is an efficient and effective solution for large integer multiplications with flexible operand sizes in hardware.
The Real-Time Iteration (RTI) is an online nonlinear model predictive control algorithm that performs a single Sequential Quadratic Programming (SQP) per sampling time. The algorithm is split into a preparation and a ...
详细信息
The Real-Time Iteration (RTI) is an online nonlinear model predictive control algorithm that performs a single Sequential Quadratic Programming (SQP) per sampling time. The algorithm is split into a preparation and a feedback phase, where the latter one performs as little computations as possible solving a single prepared quadratic program. To further improve the accuracy of this method, the Advanced-Step RTI (AS-RTI) performs additional Multi-Level Iterations (MLI) in the preparation phase, such as inexact or zero-order SQP iterations on a problem with a predicted state estimate. This letter extends and streamlines the existing local convergence analysis of AS-RTI, such as analyzing MLI of level A and B for the first time, and significantly simplifying the proofs for levels C and D. Moreover, this letter provides an efficient open-source implementation in acados, making it widely accessible to practitioners.
Side-channel attacks are powerful attacks for retrieving secret data by exploiting physical measurements, such as power consumption or electromagnetic emissions. Masking is a popular countermeasure as it can be proven...
详细信息
Side-channel attacks are powerful attacks for retrieving secret data by exploiting physical measurements, such as power consumption or electromagnetic emissions. Masking is a popular countermeasure as it can be proven secure against an attacker model. In practice, software-masked implementations suffer from a security reduction due to a mismatch between the considered leakage sources in the security proof and the real ones, which depend on the microarchitecture. We propose ARMISTICE, a framework for formally verifying the absence of leakage in first-order masked implementations taking into account modeled microarchitectural sources of leakage. As a proof of concept, we present the modeling of an Arm Cortex-M3 core from its RTL description and leakage test vectors, as well as the modeling of the memory of an STM32F1 board, exclusively using leakage test vectors. We show that, with these models, ARMISTICE pinpoints vulnerable instructions in real-world masked implementations and helps the design of masked software implementations which are practically secure.
Computational models lie at the heart of computational science, yet few scientists have a clear idea of what a computational model actually is. Is it software? Or an algorithm? How does it relate to mathematical model...
详细信息
Computational models lie at the heart of computational science, yet few scientists have a clear idea of what a computational model actually is. Is it software? Or an algorithm? How does it relate to mathematical models? What are suitable languages or notations for expressing a computational model in the literature? And will AI make computational models obsolete?
Emerging event-based social networks (EBSNs), such as Meetup, have grown rapidly and become popular in recent years. EBSNs differ from conventional social networks such as Facebook in that they not only involve online...
详细信息
Emerging event-based social networks (EBSNs), such as Meetup, have grown rapidly and become popular in recent years. EBSNs differ from conventional social networks such as Facebook in that they not only involve online social interactions but also include offline, in-person interactions. Thus, EBSNs are naturally heterogeneous and possess more valuable social information. Group recommendations in EBSNs are typically only based on the interest information filled in by users, or friends' group information. Both these methods may not well reflect users' real intentions. In this study, we propose a recommender system to predict groups that may interest EBSN users, based on a novel heterogeneous augmented graph method and a random walk with restart algorithm. In this approach, online and offline social interactions are combined into a single heterogeneous augmented graph capturing all useful relationships, including user-to-group relationships, user-to-event relationships, user-to-attribute relationships, and group-to-attribute relationships, and among others. To our knowledge, this work is the first attempt to apply a random walk algorithm into group recommendation in EBSNs. Extensive experiments on Meetup datasets demonstrate that our proposed recommender system achieves better results in terms of recall, precision, F-Measure and MRR metrics in comparison with the other four commonly used algorithms, including random recommendation, interest-based recommendation, interest- and neighborhood-based recommendation, and Katz Centrality. The significant recommendation performance of our approach may further enhance user satisfaction of EBSNs. Moreover, our approach to group recommendation may also be extended to other recommendation-related applications such as event or friend recommendation.
Constrained multiobjective optimization problems widely exist in real-world applications. To handle them, the balance between constraints and objectives is crucial, but remains challenging due to non-negligible impact...
详细信息
Constrained multiobjective optimization problems widely exist in real-world applications. To handle them, the balance between constraints and objectives is crucial, but remains challenging due to non-negligible impacts of problem types. In our context, the problem types refer particularly to those determined by the relationship between the constrained Pareto-optimal front (PF) and the unconstrained PF. Unfortunately, there has been little awareness on how to achieve this balance when faced with different types of problems. In this article, we propose a new constraint handling technique (CHT) by taking into account potential problem types. Specifically, inspired by the prior work, problems are classified into three primary types: 1) I;2) II;and 3) III, with the constrained PF being made up of the entire, part and none of the unconstrained counterpart, respectively. Clearly, any problem must be one of the three types. For each possible type, there exists a tailored mechanism being used to handle the relationships between constraints and objectives (i.e., constraint priority, objective priority, or the switch between them). It is worth mentioning that exact problem types are not required because we just consider their possibilities in the new CHT. Conceptually, we show that the new CHT can make a tradeoff among different types of problems. This argument is confirmed by experimental studies performed on 38 benchmark problems, whose types are known, and a real-world problem (with unknown types) in search-based software engineering. Results demonstrate that within both decomposition-based and nondecomposition-based frameworks, the new CHT can indeed achieve a good tradeoff among different problem types, being better than several state-of-the-art CHTs.
暂无评论