The indirect solution of optimal control problems (OCPs) with inequality constraints and parameters is obtained by solving the two-point boundary value problem (BVP) involving index-1 differential-algebraic equations ...
详细信息
The indirect solution of optimal control problems (OCPs) with inequality constraints and parameters is obtained by solving the two-point boundary value problem (BVP) involving index-1 differential-algebraic equations (DAEs) associated with its first-order optimality conditions. This paper introduces an adaptive mesh refinement method based on a collocation method for solving the index-1 BVP-DAEs. The paper first derives a method to estimate the relative error between the numerical solution and the exact solution. The relative error estimate is then used to guide the mesh refinement process. The mesh size is increased when the estimated error within a mesh interval is beyond the numerical tolerance by either increasing the order of the approximating polynomial or dividing the interval into multiple subintervals. In the mesh interval where the error tolerance has been met, the mesh size is reduced by either decreasing the degree of the approximating polynomial or merging adjacent mesh intervals. An efficient parallel implementation of the method is implemented using Python and CUDA. The paper presents three examples which show that the approach is more computationally efficient and robust when compared with fixed-order methods.
With the development of Internet of things (IoT), where humans and machines interact, healthcare that measures and diagnoses bio-signals is advancing. The electrocardiogram (ECG) signal has different normal beat chara...
详细信息
ISBN:
(纸本)9781665401746
With the development of Internet of things (IoT), where humans and machines interact, healthcare that measures and diagnoses bio-signals is advancing. The electrocardiogram (ECG) signal has different normal beat characteristics for each person, and it requires long-term data for detecting abnormalities. In this paper, we increased the detection rate of the normal signals by learning the reference signal, which is the standard for diagnosing ECG signals, as individual-specific signals from existing fixed data. In addition, we proposed an OpenCL-based FPGA-gpu hybrid cooperative platform to efficiently diagnose long-term, large-capacity ECG signals.
3D mesh watermarking in the transform domain requires significant computational complexity. This is due mainly to the incessant use of high-resolution meshes which require more and more resources. Normally, this is an...
详细信息
3D mesh watermarking in the transform domain requires significant computational complexity. This is due mainly to the incessant use of high-resolution meshes which require more and more resources. Normally, this is an expensive work that harms the commercial chain of low computational cost applications requiring content protection or enrichment. To tackle this issue, we proposed herein a high-capacity and blind watermarking scheme for 3D multiresolution semi-regular meshes while maintaining a trade-off between efficiency and robustness. For this purpose, our solution uses an unlifted butterfly wavelet transform technique that explores the computing power of the Graphic Processing Units (gpu) architecture and the Open Computing Language (OpenCL) framework. The robustness was optimized by generating a turbo-encoded watermark. This latter is embedded in the wavelet coefficients after their spherical parametrization at various levels of details using the least significant bit technique. The method allows a better imperceptibility of the watermark and invariability to affine transformation. It also shows comparative robustness against most of the geometric attacks including additive noise, quantization, smoothing and compression. Moreover, the comparison with other serial watermarking schemes proves the effectiveness in terms of computational complexity of our method. OpenCL embedding implementation offers 3-9 x speedups with a low-power gpu architecture for different mesh sizes. In case of extraction procedure, the speedups obtained vary between 2 x and 12 x.
The powerful parallel computation ability of a graphics processing unit (gpu) makes it feasible to perform dynamic receive beamforming However, a real time gpu-based beamformer requires high data rate to transfer radi...
详细信息
The powerful parallel computation ability of a graphics processing unit (gpu) makes it feasible to perform dynamic receive beamforming However, a real time gpu-based beamformer requires high data rate to transfer radio-frequency (RF) data from hardware to software memory, as well as from central processing unit (CPU) to gpu memory. There are data compression methods (e.g. Joint Photographic Experts Group (JPEG)) available for the hardware front end to reduce data size, alleviating the data transfer requirement of the hardware interface. Nevertheless, the required decoding time may even be larger than the transmission time of its original data, in turn degrading the overall performance of the gpu-based beamformer. This article proposes and implements a lossless compression-decompression algorithm, which enables in parallel compression and decompression of data. By this means, the data transfer requirement of hardware interface and the transmission time of CPU to gpu data transfers are reduced, without sacrificing image quality. In simulation results, the compression ratio reached around 1.7. The encoder design of our lossless compression approach requires low hardware resources and reasonable latency in a field programmable gate array. In addition, the transmission time of transferring data from CPU to gpu with the parallel decoding process improved by threefold, as compared with transferring original uncompressed data. These results show that our proposed lossless compression plus parallel decoder approach not only mitigate the transmission bandwidth requirement to transfer data from hardware front end to software system but also reduce the transmission time for CPU to gpu data transfer.
When a camera rotates rapidly or shakes severely, a conventional KLT (Kanade-Lucas-Tomasi) feature tracker becomes vulnerable to large inter-image appearance changes. Tracking fails in the KLT optimization step, mainl...
详细信息
When a camera rotates rapidly or shakes severely, a conventional KLT (Kanade-Lucas-Tomasi) feature tracker becomes vulnerable to large inter-image appearance changes. Tracking fails in the KLT optimization step, mainly due to an inadequate initial condition equal to final image warping in the previous frame. In this paper, we present a gyro-aided feature tracking method that remains robust under fast camera-ego rotation conditions. The knowledge of the camera's inter-frame rotation, obtained from gyroscopes, provides an improved initial warping condition, which is more likely within the convergence region of the original KLT. Moreover, the use of an eight-degree-of-freedom affine photometric warping model enables the KLT to cope with camera rolling and illumination change in an outdoor setting. For automatic incorporation of sensor measurements, we also propose a novel camera/gyro auto-calibration method which can be applied in an in-situ or on-the-fly fashion. Only a set of feature tracks of natural landmarks is needed in order to simultaneously recover intrinsic and extrinsic parameters for both sensors. We provide a simulation evaluation for our auto-calibration method and demonstrate enhanced tracking performance for real scenes with aid from low-cost microelectromechanical system gyroscopes. To alleviate the heavy computational burden required for high-order warping, our publicly available gpu implementation is discussed for tracker parallelization.
A one-sided Jacobi hyperbolic singular value decomposition (HSVD) algorithm, using a massively parallel graphics processing unit (gpu), is developed. The algorithm also serves as the final stage of solving a symmetric...
详细信息
A one-sided Jacobi hyperbolic singular value decomposition (HSVD) algorithm, using a massively parallel graphics processing unit (gpu), is developed. The algorithm also serves as the final stage of solving a symmetric indefinite eigenvalue problem. Numerical testing demonstrates the gains in speed and accuracy over sequential and MPI-parallelized variants of similar Jacobi-type HSVD algorithms. Finally, possibilities of hybrid CPU-gpuparallelism are discussed.
暂无评论