The field of artificial intelligence (AI) holds a variety of algorithms designed with the goal of achieving high accuracy at low computational cost and latency. One popular algorithm is the vision transformer (ViT), w...
详细信息
ISBN:
(纸本)9798350383638;9798350383645
The field of artificial intelligence (AI) holds a variety of algorithms designed with the goal of achieving high accuracy at low computational cost and latency. One popular algorithm is the vision transformer (ViT), which excels at various computer vision tasks for its ability to capture long-range dependencies effectively. This paper analyzes a computing paradigm, namely, spatial transformer networks (STN), in terms of accuracy and hardware complexity for image classification tasks. The paper reveals that for 2D applications, such as image recognition and classification, STN is a great backbone for AI algorithms for its efficiency and fast inference time. This framework offers a promising solution for efficient and accurate AI for resource-constrained internet of Things (IoT) and edge devices. The comparative analysis of STN implementations on the central processing unit (CPU), Raspberry Pi (RPi), and Resistive Random Access Memory (RRAM) architectures reveals nuanced performance variations, providing valuable insights into their respective computational efficiency and energy utilization.
With the continuous progress of machine learning technology, automatic driving perception technology is gradually becoming mature. As a guarantee of normal traffic operation, signal lights can effectively guide vehicl...
详细信息
Deep learning have been widely applied in image enhancement, but traditional deep learning methods have not taken into account the uncertainty and reliability of image enhancement. We propose a new method of using gen...
详细信息
MIMO systems, which consist of multiple transmitting and receiving antennas, are used as a method for effective use of frequency bandwidth. In general, signal separation in MIMO requires a known channel matrix. Howeve...
详细信息
ISBN:
(纸本)9791188428106
MIMO systems, which consist of multiple transmitting and receiving antennas, are used as a method for effective use of frequency bandwidth. In general, signal separation in MIMO requires a known channel matrix. However, using Independent Components Analysis, signal separation is possible even when the channel matrix is unknown. On the other hand, if the channel matrix is known, MIMO technology may be efficiently utilized by precoding in the reverse direction line of TDD. In this study, we examined whether ICA can be used for signal separation and channel information estimation. We report that signal detection and channel estimation in large-scale MIMO can be performed without any problems.
Adaptive block partitioning is responsible for large gains in current image and video compression systems. This method is able to compress large stationary image areas with only a few symbols, while maintaining a high...
详细信息
ISBN:
(纸本)9781728198354
Adaptive block partitioning is responsible for large gains in current image and video compression systems. This method is able to compress large stationary image areas with only a few symbols, while maintaining a high level of quality in more detailed areas. Current state-of-the-art neural-network-basedimage compression systems however use only one scale to transmit the latent space. In previous publications, we proposed RDONet, a scheme to transmit the latent space in multiple spatial resolutions. Following this principle, we extend a state-of-the-art compression network by a second hierarchical latent-space level to enable multi-scale processing. We extend the existing rate variability capabilities of RDONet by a gain unit. With that we are able to outperform an equivalent traditional autoencoder by 7% rate savings. Furthermore, we show that even though we add an additional latent space, the complexity only increases marginally and the decoding time can potentially even be decreased.
Gait recognition technology, as a key branch in the field of identity recognition, has significant value but also faces challenges such as discomfort from wearable devices, the need for synchronized processing of gait...
详细信息
The image deblurring problem is an active area of research in image processing. The Fast Iterative Shrinkage Thresholding Algorithm (FISTA) has garnered significant attention for solving deblurring problems with l(1)-...
详细信息
ISBN:
(纸本)9798350350463;9798350350456
The image deblurring problem is an active area of research in image processing. The Fast Iterative Shrinkage Thresholding Algorithm (FISTA) has garnered significant attention for solving deblurring problems with l(1)- based sparsity constraints. This paper proposes a new l(1)- based algorithm called Enhanced FISTA (EFISTA) that incorporates accelerated gradient descent and an appropriate proximal operation. We have studied the impact of accelerated gradient descent in noisy conditions, which helps us identify the importance of a well-designed proximal operation to mitigate noise interference. The experimental results show that EFISTA exhibits superior execution speed while maintaining reconstruction performance comparable to its predecessors. This highlights the robustness and efficiency of EFISTA in addressing image deblurring challenges, particularly at high noise levels.
Most of the existing moving object segmentation (MOS) methods regard MOS as an independent task, in this paper, we associate the MOS task with semantic segmentation, and propose a semantics-guided network for moving o...
详细信息
ISBN:
(纸本)9781665491907
Most of the existing moving object segmentation (MOS) methods regard MOS as an independent task, in this paper, we associate the MOS task with semantic segmentation, and propose a semantics-guided network for moving object segmentation (LiDAR-SGMOS). We first transform the range image and semantic features of the past scan into the range view of current scan based on the relative pose between scans. The residual image is obtained by calculating the normalized absolute difference between the current and transformed range images. Then, we apply a Meta-Kernel based cross scan fusion (CSF) module to adaptively fuse the range images and semantic features of current scan, the residual image and transformed features. Finally, the fused features with rich motion and semantic information are processed to obtain reliable MOS results. We also introduce a residual image augmentation method to further improve the MOS performance. Our method outperforms most LiDAR-MOS methods with only two sequential LiDAR scans as inputs on the SemanticKITTI MOS dataset.
When the radar observation angle is sparse and the target is stationary, it is difficult to image by traditional methods. Therefore, we propose a dual-angle radar image reconstruction method to achieve dual-angle rada...
详细信息
A novel approach for establishing clear regions in multi-pulse repetition frequency (PRF) pulse Doppler radar is proposed. The clear regions are constructed using the sigmoid function, resulting in a continuously diff...
详细信息
暂无评论