image segmentation is critical to object-oriented imageprocessing. Many conventional segmentation algorithms are based on the superpixel, since it integrates the pixels with similar colors and locations in prior and ...
详细信息
ISBN:
(纸本)9781665464956
image segmentation is critical to object-oriented imageprocessing. Many conventional segmentation algorithms are based on the superpixel, since it integrates the pixels with similar colors and locations in prior and is beneficial for segmentation. Recently, several segmentation algorithms based on deep learning were developed. However, due to the irregular shape and size of superpixels, it is hard to apply the superpixel directly in a leaning-based segmentation algorithm. In this paper, we propose a novel segmentation method that well integrates the techniques of the deep neural network (DNN), the superpixel, adaptive loss functions, and multi-layer feature extraction. First, different from other learning-based algorithm, which applies an image or its bounding boxes as the input, we adopt the mean and the histogram differences of the features of two superpixels as the input of the DNN to determine whether they should be merged. Moreover, to well consider both largescaled and small-scaled features, a hierarchical architecture is adopted. For different layers, the DNN models with different loss functions are applied. A larger penalty for over-merging is applied in the first layer and a larger penalty for oversegmentation is applied in the following layer. Moreover, according to human perception, the features of colors, areas, the gradient at the boundary, and the texton, which is highly related to the texture, are applied. Experiments show that the proposed method outperforms other state-of-the-art image segmentation methods and produces highly accurate segmentation results.
This paper introduces a novel approach to video prediction and object recognition based on treating image signals as dynamic system operators. We develop algorithms that extract invariant features from pixel patches t...
详细信息
ISBN:
(数字)9798350360868
ISBN:
(纸本)9798350360875
This paper introduces a novel approach to video prediction and object recognition based on treating image signals as dynamic system operators. We develop algorithms that extract invariant features from pixel patches to construct numerical matrices for image frame transitions. Our method diverges from conventional 2D signal processing by viewing images as operators rather than initial conditions. This perspective aligns more closely with biological visual systems and offers potential for efficient electronic implementations. We demonstrate the efficacy of our approach through experiments in mental rotation, affine transformations, and rotated MNIST digit recognition. Our results show that deformation invariance can be obtained without prior knowledge of the transformation. This work contributes to the broader goal of developing biologically plausible computer vision systems, with implications for video prediction, object recognition, and abstract concept formation for AI.
The SpectRx system has been developed to measure sphero-cylindrical spectacle lens power as an alternative to clinical lensmeters. This work was inspired by the ongoing global pandemic, which limited physical access t...
详细信息
ISBN:
(纸本)9781510654174;9781510654167
The SpectRx system has been developed to measure sphero-cylindrical spectacle lens power as an alternative to clinical lensmeters. This work was inspired by the ongoing global pandemic, which limited physical access to eye care facilities for regular eye exams The SpectRx system aims to bypass this limitation by providing at-home prescription measurements. The power and orientation of the spectacle lenses are obtained by the use of readily available objects such as a cell phone camera, a displayed or printed target, and a fixed-dimension magnetic stripe card. The magnification of the lenses can be calculated by examining the image captured through the lens of the target at a fixed distance. The magnification may be spatially varying due to the cylinder component of the lens. processing the pictures captured with a cell phone camera is done automatically with standard imageprocessingalgorithms. The processed images, in turn, are used to calculate a clinical prescription, i.e., Sph/CylxAxis. The SpectRx may expand access to quality eye care in not only the current pandemic situation but also in locations where eye care may not be easily accessible, such as some rural or remote areas. The imageprocessing and clinical prescription calculation are discussed here.
The paper describes an approach for estimation of inertial measurement unit using imageprocessing algorithm to determine the position of an object in space. The results of measurements of the angular velocities of a ...
详细信息
By combining optical systems and imageprocessing, wavefront coding can greatly expand the depth of focus and depth of field of optical systems. It has been widely used in iris detection, high-power microscopic object...
详细信息
ISBN:
(数字)9781510652095
ISBN:
(纸本)9781510652095;9781510652088
By combining optical systems and imageprocessing, wavefront coding can greatly expand the depth of focus and depth of field of optical systems. It has been widely used in iris detection, high-power microscopic objective lens, infrared optical system athermalized, and so on. At present, the image restoration algorithms commonly used in wavefront coding are based on deconvolution, Wiener filtering, and so on. Although these algorithms can achieve an excellent image restoration effect, they will also bring boundary ringing effects and artifacts to the image. When the image is disturbed by strong noise, the restoration effect will also be seriously affected. To solve these problems, a wavefront coded image restoration algorithm based on compressed sensing is proposed in this paper. The strong data reconstruction ability of the compressed sensing restoration algorithm is used to restore the encoded image obtained by the wavefront coding system. This method can effectively suppress noise and reconstruct the image without artifact and boundary ringing effect. Through the comparison of simulation results, the effectiveness of the proposed method is verified.
3D room layout reconstruction from a single RGB panoramic image has been an emerging research topic in recent years. To achieve better prediction accuracy, in this paper, we propose a new approach to predict 3D room l...
详细信息
ISBN:
(纸本)9789811910579;9789811910562
3D room layout reconstruction from a single RGB panoramic image has been an emerging research topic in recent years. To achieve better prediction accuracy, in this paper, we propose a new approach to predict 3D room layout from a single panoramic image. Our reconstruction flow follows a common framework which is same as LayoutNet [9] and HorizonNet [4];however, we redesign a new deep learning architecture with recurrent neural networks (RNNs) encoder-decoder as an extension for keypoints refinement and use a gradient ascent optimization algorithm to minimize the similar loss. Experiments on both cuboid-shaped and general Manhattan layouts show that the proposed work outperforms recent algorithms in prediction accuracy.
Channel estimation overhead reduction is one of the main problems for 6G XL-MIMO systems. As the number of antennas and subcarriers grows, so does the overhead of traditional channel estimation methods. High overhead ...
详细信息
ISBN:
(数字)9798331518752
ISBN:
(纸本)9798331518769
Channel estimation overhead reduction is one of the main problems for 6G XL-MIMO systems. As the number of antennas and subcarriers grows, so does the overhead of traditional channel estimation methods. High overhead limits user mobility and affects the latency. Most of the popular tensor-based channel estimation algorithms combine reference signals transmission and channel tensor estimation into one problem. Such approach typically imposes limitations on reference signals, making backward-compatibility with existing 5G standard challenging. We propose to separate tensor completion and channel tensor elements estimation into different tasks. We show that tensor completion algorithms from imageprocessing and fMRI scanning can be reused for sub-Nyquist completion of OFDM MIMO tensors to reduce the overhead. Furthermore, we extend one of these algorithms from CPD to Tensor Train (TT) and demonstrate that TT-based algorithm reduces channel estimation error up to 2 times. With the proposed approach other algorithms can be further developed based on tensor completion theory.
Few-shot image classification is a critical issue in the field of computer vision, facing challenges related to data scarcity and model generalization. Transformer models, representing self-attention mechanisms, have ...
详细信息
ISBN:
(数字)9798350349115
ISBN:
(纸本)9798350349122
Few-shot image classification is a critical issue in the field of computer vision, facing challenges related to data scarcity and model generalization. Transformer models, representing self-attention mechanisms, have made significant strides in recent years in the domain of few-shot classification. This paper commences with an introduction to the background and challenges of few-shot classification, along with a description of the principles and structure of the Transformer model. Subsequently, the paper categorizes Transformer-based few-shot image classification methods into meta-learning-based, metric-learning-based, fine-tuning-based, and feature-enhancement-based approaches, whose theoretical foundations of each method are expounded and the comparative analysis of representative algorithms are also provided. Furthermore, the paper delves into prospective research directions in this field.
Smart agriculture is an emerging innovative sector that harnesses technological advancements like sensor monitoring, imageprocessing, soil quality evaluation, and automated water systems, all of which contribute towa...
详细信息
This paper presents and discusses the development of a procedure required for the future application of machine learning (ML) in the review and investigation of infrared (IR) thermography images for building envelopes...
详细信息
暂无评论