Compressive sensing (CS) is a technique that enables the recovery of sparse signals using fewer measurements than traditional sampling methods. To address the computational challenges of CS reconstruction, our objecti...
详细信息
Compressive sensing (CS) is a technique that enables the recovery of sparse signals using fewer measurements than traditional sampling methods. To address the computational challenges of CS reconstruction, our objective is to develop an interpretable and concise neural network model for reconstructing natural images using CS. We achieve this by mapping one step of the iterative shrinkage thresholding algorithm (ISTA) to a deep network block, representing one iteration of ISTA. To enhance learning ability and incorporate structural diversity, we integrate aggregated residual transformations (ResNeXt) and squeeze-and-excitation mechanisms into the ISTA block. This block serves as a deep equilibrium layer connected to a semi-tensor product network for convenient sampling and providing an initial reconstruction. The resulting model, called MsDC-DEQ-Net, exhibits competitive performance compared to state-of-the-art network-based methods. It significantly reduces storage requirements compared to deep unrolling methods, using only one iteration block instead of multiple iterations. Unlike deep unrolling models, MsDC-DEQ-Net can be iteratively used, gradually improving reconstruction accuracy while considering computation tradeoffs. Additionally, the model benefits from multiscale dilated convolutions, further enhancing performance.
Compared to natural images, hyperspectral images (HSIs) consist of a large number of bands, with each band capturing different spectral information from a certain wavelength, even some beyond the visible spectrum. The...
详细信息
Compared to natural images, hyperspectral images (HSIs) consist of a large number of bands, with each band capturing different spectral information from a certain wavelength, even some beyond the visible spectrum. These characteristics of HSIs make them highly effective for remote sensing applications. That said, the existing hyperspectral imaging devices introduce severe degradation in HSIs. Hence, hyperspectral image denoising has attracted lots of attention by the community lately. While recent deep HSI denoising methods have provided effective solutions, their performance under real-life complex noise remains suboptimal, as they lack adaptability to new data. To overcome these limitations, in our work, we introduce a self-modulating convolutional neural network which we refer to as SM-CNN, which utilizes correlated spectral and spatial information. At the core of the model lies a novel block, which we call spectral self-modulating residual block (SSMRB), that allows the network to transform the features in an adaptive manner based on the adjacent spectral data, enhancing the network's ability to handle complex noise. In particular, the introduction of SSMRB transforms our denoising network into a dynamic network that adapts its predicted features while denoising every input HSI with respect to its spatio-spectral characteristics. Experimental analysis on both synthetic and real data shows that the proposed SM-CNN outperforms other state-of-the-art HSI denoising methods both quantitatively and qualitatively on public benchmark datasets. Our code will be available at https://***/orhan-t/SM-CNN.
With the rapid development of entity recognition technology, animal recognition has gradually become essential in modern society, supporting labour-intensive agriculture and animal husbandry tasks. Severe problems suc...
详细信息
With the rapid development of entity recognition technology, animal recognition has gradually become essential in modern society, supporting labour-intensive agriculture and animal husbandry tasks. Severe problems such as maintaining biodiversity can also benefit from animal identification technology. However, certain invasive recognition systems have resulted in permanent harm to animals, while noninvasive identification methods also exhibit certain drawbacks. This paper conducts a systematic literature review (SLR), presenting a comprehensive overview of various animal recognition technologies and their applications. Specifically, it examines methodologies such as deep learning, imageprocessing and acoustic analysis used for different animal characteristics and identification purposes. The contribution of machine learning to animal feature extraction is highlighted, emphasising its significance for animal taxonomy and wild species monitoring. Additionally, this review addresses the challenges and limitations of current technologies, including data scarcity, model accuracy and computational requirements, and suggests opportunities for future research to overcome these obstacles.
This paper presents a novel artificial intelligence (AI)-based phase shift system in a beamforming system implemented with field programmable gate array (FPGA)-based hardware by integrating a conventional convolutiona...
详细信息
This paper presents a novel artificial intelligence (AI)-based phase shift system in a beamforming system implemented with field programmable gate array (FPGA)-based hardware by integrating a conventional convolutional neural network (CNN) algorithm. The position of the target can be determined through a phase shifter in a beamforming system using artificial intelligence. In a system that emits a beam from a radio frequency (RF) transmitter and receives a beam from an RF receiver, artificial intelligence can control the phase. It controls the phase of the transmitter for beam scanning and the phase to optimize the signal-to-noise ratio (SNR) of the receiver. The position of the target was detected by learning the signal input data from the receiver. Targets were detected through two-beam scanning processes in a 3D space. The first is a coarse process of detecting the approximate position of the target in the entire space, and the second is a fine process of detecting the area in detail after detecting the first approximate position. The phases of the individual antennae should be controlled for optimal beamforming based on the 5x 5 antenna, and the phase is detected at high speed by holding the phase large in the first coarse tuning. The second scan entails a narrow range scan with a small phase to detect it at a high speed accurately. This study shows that with FPGA, AI beamforming can be implemented through two scanning methods without image sensors. Based on the receiver's 5x5 antenna, the CNN input feature consisted of 35x35 classifies the class with high accuracy.
In this study, various machine learning and image analysis approaches such as Template Matching, HOG, SVM, Faster RCNN and YOLO are examined and compared for the symbol recognition problem in color maps. Some difficul...
详细信息
ISBN:
(纸本)9798350343557
In this study, various machine learning and image analysis approaches such as Template Matching, HOG, SVM, Faster RCNN and YOLO are examined and compared for the symbol recognition problem in color maps. Some difficulties were identified regarding the forms of the symbols, the complexity of the maps or the placement of the symbols on the map. Observations about the success or failure of the methods against the difficulties defined according to the experiments are presented. It has been observed that methods involving artificial neural networks are more successful when performing symbol recognition on color maps. The highest result was obtained with Faster RCNN as 91%.
Gaze estimation is a fundamental aspect of many visual tasks. However, the high cost of acquiring gaze datasets with 3D annotations hinders the optimization and application of gaze estimation models. In this work, we ...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
Gaze estimation is a fundamental aspect of many visual tasks. However, the high cost of acquiring gaze datasets with 3D annotations hinders the optimization and application of gaze estimation models. In this work, we propose a novel Head-Eye redirection parametric model based on neural Radiance Field. This model allows for dense gaze data generation with view consistency and accurate gaze direction. Furthermore, our head-eye redirection parametric model can decouple the face and eyes for separate neural rendering, which enables us to separately control the attributes of the face, identity, illumination, and eye gaze direction. As a result, diverse 3D-aware gaze datasets can be obtained by manipulating the latent code belonging to different face attributes in an unsupervised manner. Our method has achieved state-of-the-art performance in image quality and accuracy gaze annotations compared with existing gaze data synthesis methods. Extensive experiments on several benchmarks demonstrate that our method can effectively improve domain generalization and domain adaptation in the gaze estimation task.
Deep learning in image classification has achieved remarkable success but at the cost of high resource demands. Model compression through automatic joint pruning-quantization addresses this issue, yet most existing te...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
Deep learning in image classification has achieved remarkable success but at the cost of high resource demands. Model compression through automatic joint pruning-quantization addresses this issue, yet most existing techniques overlook a critical aspect: layer correlations. These correlations are essential as they expose redundant computations across layers, and leveraging them facilitates efficient design space exploration. This study employs Graph neural Networks (GNN) to learn these inter-layer relationships, thereby optimizing the pruning-quantization strategy for the targeted model. This approach has yielded a 99.36% reduction in complexity for ResNet20 on CIFAR-10, with only a minimal 0.11% drop in accuracy. Furthermore, the integration of GNN sped up the convergence process, reducing iterations by 2.46 times on average, compared to methods without GNN.
We introduce NeRD, a new demosaicking method for generating full-color images from Bayer patterns. Our approach leverages advancements in neural fields to perform demosaicking by representing an image as a coordinate-...
详细信息
ISBN:
(纸本)9781728198354
We introduce NeRD, a new demosaicking method for generating full-color images from Bayer patterns. Our approach leverages advancements in neural fields to perform demosaicking by representing an image as a coordinate-based neural network with sine activation functions. The inputs to the network are spatial coordinates and a low-resolution Bayer pattern, while the outputs are the corresponding RGB values. An encoder network, which is a blend of ResNet and U-net, enhances the implicit neural representation of the image to improve its quality and ensure spatial consistency through prior learning. Our experimental results demonstrate that NeRD outperforms traditional and state-of-the-art CNN-based methods and significantly closes the gap to transformer-based methods.
Inverse Halftoning is an ill-posed problem which restores a continuous-tone image from a halftone image. Many conventional inverse halftoning methods have tried to solve this problem, yet the recovered images still su...
详细信息
ISBN:
(纸本)9798350300673
Inverse Halftoning is an ill-posed problem which restores a continuous-tone image from a halftone image. Many conventional inverse halftoning methods have tried to solve this problem, yet the recovered images still suffer several unwanted artifacts and fine details losses. In addition, recent deep neural network-based approaches have shown their advantages on restoration of the high-quality images with rich textures and detailed information. However, it is truly challenging for these deep learning methods to reconstruct a variety of different halftone patterns. For instance, the model trained with the halftone patterns of homogenous distribution cannot perform ideally for high structural information patterns. To solve this problem, an inverse halftoning based on deep residual neural network (DRNN) and variance classification is proposed. The proposed method utilizes benefits of progressive learning concept involving two main stages: First, the DRNN extracts numerous intrinsic features of an image, and significantly removes the halftone patterns. Subsequently, consecutive deep residual blocks are integrated to network restoring the fine details with good accuracy. Consequently, the proposed model comprises the integration of various DRNNs which are trained over various statistical ranges with respect to the statistics of halftone patches. Comprehensive experimental results demonstrate that the proposed deep learning-based technique significantly outperforms not only the conventional methods but also deep learning approaches.
Most deep learning based single image dehazing methods use convolutional neural networks (CNN) to extract features, however CNN can only capture local features. To address the limitations of CNN, We propose a basic mo...
详细信息
Most deep learning based single image dehazing methods use convolutional neural networks (CNN) to extract features, however CNN can only capture local features. To address the limitations of CNN, We propose a basic module that combines CNN and graph convolutional network (GCN) to capture both local and non-local features. The basic module consist of a CNN with triple attention modules (CAM) and a dual GCN module (DGM). CAM that combines the channel attention, spatial attention and pixel attention is designed to earn more weight from important local features. DGM combines spatial coherence computing and channel correlation computing to extract non-local information. The architecture of the network is similar to U-Net, and skip connections used in the symmetrical network can pass the image details from shallow layers to deep layers. Experimental results in several datasets indicate that the proposed method outperforms the state-of-the-arts both quantitatively and qualitatively.
暂无评论