Traditional object detection models often lose the detailed outline information of the object. To address this problem, we propose the Fourier Series Object Detection (FSD). It encodes the object's outline closed ...
详细信息
Traditional object detection models often lose the detailed outline information of the object. To address this problem, we propose the Fourier Series Object Detection (FSD). It encodes the object's outline closed curve into two one-dimensional periodic Fourier series. The Fourier Series Model (FSM) is constructed to regress the Fourier series for each object in the image. Thus, during inference, the detailed outline information of each object can be retrieved. We introduce Rolling Optimization Matching for Fourier loss to ensure that the model's learning process is not affected by the sequence of the starting points of the labeled contour points, speeding up the training process. The FSM demonstrates improved feature extraction and descriptive capabilities for non-rectangular or elongated object regions. The model achieves AP50 = 73.3% on the DOTA 1.5 dataset, which surpasses the state-of-the-art (SOTA) method by 6.44% at 66.86%. On the UCAS dataset, the model achieves AP50 = 97.25%, also surpassing the performance indicators of the SOTA methods. Furthermore, we introduce the object's Fourier power spectrum to describe outline features and the Fourier vector to indicate its direction. This enhances the scene semantic representation of the object detection model and paves a new pathway for the evolution of object detection methodologies.
The measurement of the position and speed of rotating machinery is essential in the manufacturing industry in order to effectively control a process. Traditional measurement methods such as encoders, resolvers, and ta...
详细信息
The measurement of the position and speed of rotating machinery is essential in the manufacturing industry in order to effectively control a process. Traditional measurement methods such as encoders, resolvers, and tachometers are limited to a single rotating element, require precise (mechanical) installation, and measure on the motor shaft rather than the load. This article proposes a novel methodology to overcome the shortcomings of conventional sensors with a triangular pattern vision encoder. The approach uses single pixellines in combination with a triangular pattern applied to a roller to capture the position and speed of one or more axes simultaneously. First, a preprocessing step detects the position of a roller in the frame. Second, the width of the triangle is detected and mapped to accurately measure the position of the roller. The main contributions are improved robustness compared to state-of-theart techniques, straightforward multitarget velocity analysis, and the integration of FDZP as a subpixel technique. The proposed design is validated on an industrial web processing machine (WPM) and achieves an accuracy of 480 trad, an improvement of approximately 38% over the benchmark incremental optical encoder, which achieves an accuracy of 770 trad.
Convolutional neural networks (CNNs), renowned for their efficiency in imageanalysis, have revolutionized pattern and structure recognition in visual data. Despite their success in image-based applications, CNNs face...
详细信息
Convolutional neural networks (CNNs), renowned for their efficiency in imageanalysis, have revolutionized pattern and structure recognition in visual data. Despite their success in image-based applications, CNNs face challenges when applied to tabular data due to the lack of inherent spatial relationships among features. This weakness can be overcome if the original tabular data is expanded to create an enhanced image that exhibits pseudo-spatial relationships. This paper introduces an original approach that transforms tabular data into a format suitable for CNN processing. The Novel Algorithm for Convolving Tabular Data (NCTD) applies mathematical transformations including rotation translation and reflection, to simulate spatial relationships within the data, thereby constructing a data structure analogous to a 2D synthetic image. This transformation enables CNNs to process tabular data efficiently by leveraging automated feature extraction and enhanced pattern recognition. The NCTD algorithm was extensively evaluated and compared with traditional machine learning algorithms and existing methods on ten benchmark datasets. The results showed that NCTD consistently surpassed the majority of competing algorithms in nine out of ten datasets, indicating its potential as a robust tool for extending CNN applicability beyond conventional image-based domains, particularly in complex classification and prediction.
Spectral superresolution (SSR) is a technique aimed at reconstructing hyperspectral images (HSIs) from images with low spectral resolution. Previous methods combining mathematical models with deep learning have shown ...
详细信息
Spectral superresolution (SSR) is a technique aimed at reconstructing hyperspectral images (HSIs) from images with low spectral resolution. Previous methods combining mathematical models with deep learning have shown promising performance for HSI reconstruction. However, these methods still have limitations when dealing with complex scenes, especially in terms of data consistency and realness. To address these issues, we propose a model-driven SSR network that integrates range-null space decomposition with deep learning. Specifically, we solve for the range space (R-Space) part and null space (N-Space) part to reconstruct the desired HSI with consistency and realness. The R-Space is primarily iteratively derived from the input multispectral image to ensure reliable data consistency, while the N-Space reflects the true distribution of the target HSI, and its proper representation helps improve visual quality. To enhance N-Space exploration, we construct a frequency-oriented N-Space learning module that leverages Mamba and self-attention to separately extract spatial and spectral information in the frequency domain. In addition, we introduce a structure tensor term and a multikernel maximum mean discrepancy term in the loss function to constrain R-Space and N-space, respectively. Experimental results show that the proposed method achieves excellent performance.
One of the most common tasks in histopathology is the visual comparison of the images of successive multiply stained tissue sections. Automatic image registration is crucial to perform this analysis. Although the...
详细信息
In this paper, tensile tests of specimens with a pattern of holes made of fiber-glass plastic based on combined epoxy and phenol-formaldehyde resins are carried out in order to study the processes of damage accumulati...
详细信息
In recent decades, there has been an increasing interest from the research community in various scientific and engineering fields, including robotic control, signal processing, image processing, feature selection, cla...
详细信息
In recent decades, there has been an increasing interest from the research community in various scientific and engineering fields, including robotic control, signal processing, image processing, feature selection, classification, clustering, and other issues. Many optimization problems are inherently complicated and complex. They cannot be solved by traditional optimization methods, such as mathematical programming, because most conventional optimization methods focus on evaluating first derivatives. On the other hand, metaheuristic algorithms have high ability and adaptability in finding near-optimal solutions in a reasonable time for different optimization problems due to parallel search and balance between exploration and exploitation. This study discusses the basic principles and mechanisms of the GJO algorithm and its challenges. This review aims to provide valuable insights into the potential of the GJO algorithm for real-world and scientific optimization tasks. In this paper, a complete review of the Golden Jackal Optimization (GJO) algorithm for various optimization problems is done. The GJO algorithm is one of the metaheuristic algorithms invented in 2022 and inspired by the life of natural jackals. This paper's complete classification of GJO in hybrid, improved, binary, multi-objective, and optimization problems is done. The analysis shows that the percentage of studies conducted in the four fields of hybrid, improved variants of GJO (binary, multi-objective), and optimization are 11 %, 44 %, 9 %, and 36 %, respectively. Studies have shown that this algorithm performs well in real-world challenges. GJO is a powerful tool for solving scientific and engineering problems flexibly.
This paper describes and demonstrates a comprehensive analysis of structured criteria of formalized conditions for creating universal images falsely classified by computer vision algorithms called adversarial examples...
详细信息
ISBN:
(数字)9798331511241
ISBN:
(纸本)9798331511258
This paper describes and demonstrates a comprehensive analysis of structured criteria of formalized conditions for creating universal images falsely classified by computer vision algorithms called adversarial examples based on YOLO neural network models. In this paper, a pattern was identified and studied using the above mathematical model of the proposed algorithm for the successful creation of a universal destructive image depending on the generated dataset, on which neural networks were trained using a fast sign gradient attack. This pattern is demonstrated for YOLO 8, YOLO 9, YOLO 10, and YOLO 11 classifier models trained on the basis of the standard COCO dataset.
image completion is a challenging task, particularly when ensuring that generated content seamlessly integrates with existing parts of an image. While recent diffusion models have shown promise, they often struggle wi...
详细信息
Deep convolutional neural networks have significantly advanced color image denoising. However, existing models often apply grayscale denoising techniques to color images without accounting for inter-channel correlatio...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Deep convolutional neural networks have significantly advanced color image denoising. However, existing models often apply grayscale denoising techniques to color images without accounting for inter-channel correlations, resulting in color distortion, detail loss, and visual artifacts. Moreover, these models frequently neglect salient features within convolutional maps. To address these issues, we propose a quaternion CNN model that captures channel correlations and extracts salient features, thereby enhancing color image denoising performance. Specifically, we convert color images into quaternion matrices to better capture these correlations and design a quaternion convolutional network to learn relevant features. Furthermore, an aggregated feature block is introduced to enhance the extraction of salient features and further refine the denoising process. Experimental results on multiple datasets demonstrate that the proposed model achieves superior performance compared to recent state-of-the-art methods.
暂无评论