image Coding for Machines (ICM) is developed to compress images with a focus on machine vision tasks rather than human perception. For ICM, It is very important to develop a universal codec adaptable to different mach...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
image Coding for Machines (ICM) is developed to compress images with a focus on machine vision tasks rather than human perception. For ICM, It is very important to develop a universal codec adaptable to different machine tasks. In this paper, we propose novel parallel task-prompts that can be easily adapted to various machine vision tasks without necessitating new networks or scratch training. Besides, Our parallel prompts are compatible with mainstream backbones such as transformers and convolutional neural networks, making them widely applicable across different model architectures. In order to fine-tune our task-prompts, we leverage a machine task network as the teacher net, guiding our student ICM network to efficiently compress feature maps for downstream machine tasks. Through extensive experimentation on object detection and segmentation, we demonstrate that our proposed method surpasses traditional image compression techniques and state-of-the-art learning-based feature compression techniques in terms of rate-accuracy performance.
This research focuses on the development of a deep learning based method to enable a drone equipped with a stereo vision camera to accurately detect and measure the spatial positions of tree branches. YOLO is employed...
详细信息
ISBN:
(纸本)9798331518783;9798331518776
This research focuses on the development of a deep learning based method to enable a drone equipped with a stereo vision camera to accurately detect and measure the spatial positions of tree branches. YOLO is employed for branch segmentation, while two depth estimation approaches, monocular and stereo, are investigated. In comparison to Semi-Global Block Matching(SGBM), deep learning techniques produce more refined and accurate depth maps. In the absence of ground-truth data, a fine-tuning process with deep neural networks is applied to generate the depth map that most closely approximates the ground-truth. This methodology achieves accurate branch detection and precise distance measurement, addressing key challenges in automating pruning operations. The results indicate substantial improvements in accuracy, though further optimization is required to enhance processing speed, demonstrating the potential of deep learning to advance automation in agricultural systems.
With the continuous progress of science and technology, imageprocessingtechniques have been used increasingly in recent years. imageprocessing plays an indispensable role in the fields of computer vision, artificia...
详细信息
With the continuous progress of science and technology, imageprocessingtechniques have been used increasingly in recent years. imageprocessing plays an indispensable role in the fields of computer vision, artificial intelligence, pattern recognition, and related fields. Improvements in basic algorithms and the development of new algorithms have resulted in considerable innovation and progress. This paper is devoted to finding new game applications in a branch of imageprocessing. It introduces an analysis model proposed by the author and discusses the relationship between roughness in the frequency domain and visual image interpretation. By using the concept of roughness, we separated the image features into meaningful information and residual information and analysed the image in the frequency domain. The results were compared with those of traditional imageprocessing methods. The starting point is the visual identification of a feature based on human interpretation. The image information was separated into meaningful features and the residual component to reduce the redundancy of the model. This allowed for a sparse representation of the feature information in the image. By analysing the meaningful features and residual components of an image separately, we established a relationship between the results and the original images. Parameters such as texture, morphology, and the degree of blurring were considered and we developed a parameter called "frequency roughness". The algorithm incorporates the concepts of frequency and roughness and the roughness is determined in the frequency domain. The frequency roughness algorithm successfully separated the rough features in the frequency domain and calculated the residual value in an image. This model provided more accurate imageprocessing results than comparable methods. This paper includes an analysis and game applications of the proposed model for de-blurring, image enhancement, recognition, and other image proces
In machine/computer vision, cameras serve a major role in image acquisition. Surveillance scenarios typically rely on Closed-Circuit Television (CCTV) cameras. This study aims to evaluate industrial cameras within a s...
详细信息
ISBN:
(纸本)9798350350494;9798350350500
In machine/computer vision, cameras serve a major role in image acquisition. Surveillance scenarios typically rely on Closed-Circuit Television (CCTV) cameras. This study aims to evaluate industrial cameras within a surveillance application, contrasting their performance with that of CCTV cameras. We explore the comparative analysis of CCTV and industrial cameras for vehicle attribute recognition, specifically concentrating on the recognition of vehicle color and model using deep learning techniques. To train and evaluate the models, we have created datasets from images captured by both a CCTV and an industrial camera. Our findings indicate that the industrial camera outperforms the CCTV. However, employing advanced processing algorithms has the potential to minimize the performance gap between these two cameras. Our research represents one of the initial comparative analyses between these camera types, offering valuable guidance in selecting the most suitable camera for specific applications.
Capturing and presenting exciting moments is crucial for the audience’s experience in basketball game broadcast cameras. However, traditional radar imageprocessingtechniques are limited by various factors and canno...
详细信息
vision-language models, such as the Contrastive Language-image Pre-Training (CLIP) model, have achieved significant success in image classification tasks. CLIP demonstrates high expressive power in few-shot learning s...
详细信息
vision-language models, such as the Contrastive Language-image Pre-Training (CLIP) model, have achieved significant success in image classification tasks. CLIP demonstrates high expressive power in few-shot learning scenarios due to its pairing of text and image encoders. However, CLIP still faces over-fitting when trained with a limited number of samples. To mitigate this, image augmentation techniques have been proposed in few-shot learning tasks to prevent over-fitting by enriching the dataset. Existing image augmentation methods, primarily designed for single-modal image models, focus solely on transformations within the image itself. However, for CLIP, merely increasing visual variety without considering textual content can reduce generalization ability and may even mislead the model. To address this issue, we introduce a novel image augmentation approach-Integrated image-Text Augmentation (ITA)- for CLIP model in few-shot learning tasks. This method generates new and diverse augmented images to increase the diversity of the training data and reduce over-fitting. Additionally, ITA establishes an alignment between the augmented images and their textual descriptions. Through this alignment, the model not only learns to recognize visual elements in the images but also understands the semantic connections between these elements and the text descriptions. This dual-modal approach enhances the model's flexibility and accuracy in processing few-shot learning tasks. Extensive experiments in few-shot image classification scenarios have demonstrated that ITA shows significant improvements compared to various image augmentation techniques.
In the imageprocessing domain, the growth of digital data has intensified the need for efficient and robust optimization techniques. This research study aims to develop and evaluate advanced optimization methods tail...
详细信息
The rapid evolution of wireless communication technologies has underscored the critical role of antennas in ensuring seamless *** defects,ranging from manufacturing imperfections to environmental wear,pose significant...
详细信息
The rapid evolution of wireless communication technologies has underscored the critical role of antennas in ensuring seamless *** defects,ranging from manufacturing imperfections to environmental wear,pose significant challenges to the reliability and performance of communication *** review paper navigates the landscape of antenna defect detection,emphasizing the need for a nuanced understanding of various defect types and the associated challenges in visual *** review paper serves as a valuable resource for researchers,engineers,and practitioners engaged in the design and maintenance of communication *** insights presented here pave the way for enhanced reliability in antenna systems through targeted defect detection *** this study,a comprehensive literature analysis on computer vision algorithms that are employed in end-of-line visual inspection of antenna parts is *** PRISMA principles will be followed throughout the review,and its goals are to provide a summary of recent research,identify relevant computer visiontechniques,and evaluate how effective these techniques are in discovering defects during *** contains articles from scholarly journals as well as papers presented at conferences up until June *** research utilized search phrases that were relevant,and papers were chosen based on whether or not they met certain inclusion and exclusion *** this study,several different computer vision approaches,such as feature extraction and defect classification,are broken down and ***,their applicability and performance are *** review highlights the significance of utilizing a wide variety of datasets and measurement *** findings of this study add to the existing body of knowledge and point researchers in the direction of promising new areas of investigation,such as real-time inspection systems and multispectral *** review,on its whole,of
Semantic image segmentation is a fundamental task in computer vision, frequently addressed using deep learning techniques. Nevertheless, these methods often struggle to fully capture the structural details and semanti...
详细信息
ISBN:
(纸本)9798331541859;9798331541842
Semantic image segmentation is a fundamental task in computer vision, frequently addressed using deep learning techniques. Nevertheless, these methods often struggle to fully capture the structural details and semantic relationships present within an image. We propose a new approach, based on a multiview graph neural network, allowing to exploit various kinds of structural information, each one being related to a particular view. We perform experiments on both a synthetic dataset and a real-world one and demonstrate that our model is superior to conventional graph neural network and resilient to small training datasets. Subsequently, our method outperforms other classic methods when considering a few training data. Additionally, the integration of views appears to improve convergence in training. Our findings highlight the potential of multi-view representations in enhancing image segmentation tasks, paving the way for more advanced and accurate computer visionsystems.
Plankton are an important component of life on Earth. Since the 19th century, scientists have attempted to quantify species distributions using many techniques, such as direct counting, sizing, and classification with...
详细信息
Plankton are an important component of life on Earth. Since the 19th century, scientists have attempted to quantify species distributions using many techniques, such as direct counting, sizing, and classification with microscopes. Since then, extraordinary work has been performed regarding the development of plankton imaging systems, producing a massive backlog of images that await classification. Automatic imageprocessing and classification approaches are opening new avenues for avoiding time-consuming manual procedures. While some algorithms have been adapted from many other applications for use with plankton, other exciting techniques have been developed exclusively for this issue. Achieving higher accuracy than that of human taxonomists is not yet possible, but an expeditious analysis is essential for discovering the world beyond plankton. Recent studies have shown the imminent development of real-time, in situ plankton image classification systems, which have only been slowed down by the complex implementations of algorithms on low-power processing hardware. This article compiles the techniques that have been proposed for classifying marine plankton, focusing on automatic methods that utilize imageprocessing, from the beginnings of this field to the present day.
暂无评论