One of the major challenges of AI is the misuse of images generated by generative models. Advances in this field have reached a point where distinguishing between real and fake images can be impossible for humans and ...
One of the major challenges of AI is the misuse of images generated by generative models. Advances in this field have reached a point where distinguishing between real and fake images can be impossible for humans and challenging even for machines. Although significant work has been done on detecting fake images, there is an ongoing competition between content generation and detection methods. However, a significant challenge for detection methods is their limitation to content generated by specific models. This study aims to enhance the generalization of fake image detection methods. Experimental results indicate that modifications made to the base model have contributed to improving its generalizability.
This paper introduces a vision-based dynamic positioning (DP) control system and develops a hardware-in-the-loop (HIL) platform to validate the performance of the controller applied to a work-class remotely operated v...
This paper introduces a vision-based dynamic positioning (DP) control system and develops a hardware-in-the-loop (HIL) platform to validate the performance of the controller applied to a work-class remotely operated vehicle (ROV). The proposed platform consists of three main parts: hardware, imageprocessing part and controller. The hardware included a calibrated camera that was connected to a dedicated computer via USB 2.0. In the imageprocessing part after pre-processing a circular Hough transform was used to detect and determine the position of the target in the image plane. Furthermore, this paper proposed a feedforward proportional-integral-derivative (PID) controller. To evaluate the performance of the proposed controller, two scenarios were implemented. In the first scenario, the target was considered stationary and a disturbance was applied to the ROV in the simulation environment. In the second scenario, the target object has moved along a rectangular path, and the objective was to stabilize the ROV at the desired points. In both scenarios, the reference signal was acquired by a calibrated camera from the target and sent to the controller. The results showed the desirable performance of the proposed controller.
With the rapid advancement of multi-source imaging technology, image fusion has become crucial in imageprocessing. This technique combines data from various modalities to produce rich fused images for deeper analysis...
详细信息
Eliminating defects of missing strips and mislabeling is an important task for cigarette enterprises to ensure product quality. There are various shortcomings in the defect detection methods used in the past, and mach...
详细信息
The deep learning field is converging towards the use of general foundation models that can be easily adapted for diverse tasks. While this paradigm shift has become common practice within the field of natural languag...
详细信息
ISBN:
(纸本)9798350318920;9798350318937
The deep learning field is converging towards the use of general foundation models that can be easily adapted for diverse tasks. While this paradigm shift has become common practice within the field of natural language processing, progress has been slower in computer vision. In this paper we attempt to address this issue by investigating the transferability of various state-of-the-art foundation models to medical image classification tasks. Specifically, we evaluate the performance of five foundation models, namely SAM, SEEM, DINOv2, BLIP, and OPENCLIP across four well-established medical imaging datasets. We explore different training settings to fully harness the potential of these models. Our study shows mixed results. DINOv2 consistently outperforms the standard practice of imageNET pretraining. However, other foundation models failed to consistently beat this established baseline indicating limitations in their transferability to medical image classification tasks.
In machine/computer vision, cameras serve a major role in image acquisition. Surveillance scenarios typically rely on Closed-Circuit Television (CCTV) cameras. This study aims to evaluate industrial cameras within a s...
In machine/computer vision, cameras serve a major role in image acquisition. Surveillance scenarios typically rely on Closed-Circuit Television (CCTV) cameras. This study aims to evaluate industrial cameras within a surveillance application, contrasting their performance with that of CCTV cameras. We explore the comparative analysis of CCTV and industrial cameras for vehicle attribute recognition, specifically concentrating on the recognition of vehicle color and model using deep learning techniques. To train and evaluate the models, we have created datasets from images captured by both a CCTV and an industrial camera. Our findings indicate that the industrial camera outperforms the CCTV. However, employing advanced processing algorithms has the potential to minimize the performance gap between these two cameras. Our research represents one of the initial comparative analyses between these camera types, offering valuable guidance in selecting the most suitable camera for specific applications.
Accurate segmentation of retinal vessels is vital for clinical diagnosis, yet challenges like complex vascular structures, noise, and low contrast persist. While deep learning has enhanced segmentation through context...
详细信息
With the continuous progress of imageprocessing and machinevision technology, the demand for efficient and real-time processing is becoming more and more prominent, especially in the field of high-noise image proces...
详细信息
ISBN:
(纸本)9798350377040;9798350377033
With the continuous progress of imageprocessing and machinevision technology, the demand for efficient and real-time processing is becoming more and more prominent, especially in the field of high-noise imageprocessing. In this study, an adaptive Gaussian filtering algorithm is proposed, which is implemented based on FPGA and aims to improve the computational efficiency and real-time performance of the imageprocessing system. Compared with the traditional fixed-weight filter, this algorithm is able to dynamically adjust the filtering parameters according to different noise environments, effectively balancing noise suppression and image detail retention. We coded the algorithm using Verilog hardware description language and verified it on PYNQ-Z2 FPGA platform. The experimental results show that the adaptive algorithm outperforms the fixed-weight filtering method in terms of performance, especially in terms of noise suppression and detail preservation. Meanwhile, the FPGA hardware implements the reduction of filtering delay and optimization of resource consumption, making it well suited for real-time applications. This study demonstrates the promise of FPGA adaptive filtering for applications in medical imaging, remote sensing, and intelligent surveillance, which have stringent requirements for high-performance and high-efficiency processing. This research provides new hardware solutions for real-time, high-quality imageprocessing in constrained environments.
[Objective] This study aims to design an automatic agricultural machinery identification system based on computer vision, to address the issue of disorganized and chaotic labeling of some agricultural machinery produc...
详细信息
In the process of real-time tracking of multiple moving targets, there is a big gap between the tracking effect and the ideal effect due to the influence of the objective environment state. Therefore, a real-time trac...
详细信息
暂无评论