High-speed 3D shape sensing is an essential technology for three-dimensional recognition and manipulation in dynamic scenes. However, conventional high-speed sensing methods mainly focus on the image capturing speed a...
详细信息
ISBN:
(纸本)9781665456708
High-speed 3D shape sensing is an essential technology for three-dimensional recognition and manipulation in dynamic scenes. However, conventional high-speed sensing methods mainly focus on the image capturing speed and the actual processingtime to obtain a point cloud is not optimized in the measurement scheme and sensing pattern configuration. On the other hand, measurement latency is critical to respond in real-time and physically handle the dynamic scenes. This paper introduces a physically stereo-rectified projector-camera system for high-speed and low-latency 3D sensing with fast sequential memory access in the decoding process. Moreover, we configure a structured light pattern named "Parallel-bus pattern" with a De Bruijn torus and clock lines to maximize information density and robustly decode the pattern as data transfer in a parallel bus interface. We measured dynamically moving and deforming objects with the proposed system and evaluated the measurement performance. As a result, the developed 3D sensing system with the parallel-bus pattern achieved 27 K-points measurement at higher than 1000 fps with 0.336 ms latency and 0.838 mm accuracy on average.
In the realm of agriculture and horticulture, machine vision and soft computing approaches have shown promise in overcoming the limitations of traditional methods for identifying plant illnesses utilizing various plan...
详细信息
image aesthetics assessment (IAA) aims to estimate the aesthetics of images. Depending on the content of an image, diverse criteria need to be selected to assess its aesthetics. Existing works utilize pre-trained visi...
image aesthetics assessment (IAA) aims to estimate the aesthetics of images. Depending on the content of an image, diverse criteria need to be selected to assess its aesthetics. Existing works utilize pre-trained vision backbones based on content knowledge to learn image aesthetics. However, training those backbones is time-consuming and suffers from attention dispersion. Inspired by learnable queries in vision-language alignment, we propose the image Aesthetics Assessment via Learnable Queries (IAA-LQ) approach. It adapts learnable queries to extract aesthetic features from pre-trained image features obtained from a frozen image encoder. Extensive experiments on real-world data demonstrate the advantages of IAA-LQ, beating the best state-of-the-art method by 2.2% and 2.1% in terms of SRCC and PLCC, respectively.
Deep learning has a variety of uses and issues it can solve in the real world, but it also has some limitations. The growing use of AI-Morped videos is one of the most recent and complicated issues. "AI Morphed V...
详细信息
ISBN:
(数字)9798350394474
ISBN:
(纸本)9798350394481
Deep learning has a variety of uses and issues it can solve in the real world, but it also has some limitations. The growing use of AI-Morped videos is one of the most recent and complicated issues. "AI Morphed videos," which are digitally manipulated still or moving visuals, are made using deep learning techniques. In an AI Morphed video, the target’s face is superimposed over the original image so that the altered digital data can be used for online frauds, extortion, pornography, etc. It is getting harder and harder to manually discern between true and false as deep learning develops. Therefore, research and development in the field of AI Morphed video detection are crucial. An overview of the several AI Morphed video detection strategies is completed in time for the classification of feature-based, temporal-based, and deep feature-based AI Morphed video detection. The comparison research is based mostly on the key features used, including the face detection architecture, the deep learning architecture, whether it is video-based or image-based, the dataset used, the frames size, and the dataset size used. Along with the comparison, a semi-supervised GAN architecture is also proposed and built to recognize the AI Morphed video.
Aiming at the real-timeprocessing requirements in edge computing devices, we construct in-sensor computing units based on magnetic tunnel junction (MTJ) arrays, which realizes in-situ computation of vector-matrix mul...
详细信息
ISBN:
(数字)9798331542887
ISBN:
(纸本)9798331542894
Aiming at the real-timeprocessing requirements in edge computing devices, we construct in-sensor computing units based on magnetic tunnel junction (MTJ) arrays, which realizes in-situ computation of vector-matrix multiplication with parameters dynamically programmed to adapt to different tasks. Different from the previously used back-propagation (BP) algorithm, we innovatively introduce the forward-forward (FF) algorithm, which utilizes the positive and negative samples “Goodness” optimization mechanism to reduce the computational complexity. In the noisy image classification task, the FF algorithm reduces the training time by more than 90% compared to the BP algorithm and significantly improves the convergence speed when perfect classification results are achieved.
Traffic sign recognition is crucial for the safe and efficient operation of autonomous vehicles. While previous research has primarily focused on traffic sign recognition in foreign countries, these studies often face...
详细信息
ISBN:
(数字)9798350352368
ISBN:
(纸本)9798350352375
Traffic sign recognition is crucial for the safe and efficient operation of autonomous vehicles. While previous research has primarily focused on traffic sign recognition in foreign countries, these studies often face limitations such as differing traffic sign designs, language barriers in textual information, and varying environmental conditions. In this paper, we propose a traffic sign detection and recognition system tailored for Malaysia, utilizing Convolutional Neural Networks (CNNs) and Optical Character Recognition (OCR). In this paper, we propose a traffic sign detection and recognition system utilizing You Only Look Once (YOLO) V8 for object detection and EasyOCR to process textual information on selected traffic signs. Our system achieves a mean Average Precision (mAP) of 0.824 and an average processingtime of 1.2 seconds per frame, which is comparable to existing literature. Furthermore, the complexity of our method is significantly reduced, enhancing its potential for real-timeprocessing applications, as evidenced by its efficient processingtime.
Flooding is one of the major natural disasters that poses serious threats to human life, property, and ecological systems, especially in areas prone to inundation like Calamba, Laguna, Philippines. The rising occurren...
详细信息
ISBN:
(数字)9798331530983
ISBN:
(纸本)9798331530990
Flooding is one of the major natural disasters that poses serious threats to human life, property, and ecological systems, especially in areas prone to inundation like Calamba, Laguna, Philippines. The rising occurrence of flooding due to climate change and urbanization emphasizes the need for effective early warning systems to mitigate its impacts. The BAHAGAP flood detection and monitoring system developed here runs on solar energy. It uses the techniques of imageprocessing to improve early warning functionalities and strengthen community resilience. BAHAGAP uses advanced imageprocessing to monitor flood levels in real-time and integrates hardware components such as rain gauges, Raspberry Pi 4B microcontrollers, and a solar power system to ensure uninterrupted operation during power outages common in extreme weather. The system provides a sustainable and efficient solution specific to high-risk zones and addresses weaknesses in traditional flood management strategies. Its energy-efficient design, scalability, and ease of deployment make it adaptable to other flood-prone regions. The imageprocessing helps detect the actual flood level with precision and also sends timely warnings to local governments and residents so that they may make informed decisions. In addition, a flood detection model is also trained to obtain high system performance. BAHAGAP integrates hardware reliability with software capabilities, thus giving a holistic approach to flood surveillance. The new system has been a step forward in flood identification and early alert systems to meet the ever-growing need for sustainable solutions to reducing the impact of flooding. The study calls attention to the fundamental role of technology in enhancing resilience to disasters and puts BAHAGAP as an example for further flood management schemes in vulnerable regions.
Segmented light field images can serve as a powerful representation in many computer vision tasks exploiting geometry and appearance of objects, such as object pose tracking. For those images, segmentation presents an...
详细信息
ISBN:
(数字)9798331536626
ISBN:
(纸本)9798331536633
Segmented light field images can serve as a powerful representation in many computer vision tasks exploiting geometry and appearance of objects, such as object pose tracking. For those images, segmentation presents an additional objective of recognizing the same segment through all the views. Segment Anything Model 2 (SAM 2) allows for producing semantically meaningful segments for monocular images and videos. Using the video SAM 2 on less general 4D functions such as light fields is ineffective. In this work, we present a novel segmentation method that adapts SAM 2 to the light field domain without retraining or modifying the model. By utilizing epipo-lar constraints, our method produces high quality and view-consistent masks, outperforming the SAM 2 video tracking baseline and working 7 times faster, moving towards a real-time segmentation speed. We achieve this by exploiting the epipolar geometry cues to propagate the masks between the views, probing the SAM 2 latent space to estimate their occlusion, and further prompting SAM 2 for their refinement. The code and additional materials are available at https://***/Projects/LFSAM/.
In recent years, deep learning has been gradually applied to the industry with great success. As the demand for the lightweight intelligent devices increases, the deployment of deep learning models on embedded platfor...
详细信息
This paper presents an automated pothole detection system using deep learning, and Explainable AI (XAI), which improves model transparency and decision-making. High-quality images and videos undergo imageprocessing, ...
详细信息
ISBN:
(数字)9798331523923
ISBN:
(纸本)9798331523930
This paper presents an automated pothole detection system using deep learning, and Explainable AI (XAI), which improves model transparency and decision-making. High-quality images and videos undergo imageprocessing, including edge detection, feature extraction, and deep learning, enabling precise identification and classification of potholes, even in challenging environments. Evaluating potholes is crucial for road maintenance, safety, and reducing costs related to vehicle damage and road repairs. Traditional methods, such as manual inspections and sensors, are costly and time-consuming. The proposed real-time system can spot problem areas before they become severe potholes.
暂无评论