Intelligent transportation and smart city applications are currently on the rise. In many applications, diverse and accurate sensor perception of vehicles is crucial. Relevant information could be conveniently acquire...
详细信息
Intelligent transportation and smart city applications are currently on the rise. In many applications, diverse and accurate sensor perception of vehicles is crucial. Relevant information could be conveniently acquired with traffic cameras, as there is an abundance of cameras in cities. However, cameras have to be calibrated in order to acquire position data of vehicles. This paper proposes a novel automated calibration approach for partially connected vehicle environments. The approach utilises Global Navigation Satellite System positioning information shared by connected vehicles. Corresponding vehicle Global Navigation Satellite System locations and image coordinates are utilised to fit a direct transformation between image and ground plane coordinates. The proposed approach was validated with a research vehicle equipped with a real-time Kinematic-corrected Global Navigation Satellite System receiver driving past three different cameras. On average, the camera estimates contained errors ranging from 1.5 to 2.0 m, when compared to the Global Navigation Satellite System positions of the vehicle. Considering the vast lengths of the overlooked road sections, up to 140 m, the accuracy of the camera-based localisation should be adequate for a number of intelligent transportation applications. In future, the calibration approach should be evaluated with fusion of stand-alone Global Navigation Satellite System positioning and inertial measurements, to validate the calibration methodology with more common vehicle sensor equipment.
Controllable image denoising aims to generate clean samples with human perceptual priors and balance sharpness and smoothness. In traditional filter-based denoising methods, this can be easily achieved by adjusting th...
详细信息
ISBN:
(纸本)9798350301298
Controllable image denoising aims to generate clean samples with human perceptual priors and balance sharpness and smoothness. In traditional filter-based denoising methods, this can be easily achieved by adjusting the filtering strength. However, for NN (Neural Network)-based models, adjusting the final denoising strength requires performing network inference each time, making it almost impossible for real-time user interaction. In this paper, we introduce real-time Controllable Denoising (RCD), the first deep image and video denoising pipeline that provides a fully controllable user interface to edit arbitrary denoising levels in real-time with only one-time network inference. Unlike existing controllable denoising methods that require multiple denoisers and training stages, RCD replaces the last output layer (which usually outputs a single noise map) of an existing CNN-based model with a lightweight module that outputs multiple noise maps. We propose a novel Noise Decorrelation process to enforce the orthogonality of the noise feature maps, allowing arbitrary noise level control through noise map interpolation. This process is network-free and does not require network inference. Our experiments show that RCD can enable real-time editable image and video denoising for various existing heavy-weight models without sacrificing their original performance.
real-timevideo streaming through the underwater acoustic (UA) channel is challenging due to the limited bandwidth. In this paper, we present a high-rate, reconfigurable software-defined UA communication system that w...
详细信息
ISBN:
(纸本)9798350332261
real-timevideo streaming through the underwater acoustic (UA) channel is challenging due to the limited bandwidth. In this paper, we present a high-rate, reconfigurable software-defined UA communication system that we recently developed, which is capable of real-time through-water video streaming. The transmitter consists of a universal software radio peripheral interfaced with a high-frequency transducer through a broadband impedance matching network designed in house. The transmitter and receiver signal processing algorithms are implemented using Python and run on external host computers. The system can reach a data rate of 445 kbps using a single transducer. The prototype system is tested in a UA communication experiment conducted in a hydroacoustic tank. Experimental results show that together with our videoprocessing algorithms, this system can transmit real-timevideo with a high quality.
Given the increasing prevalence of digital services across various aspects of life, it has become crucial to understand and recognize the mental states of individuals interacting with artificial systems. To address th...
详细信息
ISBN:
(纸本)9780998133171
Given the increasing prevalence of digital services across various aspects of life, it has become crucial to understand and recognize the mental states of individuals interacting with artificial systems. To address this concern, we aimed to develop the PosEmo - an automated application that can assess individuals' affective states using a video web camera. While studying affective states, we focused on two kinds of emotional behavior: approach/avoidance behavior and behavioral freezing/activation. To measure these behaviors, we use computer vision techniques to track the movement of the participant's head in video recordings, as well as in real-timevideo streaming. This method offered the seated research participant convenience, replicability, and non-intrusiveness. Drawing from established theoretical frameworks and supported by initial empirical findings, we developed the software and validated it in the online experiment. We found that PosEmo recognized whether people watched negative, neutral, or positive videos. Thus, our innovative approach enables us to accurately estimate people's affective states. In sum, by adopting a human-centered approach, we combined artificial intelligence methodologies to create an innovative system supporting human-computer interaction. Our system's potential research applications span various domains, such as psychology, cognitive science, usability studies, psychotherapy sessions, content quality assessment, and education.
Underwater imaging presents numerous challenges due to refraction, light absorption, and scattering, resulting in color degradation, low contrast, and blurriness. Enhancing underwater images is crucial for high-level ...
详细信息
ISBN:
(纸本)9798350318920;9798350318937
Underwater imaging presents numerous challenges due to refraction, light absorption, and scattering, resulting in color degradation, low contrast, and blurriness. Enhancing underwater images is crucial for high-level computer vision tasks, but existing methods either neglect the physics-based image formation process or require expensive computations. In this paper, we propose an effective framework that combines a physics-based Underwater image Formation Model (UIFM) with a deep image enhancement approach based on the retinex model. Firstly, we remove backscatter by estimating attenuation coefficients using depth information. Then, we employ a retinex model-based deep image enhancement module to enhance the images. To ensure adherence to the UIFM, we introduce a novel Wideband Attenuation prior. The proposed PhISH-Net framework achieves real-timeprocessing of high-resolution underwater images using a lightweight neural network and a bilateral-grid-based upsampler. Extensive experiments on two underwater image datasets demonstrate the superior performance of our method compared to state-of-the-art techniques. Additionally, qualitative evaluation on a cross-dataset scenario confirms its generalization capability. Our contributions lie in combining the physics-based UIFM with deep image enhancement methods, introducing the wideband attenuation prior, and achieving superior performance and efficiency.
Deploying high-spec cameras in video systems often falls short of user expectations. Leveraging advancements in deep learning, we propose a mobile, lightweight, real-timevideo enhancement system. Our approach adopts ...
详细信息
This research explores an affordable and highprecision crowd-monitoring system of integrating data from 2D LiDAR and images from the camera through LiDAR scan data and image fusion. The novelty of the research is to a...
详细信息
ISBN:
(纸本)9798350352368
This research explores an affordable and highprecision crowd-monitoring system of integrating data from 2D LiDAR and images from the camera through LiDAR scan data and image fusion. The novelty of the research is to achieve 3-D scanning by using 2D LiDAR, which is controlled by a servo-controlled tilting mechanism to obtain multiple scan data from different elevation angles, for simulating 3D scanning operation through overlapping of multiple scanning results according to its elevation angles and performs image, 2D point cloud data fusion for human detection and distance measurement for crowd monitoring purposes. The proposed techniques enhance 2D LiDAR detection, enabling detailed scanning at lower cost and complexity. The system combines LiDAR measurements with camera imagery through proposed filtering and fusion algorithms which are implemented on a novel servo-controlled swinging platform, essential for accurate real-time tracking in enclosed crowded areas. The outcomes of the research show that the proposed crowd-monitoring system can accurately localize an individual using LiDAR scan data in terms of his/her distance and angle from an image with a bounding box aiming to classify the detected object as a human being with high accuracy by using the proposed filtering and fusion techniques.
Object re-identification (ReID) from images plays a critical role in application domains of image retrieval (surveillance, retail analytics, etc.) and multi-object tracking (autonomous driving, robotics, etc.). Howeve...
详细信息
ISBN:
(纸本)9798350318920;9798350318937
Object re-identification (ReID) from images plays a critical role in application domains of image retrieval (surveillance, retail analytics, etc.) and multi-object tracking (autonomous driving, robotics, etc.). However, systems that additionally or exclusively perceive the world from depth sensors are becoming more commonplace without any corresponding methods for object ReID. In this work, we fill the gap by providing the first large-scale study of object ReID from point clouds and establishing its performance relative to image ReID. To enable such a study, we create two large-scale ReID datasets with paired image and LiDAR observations and propose a lightweight matching head that can be concatenated to any set or sequence processing backbone (e.g., PointNet or ViT), creating a family of comparable object ReID networks for both modalities. Run in Siamese style, our proposed point cloud ReID networks can make thousands of pairwise comparisons in real-time (10 Hz). Our findings demonstrate that their performance increases with higher sensor resolution and approaches that of image ReID when observations are sufficiently dense. Our strongest network trained at the largest scale achieves ReID accuracy exceeding 90% for rigid objects and 85% for deformable objects (without any explicit skeleton normalization). To our knowledge, we are the first to study object re-identification from real point cloud observations. Our code is available at https://***/bentherien/point-cloud-reid.
Nowadays, securing people in public places is an emerging social issue in the research of real-time crime detection (RCD) by video surveillance, in which initial automatic recognition of suspicious objects is consider...
详细信息
Overfitted image codecs offer compelling compression performance and low decoder complexity, through the overfitting of a lightweight decoder for each image. Such codecs include Cool-chic, which presents image coding ...
详细信息
ISBN:
(纸本)9789464593617;9798331519773
Overfitted image codecs offer compelling compression performance and low decoder complexity, through the overfitting of a lightweight decoder for each image. Such codecs include Cool-chic, which presents image coding performance on par with VVC while requiring around 2000 multiplications per decoded pixel. This paper proposes to decrease Cool-chic encoding and decoding complexity. The encoding complexity is reduced by shortening Cool-chic training, up to the point where no overfitting is performed at all. It is also shown that a tiny neural decoder with 300 multiplications per pixel still outperforms HEVC. A near real-time CPU implementation of this decoder is made available at https://***/Cool-Chic/.
暂无评论