Given the increasing prevalence of digital services across various aspects of life, it has become crucial to understand and recognize the mental states of individuals interacting with artificial systems. To address th...
详细信息
ISBN:
(纸本)9780998133171
Given the increasing prevalence of digital services across various aspects of life, it has become crucial to understand and recognize the mental states of individuals interacting with artificial systems. To address this concern, we aimed to develop the PosEmo - an automated application that can assess individuals' affective states using a video web camera. While studying affective states, we focused on two kinds of emotional behavior: approach/avoidance behavior and behavioral freezing/activation. To measure these behaviors, we use computer vision techniques to track the movement of the participant's head in video recordings, as well as in real-timevideo streaming. This method offered the seated research participant convenience, replicability, and non-intrusiveness. Drawing from established theoretical frameworks and supported by initial empirical findings, we developed the software and validated it in the online experiment. We found that PosEmo recognized whether people watched negative, neutral, or positive videos. Thus, our innovative approach enables us to accurately estimate people's affective states. In sum, by adopting a human-centered approach, we combined artificial intelligence methodologies to create an innovative system supporting human-computer interaction. Our system's potential research applications span various domains, such as psychology, cognitive science, usability studies, psychotherapy sessions, content quality assessment, and education.
Object re-identification (ReID) from images plays a critical role in application domains of image retrieval (surveillance, retail analytics, etc.) and multi-object tracking (autonomous driving, robotics, etc.). Howeve...
详细信息
ISBN:
(纸本)9798350318920;9798350318937
Object re-identification (ReID) from images plays a critical role in application domains of image retrieval (surveillance, retail analytics, etc.) and multi-object tracking (autonomous driving, robotics, etc.). However, systems that additionally or exclusively perceive the world from depth sensors are becoming more commonplace without any corresponding methods for object ReID. In this work, we fill the gap by providing the first large-scale study of object ReID from point clouds and establishing its performance relative to image ReID. To enable such a study, we create two large-scale ReID datasets with paired image and LiDAR observations and propose a lightweight matching head that can be concatenated to any set or sequence processing backbone (e.g., PointNet or ViT), creating a family of comparable object ReID networks for both modalities. Run in Siamese style, our proposed point cloud ReID networks can make thousands of pairwise comparisons in real-time (10 Hz). Our findings demonstrate that their performance increases with higher sensor resolution and approaches that of image ReID when observations are sufficiently dense. Our strongest network trained at the largest scale achieves ReID accuracy exceeding 90% for rigid objects and 85% for deformable objects (without any explicit skeleton normalization). To our knowledge, we are the first to study object re-identification from real point cloud observations. Our code is available at https://***/bentherien/point-cloud-reid.
There is a growing demand for real-timeimage denoising in low-light shooting with ultra-high definition cameras. This paper presents a denoising method that incorporates Haar-wavelet shrinkage denoising and a minimum...
详细信息
video synthetic aperture radar (video SAR) has become a research hotspot in the SAR field due to its characteristic of continuous monitoring. Compared to traditional SAR, video SAR provides the ability to dynamically ...
详细信息
video target tracking has a wide range of application value in the field of automatic driving, UAV target tracking, security monitoring, etc. How to maintain stable tracking of the target among video data frames is th...
详细信息
Interactive information fault diagnosis technology is a new type of fault diagnosis technology which is integrated by information fusion, artificial intelligence, computer science and other disciplines. It can extract...
详细信息
Change detection is crucial for various industrial applications. Although image change detection datasets are abundant, the collection of labeled video data is time-consuming, expensive, and cumbersome. This scarcity ...
详细信息
In order to achieve the possibility and probability of discovering the violation of operation tasks through previous data, this paper proposes an intelligent identification algorithm for safety risk of transmission li...
详细信息
This research presents a novel computer vision-based attention monitoring system designed for both online and offline contexts. Leveraging advanced imageprocessing and machine learning algorithms, the system analyzes...
详细信息
video analytics systems designed for computer vision tasks use deep learning models that rely on high-quality input data to maximize performance. However, in a real-world system, these inputs are often compressed usin...
详细信息
ISBN:
(数字)9781665496209
ISBN:
(纸本)9781665496209
video analytics systems designed for computer vision tasks use deep learning models that rely on high-quality input data to maximize performance. However, in a real-world system, these inputs are often compressed using video codecs such as HEVC. video compression degrades the quality of the inputs, thereby degrading the performance of these models. Region-of-interest (ROI) coding enables bits to be allocated to improve performance;however, the method to select regions should be computationally simple since it must occur during or before the video is compressed and transmitted for further processing. In this paper, we propose a task-aware quad-tree (TA-QT) partitioning and quantization method to achieve ROI coding for HEVC and other video coding standards. TAQT uses a lightweight edge-based model to guide task-aware video encoding to improve end-stage video analytics (ESVA) performance while reducing both bit-rate and encoding time. We demonstrate the effectiveness of our approach in terms of (a) the performance of the ESVA on compressed inputs, (b) transmission bit-rates, and (c) encoding time.
暂无评论