In this paper, the development of an immersive virtual conference room designed to replicate the experience of a physical meeting environment has been presented. Utilizing Unity software, authors have created a virtua...
详细信息
ISBN:
(纸本)9798350391893;9798350391886
In this paper, the development of an immersive virtual conference room designed to replicate the experience of a physical meeting environment has been presented. Utilizing Unity software, authors have created a virtual space equipped with several simple screens instead of holograms, each attached with individual cameras. Live camera feeds seamlessly integrate real-time participant interactions within virtual environments, enhancing telepresence and fostering immersive collaboration akin to face-to-face meetings. While the current implementation operates locally, our future focus is on enabling remote connectivity, facilitating collaboration among individuals across different geographic locations, and later a hologram-based virtual conference. This innovative approach aims to enhance remote collaboration experiences and bridge the gap between virtual interactions and physical presence.
This paper proposes an efficient real-time framework to generate detailed avatar animations solely from monocular camera videos, avoiding costly motion capture equipment. It extracts 3D facial and body landmarks using...
详细信息
Nowadays, video surveillance systems are widely deployed in various places, e.g., schools, parks, airports, roads, etc. However, existing video surveillance systems are far from full utilization due to high computatio...
详细信息
Nowadays, video surveillance systems are widely deployed in various places, e.g., schools, parks, airports, roads, etc. However, existing video surveillance systems are far from full utilization due to high computation overhead in videoprocessing. In this work, we present ViTrack, a framework for efficient multi-video tracking using computation resource on the edge for commodity video surveillance systems. In the heart of ViTrack lies a two layer spatial/temporal compressed target detection method to significantly reduce the computation overhead by combining videos from multiple cameras. Further, ViTrack derives the video relationship and camera information even in absence of camera location, direction, etc. To alleviate the impact of variant video quality and missing targets, ViTrack leverages a Markov Model based approach to efficiently recover missing information and finally derive the complete trajectory. We implement ViTrack on a real deployed video surveillance system with 110 cameras. The experiment results demonstrate that ViTrack can provide efficient trajectory tracking with processingtime 45x less than the existing approach. For 110 video cameras, ViTrack can run on a Dell OptiPlex 390 computer to track given targets in almost realtime. We believe ViTrack can enable practical video analysis for widely deployed commodity video surveillance systems.
Heart rate is a crucial metric in health monitoring. Traditional computer vision solutions estimate cardiac signals by detecting physical manifestations of heartbeats, such as facial discoloration caused by blood oxyg...
详细信息
Heart rate is a crucial metric in health monitoring. Traditional computer vision solutions estimate cardiac signals by detecting physical manifestations of heartbeats, such as facial discoloration caused by blood oxygenation changes, from subject videos using regression methods. As continuous signals are more complex and expensive to de-noise, this study introduces an alternative approach, employing end-to-end classification models to remotely derive a discrete representation of cardiac signals from face videos. These visual cardiac signal classifiers are trained on discretized cardiac signals, a novel pre-processing method with limited precedent in health monitoring literature. Consequently, various methods to convert continuous cardiac signals into binary form are presented, and their impact on training is evaluated. An implementation of this approach, the temporal shift convolutional attention binary classifier, is presented using the regression-based convolutional attention network architecture. The classifier and a baseline regression model are trained and tested using publicly available and locally collected datasets designed for heart signal detection from face video. The model performance is then assessed based on the heart rate error from the extracted cardiac signals. Results show the proposed method outperforms the baseline on the UBFC-rPPG dataset, reducing cross-dataset root mean square error from 2.33 to 1.63 beats per minute. However, both models struggled to generalize to the PURE dataset, with root mean square errors of 12.40 and 16.29 beats per minute, respectively. Additionally, the proposed approach reduces the computational complexity of model output post-processing, enhancing its suitability for real-time applications and deployment on systems with restricted resources.
This survey paper reviews the challenges and recent advancements in Artificial Intelligence (AI) video-based smoke and fire detection systems, with particular focus on both indoor and outdoor environments. The main pr...
详细信息
This survey paper reviews the challenges and recent advancements in Artificial Intelligence (AI) video-based smoke and fire detection systems, with particular focus on both indoor and outdoor environments. The main problem addressed is the high false alarm rates (ranging from 3.4% to 29.49% across various systems) and the challenges posed by environmental variability, dataset scarcity, and the complexity of real-time detection. The paper critically examines key methodologies, including traditional approaches, deep learning techniques (with accuracy rates reaching up to 98.72% and false alarm rates reduced to as low as 0.61%), hybrid methods, and domain transfer-based tools, highlighting their evolution and current trends. This survey also provides an indepth analysis of publicly available datasets and evaluation metrics, such as detection accuracy (ranging from 79.66% to 98.72%), robustness to dynamic environments, and real-timeprocessing capabilities (with some systems achieving up to 333 frames per second (FPS). By synthesizing insights from 33 papers published between 2013 and 2024, the survey not only summarizes the current state of the art but also identifies emerging trends, such as the increasing use of automatic feature learning and multi-fusion systems, which have demonstrated significant improvements in detection accuracy. The paper concludes by advocating for future research focused on improving system robustness and reducing false alarms through the integration of visible range cameras and traditional sensors, with the goal of achieving more accurate and reliable fire detection in surveillance systems.
In the realm of videoprocessing and analysis, accurate prediction of future frames is crucial in applications like video compression, anomaly detection and augmented reality. This paper introduces a novel approach th...
详细信息
This paper tackles the problem of mixed Gaussian and impulsive noise suppression in color images. The proposed method comprises two essential steps. Firstly, we detect impulsive noise through an approach based on the ...
详细信息
ISBN:
(纸本)9781510673878;9781510673861
This paper tackles the problem of mixed Gaussian and impulsive noise suppression in color images. The proposed method comprises two essential steps. Firstly, we detect impulsive noise through an approach based on the concept of digital path exploring the local pixel neighborhood. Each pixel is assigned a cost of a path connecting the boundary of a local processing window with its center. When the central pixel exhibits a high value of the path with lowest cost, it is identified as an impulse. To achieve this, we use a thresholding procedure for detecting corrupted pixels. Analyzing the distribution of minimum path costs, we employ the k-means technique to classify pixels into three distinct categories: those nearly undistorted, those corrupted by Gaussian noise, and those affected by impulsive noise. Subsequently, we employ the Laplace interpolation technique to restore the impulsive pixels - a fast and effective method yielding satisfactory denoising results. In the second step, we address the residual Gaussian noise using the Non-Local Means method, which selectively considers pixels from the local window that have not been flagged as impulsive. The experimental results confirm that our proposed hybrid method consistently yields superior outcomes compared to state-of-the-art denoising techniques. Moreover, its computational complexity remains low, rendering it suitable for real-time applications.
Robotic surgery requires endoscope 3D tracking to navigate the endoscope in the body. This paper proposes an accurate multiscale selective fusion framework to register 2D endoscopic videoimages to 3D pre-operative CT...
详细信息
ISBN:
(纸本)9781665405409
Robotic surgery requires endoscope 3D tracking to navigate the endoscope in the body. This paper proposes an accurate multiscale selective fusion framework to register 2D endoscopic videoimages to 3D pre-operative CT data for endoscope 3D tracking. Current video-based 3D tracking depends on the performance of the 2D-3D fusion procedure that suffers from inaccurate similarity and image uncertainties. To boost video-based 3D tracking, we develop multiscale selective similarity characterization to enhance the 2D-3D fusion procedure. Such fusion not only uses image pyramids in multiple scales to represent endoscopic images but also selects specific structure information from these multiscale images to compute the similarity. We validated our method on clinical data. Our method can reduce the current tracking error from 8.9 to 5.4 mm without using any external trackers, while it provides surgeons with robust real-time surgical 3D tracking.
Aiming at the problem that pointer instrument detection algorithm has slow locating speed and low realtime performance in edge equipment, this paper proposes a pointer instrument video detection method based on impro...
详细信息
With the rise of marine exploration, underwater imaging has gained significant attention as a research topic. Under-water video enhancement has become crucial for real-time computer vision tasks in marine exploration....
详细信息
暂无评论