Equipment health monitoring (EHM) techniques are increasing in their ability to accurately diagnose defective equipment. This increase in capability comes with an increase in computational complexity. For these techni...
详细信息
With the advancement of technology and due to the recent pandemic situation, the education sector has turned to the online teaching method. But the main problem here is the inconvenience and irregularities in the stud...
详细信息
Mixed reality technologies provide real-time and immersive experiences,which bring tremendous opportunities in entertainment,education,and enriched experiences that are not directly accessible owing to safety or *** r...
详细信息
Mixed reality technologies provide real-time and immersive experiences,which bring tremendous opportunities in entertainment,education,and enriched experiences that are not directly accessible owing to safety or *** research in this field has been in the spotlight in the last few years as the metaverse went *** recently emerging omnidirectional video streams,i.e.,360°videos,provide an affordable way to capture and present dynamic real-world *** the last decade,fueled by the rapid development of artificial intelligence and computational photography technologies,the research interests in mixed reality systems using 360°videos with richer and more realistic experiences are dramatically increased to unlock the true potential of the *** this survey,we cover recent research aimed at addressing the above issues in the 360°image and videoprocessing technologies and applications for mixed *** survey summarizes the contributions of the recent research and describes potential future research directions about 360°media in the field of mixed reality.
real-timeimageprocessing involves the transformation of incoming signals, primarily from a camera, into a format that can be readily interpreted by a display device. This process is heavily reliant on precise timing...
详细信息
In response to the problem of traditional methods ignoring audio modality tampering, this study aims to explore an effective deep forgery video detection technique that improves detection precision and reliability by ...
详细信息
In response to the problem of traditional methods ignoring audio modality tampering, this study aims to explore an effective deep forgery video detection technique that improves detection precision and reliability by fusing lip images and audio signals. The main method used is lip-audio matching detection technology based on the Siamese neural network, combined with MFCC (Mel Frequency Cepstrum Coefficient) feature extraction of band-pass filters, an improved dual-branch Siamese network structure, and a two-stream network structure design. Firstly, the video stream is preprocessed to extract lip images, and the audio stream is preprocessed to extract MFCC features. Then, these features are processed separately through the two branches of the Siamese network. Finally, the model is trained and optimized through fully connected layers and loss functions. The experimental results show that the testing accuracy of the model in this study on the LRW (Lip Reading in the Wild) dataset reaches 92.3%;the recall rate is 94.3%;the F1 score is 93.3%, significantly better than the results of CNN (Convolutional Neural Networks) and LSTM (Long Short-Term Memory) models. In the validation of multi-resolution image streams, the highest accuracy of dual-resolution image streams reaches 94%. Band-pass filters can effectively improve the signal-to-noise ratio of deep forgery video detection when processing different types of audio signals. The real-timeprocessing performance of the model is also excellent, and it achieves an average score of up to 5 in user research. These data demonstrate that the method proposed in this study can effectively fuse visual and audio information in deep forgery video detection, accurately identify inconsistencies between video and audio, and thus verify the effectiveness of lip-audio modality fusion technology in improving detection performance.
In the era of rapidly expanding image data, the demand for improved image compression algorithms has grown significantly, particularly with the integration of deep learning approaches into traditional imageprocessing...
详细信息
ISBN:
(纸本)9781510673854;9781510673847
In the era of rapidly expanding image data, the demand for improved image compression algorithms has grown significantly, particularly with the integration of deep learning approaches into traditional imageprocessing tasks. However, many of the existing solutions in this domain are burdened by computational complexity, rendering them unsuitable for real-time deployment on standard devices as they often necessitate complex systems and substantial energy consumption. This work addresses the growing paradigm of edge computing for real-time applications by introducing a novel, on-edge device solution. This innovative approach aims to strike a balance between efficiency and accuracy, adhering to the practical constraints of real-world deployment. By presenting demonstrations of the proposed solution's performance on readily available devices, we provide tangible evidence of its applicability and viability in real-world scenarios. This advance contributes to the ongoing dialogue about the need for accessible and efficient image compression algorithms that can be deployed real-time applications on edge devices, bridging the gap between the demanding computational requirements of deep learning and the practical limitations of everyday hardware. As data continues to surge, solutions like this become ever more critical in ensuring effective image compression, aligning with on-edge computing within AI. This research paves the way for improved imageprocessing in real-time applications while conserving computational resources and energy consumption.
Distortions like blur and smoke in real-time laparoscopic videos often result from lens contamination. Detecting these distortions automatically and "in realtime"is a step preceding automatic lens cleaning ...
详细信息
We previously implemented an inexpensive imaging system that combines a single real camera with a mirror array located along a paraboloid. It allows us to robustly acquire dynamic light fields composed of multi-view v...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
We previously implemented an inexpensive imaging system that combines a single real camera with a mirror array located along a paraboloid. It allows us to robustly acquire dynamic light fields composed of multi-view videos by providing a virtual camera array, where its viewpoints exist in the mirrors. Actually, as moving the real camera to the focus of the paraboloid, virtual viewpoints in the mirrors get equally-spaced to achieve multi-view imaging with structured disparity. In this paper, we discuss an efficient method for adjusting the pose of a single camera to acquire high quality dynamic light fields as multi-view videos. Specifically, we introduce some indicator values determined by detected corners of the mirror array on acquired images while adjusting the camera. By using these values for camera adjustment, we easily know how to move its position and virtually correct its angle through homography transform. Experimental results of simulations demonstrate that our proposed method sufficiently achieves structured light field video acquisition with equally-spaced virtual viewpoints, where we do not need camera rotation requiring complex devices and only the camera position is controlled by a simple 3D system like XYZ stages.
Epilepsy, a prevalent neurological disorder, often leads to tonicclonic seizures characterized by loss of consciousness and uncontrolled motor activity. Prompt detection of these seizures is crucial for effective nurs...
详细信息
Epilepsy, a prevalent neurological disorder, often leads to tonicclonic seizures characterized by loss of consciousness and uncontrolled motor activity. Prompt detection of these seizures is crucial for effective nursing and diagnosis. This paper introduces a novel analysis, eliminating the need for body attachments or special equipment like markers or specific clothing. Our approach is straightforward: each video frame is segmented into blocks, and the average values of these blocks are computed. We then analyze the temporal changes in these averages using spectrograms. Our findings indicate that during tonic-clonic seizures, dominant frequency components typically range from 1 to 6 Hz and decrease as the seizure progresses. By capitalizing on these clinical observations, we have formulated effective detection rules. Experimental evaluations reveal that our method not only accurately detects epileptic seizures but also operates approximately four times faster than real-time on standard desktop computers. This efficiency and accuracy underscore the potential of our method as a practical tool in epilepsy monitoring and management.
Computer vision is a promising domain that focuses on emerging approaches, algorithms and technologies to provide computing capability to machine to analysis visual data, such as image files, videos files and real tim...
详细信息
暂无评论