Transmission latency significantly affects users' quality of experience in real-time interaction and actuation. As latency is principally inevitable, video prediction can be utilized to mitigate the latency and ul...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
Transmission latency significantly affects users' quality of experience in real-time interaction and actuation. As latency is principally inevitable, video prediction can be utilized to mitigate the latency and ultimately enable zero-latency transmission. However, most of the existing video prediction methods are computationally expensive and impractical for real-time applications. In this work, we therefore propose real-timevideo prediction towards the zero-latency interaction over networks, called IFRVP (Intermediate Feature Refinement video Prediction). Firstly, we propose three training methods for video prediction that extend frame interpolation models, where we utilize a simple convolution-only frame interpolation network based on IFRNet. Secondly, we introduce ELAN-based residual blocks into the prediction models to improve both inference speed and accuracy. Our evaluations show that our proposed models perform efficiently and achieve the best trade-off between prediction accuracy and computational speed among the existing video prediction methods. A demonstration movie is also provided at http://***/IFRVPDemo.
Multiple-valued logics (MVL) have abundant operation functions which can be used for encryption. A reconfigurable MVL operator can perform all MVL functions with a universal circuit structure at fast operation speed, ...
详细信息
Multiple-valued logics (MVL) have abundant operation functions which can be used for encryption. A reconfigurable MVL operator can perform all MVL functions with a universal circuit structure at fast operation speed, based on which a one-time-pad cryptosystem is expected to be built. However, we find that when the existing MVL encryption method is applied to video data encryption, the color edges in the plaintext image will remain in the ciphertext image, resulting in partial leakage of information. To solve this problem, we propose byte reorganization and random mask strategies, forming an improved MVL encryption method for video streaming. For verifying the effectiveness of the method, we implement an FPGA-based experimental system to encrypt and decrypt real-timevideo streaming data. In this system, 16-quit reconfigurable quaternary logic operators are implemented to encrypt, decrypt and derive keys. The process of either encryption or decryption only takes 34 clock cycles. The encryption and decryption modules are capable of processing streaming data at a speed of 6.21 Gbit/s, showing that the system has real-timeprocessing capability. For proving that our method is secure, we compare our improved MVL encryption method with existing image encryption methods in terms of common security evaluation metrics. Experimental results show that our method solves the problem of remained color edge and the ciphertext exhibits good statistical properties.
Research and development on dehazing algorithms have come a long way and the current algorithms work very efficiently in generating clear dehazed images, restoring the images whose contrast gets impaired due to presen...
详细信息
Research and development on dehazing algorithms have come a long way and the current algorithms work very efficiently in generating clear dehazed images, restoring the images whose contrast gets impaired due to presence of aerosols in the atmosphere. However these algorithms do not work well when applied to dehaze video sequences of hazy scenes because of the time taken to do so, making them unsuitable in realtime applications. In this paper, a real-timevideo dehazing technique has been proposed with a novel haze parameter 'SATVAL' which is the ratio of maximum saturation to maximum value of a RGB image applied on image scattering model using a few video frames processing in a second. A frame with a 'SATVAL' ratio below threshold value is considered to be dehazed or else passed without dehazing. This makes a dehazed video sequence perform accurately in real-time comparable to other contemporary methods. A portable "Raspberry pi model 4B" is used for validation video-on-board or a remote server displaying on a LCD screen. Extensive experimental studies have been carried out to test the effectiveness of the method both at hardware and software levels in comparisons with four existing methods qualitatively and quantitatively. MSE, SSIM, Correlation, PSNR, FPS are the evaluating parameters showing promising output with high quality video in real-time. Finally ten video datasets have been developed for successful implementation of this method.
This study offers a fresh technique for translating subtitles in sports events, addressing the issues of real-time translation with improved accuracy and efficiency. Different from standard methods, which often result...
详细信息
This study offers a fresh technique for translating subtitles in sports events, addressing the issues of real-time translation with improved accuracy and efficiency. Different from standard methods, which often result in delayed or inaccurate subtitles, the proposed method integrates advanced annotation techniques and machine learning algorithms to increase subtitle recognition and extraction. Annotation techniques in this study include systematically labeling spoken elements like commentary and dialogue, enabling accurate subtitle recognition and real-time adjustments in live sports broadcasts to ensure both accuracy and contextual relevance. These novel ideas allow for seamless adjustments to multiple language types, including the voices of commentators, off-site hosts, and athletes, while maintaining critical information within strict word count limits. Key improvements include faster processingtimes and increased translation precision, which are crucial for the dynamic environment of live sports broadcasts. The study builds on past studies in audiovisual translation, specifically tailoring its strategy to the unique demands of sports media. By emphasizing the importance of clear and contextually appropriate real-time subtitles, this research presents significant advancements over existing methods, providing valuable insights for future translation projects in sports and similar contexts. The results contribute to a more effective subtitle translation framework, enhancing the accessibility and viewing experience for audiences during live sports events.
Conventional methods that merge multiple images with different exposure levels often suffer from blur and ghosting due to object movement. Existing ghosting removal algorithms are usually complex and slow, making them...
详细信息
Conventional methods that merge multiple images with different exposure levels often suffer from blur and ghosting due to object movement. Existing ghosting removal algorithms are usually complex and slow, making them unsuitable for real-timevideo applications. To address this challenge, on an FPGA. IMX662 image sensor is employed, which simultaneously captures both HCG and LCG images with the same exposure time, enabling efficient HDR image synthesis. The proposed method directly addresses the source of the problem, eliminating the need for post-processing steps, thereby preserving algorithmic simplicity. Experimental results reveal that the proposed method not only removes ghosting by 100% but also processes data on an FPGA 98.79% faster than traditional software-based HDR fusion techniques, enabling real-timevideo stream processing. This dual gain, ghosting-free fusion algorithm demonstrates promising potential for use in high-speed photography and surveillance.
Reducing the huge computational complexity of intra mode decision is the key to real-timevideo Coding (VVC). This paper proposes a fast intra mode decision scheme that takes advantage of lightweight machine learning ...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
Reducing the huge computational complexity of intra mode decision is the key to real-timevideo Coding (VVC). This paper proposes a fast intra mode decision scheme that takes advantage of lightweight machine learning (ML) models to classify intra modes into fifteen clusters. The cluster is further refined using one of the three proposed strategies to select the most optimal mode. Our experimental results with the fastest configuration of the practical uvg266 encoder show that the proposed methods yield a competitive rate-distortion-complexity trade-off over a conventional rough mode decision (RMD). To the best of our knowledge, this is the first work to successfully reduce the complexity of RMD in a practical VVC encoder with the use of ML techniques.
[...]in recent years, there has been much research focus on reducing the complexity of Deep Learning models and essentially improving their speed while preserving their accuracy.[...]a very fundamental and hot topic i...
详细信息
[...]in recent years, there has been much research focus on reducing the complexity of Deep Learning models and essentially improving their speed while preserving their accuracy.
[...]a very fundamental and hot topic is the application of such models in image and videoprocessing tasks such as remote sensing.
[...]by taking inertial measurement error and the motion model’s error with respect to the coordinate, the coordinate variation is corrected.
[...]the method is parallelized to achieve further reduction of processingtime.
The proceedings contain 22 papers. The topics discussed include: realtime hand gesture recognition in industry;epidemic prevention system based on voice recognition combined with intelligent recognition of mask and h...
ISBN:
(纸本)9781450385886
The proceedings contain 22 papers. The topics discussed include: realtime hand gesture recognition in industry;epidemic prevention system based on voice recognition combined with intelligent recognition of mask and helmet;activity recognition in industrial environment using two layers learning;a new method of specific emitter feature extraction based on IQ imbalance;mixup augmentation for deep hashing;multi-resolution Gabor descriptor for corrosion detection in pipeline video sequences;image deep steganography detection based on knowledge distillation in teacher-student network;a multi-scale framework for visual grounding;a comparison of three swarm-based optimization algorithms in wind turbine radar clutter micro-motion parameters estimation;the influence of accounting information system quality and human resource competency on information quality;measurement and analysis of electrophysiological propagation on the cardiac slice-based biosensor;and improve the field-of-view of cameras: consideration on the micro lens array.
Underwater images often suffer from serious color bias and blurred features because of the effect of the water bodies on the light. To enhance underwater images, we present SU-DDPM, a method of real-time underwater im...
详细信息
Underwater images often suffer from serious color bias and blurred features because of the effect of the water bodies on the light. To enhance underwater images, we present SU-DDPM, a method of real-time underwater image enhancement (UIE) based on a denoising diffusion probabilistic model (DDPM). SU-DDPM outperforms other baseline and generative adversarial network models in underwater image enhancement, thus establishing a new state-of-the-art baseline. SU-DDPM processes images more rapidly than the diffusion model, which makes it competitive with other deep learning-based methods. We demonstrate that if conditional DDPM is used directly for the UIE task, the processing speed is slow, and the enhanced images are of poor quality and show color bias. The quality of the enhanced image is improved by combining the degraded image with the reference image in the diffusion stage to create a fusion-DDPM model. The specificity of the UIE task allows us to accelerate the inference process by changing the initial sampling distribution and reducing the number of iterations in the denoising stage of the model. We evaluate SU-DDPM on the UIE task using challenging real underwater image datasets and a synthetic image dataset and compare it to state-of-the-art models. SU-DDPM ensures increased enhancement quality, and enhancement processing speed is comparable to the speed of real-time enhancement models.
Healthcare monitoring depends on the accuracy of the measured physiological parameters in real-time, given the ongoing increase in the number of patients as compared to the limited medical physicians. Imaging photople...
详细信息
Healthcare monitoring depends on the accuracy of the measured physiological parameters in real-time, given the ongoing increase in the number of patients as compared to the limited medical physicians. Imaging photoplethysmography (IPPG) is one of the emerging non-invasive techniques for the measurement of vital signs, including oxygen saturation (SpO2), heart rate (HR), and respiratory rate (RR). This work explores a comprehensive sensitivity analysis to evaluate the impact of the critical acquisition parameters such as (1) image resize, from 100 to 2%, (2) the region of interest (ROI) within the images, and (3) acquisition duration, from 5 s to 30 s, using image sequences obtained at 30 frames per second. To evaluate and validate the performance of the system, the study consists of several mouse examinations to enhance both precision and consistency in real-time monitoring. The analysis reveals that how image resize influences signal integrity, image resolution, and processing efficiency, which is crucial for resource-limited applications. The ROI selection analysis discovers the key regions to optimize the accuracy of measured vital signs, while the evaluation of acquisition duration provides insights in terms of ensuring the reliable minimum duration for vital signs. These comprehensive analysis advances the current state of the art and addresses the previously overlooked but important factors that offers a robust framework for effective real-time monitoring for research and medical applications.
暂无评论