The proceedings contain 21 papers. The topics discussed include: calibration method for driving behavior parameters in simulation model of asphalt pavement with different gradations;data-driven gradient mapping adjust...
ISBN:
(纸本)9781510674660
The proceedings contain 21 papers. The topics discussed include: calibration method for driving behavior parameters in simulation model of asphalt pavement with different gradations;data-driven gradient mapping adjustment method for color optimization in digital illustrations;STUDY on the visual appeal of character elements in digital human guides in tourism;DL-assisted tradeoff design for multi-UAV real-timevideo streaming transmission;visual analysis of research hotspots and development trends in industrial software;control priorities of signal on-ramps metering based on chaos state identification;two key factors in handwriting analysis for personality prediction;and transformer-based vehicle re-identification with multiple details.
The proceedings contain 28 papers. The topics discussed include: human activity role identification using feature vector and encoding techniques on natural language sentences;access control, biometrics, and the future...
ISBN:
(纸本)9781450398381
The proceedings contain 28 papers. The topics discussed include: human activity role identification using feature vector and encoding techniques on natural language sentences;access control, biometrics, and the future;real-time privacy preserving human activity recognition on mobile using 1DCNN-BiLSTM deep learning;empirical study of emotional state recognition using multimodal fusion framework;monitoring application-driven continuous affect recognition from video frames;a human body part semantic segmentation enabled parsing for human pose estimation;automatic segmentation of pneumothorax in chest radiographs based on dual-task interactive learning method;transformer for hyperspectral image classification based on multi-feature learning;optimal parameter values for estimating rotational eye movement by using vascular images in the white part of the eyeball;and a computational investigation on precision autism and metabolic disorders: predictive machine learning for hepatic ailment classification.
The proceedings contain 34 papers. The topics discussed include: two-stage attention-based fusion neural network for image-text sentiment classification;a two-stage refinement network for nuclei segmentation in histop...
ISBN:
(纸本)9781450387415
The proceedings contain 34 papers. The topics discussed include: two-stage attention-based fusion neural network for image-text sentiment classification;a two-stage refinement network for nuclei segmentation in histopathology images;a novel method for extracting frequency line on Lofargram based on feature function;low-light image enhancement via transformer-based network;a fragile watermarking algorithm based on multiple watermarking;performance of domain adaptation schemes in video action recognition using synthetic data;two-stage self-supervised learning for facial action unit recognition;deep learning-based real-time activity recognition with multiple inertial sensors;fast image super-resolution based on multi-feature extraction network;and a study on the effect of video resolution on the quality of sound recovered using the visual microphone.
The proceedings contain 22 papers. The topics discussed include: realtime hand gesture recognition in industry;epidemic prevention system based on voice recognition combined with intelligent recognition of mask and h...
ISBN:
(纸本)9781450385886
The proceedings contain 22 papers. The topics discussed include: realtime hand gesture recognition in industry;epidemic prevention system based on voice recognition combined with intelligent recognition of mask and helmet;activity recognition in industrial environment using two layers learning;a new method of specific emitter feature extraction based on IQ imbalance;mixup augmentation for deep hashing;multi-resolution Gabor descriptor for corrosion detection in pipeline video sequences;image deep steganography detection based on knowledge distillation in teacher-student network;a multi-scale framework for visual grounding;a comparison of three swarm-based optimization algorithms in wind turbine radar clutter micro-motion parameters estimation;the influence of accounting information system quality and human resource competency on information quality;measurement and analysis of electrophysiological propagation on the cardiac slice-based biosensor;and improve the field-of-view of cameras: consideration on the micro lens array.
This paper presents a real-time semantic video communication method for general scenes, combining lossy semantic map coding with motion compensation to achieve reduced bit rates while maintaining perceptual and semant...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
This paper presents a real-time semantic video communication method for general scenes, combining lossy semantic map coding with motion compensation to achieve reduced bit rates while maintaining perceptual and semantic quality. Our findings show that semantic image synthesis effectively adapts to minute errors resulting from motion estimation, eliminating the need to transmit the residuals. We recommend the Group of Pictures approach as a more efficient alternative. Comparative assessments against HEVC and VVC confirm the method's effectiveness. This research paves the way for efficient real-time semantic video communication, addressing the demands of data-intensive visual applications.
This paper presents a demonstration setup for our open-source intra encoder called uvgVPCCenc, which is optimized for real-timevideo-based Point Cloud Compression (V-PCC). uvgVPCCenc achieves an average encoding spee...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
This paper presents a demonstration setup for our open-source intra encoder called uvgVPCCenc, which is optimized for real-timevideo-based Point Cloud Compression (V-PCC). uvgVPCCenc achieves an average encoding speed of 26 frames per second (fps) on an Intel i7-12700 CPU when encoding volumetric video sequences with up to 185 000 points per frame. It is shown to be 700 times as fast as TMC2 reference implementation for V-PCC. Our work is the first to demonstrate real-time intra V-PCC encoding on a consumer-grade desktop computer. It indicates that even the immense computational complexity of intra V-PCC encoding can be tackled for practical applications with effective design and optimization techniques.
In this paper, we present a successful implementation of a local maxima filter on a Zybo Z7-20 and PYNQ Z1 FPGA using their two HDMI ports in real-time. The proposed system uses the HDMI ports to capture video frames ...
详细信息
ISBN:
(纸本)9798350367331;9798350367348
In this paper, we present a successful implementation of a local maxima filter on a Zybo Z7-20 and PYNQ Z1 FPGA using their two HDMI ports in real-time. The proposed system uses the HDMI ports to capture video frames with a resolution of 640x480 pixels. The local maxima filter is then applied to the captured frames in real-time, allowing for the detection of peaks in the image data. The filter uses a sliding window approach to determine the local maxima, and a threshold value is set to identify and retain only the most significant peaks in the image. The system was implemented using SystemVerilog Hardware Description Language (HDL). The system was developed in the Xilinx Vivado design suite and the results show that the proposed system is able to process video frames at a rate of 60 frames per second with high accuracy and low latency. The proposed implementation using SystemVerilog presents a more efficient and flexible solution for imageprocessing applications on FPGA, making it a promising approach for real-timeimageprocessing.
Transmission latency significantly affects users' quality of experience in real-time interaction and actuation. As latency is principally inevitable, video prediction can be utilized to mitigate the latency and ul...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
Transmission latency significantly affects users' quality of experience in real-time interaction and actuation. As latency is principally inevitable, video prediction can be utilized to mitigate the latency and ultimately enable zero-latency transmission. However, most of the existing video prediction methods are computationally expensive and impractical for real-time applications. In this work, we therefore propose real-timevideo prediction towards the zero-latency interaction over networks, called IFRVP (Intermediate Feature Refinement video Prediction). Firstly, we propose three training methods for video prediction that extend frame interpolation models, where we utilize a simple convolution-only frame interpolation network based on IFRNet. Secondly, we introduce ELAN-based residual blocks into the prediction models to improve both inference speed and accuracy. Our evaluations show that our proposed models perform efficiently and achieve the best trade-off between prediction accuracy and computational speed among the existing video prediction methods. A demonstration movie is also provided at http://***/IFRVPDemo.
Reducing the huge computational complexity of intra mode decision is the key to real-timevideo Coding (VVC). This paper proposes a fast intra mode decision scheme that takes advantage of lightweight machine learning ...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
Reducing the huge computational complexity of intra mode decision is the key to real-timevideo Coding (VVC). This paper proposes a fast intra mode decision scheme that takes advantage of lightweight machine learning (ML) models to classify intra modes into fifteen clusters. The cluster is further refined using one of the three proposed strategies to select the most optimal mode. Our experimental results with the fastest configuration of the practical uvg266 encoder show that the proposed methods yield a competitive rate-distortion-complexity trade-off over a conventional rough mode decision (RMD). To the best of our knowledge, this is the first work to successfully reduce the complexity of RMD in a practical VVC encoder with the use of ML techniques.
This study investigates the practical performance of neural-network post-filters standardized in ITU-T H.274. We implement neural-network models on a Field-Programmable Gate Array (FPGA), allowing real-timeprocessing...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
This study investigates the practical performance of neural-network post-filters standardized in ITU-T H.274. We implement neural-network models on a Field-Programmable Gate Array (FPGA), allowing real-timeprocessing of 4K 60fps encoded videos transmitted via 12G-SDI. Experimental results suggest that a minor bitrate increase for the transmission of the neural-network model weights can enhance the quality of the videos encoded by Versatile video Coding (VVC).
暂无评论