The proceedings contain 139 papers. The topics discussed include: cross-device image saliency detection: database and comparative analysis;performance evaluation of feature detectors and descriptors with close-range s...
ISBN:
(纸本)9798331529543
The proceedings contain 139 papers. The topics discussed include: cross-device image saliency detection: database and comparative analysis;performance evaluation of feature detectors and descriptors with close-range solar panel images;inter Submesh border information coding with skip mode in V-DMC;advancements in Lenslet video coding: insights from MPEG LVC;advanced learning-based inter prediction for future video coding;packed regions information SEI message;content-adaptive rate-quality curve prediction model in media processing system;deep reinforcement learning-based camera autofocus with gaussian process regression;and frame similarity-based screen content video quality enhancement via adaptive long short-term fusion.
The proceedings contain 153 papers. The topics discussed include: towards efficient learned image coding for machines via saliency-driven rate allocation;transformer-based spatial-temporal feature lifting for 3D hand ...
ISBN:
(纸本)9798350359855
The proceedings contain 153 papers. The topics discussed include: towards efficient learned image coding for machines via saliency-driven rate allocation;transformer-based spatial-temporal feature lifting for 3D hand mesh reconstruction;accuracy improvement of depth map estimation from multi-view images using NeRF;learning end-to-end depth maps compression with conditional quality-controllable autoencoder;tangent space sampling of video sequence with locally structured unitary network;efficient lightweight attention based learned image compression;a method for multi-linear TV channels streaming based on non-uniform tiled structure;and subspace learning machine with soft partitioning (SLM/SP): methodology and performance benchmarking.
The proceedings contain 21 papers. The topics discussed include: from ray tracing to channel impulse responses: a review on the description of polarimetric time-invariant SISO channels;an efficient algorithm for sched...
ISBN:
(纸本)9798350388459
The proceedings contain 21 papers. The topics discussed include: from ray tracing to channel impulse responses: a review on the description of polarimetric time-invariant SISO channels;an efficient algorithm for scheduling aircraft landing problem;modeling and characterization of a compact in line filter with transmission zeros;from concept to implementation: lessons learned in designing and deploying a visible light positioning system;designing an augmented reality teaching module for power consumption in FPGAs;navigating the future: digital twin in maritime industry;measurement of a baby dummy in a car for child presence detection;advancing automotive connectivity: new technologies and security considerations;digital twins to monitor IoT devices for green transformation of university campus;and feasibility study of time synchronization solution for the bistatic synthetic aperture radar using mobile platforms.
The proceedings contain 113 papers. The topics discussed include: visual analysis motivated super-resolution model for image reconstruction;hierarchical reinforcement learning based video semantic coding for segmentat...
ISBN:
(纸本)9781665475921
The proceedings contain 113 papers. The topics discussed include: visual analysis motivated super-resolution model for image reconstruction;hierarchical reinforcement learning based video semantic coding for segmentation;distinguishing computer-generated images from photographic images: a texture-aware deep learning-based method;high-speed scene reconstruction from low-light spike streams;one shot object detection via hierarchical adaptive alignment;reduced reference quality assessment for point cloud compression;a fast and effective framework for camera calibration in sport videos;dynamic mesh commonality modeling using the cuboidal partitioning;CNN-based post-processing filter for video compression with multi-scale feature representation;history-parameter-based affine model inheritance;robust dynamic background modeling for foreground estimation;space and level cooperation framework for pathological cancer grading;and semantic attribute guided image aesthetics assessment.
The proceedings contain 134 papers. The topics discussed include: large-scale crowdsourcing subjective quality evaluation of learning-based image coding;alpha-trimmed mean filter and XOR based image enhancement for em...
ISBN:
(纸本)9781728185514
The proceedings contain 134 papers. The topics discussed include: large-scale crowdsourcing subjective quality evaluation of learning-based image coding;alpha-trimmed mean filter and XOR based image enhancement for embedding data in image;faster and finer pose estimation for object pool in a single RGB image;MPEG immersive video tools for light field head mounted displays;urban planter: a web app for automatic classification of urban plants;SPCNet: a panoramic image depth estimation method based on spherical convolution;attention-guided convolutional neural network for lightweight JPEG compression artifacts removal;and enhanced cross component sample adaptive offset for AVS3.
The proceedings contain 138 papers. The topics discussed include: the enhancement of underexposed images with blurred reflectance;geodesic disparity compensation for inter-view prediction in VR180;two recent advances ...
ISBN:
(纸本)9781728180670
The proceedings contain 138 papers. The topics discussed include: the enhancement of underexposed images with blurred reflectance;geodesic disparity compensation for inter-view prediction in VR180;two recent advances on normalization methods for deep neural network optimization;sparse representation-based intra prediction for lossless/near lossless video coding;recent advances in end-to-end learned image and video compression;FishUI: interactive fisheye distortion visualization;orthogonal features fusion network for anomaly detection;4D-DCT hardware architecture for JPEG Pleno light field coding;and deep blind video quality assessment for user generated videos.
Quanta image sensors are a novel paradigm in image sensor technology. Their direct application to quanta image sensors-based imaging systems is challenging because a bit-plane image is a set of binary images. In this ...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
Quanta image sensors are a novel paradigm in image sensor technology. Their direct application to quanta image sensors-based imaging systems is challenging because a bit-plane image is a set of binary images. In this paper, we introduce spatiotemporal priors based on the intensity invariance and smoothness characteristics of the motion vector. Specifically, we model when the image sequences align with the correct motion vector, the spatiotemporal structure becomes more consistent. Moreover, the spatial smoothness prior is incorporated through the smoothing filtering of the evaluation metrics of motion vector candidates. The experimental results show that the proposed method is more effective than conventional methods.
Supported by powerful generative models, low-bitrate learned image compression (LIC) models utilizing perceptual metrics have become feasible. Some of the most advanced models achieve high compression rates and superi...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
Supported by powerful generative models, low-bitrate learned image compression (LIC) models utilizing perceptual metrics have become feasible. Some of the most advanced models achieve high compression rates and superior perceptual quality by using image captions as sub-information. This paper demonstrates that using a large multi-modal model (LMM), it is possible to generate captions and compress them within a single model. We also propose a novel semantic-perceptual-oriented fine-tuning method applicable to any LIC network, resulting in a 41.58% improvement in LPIPS BD-rate compared to existing methods. Our implementation and pre-trained weights are available at https://***/tokkiwa/imageTextCoding.
This paper focuses on the Referring image Segmentation (RIS) task, which aims to segment objects from an image based on a given language description, having significant potential in practical applications such as food...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
This paper focuses on the Referring image Segmentation (RIS) task, which aims to segment objects from an image based on a given language description, having significant potential in practical applications such as food safety detection. Recent advances using the attention mechanism for cross-modal interaction have achieved excellent progress. However, current methods tend to lack explicit principles of interaction design as guidelines, leading to inadequate cross-modal comprehension. Additionally, most previous works use a single-modal mask decoder for prediction, losing the advantage of full cross-modal alignment. To address these challenges, we present a Fully Aligned Network (FAN) that follows four cross-modal interaction principles. Under the guidance of reasonable rules, our FAN achieves state-of-the-art performance on the prevalent RIS benchmarks (RefCOCO, RefCOCO+, G-Ref) with a simple architecture.
暂无评论