The proceedings contain 153 papers. The topics discussed include: towards efficient learned image coding for machines via saliency-driven rate allocation;transformer-based spatial-temporal feature lifting for 3D hand ...
ISBN:
(纸本)9798350359855
The proceedings contain 153 papers. The topics discussed include: towards efficient learned image coding for machines via saliency-driven rate allocation;transformer-based spatial-temporal feature lifting for 3D hand mesh reconstruction;accuracy improvement of depth map estimation from multi-view images using NeRF;learning end-to-end depth maps compression with conditional quality-controllable autoencoder;tangent space sampling of video sequence with locally structured unitary network;efficient lightweight attention based learned image compression;a method for multi-linear TV channels streaming based on non-uniform tiled structure;and subspace learning machine with soft partitioning (SLM/SP): methodology and performance benchmarking.
The proceedings contain 134 papers. The topics discussed include: improving person re-identi?cation performance using body mask via cross-learning strategy;privacy-preserving fall detection with deep learning on mmWav...
ISBN:
(纸本)9781728137230
The proceedings contain 134 papers. The topics discussed include: improving person re-identi?cation performance using body mask via cross-learning strategy;privacy-preserving fall detection with deep learning on mmWave radar signal;stereoscopic image quality assessment weighted guidance by disparity map using convolutional neural network;depthwise separable convolutional neural network for image forensics;low resolution recognition of aerial images;fast QTMT partition decision algorithm in VVC intra coding based on variance and gradient;and adaptive CU Split decision with pooling-variable CNN for VVC intra encoding.
The proceedings contain 138 papers. The topics discussed include: the enhancement of underexposed images with blurred reflectance;geodesic disparity compensation for inter-view prediction in VR180;two recent advances ...
ISBN:
(纸本)9781728180670
The proceedings contain 138 papers. The topics discussed include: the enhancement of underexposed images with blurred reflectance;geodesic disparity compensation for inter-view prediction in VR180;two recent advances on normalization methods for deep neural network optimization;sparse representation-based intra prediction for lossless/near lossless video coding;recent advances in end-to-end learned image and video compression;FishUI: interactive fisheye distortion visualization;orthogonal features fusion network for anomaly detection;4D-DCT hardware architecture for JPEG Pleno light field coding;and deep blind video quality assessment for user generated videos.
The proceedings contain 139 papers. The topics discussed include: cross-device image saliency detection: database and comparative analysis;performance evaluation of feature detectors and descriptors with close-range s...
ISBN:
(纸本)9798331529543
The proceedings contain 139 papers. The topics discussed include: cross-device image saliency detection: database and comparative analysis;performance evaluation of feature detectors and descriptors with close-range solar panel images;inter Submesh border information coding with skip mode in V-DMC;advancements in Lenslet video coding: insights from MPEG LVC;advanced learning-based inter prediction for future video coding;packed regions information SEI message;content-adaptive rate-quality curve prediction model in media processing system;deep reinforcement learning-based camera autofocus with gaussian process regression;and frame similarity-based screen content video quality enhancement via adaptive long short-term fusion.
The proceedings contain 132 papers. The topics discussed include: adaptive rounding operator for efficient Wyner-Ziv video coding;retina model inspired image quality assessment;color image guided locality regularized ...
ISBN:
(纸本)9781479902903
The proceedings contain 132 papers. The topics discussed include: adaptive rounding operator for efficient Wyner-Ziv video coding;retina model inspired image quality assessment;color image guided locality regularized representation for kinect depth holes filling;motion vector refinement for frame rate up conversion on 3d video;efficient active contour model based on Vese-Chan model and split Bregman method;HEVC interpolation filter architecture for quad full HD decoding;correlation estimation for distributed wireless video communication;enhancing coded video quality with perceptual foveation driven bit allocation strategy;soft mobile video broadcast based on side information refining;quality enhancement based on retinex and pseudo-HDR synthesis algorithms for endoscopic images;and object co-segmentation based on directed graph clustering.
The proceedings contain 113 papers. The topics discussed include: visual analysis motivated super-resolution model for image reconstruction;hierarchical reinforcement learning based video semantic coding for segmentat...
ISBN:
(纸本)9781665475921
The proceedings contain 113 papers. The topics discussed include: visual analysis motivated super-resolution model for image reconstruction;hierarchical reinforcement learning based video semantic coding for segmentation;distinguishing computer-generated images from photographic images: a texture-aware deep learning-based method;high-speed scene reconstruction from low-light spike streams;one shot object detection via hierarchical adaptive alignment;reduced reference quality assessment for point cloud compression;a fast and effective framework for camera calibration in sport videos;dynamic mesh commonality modeling using the cuboidal partitioning;CNN-based post-processing filter for video compression with multi-scale feature representation;history-parameter-based affine model inheritance;robust dynamic background modeling for foreground estimation;space and level cooperation framework for pathological cancer grading;and semantic attribute guided image aesthetics assessment.
The proceedings contain 125 papers. The topics discussed include: two-stream federated learning: reduce the communication costs;a new update strategy for blocks with low correlation in 3-D recursive search;eye movemen...
ISBN:
(纸本)9781538644584
The proceedings contain 125 papers. The topics discussed include: two-stream federated learning: reduce the communication costs;a new update strategy for blocks with low correlation in 3-D recursive search;eye movement pattern modeling and visual comfort viewing S3D images;motion trajectory based spatial-temporal degradation measurement for video quality assessment;two-pass rate control for constant quality in high efficiency video coding;adaptive motion vector prediction for omnidirectional video;generative adversarial network-based frame extrapolation for video coding;a CNN-based in-loop filter with CU classification for HEVC;synthesizing 3D acoustic-articulatory mapping trajectories: predicting articulatory movements by long-term recurrent convolutional neural network;analysis of smoothed LHE methods for processingimages with optical illusions;and deep network with spatial and channel attention for person re-identification.
Text-to-image generation is a cutting-edge technology that enables computers to generate images from textual descriptions. While this technology has been extensively researched and applied to English language text, ap...
详细信息
ISBN:
(纸本)9783031804373;9783031804380
Text-to-image generation is a cutting-edge technology that enables computers to generate images from textual descriptions. While this technology has been extensively researched and applied to English language text, applying it to Arabic language text is still in its early stages. Additionally, the Arabic language is challenging due to its right-to-left writing system and extensive vocabulary of 1.3 million words. In this paper, we explore text-to-image generation for generating images from Arabic language text descriptions. Firstly, we fine-tune a transformer-based model pre-trained on the Arabic text to transform the text information into affine transformation within the DF-GAN generator. Secondly, we present a text transformer that combines LSTM layers to address the limitation of unrecognized words. Thirdly, a mask predictor is trained into the generator using a weakly supervised method and incorporated into the affine transformation for a more effective integration of image and text features. In addition, we add the DAMSM loss function as a regularization to the loss function to achieve convergences and stability in the training phase. The experiment on two challenging datasets CUB and Oxford-flower shows that our architectures can accurately generate high-quality images faithfully representing the Arabic textual descriptions. We believe the scaling of this task could have critical applications in fields such as Arabic visual learning, e-commerce, advertising, and entertainment.
The process that produces written descriptions that effectively represent the meaning and context of an image is known as image captioning. To integrate visual and textual data, it needs to blend computer vision and n...
详细信息
image restoration is a classic foundational visual task, aimed at recovering damaged images, such as those affected by compression, blurring, or noise, to high-definition clarity. Although current image enhancement te...
详细信息
暂无评论