The proceedings contain 125 papers. The topics discussed include: two-stream federated learning: reduce the communication costs;a new update strategy for blocks with low correlation in 3-D recursive search;eye movemen...
ISBN:
(纸本)9781538644584
The proceedings contain 125 papers. The topics discussed include: two-stream federated learning: reduce the communication costs;a new update strategy for blocks with low correlation in 3-D recursive search;eye movement pattern modeling and visual comfort viewing S3D images;motion trajectory based spatial-temporal degradation measurement for video quality assessment;two-pass rate control for constant quality in high efficiency video coding;adaptive motion vector prediction for omnidirectional video;generative adversarial network-based frame extrapolation for video coding;a CNN-based in-loop filter with CU classification for HEVC;synthesizing 3D acoustic-articulatory mapping trajectories: predicting articulatory movements by long-term recurrent convolutional neural network;analysis of smoothed LHE methods for processingimages with optical illusions;and deep network with spatial and channel attention for person re-identification.
The proceedings contain 132 papers. The topics discussed include: adaptive rounding operator for efficient Wyner-Ziv video coding;retina model inspired image quality assessment;color image guided locality regularized ...
ISBN:
(纸本)9781479902903
The proceedings contain 132 papers. The topics discussed include: adaptive rounding operator for efficient Wyner-Ziv video coding;retina model inspired image quality assessment;color image guided locality regularized representation for kinect depth holes filling;motion vector refinement for frame rate up conversion on 3d video;efficient active contour model based on Vese-Chan model and split Bregman method;HEVC interpolation filter architecture for quad full HD decoding;correlation estimation for distributed wireless video communication;enhancing coded video quality with perceptual foveation driven bit allocation strategy;soft mobile video broadcast based on side information refining;quality enhancement based on retinex and pseudo-HDR synthesis algorithms for endoscopic images;and object co-segmentation based on directed graph clustering.
conference proceedings front matter may contain various advertisements, welcome messages, committee or program information, and other miscellaneous conference information. This may in some cases also include the cover...
conference proceedings front matter may contain various advertisements, welcome messages, committee or program information, and other miscellaneous conference information. This may in some cases also include the cover art, table of contents, copyright statements, title-page or half title-pages, blank pages, venue maps or other general information relating to the conference that was part of the original conference proceedings.
This conference proceedings deals with: H.264 video coding standard; MPEG-4; Croatian digital video broadcasting; multimedia streaming networks; digital signal processing; image data processing; ultrasonic imaging; co...
This conference proceedings deals with: H.264 video coding standard; MPEG-4; Croatian digital video broadcasting; multimedia streaming networks; digital signal processing; image data processing; ultrasonic imaging; color correlation; edge-preserving regularization; image reconstruction; computer vision; 3D-DCT compression; lossless fractal image coding; post-processing algorithm; video processing; mammogram; image registration; gray-scale images; hotelling transform; photon-limited images; steerable pyramids; image retrieval; edge detection technique; querying; robust image matching algorithm; local expert networks; content-based mesh generation algorithm; texture recognition; audio classification; stereo image compression; mitochondrial genomic signals; fast Fourier transform; Hough transform; weighted vector median optimization; entropy vector filters; group signature scheme; impulsive noise reduction technique; multiprocessors systems; color image enhancement; nonparametric density estimation; video coding; programmable media processor; PDA; digital video software encoder; hybrid DWT-SVD image coding system; wavelet transform; error-resilient video codec; adaptive data mapping; MIMO systems; DSL communication; pose invariant face detection; 3D face recognition; MCM; facial recognition system; radar objection identification; image segmentation tool; maritime images; world wide web; video indexing; ridge polynomial network; pattern recognition; hypergraph representation; delay jitter analysis; network latency; broadband communication network; predictive speech coding; noisy medical audio signal; context dependent viseme model; voice driven animation; forward masking phenomenon; concatenative speech synthesis; data hiding; blind watermarking; T-codes; Internet; bluetooth; WPAN; wireless ad hoc networks; direct spread sequence mobile communication system; virtual private networks; CATV broadband technologies.
作者:
Chen, ZhaoguoCollege of Arts
Shandong Agricultural Engineering University Shandong Province Jinan250103 China
To fully harness the capabilities of computer graphics and imageprocessing technologies and elevate the quality of visual communication design, this paper presents a comprehensive suite of innovative methodologies. F...
详细信息
Zero-shot learning (ZSL) directs the challenge of classifying unseen test images without explicit training on those samples. ZSL can identify and classify unlabeled images available in abundance by learning from visua...
详细信息
ISBN:
(纸本)9783031734762;9783031734779
Zero-shot learning (ZSL) directs the challenge of classifying unseen test images without explicit training on those samples. ZSL can identify and classify unlabeled images available in abundance by learning from visual and semantic embedding vectors (feature vectors). Information-enriched visual features extracted from images play a crucial role in ZSL. This paper proposes a hybrid feature approach that integrates low-level (LL), and high-level (HL) features extracted from images. Gray Level Co-occurrence Matrix (GLCM) and Gabor features are employed to obtain LL texture features, while HL features are derived from the ResNet-50 model, renowned for capturing complex hierarchical representations. These hybrid visual features are then mapped with semantic features using linear mapping, where the semantic features are embedding vectors of labels generated by the fastText model. Experiments on the AWA2 and SUN datasets are conducted in a bid to evaluate the proposed approach's effectiveness. The hybrid feature approach has demonstrated enhanced quality in zero-shot image classification, effectively classifying images that the model has not seen during training.
Text-to-image generation is a cutting-edge technology that enables computers to generate images from textual descriptions. While this technology has been extensively researched and applied to English language text, ap...
详细信息
ISBN:
(纸本)9783031804373;9783031804380
Text-to-image generation is a cutting-edge technology that enables computers to generate images from textual descriptions. While this technology has been extensively researched and applied to English language text, applying it to Arabic language text is still in its early stages. Additionally, the Arabic language is challenging due to its right-to-left writing system and extensive vocabulary of 1.3 million words. In this paper, we explore text-to-image generation for generating images from Arabic language text descriptions. Firstly, we fine-tune a transformer-based model pre-trained on the Arabic text to transform the text information into affine transformation within the DF-GAN generator. Secondly, we present a text transformer that combines LSTM layers to address the limitation of unrecognized words. Thirdly, a mask predictor is trained into the generator using a weakly supervised method and incorporated into the affine transformation for a more effective integration of image and text features. In addition, we add the DAMSM loss function as a regularization to the loss function to achieve convergences and stability in the training phase. The experiment on two challenging datasets CUB and Oxford-flower shows that our architectures can accurately generate high-quality images faithfully representing the Arabic textual descriptions. We believe the scaling of this task could have critical applications in fields such as Arabic visual learning, e-commerce, advertising, and entertainment.
The process that produces written descriptions that effectively represent the meaning and context of an image is known as image captioning. To integrate visual and textual data, it needs to blend computer vision and n...
详细信息
This study tackles the difficult issues of image captioning while negotiating the complexity of visual data processing. The complexity of visual data and the associated processing requirements make image captioning a ...
详细信息
暂无评论