检索结果-内蒙古大学图书馆

8th International conference on Wireless communications and Signal processing (WCSP)

作者： Yang, Ya'nan Wang, Xiaofan Liu, Feng Gan, Zongliang Nanjing Univ Posts & Telecommun Jiangsu Prov Key Lab Image Proc & Image Commun Nanjing 210003 Jiangsu Peoples R China Nanjing Univ Posts & Telecommun Minist Educ Key Lab Broadband Wireless Commun & Sensor Networ Nanjing 210003 Jiangsu Peoples R China

ISBN: (纸本)9781509028603

Since the lighting conditions in strong contrast regions between the light and dark cant be estimated accurately by traditional center/surround Retinex algorithm, the over-enhancement and color distortion may exist. In view of this, combining with the human visual characteristics, a color image enhancement algorithm based on tone-preserving was proposed. A determination function was added to the bilateral filter to estimate illuminance image more accurately and weaken over-enhancement. According to human visual masking effect, the improved gamma correction was utilized to correct the brightness of illumination image adaptively and the local contrast of reflection image obtained by division was enhanced based on local statistics. Besides, the final enhanced image was obtained by combining illumination image with reflection image, which can make image appear more natural. Compared with other similar algorithms from both subjective and objective aspects, the results show that this method being applied to low-contrast color image enhancement can not only improve image clarity, but reduce color distortion.

关键词： Retinex tone-preserving bilateral filter masking effect local contrast

来源：评论

学校读者我要写书评

暂无评论

Fully Neural Network Mode Based Intra Prediction of Variable Block Size

Fully Neural Network Mode Based Intra Prediction of Variable...

引用

IEEE International conference on visual communications and image processing (VCIP)

作者： Sun, Heming Yu, Lu Katto, Jiro Waseda Univ Waseda Res Inst Sci & Engn Tokyo Japan JST PRESTO 4-1-8 Honcho Kawaguchi Saitama Japan Zhejiang Univ Inst Informat & Commun Engn Hangzhou Peoples R China Waseda Univ Dept Comp Sci & Commun Engn Tokyo Japan

ISBN: (纸本)9781728180687

Intra prediction is an essential component in the image coding. This paper gives an intra prediction framework completely based on neural network modes (NM). Each NM can be regarded as a regression from the neighboring reference blocks to the current coding block. (1) For variable block size, we utilize different network structures. For small blocks 4x4 and 8x8, fully connected networks are used, while for large blocks 16x16 and 32x32, convolutional neural networks are exploited. (2) For each prediction mode, we develop a specific pre-trained network to boost the regression accuracy. When integrating into HEVC test model, we can save 3.55%, 3.03% and 3.27% BD-rate for Y, U, V components compared with the anchor. As far as we know, this is the first work to explore a fully NM based framework for intra prediction, and we reach a better coding gain with a lower complexity compared with the previous work.

关键词： Intra prediction image compression deep learning fully connected layer convolutional neural network

来源：评论

学校读者我要写书评

暂无评论

Content-Adaptive Rate-Quality Curve Prediction Model in Media processing System

Content-Adaptive Rate-Quality Curve Prediction Model in Medi...

引用

2024 conference on visual communications and image processing

作者： Yin, Shibo Zhang, Zhiyu Ning, Peirong Chen, Qiubo Chen, Jing Zhou, Quan Lu, Guo Song, Li Xiaohongshu Inc Shanghai Peoples R China Shanghai Jiao Tong Univ Shanghai Peoples R China

ISBN: (纸本)9798331529543;9798331529550

In streaming media services, video transcoding is a common practice to alleviate bandwidth demands. Unfortunately, traditional methods employing a uniform rate factor (RF) across all videos often result in significant inefficiencies. Content-adaptive encoding (CAE) techniques address this by dynamically adjusting encoding parameters based on video content characteristics. However, existing CAE methods are often tightly coupled with specific encoding strategies, leading to inflexibility. In this paper, we propose a model that predicts both RF-quality and RF-bitrate curves, which can be utilized to derive a comprehensive bitrate-quality curve. This approach facilitates flexible adjustments to the encoding strategy without necessitating model retraining. The model leverages codec features, content features, and anchor features to predict the bitrate-quality curve accurately. Additionally, we introduce an anchor suspension method to enhance prediction accuracy. Experiments confirm that the actual quality metric (VMAF) of the compressed video stays within +/- 1 of the target, achieving an accuracy of 99.14%. By incorporating our quality improvement strategy with the rate-quality curve prediction model, we conducted online A/B tests, obtaining both +0.107% improvements in video views and video completions and +0.064% app duration time. Our model has been deployed on the Xiaohongshu App.

关键词： Prediction models

来源：评论

学校读者我要写书评

暂无评论

A Practical Approach to Depth-Aware Augmentation for Neural Radiance Fields

A Practical Approach to Depth-Aware Augmentation for Neural ...

引用

2024 conference on visual communications and image processing

作者： Khosroshahi, Hamed Razavi Sancho, Jaime Bonatto, Daniele Fachada, Sarah Bang, Gun Lafruit, Gauthier Juarez, Eduardo Teratani, Mehrdad Univ Libre Bruxelles Lab Image Synth & Anal Brussels Belgium Univ Politecn Madrid Res Ctr Software Technol & Multimedia Syst Madrid Spain Telecommun Res Inst Elect Daejeon South Korea

ISBN: (纸本)9798331529543;9798331529550

Neural Radiance Fields (NeRF) have demonstrated exceptional performance in generating novel views of scenes by learning implicit volumetric representations from calibrated RGB images, without depth information. A major limitation is the need for large training datasets in neural network-based view synthesis frameworks. The challenge of effective data augmentation for view synthesis remains unresolved. NeRF models require extensive scene coverage from multiple views to accurately estimate radiance and density. Insufficient coverage reduces the model's ability to interpolate or extrapolate unseen parts of the scene effectively. In this paper, we propose a novel pipeline to address this data augmentation issue using depth map information. We use depth image-based rendering (DIBR) to overcome the lack of enough views for training NeRF. Experimental results indicate that our approach enhances the quality of rendered images using the NeRF framework, achieving an average peak signal-to-noise ratio (PSNR) increase of 7.2 dB, with a maximum improvement of 12 dB.

关键词： Neural Radiance Fields NeRF View synthesis Data augmentation Depth map

来源：评论

学校读者我要写书评

暂无评论

CNN-Based Post-processing Filter for Video Compression with Multi-Scale Feature Representation

CNN-Based Post-Processing Filter for Video Compression with ...

引用

IEEE International conference on visual communications and image processing (VCIP)

作者： Qi, Zhanyuan Jung, Cheolkon Liu, Yang Li, Ming Xidian Univ Sch Elect Engn Xian Peoples R China OPPO Xian Peoples R China

ISBN: (纸本)9781665475921

In this paper, we propose a convolutional neural network (CNN)-based post-processing filter for video compression with multi-scale feature representation. The discrete wavelet transform (DWT) decomposes an image into multi-frequency and multi-directional sub-bands, and can figure out artifacts caused by video compression with multi-scale feature representation. Thus, we combine DWT with CNN and construct two sub-networks: Step-like sub-band network (SLSB) and mixed enhancement network (ME). SLSB takes the wavelet subbands as input, and feeds them into the Res2Net group (R2NG) from high frequency to low frequency. R2NG consists of Res2Net modules and adopts spatial and channel attentions to adaptively enhance features. We combine the high frequency sub-band output with the low frequency sub-band in R2NG to capture multi-scale features. ME uses mixed convolution composed of dilated convolution and standard convolution as the basic block to expand the receptive field without blind spots in dilated convolution and further improve the reconstruction quality. Experimental results demonstrate that the proposed CNN filter achieves average 2.13 %, 2.63 %, 2.99 %, 4.8 %, 3.72 % and 4.5 % BD-rate reductions over VTM 11.0-NNVC anchor for Y channel on A1, A2, B, C, D and E classes of the common test conditions (CTC) in AI, RA and LDP configurations, respectively.

关键词： Convolutional neural network attention mechanism compressed video restoration dilated convolution wavelet

来源：评论

学校读者我要写书评

暂无评论

Estimating blockness distortion for performance evaluation of picture coding algorithms

Estimating blockness distortion for performance evaluation o...

引用

1997 IEEE Pacific Rim conference on communications, Computers and Signal processing

作者： Giusto, DD Perra, M Univ of Cagliari Cagliari Italy

ISBN: (纸本)0780339061

In this paper, some of the most significant image quality indexes are reviewed and compared with a new method for blockness distortion evaluation. The paper begins with a brief survey on classical measures based on numerical difference between original and reconstructed image data (e.g., MSE, SNR and PSNR) and advanced methods aiming at considering the perceptive aspects of image degradation (e.g., Hosaka Plots and other methods based on Human visual System properties like Information Content or Perceptual image Distortion). After, four innovative methods for blockness distortion measurement are proposed: two based on DCT analysis, and two on differential Sobel operator. Results on standard pictures confirm the efficiency of the proposed measures.

关键词： image coding

来源：评论

学校读者我要写书评

暂无评论

Adaptive multiscale image denoising using neural networks

Adaptive multiscale image denoising using neural networks

引用

International conference on Signal processing and communications (SPCOM)

作者： Srinivasan, M Annadurai, S Anna Univ Govt Coll Technol Coimbatore Tamil Nadu India

ISBN: (纸本)0780386744

An image is often corrupted by additive gaussian noise during its acquisition and transmission. Denoising has to be performed on these images to retain the signal and to suppress the noise. Denoising can be performed by various methods like thresholding, filtering etc. But these methods did not consider the local space scale information of the image. Here a new type of neural network is constructed for noise reduction, where the space scale information of the image is considered. This method gives a good numerical results and also better visual effects. Keywords: Denoising,Discrete Wavelet Transform, Continuous soft thresholding, Least Mean Square rule.

关键词： denoising discrete wavelet transform continuous soft thresholding Least mean square rule

来源：评论

学校读者我要写书评

暂无评论

image Resolution Enhancement by Using Interpolation Followed by Iterative Back Projection

Image Resolution Enhancement by Using Interpolation Followed...

引用

21st Signal processing and communications Applications conference (SIU)

作者： Rasti, Pejman Demirel, Hasan Anbarjafari, Gholamreza Dogu Akdeniz Univ Elekt & Elekt Muhendisligi Bolumu Gazimagusa Cyprus Uluslararasl Kibras Univ Elekt & Elekt Muhendisligi Bolumu Nicosia Cyprus

ISBN: (纸本)9781467355636;9781467355629

In this paper, we propose a new super resolution technique based on the interpolation followed by registering them using iterative back projection (IBP). Low resolution images are being interpolated and then the interpolated images are being registered in order to generate a sharper high resolution image. The proposed technique has been tested on Lena, Elaine, Pepper, and Baboon. The quantitative peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) results as weil as the visual results show the superiority of the proposed technique over the conventional and state-of-art image super resolution techniques. For Lena's image, the PSNR is 6.52 dB high er than the bicubic interpolation.

关键词： Super resolution Iterative Back Projection image Registeration

来源：评论

学校读者我要写书评

暂无评论

Game Character Generation with Generative Adversarial Networks 30

Game Character Generation with Generative Adversarial Networ...

引用

30th IEEE Signal processing and communications Applications conference (SIU)

作者： Emekligil, Ferda Gul Aydin Oksuz, Ilkay Istanbul Tech Univ Oyun & Etkilesim Teknol Bolumu Istanbul Turkey Istanbul Tech Univ Bilgisayar Muhendisligi Bolumu Istanbul Turkey

ISBN: (纸本)9781665450928

Designing visual content and characters for games is a time consuming task even for designers and illustrators with experience. Most of the game companies and developers use procedural methods to automate the design process. The visual content produced by these algorithms is limited in terms of variation. In this paper, we propose to use Generative Adversarial Networks (GANs) for visual content production. Two different rpg and dnd visual image datasets were collected over the internet for training and 6 different GAN models were trained on them. In 3 of 18 experiments, transfer learning methods are used because of the limited datasets. The Frechet Inception Distance metric was used to compare the model results. As a result, SNGAN was the most successful in both datasets. Moreover, the transfer learning method (WGAN-GP, BigGAN) was more successful than the from scratch method.

关键词： Generative Adversarial Network Generative Learning image Generation Game Character Generation

来源：评论

学校读者我要写书评

暂无评论

Introduction to the special issue on audio and video analysis for multimedia interactive services

引用

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 2004年第5期14卷 569-571页

作者： Izquierdo, E Katsaggelos, AK Strintzis, MG Univ London Queen Mary Coll Dept Elect Engn London E1 4NS England Northwestern Univ Dept Elect & Comp Engn Evanston IL 60208 USA Aristotle Univ Thessaloniki Dept Elect & Comp Engn Thessaloniki 54124 Greece

The article focuses on the audio and video analysis for multimedia interactive services. It describes a system that automates home video editing. It automatically extracts a set of highlight segments from a set of raw home videos and aligns them with user-supplied incidental music based on the content of the video and incidental music. Finally, it introduces a method for interactive image retrieval using query feedback. It learns the user query as well as the correspondence between high-level user concepts and their low-level machine representation by performing retrievals according to multiple queries supplied by the user during the course of a retrieval session.

关键词： Information storage & retrieval systems Information retrieval Multimedia systems image retrieval Videotape editing -- Equipment & supplies Incidental music

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：