检索结果-内蒙古大学图书馆

Local Geometric Indexing of High Resolution Data for Facial Reconstruction From Sparse Markers

IEEE TRANSACTIONS ON vISUALIZATION AND COMPUTER GRAPHICS 2024年第8期30卷 5289-5298页

作者： Cong, Matthew Lan, Lana Fedkiw, Ronald Ind Light & Mag San Francisco CA 94129 USA Stanford Univ Dept Comp Sci Stanford CA 94305 USA

When considering sparse motion capture marker data, one typically struggles to balance its overfitting via a high dimensional blendshape system versus underfitting caused by smoothness constraints. With the current trend towards using more and more data, our aim is not to fit the motion capture markers with a parameterized (blendshape) model or to smoothly interpolate a surface through the marker positions, but rather to find an instance in the high resolution dataset that contains local geometry to fit each marker. Just as is true for typical machine learning applications, this approach benefits from a plethora of data, and thus we also consider augmenting the dataset via specially designed physical simulations that target the high resolution dataset such that the simulation output lies on the same so-called manifold as the data targeted.

关键词： Shape Faces Geometry Surface reconstruction Cameras Point cloud compression Deformation Computer graphics image processing and computer vision interpolation

来源：评论

学校读者我要写书评

暂无评论

No-reference image Quality Metric for NeRF (Neural Radiance Fields) Rendering in Automotive applications 26

No-reference Image Quality Metric for NeRF (Neural Radiance ...

引用

26th Irish machine vision and image processing Conference, IMvIP 2024

作者： Raymond, Mary Sistu, Ganesh Gallagher, Louis Valeo Vision Systems Ireland Maynooth University Ireland

ISBN: (纸本)9781837242672

Neural Radiance Fields (NeRF) rendering is a promising Artificial intelligence (AI) technology for generating photorealistic views, with significant potential for automotive applications. However, traditional metrics such as Structural Similarity Index Measure (SSIM), Peak Signal-to-Noise Ratio (PSNR), and Learned Perceptual image Patch Similarity (LPIPS) often fail to evaluate the model's quality for novel viewpoints outside the dataset range, which is crucial for real-life use. This study introduces the Fréchet Inception Distance (FID) as a no-reference image quality metric for novel viewpoints. Our experiments demonstrate that FID aligns well with human quality assessments and is effective in automotive scenarios with fisheye images. The need for further research on FID normalization, the sample sizes of generated viewpoints used to calculate FID, and measures of viewpoint difficulty is highlighted. Adopting FID advances NeRF evaluation, enhancing assessments in real life scenarios within automotive and robotics, and improving autonomous system performance and safety. More results from our experiments are available here https://***/watch?v=Lb8azH79EI0. © This is an open access article published by the IET under the Creative Commons Attribution License (http://***/licenses/by/3.0/)

关键词： Rendering (computer graphics)

来源：评论

学校读者我要写书评

暂无评论

Imaging and vision Development Platform with Algorithm Library for Intelligent vision Systems 7th

Imaging and Vision Development Platform with Algorithm Libra...

引用

7th International Symposium on Intelligent Informatics, ISI 2022

作者： Sreedhanya, L.R. Daniel, J. Jerry Nithin, P.v. Saivam, Murugan Kerala Thiruvananthapuram India

ISBN: (纸本)9789811980930

machine vision applications for intelligent vision systems in manufacturing industries were reported based on image processing and artificial intelligence technology. We propose the imaging and vision development platform in this research for creating vision applications using image processing, machine learning, and a deep learning algorithm library. An algorithm library, vision configurator, execution logic, display manager and deploy manager modules are all included in the proposed platform. This platform is based on an open-source software stack for machine learning and deep learning computer vision technologies including OpenCv, TensorFlow, CUDA, Keras, YOLO and PyTorch. To assess the performance of the suggested platform, real-time applications like vehicle identification, person detection, code scanner, and OCR vision application were developed, validated, and deployed in an embedded system utilizing this platform. The results of the experiments show that the suggested platform can be utilized to evaluate high resolution real-time images and construct vision applications. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

关键词： Deep learning

来源：评论

学校读者我要写书评

暂无评论

CONTROLLABLE UNIvERSAL EDGE-PRESERvING image FILTERING 7

CONTROLLABLE UNIVERSAL EDGE-PRESERVING IMAGE FILTERING

引用

7th IEEE International Conference on Multimedia Information processing and Retrieval (MIPR)

作者： Liang, Shijun Fu, Dongdong Michigan State Univ Dept Biomed Engn E Lansing MI 48824 USA Dolby Labs Inc Sunnyvale CA USA

ISBN: (纸本)9798350351439;9798350351422

In this study, we investigate the Deep image Prior (DIP) in enhancing image smoothing, a crucial component in numerous computer vision and graphics applications. Although deep learning has demonstrated remarkable achievements in these domains, it often falls short in flexibility and controllability, in contrast to traditional methods, which are more adaptable and typically exhibit subpar performance. Notably, some end-to-end deep learning models offer control over edge preservation, yet their performance remains marginally suboptimal. To address this shortcoming, we introduce an innovative network architecture that diverges from the traditional U-Net model, featuring a Laplacian pyramid as the encoder and a deep decoder as the decoding component, integrated with a bilateral filter loss to improve DIP. This design aids the network in rapidly assimilating essential low-frequency information. Our approach excels in retaining texture details, significantly improving image smoothing and related tasks beyond the capabilities of standard DIP methods. Moreover, our technique outperforms the leading unsupervised method, pyramid texture filtering, in texture filtering tasks and other applications.

关键词： image smoothing machine learning deep learning Deep image prior

来源：评论

学校读者我要写书评

暂无评论

A deep journey into image enhancement: A survey of current and emerging trends

引用

INFORMATION FUSION 2023年第1期93卷 36-76页

作者： Lepcha, Dawa Chyophel Goyal, Bhawna Dogra, Ayush Sharma, Kanta Prasad Gupta, Deena Nath Chandigarh Univ Dept ECE Mohali 140413 Punjab India Ronin Inst Montclair NJ 07043 USA GLA Univ Inst Engn & Technol Mathura India C DAC Mumbai Mumbai India

image captured under poor-illumination conditions often display attributes of having poor contrasts, low brightness, a narrow gray range, colour distortions and considerable interference, which seriously affect the qualitative visual effects on human eyes and severely restrict the efficiency of several machine vision systems. In addition, underwater images often suffer from colour shift and contrast degradation because of an absorption and scattering of light while travelling in water. These unpleasant effects limits visibility, reduce contrast and even generate colour casts that limits the use of underwater images and videos in marine archaeology and biology. In medical imaging applications, medical images are important tools for detecting and diagnosing several medical conditions and ailments. However, the quality of medical images can often be degraded during image acquisition due to factors such as noise interference, artefacts, and poor illumination. This may lead to the misdiagnosis of medical conditions, which can further aggravate life threatening situations. image enhancement is one of the most important technologies in the field of image processing, and its purpose is to improve the quality of images for specific applications. In general, the basic principle of image enhancement is to improve the quality and visual interpretability of an image so that it is more suitable for the specific applications and the observers. Over the last few decades, numerous image enhancement techniques have been proposed in the literature This study covers a systematic survey on existing state-of-the-art image enhancement techniques into broad classification of their algorithms. In addition, this paper summarises the datasets utilised in the literature for performing the experiments. Furthermore, an attention has been drawn towards several evaluation parameters for quantitative evaluation and compared different state-of-the-art algorithms for performance analysis on benchmark

关键词： Review image enhancement Fuzzy theory Retinex theory Deep learning Convolutional neural networks (CNNs) Generative adversarial networks (GANs) applications Quality assessment criteria Survey

来源：评论

学校读者我要写书评

暂无评论

Automatic imagery Bank Cheque data extraction based on machine learning approaches: a comprehensive survey

引用

MULTIMEDIA TOOLS AND applications 2023年第20期82卷 30543-30598页

作者： Thakur, Neha Ghai, Deepika Kumar, Sandeep Lovely Profess Univ Phagwara 144411 Punjab India Koneru Lakshmaiah Educ Fdn Vaddeswaram Andhra Pradesh India

Bank Cheques are used mainly for financial transactions due to which they are processed in enormous amounts on daily basis around the globe. Often, Cheque execution time and expenses can be saved if the whole method of recognition and verification of the Cheque becomes automatic. Automatic bank Cheque processing system is an emerging research field in the area of computer vision, image processing, pattern recognition, machine learning, and deep learning. The article emphasizes the stages of the proceedings of image acquisition, pre-processing, and extraction and recognition in the automatic bank Cheque processing system. This paper describes the various steps involved in the system of automatic data extraction. It further classifies and examines existing challenges in different stages of automated processing of bank Cheques. An attempt is made in this paper to present state-of-the-art techniques for the automatic processing of bank Cheque images. The categories and sub-categories of various fields related to bank Cheque images are illustrated, benchmark datasets are enumerated, and the performance of the most representative approaches is compared. Moreover, it also contains some information about the products available in the market for automatic Cheque processing. This review provides a fundamental comparison and analysis of the remaining problems in the field. It is found that multilayer feed-forward neural network gave an accuracy of 97.31% for payee's name recognition systems;HMM-MLP gave an accuracy of 95.5% for date recognition system. In the courtesy and legal amount system, DNN gave an accuracy of 98.5% for digit recognition, MLP gave an accuracy of 93.2% for courtesy amount, MQDF gave an accuracy of 97.04% for the legal amount. Further, the SvM classifier gave an accuracy of 99.13% for signature recognition, and deep learning-based Convolutional Neural Networks (CNN) gave an accuracy of 99.14% for handwritten numeric character recognition. This survey paper

关键词： Bank Cheque processing Courtesy amount recognition Date recognition Legal amount recognition MICR code Signature verification

来源：评论

学校读者我要写书评

暂无评论

Restoration of motion-blurred numeral image using a complex-amplitude diffractive processor

引用

OPTICS LETTERS 2024年第17期49卷 4914-4917页

作者： Zhu, Haodong Yin, Ruiqi Hu, Tie Xia, Rui Li, Minglong Zhao, Ming Yang, ZhenYu Huazhong Univ Sci & Technol Sch Opt & Elect Informat Nanophoton Lab Wuhan 430074 Peoples R China

We propose a complex-amplitude diffractive processor based on diffractive deep neural networks (D2NNs). By precisely controlling the propagation of an optical field, it can effectively remove the motion blur in numeral images and realize the restoration. Comparative analysis of phase-only, amplitude-only, and complex-amplitude diffractive processor reveals that the complex-amplitude network significantly enhances the performance of the processor and improves the peak signal-to-noise ratio (PSNR) of the images. Appropriate use of complex-amplitude networks contributes to reduce the number of network layers and alleviates alignment difficulties. Due to its fast processing speed and low power consumption, complex-amplitude diffractive processors hold potential applications in various fields including road monitoring, sports photography, satellite imaging, and medical diagnostics. (c) 2024 Optica Publishing Group. All rights, including for text and data mining (TDM), Artificial Intelligence (AI) training, and similar technologies, are reserved.

关键词： image restoration machine vision Neural networks Optical computing Optical fields Photography

来源：评论

学校读者我要写书评

暂无评论

LFRNet: Low-Light Face Super-Resolution with Light Frequency Representation 2

LFRNet: Low-Light Face Super-Resolution with Light Frequency...

引用

2nd International Conference on Algorithm, image processing and machine vision, AIPMv 2024

作者： Zha, Bingxin Zhu, Huijie Zhang, Haolin Yang, Shengying School of Information and Electronic Engineering Zhejiang University of Science and Technology Hangzhou310023 China Huzhou Zhongke Fanzai Electric Power Technology Development Co. Ltd Huzhou313000 China Co. Ltd Shanghai200233 China

ISBN: (纸本)9798350390254

In the field of computer vision, the task of facial super-resolution (FSR) is crucial for applications such as surveillance and photo restoration. However, factors such as noise and artifacts in real-world scenarios severely degrade image quality. Although existing methods use geometric priors and facial heatmap localization to improve FSR performance, these priors are often inaccurate under low-light conditions, affecting the results. This paper proposes an innovative low-light facial enhancement network aimed at integrating FSR and low-light enhancement tasks. We designed a Light Frequency Inference Block (LFIB) to capture and refine brightness and texture features at different frequencies, combining it with Transformer modules in an Encoder-Decoder architecture. The LFIB module separates degraded images into low-frequency and high-frequency brightness features and employs spatial cross-attention to capture facial texture details, effectively addressing the degradation issues in facial images. Experimental results demonstrate that our method excels in the low-light facial super-resolution task, outperforming existing methods on various metrics. Additionally, it shows good generalization capabilities in real-world scenarios, confirming its potential for practical applications. © 2024 IEEE.

关键词： image reconstruction

来源：评论

学校读者我要写书评

暂无评论

A Heterogeneous Multi-RISCv Architecture for image processing and Computer vision applications 12

A Heterogeneous Multi-RISCV Architecture for Image Processin...

引用

2024 IEEE Andescon Conference

作者： Lima, Arthur M. Silva, Douglas L. Santos, Hercules I. A. Arias-Garcia, Janier Beserra, Gilmar S. Yudi, Jones Univ Brasilia Automat & Control Grp Brasilia DF Brazil Univ Fed Minas Gerais Dept Elect Engn Belo Horizonte MG Brazil Univ Brasilia Fac Gama Brasilia DF Brazil

ISBN: (纸本)9798350355291;9798350355284

In the advanced field of image processing and Computer vision (IP/Cv), there is a trend toward utilising parallel processing in computer architectures for enhanced efficiency, striking a balance between general-purpose capabilities and hardware-specific processes. The RISC-v standard, now backed by a wide array of compilers, frameworks, and operating systems, is paving the way for innovative cores. Our introduction of a Multi-Processor Systems on Chip (MPSoC), MPRISC-v, is a testament to this evolution. This system incorporates a Network on Chip (NoC) for robust intra-chip communication. The processing System (PS) seamlessly integrates and manages it through a user-friendly API crafted to simplify the development cycle. To ascertain its effectiveness, we tested it on a Zynq Ultrascale+ MPSoC device, deploying a Sobel-based application benchmark. By evaluating its efficiency in terms of cycles/pixels, our findings underscore its potential and spotlight areas ripe for further enhancement.

关键词： MPSoC IP/Cv RISC-v Sobel-filter FPGA

来源：评论

学校读者我要写书评

暂无评论

Motion-Oriented Diffusion Models for Facial Expression Synthesis 13

Motion-Oriented Diffusion Models for Facial Expression Synth...

引用

13th International Conference on image processing Theory Tools and applications

作者： Bouzid, Hamza Ballihi, Lahoucine Mohammed V Univ Rabat Fac Sci LRIT CNRST URAC 29 Rabat Morocco

ISBN: (纸本)9798331541859;9798331541842

Facial expression generation in computer vision is essential for improving human-computer interaction by enabling machines to interpret and respond to human emotions effectively. This area has attracted considerable research interest. In this context, we introduce a new approach for generating facial expressions from a single neutral image and a target expression label. Our method, referred to as Motion-Oriented Diffusion Model (MODM), leverages latent diffusion techniques, which are known for their ability to learn complex latent spaces and integrate controlled stochasticity to diversify generated content. The main idea of MODM is separating the embedding space into identity and motion domains, and applying diffusion to the motion latent space only. This strategy enhances our model capability to generate various facial expressions while ensuring that the identity details remain consistent across different expressions. To assess the effectiveness of MODM, we perform qualitative and quantitative evaluations using the MUG facial expression database. The preliminary results demonstrate that MODM can generate realistic videos of the six basic facial expressions, preserving the identity of the input subject while accurately representing different emotional states. Additionally, our study highlights promising directions for potential future research and improvements.

关键词： Diffusion Facial expression generation Generative

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：