检索结果-内蒙古大学图书馆

26th Irish machine vision and image processing Conference, IMVIP 2024

作者： Raymond, Mary Sistu, Ganesh Gallagher, Louis Valeo Vision Systems Ireland Maynooth University Ireland

ISBN: (纸本)9781837242672

Neural Radiance Fields (NeRF) rendering is a promising Artificial intelligence (AI) technology for generating photorealistic views, with significant potential for automotive applications. However, traditional metrics such as Structural Similarity Index Measure (SSIM), Peak Signal-to-Noise Ratio (PSNR), and Learned Perceptual image Patch Similarity (LPIPS) often fail to evaluate the model's quality for novel viewpoints outside the dataset range, which is crucial for real-life use. This study introduces the Fréchet Inception Distance (FID) as a no-reference image quality metric for novel viewpoints. Our experiments demonstrate that FID aligns well with human quality assessments and is effective in automotive scenarios with fisheye images. The need for further research on FID normalization, the sample sizes of generated viewpoints used to calculate FID, and measures of viewpoint difficulty is highlighted. Adopting FID advances NeRF evaluation, enhancing assessments in real life scenarios within automotive and robotics, and improving autonomous system performance and safety. More results from our experiments are available here https://***/watch?v=Lb8azH79EI0. © This is an open access article published by the IET under the Creative Commons Attribution License (http://***/licenses/by/3.0/)

关键词： Rendering (computer graphics)

来源：评论

学校读者我要写书评

暂无评论

LDNet: low-light image enhancement with joint lighting and denoising

引用

machine vision AND applications 2023年第1期34卷 1-15页

作者： Li, Yuhang Liu, Tianyanshi Fan, Jiaxin Ding, Youdong Shanghai Univ Shanghai Film Acad Shanghai 200072 Peoples R China Shanghai Engn Res Ctr Mot Picture Special Effects Shanghai 200072 Peoples R China

Due to unavoidable environmental and/or technical constraints, many photographs are often taken in low-light conditions, which result in underexposure and severe noise. Existing low-light enhancement and denoising methods can deal with both problems individually, but the forced cascading of such methods does not deal well with the combined degradation of light and noise, and is also time-consuming. To address this problem, we propose an efficient network-LDNet, to perform joint low-light enhancement and denoising tasks. LDNet contains an encoder for low-light enhancement, L-Encoder, and a decoder for denoising, D-Decoder. Specifically, we customize the lighten enhancement block (LEB) in L-Encoder to recover rich texture information and luminance information. In D-Decoder, we use image adaptive projection for denoising. Furthermore, since training an end-to-end network requires paired data support, we collect a large-scale real low-light image paired dataset (LN-data). Both the proposed network and dataset provide the basis for this challenging joint task. Extensive experimental results show that our approach achieves better results in both qualitative and quantitative evaluation, notably with a PSNR value of 27.69 and an SSIM value of 0.91 on the LN-data dataset, outperforming other optimal methods.

关键词： Low-light enhancement image processing Supervised learning Denoising

来源：评论

学校读者我要写书评

暂无评论

FusionNet for Interactive image Segmentation 7th

FusionNet for Interactive Image Segmentation

引用

7th Chinese Conference on Pattern Recognition and Computer vision

作者： Wu, Enyi Shi, Qingxuan Wang, Kanglin Hebei Univ Sch Cyber Secur & Comp Baoding 071002 Peoples R China Hebei Univ Hebei Machine Vis Engn Res Ctr Baoding 071002 Peoples R China

ISBN: (纸本)9789819784899;9789819784905

Despite the advancements in neural network technologies driving interactive image segmentation forward, challenges persist, especially concerning segmentation ambiguities caused by overlapping or visually similar objects against complex backgrounds, as well as intricate object boundaries. Addressing these challenges, we introduce FusionNet, focusing on effective feature fusion. Firstly, the Hierarchical Context Fusion Module aids in grasping holistic structures and multi-scale contextual information of target objects. Secondly, the Attention Feature Fusion Module captures more representative feature expressions. This design empowers FusionNet to capture details and contextual relationships better, thereby enhancing segmentation accuracy. For fine-grained boundary details, we propose the Local Correction Module, refining local mask details meticulously. This module initially focuses on information around newly clicked areas, employing discriminative correction feedback for enhanced detail processing accuracy. Rigorous experimentations on datasets like SBD, DAVIS, GrabCut, and Berkeley validate our model's effectiveness, with segmentation results strongly supporting the superiority of our approach.

关键词： Interactive image segmentation Feature fusion Attention mechanism

来源：评论

学校读者我要写书评

暂无评论

Research on simulation of 3D human animation vision technology based on an enhanced machine learning algorithm

引用

NEURAL COMPUTING & applications 2023年第6期35卷 4243-4254页

作者： Yuan, Henning Lee, Jong Han Zhang, Sai Qingdao Agr Univ Acad Affairs Off Qingdao 266109 Peoples R China Huxi Univ Dept Format Convergence Arts Seoul 31499 South Korea Qingdao Agr Univ Animat & Commun Coll Qingdao 266109 Peoples R China

This paper provides an in-depth analysis and study of the simulation of 3D human animation visualization techniques by enhancing machine learning algorithms. Based on the statistical analysis of the data obtained from different measurement methods, the extraction of human body feature parameters based on millimeter-wave point cloud data is realized, and the 3D reconstruction and simulation of the human body are realized using parametric human modeling software. In video-based action recognition, most methods are data-driven and use deep networks to automatically learn features of the entire video image. In this process, specific research on human actions is not included or reflected. However, human action recognition is a processing of the semantic level of video content. Realizing universal human action recognition requires a semantic understanding of human behavior. Firstly, the geometric feature analysis of the 3D scanned human model is performed to extract the human body shape characteristic parameters, and the research on the analysis and estimation methods of body shape characteristic parameters is carried out to establish the human body shape parameter relationship model;then, the millimeter-wave point cloud is calculated and measured, the Li group features extracted using the group skeletal representation model with high data dimensionality, to be able to process the high-dimensional data, while reducing the complexity of the recognition process and speeding up the computation, feature learning and classification are performed with convolutional neural networks. To verify the better library portability and robustness of the method in this paper, the method was tested on a self-built human action database in the laboratory, and an average recognition rate of 97.26% was achieved. Meanwhile, this paper investigates the natural interaction application of virtual characters in a virtual learning environment based on human action recognition. Four testers tested t

关键词： Enhanced machine learning algorithms Simulated 3D human body Animated vision techniques

来源：评论

学校读者我要写书评

暂无评论

Continuous measurement of ferrous sinter size distributions using an optical sensor system

引用

IRONMAKING & STEELMAKING 2023年第8期50卷 1104-1111页

作者： Holliday, Michael Lai, Yufeng Hobbs, Matthew Boone, Nick Al-Haji, Tariq Scott, Iain Willmott, Jon Univ Sheffield Dept Elect & Elect Engn Sheffield England Tata Steel UK Ltd London England PyrOpt Instruments Ltd Innovat Ctr Sheffield England Dept Elect & Elect Engn Sir Frederick Mappin Bldg Mappin St Sheffield S1 3JD England

The size distribution of iron ore sinter is critical to efficient blast furnace operation and is an optimised variable in sinter plants globally. Prompt process control response to discrepancies in sinter size is essential, and the standard sieve measurement test introduces significant delay in data acquisition. We introduce a networked optical sensor system that is shown to accurately measure size distribution within 5 s, collect data continuously at 0.5 Hz, and is well correlated to sieving measurements. This system is deployed at the end of a sinter plant, providing real-time process control data with digital image analysis performed on an integrated microprocessor. The systems performance was assessed with a 12-week validation period, showing excellent correlation with sieve data. Systems such as ours can be widely implemented in sinter plants, and in similar steelmaking applications, due to its cost-effective implementation of continuous data acquisition and the systems versatility to be adapted.

关键词： Sinter size iron ore process control optical blast furnace machine vision image processing

来源：评论

学校读者我要写书评

暂无评论

Serf: Towards better training of deep neural networks using log-Softplus ERror activation Function 23

Serf: Towards better training of deep neural networks using ...

引用

23rd IEEE/CVF Winter Conference on applications of Computer vision (WACV)

作者： Nag, Sayan Bhattacharyya, Mayukh Mukherjee, Anuraag Kundu, Rohit Univ Toronto Toronto ON Canada SUNY Stony Brook Stony Brook NY USA IISER Mohali Ajitgarh India UCR Riverside Riverside CA USA

ISBN: (纸本)9781665493468

Activation functions play a pivotal role in determining the training dynamics and neural network performance. The widely adopted activation function ReLU despite being simple and effective has few disadvantages including the Dying ReLU problem. In order to tackle such problems, we propose a novel activation function called Serf which is self-regularized and non-monotonic in nature. Like Mish, Serf also belongs to the Swish family of functions. Based on several experiments on computer vision (image classification and object detection) and natural language processing (machine translation, sentiment classification and multi-modal entailment) tasks with different state-of-the-art architectures, it is observed that Serf vastly outperforms ReLU (baseline) and other activation functions including both Swish and Mish, with a markedly bigger margin on deeper architectures. Ablation studies further demonstrate that Serf based architectures perform better than those of Swish and Mish in varying scenarios, validating the effectiveness and compatibility of Serf with varying depth, complexity, optimizers, learning rates, batch sizes, initializers and dropout rates. Finally, we investigate the mathematical relation between Swish and Serf, thereby showing the impact of pre-conditioner function ingrained in the first derivative of Serf which provides a regularization effect making gradients smoother and optimization faster.

关键词： Training Deep learning Computer vision Neural networks Computer architecture Object detection machine translation

来源：评论

学校读者我要写书评

暂无评论

Facial Deblurring and Recognition Using image processing and machine Learning Techniques 8th

Facial Deblurring and Recognition Using Image Processing and...

引用

8th IFIP TC 12 International Conference on Computer, Communication and Signal processing, ICCCSP 2024

作者： Subrahmanyam, B.R. Janavi, T.S. Keerthana, V. Serajudeen, Ayesha Jumana Arunnagiri, A.M. Department of Electronics and Communication Engineering College of Engineering Guindy Anna University Chennai600025 India

ISBN: (纸本)9783031736162

The project addresses the challenge of accurately identifying blurred faces in computer vision and facial recognition. It introduces a novel framework that integrates deblurring techniques, utilizing point spread function deconvolution to enhance facial image quality. Principal Component Analysis (PCA) is employed for feature extraction, and a K-Nearest Neighbors (KNN) classifier is applied for face identification. The combined deblurring and PCA-transformed features improve matching and identification accuracy, particularly in scenarios with initially blurred images. Experimental validation on a real-world dataset demonstrates the efficacy of the proposed methodology. This approach not only enhances facial recognition accuracy but also lays the groundwork for future research in challenging practical applications, such as security and law enforcement. © IFIP International Federation for Information processing 2025.

关键词： Optical transfer function

来源：评论

学校读者我要写书评

暂无评论

Genetic Algorithm Augmented Inception-Net based image Classifier Accelerated on FPGA

引用

Multimedia Tools and applications 2023年第29期82卷 45097-45125页

作者： Kaziha, Omar Bonny, Talal Jarndal, Anwar Department of Electrical Engineering College of Engineering University of Sharjah Sharjah United Arab Emirates Department of Computer Engineering College of Computing and Informatics University of Sharjah Sharjah United Arab Emirates

Deep learning models for computer vision applications specifically and for machine learning generally are now the state of the art. The growth of size and complexity of neural networks has made them more and more reliable, yet in greater need of computational power and memory as is evident from the heavy reliance on graphical processing units and cloud computing for training them. As the complexity of deep neural networks increases, the need for fast processing neural networks in real-time embedded applications at the edge also increases and accelerating them using reconfigurable hardware suggests a solution. In this work, a convolutional neural network based on the inception net architecture is first optimized in software and then accelerated by taking advantage of field programmable gate array (FPGA) parallelism. Genetic algorithm augmented training is proposed and used on the neural network to produce an optimum model from the first training run without re-training iterations. Quantization of the network parameters is performed according to the weights of the network. The resulting neural network is then transformed into hardware by writing the register transfer level (RTL) code for FPGAs with exploitation of layer parallelism and a simple trial-and-error allocation of resources with the help of the roofline model. The approach is simple and easy to use as compared to many complex existing methods in literature and relies on trial and error to customize the FPGA design to the model needed to work on any computer vision or multimedia application deep learning model. Simulation and synthesis are performed. The results prove that the genetic algorithm reduces the number of back-propagation epochs in software and brings the network closer to the global optimum in terms of performance. Quantization to 16 bits also shows a reduction in network size by almost half with no performance drop. The synthesis of our design also shows that the Inception-based classifier is cap

关键词： Genetic algorithms

来源：评论

学校读者我要写书评

暂无评论

An end-to-end deep convolutional neural network-based data-driven fusion framework for identification of human induced pluripotent stem cell-derived endothelial cells in photomicrographs

引用

ENGINEERING applications OF ARTIFICIAL INTELLIGENCE 2025年 139卷

作者： Iqbal, Imran Ullah, Imran Peng, Tingying Wang, Weiwei Ma, Nan Inst Act Polymers Helmholtz Zentrum Hereon Dept PLR D-14513 Teltow Germany German Res Ctr Environm Hlth Helmholtz Munich D-85764 Munich Germany

Deep learning is a very powerful analytic tool to recognize the patterns in data to make appropriate predictions. It has tremendous potential in data analyses, particularly for cell biology domain, caused by the growing scale and inherent complexity of biological data. The core purpose of this research work is to design, implement, and calibrate an efficient deep convolutional neural network (DCNN) architecture in the context of binary-class classification problem. This diversified network is developed to precisely identify human induced pluripotent stem cell-derived endothelial cells (hiPSC-derived EC) based on photomicrograph. The proposed architecture is cerebrally developed with numerous convolutional modules, multiple kernel sizes, various pooling layers, activation functions and strides, nevertheless fewer trainable parameters to strengthen the network and enhance its performance. The proposed feature fusion framework is compared with the classifier fusion approach in terms of Matthews's correlation coefficient (MCC), training time, inference time, number of layers, number of parameters, graphics processing unit (GPU) memory utilization, and floating-point operations (FLOPS). Specifically, it achieves 94.6% sensitivity, 94.5% specificity, and 94.7% precision. Induced pluripotent stem cell (iPS) dataset is also introduced in this research work that has 16278 images which are labelled by three independent and experienced human experts of cell biology domain to facilitate future research. Experimental results show that the proposed framework offers an innovative and attainable algorithm for accelerating and systematizing the classification task along with saving time and effort.

关键词： Computer vision Deep convolutional neural networks Endothelial cells Human induced pluripotent stem cells image processing Information fusion machine learning Photomicrograph

来源：评论

学校读者我要写书评

暂无评论

AI Based Automated Gym Trainer Using machine Learning 10

AI Based Automated Gym Trainer Using Machine Learning

引用

10th International Conference on Advanced Computing and Communication Systems, ICACCS 2024

作者： Cholaraja, K. Gururaj, M. Swasthika, V. Gnanasambandam, S.R. Sri Eshwar College of Engineering Department of Computer Science and Business Systems Coimbatore India

ISBN: (数字)9798350384369

ISBN: (纸本)9798350384369

The 'Smart Exercise Counter using Computer vision' is a groundbreaking system that blends cutting-edge computer vision technology with exercise monitoring. In an age where fitness and health are paramount, this innovative solution addresses the limitations of conventional fitness tracking methods by offering real-time, accurate feedback and comprehensive data analysis. By utilizing computer vision algorithms, this system tracks and analyzes the movements of individuals during their exercise routines, enhancing exercise counting, form assessment, and overall workout efficacy. Key components of the system include strategically positioned cameras, real-time image processing, machine learning models for exercise recognition, and biomechanically-informed form analysis. The system not only counts repetitions but also evaluates posture and technique, providing immediate feedback and corrections. Users can access their exercise data and progress through an intuitive mobile app or web dashboard, making it accessible and user-friendly. The applications of this technology extend beyond personal fitness, encompassing healthcare, sports training, and gym facilities. Healthcare professionals can employ it for rehabilitation and therapy, while athletes and coaches can refine training regimens. Gyms can enhance member experiences by offering advanced exercise monitoring. The 'Smart Exercise Counter using Computer vision' embodies the future of exercise monitoring, where technology aligns with health and fitness goals, offering a promising path towards more accurate, effective, and enjoyable workouts. This abstract encapsulates the transformative potential of computer vision in shaping the way we approach physical exercise and well-being. © 2024 IEEE.

关键词： Training Computer vision Accuracy Tracking Medical treatment machine learning Real-time systems Mobile applications Monitoring Sports

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：