检索结果-内蒙古大学图书馆

8th IFIP TC 12 International Conference on Computer, Communication and Signal processing, ICCCSP 2024

作者： Subrahmanyam, B.R. Janavi, T.S. Keerthana, V. Serajudeen, Ayesha Jumana Arunnagiri, A.M. Department of Electronics and Communication Engineering College of Engineering Guindy Anna University Chennai600025 India

ISBN: (纸本)9783031736162

The project addresses the challenge of accurately identifying blurred faces in computer vision and facial recognition. It introduces a novel framework that integrates deblurring techniques, utilizing point spread function deconvolution to enhance facial image quality. Principal Component Analysis (PCA) is employed for feature extraction, and a K-Nearest Neighbors (KNN) classifier is applied for face identification. The combined deblurring and PCA-transformed features improve matching and identification accuracy, particularly in scenarios with initially blurred images. Experimental validation on a real-world dataset demonstrates the efficacy of the proposed methodology. This approach not only enhances facial recognition accuracy but also lays the groundwork for future research in challenging practical applications, such as security and law enforcement. © IFIP International Federation for Information processing 2025.

关键词： Optical transfer function

来源：评论

学校读者我要写书评

暂无评论

Genetic Algorithm Augmented Inception-Net based image Classifier Accelerated on FPGA

引用

Multimedia Tools and applications 2023年第29期82卷 45097-45125页

作者： Kaziha, Omar Bonny, Talal Jarndal, Anwar Department of Electrical Engineering College of Engineering University of Sharjah Sharjah United Arab Emirates Department of Computer Engineering College of Computing and Informatics University of Sharjah Sharjah United Arab Emirates

Deep learning models for computer vision applications specifically and for machine learning generally are now the state of the art. The growth of size and complexity of neural networks has made them more and more reliable, yet in greater need of computational power and memory as is evident from the heavy reliance on graphical processing units and cloud computing for training them. As the complexity of deep neural networks increases, the need for fast processing neural networks in real-time embedded applications at the edge also increases and accelerating them using reconfigurable hardware suggests a solution. In this work, a convolutional neural network based on the inception net architecture is first optimized in software and then accelerated by taking advantage of field programmable gate array (FPGA) parallelism. Genetic algorithm augmented training is proposed and used on the neural network to produce an optimum model from the first training run without re-training iterations. Quantization of the network parameters is performed according to the weights of the network. The resulting neural network is then transformed into hardware by writing the register transfer level (RTL) code for FPGAs with exploitation of layer parallelism and a simple trial-and-error allocation of resources with the help of the roofline model. The approach is simple and easy to use as compared to many complex existing methods in literature and relies on trial and error to customize the FPGA design to the model needed to work on any computer vision or multimedia application deep learning model. Simulation and synthesis are performed. The results prove that the genetic algorithm reduces the number of back-propagation epochs in software and brings the network closer to the global optimum in terms of performance. Quantization to 16 bits also shows a reduction in network size by almost half with no performance drop. The synthesis of our design also shows that the Inception-based classifier is cap

关键词： Genetic algorithms

来源：评论

学校读者我要写书评

暂无评论

An end-to-end deep convolutional neural network-based data-driven fusion framework for identification of human induced pluripotent stem cell-derived endothelial cells in photomicrographs

引用

ENGINEERING applications OF ARTIFICIAL INTELLIGENCE 2025年 139卷

作者： Iqbal, Imran Ullah, Imran Peng, Tingying Wang, Weiwei Ma, Nan Inst Act Polymers Helmholtz Zentrum Hereon Dept PLR D-14513 Teltow Germany German Res Ctr Environm Hlth Helmholtz Munich D-85764 Munich Germany

Deep learning is a very powerful analytic tool to recognize the patterns in data to make appropriate predictions. It has tremendous potential in data analyses, particularly for cell biology domain, caused by the growing scale and inherent complexity of biological data. The core purpose of this research work is to design, implement, and calibrate an efficient deep convolutional neural network (DCNN) architecture in the context of binary-class classification problem. This diversified network is developed to precisely identify human induced pluripotent stem cell-derived endothelial cells (hiPSC-derived EC) based on photomicrograph. The proposed architecture is cerebrally developed with numerous convolutional modules, multiple kernel sizes, various pooling layers, activation functions and strides, nevertheless fewer trainable parameters to strengthen the network and enhance its performance. The proposed feature fusion framework is compared with the classifier fusion approach in terms of Matthews's correlation coefficient (MCC), training time, inference time, number of layers, number of parameters, graphics processing unit (GPU) memory utilization, and floating-point operations (FLOPS). Specifically, it achieves 94.6% sensitivity, 94.5% specificity, and 94.7% precision. Induced pluripotent stem cell (iPS) dataset is also introduced in this research work that has 16278 images which are labelled by three independent and experienced human experts of cell biology domain to facilitate future research. Experimental results show that the proposed framework offers an innovative and attainable algorithm for accelerating and systematizing the classification task along with saving time and effort.

关键词： Computer vision Deep convolutional neural networks Endothelial cells Human induced pluripotent stem cells image processing Information fusion machine learning Photomicrograph

来源：评论

学校读者我要写书评

暂无评论

AI Based Automated Gym Trainer Using machine Learning 10

AI Based Automated Gym Trainer Using Machine Learning

引用

10th International Conference on Advanced Computing and Communication Systems, ICACCS 2024

作者： Cholaraja, K. Gururaj, M. Swasthika, V. Gnanasambandam, S.R. Sri Eshwar College of Engineering Department of Computer Science and Business Systems Coimbatore India

ISBN: (数字)9798350384369

ISBN: (纸本)9798350384369

The 'Smart Exercise Counter using Computer vision' is a groundbreaking system that blends cutting-edge computer vision technology with exercise monitoring. In an age where fitness and health are paramount, this innovative solution addresses the limitations of conventional fitness tracking methods by offering real-time, accurate feedback and comprehensive data analysis. By utilizing computer vision algorithms, this system tracks and analyzes the movements of individuals during their exercise routines, enhancing exercise counting, form assessment, and overall workout efficacy. Key components of the system include strategically positioned cameras, real-time image processing, machine learning models for exercise recognition, and biomechanically-informed form analysis. The system not only counts repetitions but also evaluates posture and technique, providing immediate feedback and corrections. Users can access their exercise data and progress through an intuitive mobile app or web dashboard, making it accessible and user-friendly. The applications of this technology extend beyond personal fitness, encompassing healthcare, sports training, and gym facilities. Healthcare professionals can employ it for rehabilitation and therapy, while athletes and coaches can refine training regimens. Gyms can enhance member experiences by offering advanced exercise monitoring. The 'Smart Exercise Counter using Computer vision' embodies the future of exercise monitoring, where technology aligns with health and fitness goals, offering a promising path towards more accurate, effective, and enjoyable workouts. This abstract encapsulates the transformative potential of computer vision in shaping the way we approach physical exercise and well-being. © 2024 IEEE.

关键词： Training Computer vision Accuracy Tracking Medical treatment machine learning Real-time systems Mobile applications Monitoring Sports

来源：评论

学校读者我要写书评

暂无评论

Generalizing Functional Error Correction for Language and vision-Language Models 23

Generalizing Functional Error Correction for Language and Vi...

引用

23rd IEEE International Conference on machine Learning and applications, ICMLA 2024

作者： Peng, Wenyu Zheng, Simeng Baluja, Michael Xie, Tao Jiang, Anxiao Siegel, Paul H. University of California San Diego Electrical and Computer Engineering Department La Jolla CA92093 United States San Diego State University San DiegoCA92182 United States Texas A&m University Department of Computer Science and Engineering College StationTX77843 United States

ISBN: (纸本)9798350374889

The goal of functional error correction is to preserve neural network performance when stored network weights are corrupted by noise. To achieve this goal, a selective protection (SP) scheme was proposed to optimally protect the functionally important bits in binary weight representations in a layer-dependent manner. Although it showed its effectiveness in image classification tasks on some relatively simple networks such as ResNet-18 and VGG-16, it becomes inadequate for emerging complex machine learning tasks generated from natural language processing and vision-language association domains. To solve this problem, we extend the SP scheme in three directions: task complexity, model complexity, and storage complexity. Extensions to complex natural language and vision-language tasks include text categorization and 'zero-shot' textual classification of images. Extensions to more complex models with deeper block structures and attention mechanisms consist of Very Deep Convolutional Neural Network (VDCNN) and Contrastive Language-image Pre-Training (CLIP) networks. Extensions to more complex storage configurations focus on distributed storage architectures to support model parallelism. Experimental results show that the optimized SP scheme preserves network performance in all of these settings. The results also provide insights into redundancy-performance tradeoffs, generalizability of SP across datasets and tasks, and robustness of partitioned network architectures. © 2024 IEEE.

关键词： Visual languages

来源：评论

学校读者我要写书评

暂无评论

Comparison of image processing techniques for defect detection 1

Comparison of image processing techniques for defect detecti...

引用

1st International Workshop of Young Scientists on Artificial Intelligence for Sustainable Development, AISD 2024

作者： Kovalskyi, Semen Koval, Vasyl West Ukrainian National University 11 Lvivska Str. Ternopil46009 Ukraine

Defect detection is a crucial quality control process in the manufacturing industry, aimed at identifying and classifying imperfections or anomalies in products before they reach customers. Traditional manual inspection methods are time-consuming, labor-intensive, and prone to human error. This paper provides a comprehensive overview of image-based defect detection algorithms, including traditional image processing techniques, machine learning algorithms, and deep learning models. The study analyzes the strengths, limitations, and performance of each approach across various applications and datasets. The results demonstrate that while traditional methods and machine learning algorithms offer reliable defect detection, deep learning models, particularly convolutional neural networks (CNNs), achieve exceptional accuracy and robustness. However, deep learning models require significant computational resources and large amounts of labeled data for training. The paper highlights the importance of selecting the most appropriate approach based on specific application requirements, data characteristics, and computational constraints. Furthermore, it discusses future research opportunities, such as developing more robust and generalized algorithms, leveraging multi-modal data, improving model interpretability, and enabling real-time and edge computing solutions. © 2024 Copyright for this paper by its authors.

关键词： Computer vision

来源：评论

学校读者我要写书评

暂无评论

E2Evideo: End to End Video and image Pre-processing and Analysis Tool 30th

E2Evideo: End to End Video and Image Pre-processing and A...

引用

30th International Conference on MultiMedia Modeling, MMM 2024

作者： Alawad, Faiga Halvorsen, Pål Riegler, Michael A. Department of Holistic Systems Simula Metropolitan Center for Digital Engineering Oslo Norway

ISBN: (纸本)9783031533013

In this demonstration paper, we present "e2evideo" a versatile Python package composed of domain-independent modules. These modules can be seamlessly customised to suit specialised tasks by modifying specific attributes, allowing users to tailor functionality to meet the requirements of a targeted task. The package offers a variety of functionalities, such as interpolating missing video frames, background subtraction, image resizing, and extracting features utilising state-of-the-art machine learning techniques. With its comprehensive set of features, "e2evideo" stands as a facilitating tool for developers in the creation of image and video processing applications, serving diverse needs across various fields of computer vision. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

关键词： Python

来源：评论

学校读者我要写书评

暂无评论

Video Denoising using Temporal Coherency of Video Frames and Sparse Representation 12

Video Denoising using Temporal Coherency of Video Frames and...

引用

12th Iranian/2nd International Conference on machine vision and image processing, MVIP 2022

作者： Torkashvand, Azadeh Behrad, Alireza Shahed University Electrical Engineering Department Tehran Iran

ISBN: (纸本)9781665412162

Sparse representation based on dictionary learning has been widely used in many applications over the past decade. In this article, a new method is proposed for removing noise from video images using sparse representation and a trained dictionary. To enhance the noise removal capability, the proposed method is combined with a block matching algorithm to take the advantage of the temporal dependency of video images and increase the quality of the output images. The simulations performed on different test data show the appropriate response of the proposed algorithm in terms of video image output quality. © 2022 IEEE.

关键词： image reconstruction

来源：评论

学校读者我要写书评

暂无评论

Planning the trajectory of an object in a confined space using stationary machine vision systems 10

Planning the trajectory of an object in a confined space usi...

引用

Conference on Optical Metrology and Inspection for Industrial applications X

作者： Urunov, Salavat Voronin, Viacheslav Semenishchev, Evgenii Moscow State Tech Univ STANKIN 1a Vadkovsky Moscow 127055 Russia

ISBN: (纸本)9781510667877;9781510667884

The article proposes an approach to the formation of the trajectory of the spatial movement of a controlled object in a confined space using stationary vision systems. For its implementation, the following main steps are used in the work: 1. Preprocessing of data generated by the machine vision system. The task includes multicriteria image processing in order to minimize the noise component and determine the boundaries of objects. 2. An automated method for adaptive non-local separation of objects on borders, background and objects. 3. Execution of the task of adaptive nonlocal binarization. 4. Building a mask of stationary and current moving objects. 5. Formation of an equidistant displacement trajectory. 6. Checking the trajectory by moving in adjacent frames. 7. Prediction and remeasurement of the position of objects in the frame based on displacement vectors and correction of the object's movement trajectory. 7. Formation of a control team to move an object in a confined space using stationary vision systems. To test the effectiveness, studies were conducted on a set of test sequences. The studies were carried out on a group of cameras in the visible spectrum (1920x1080, RGB, 8 bits) covering the entire field of view. The adaptability of the application of the proposed approach in solving complex problems is showed.

关键词： machine vision edge detection preprocessing multicriteria method planning trajectory

来源：评论

学校读者我要写书评

暂无评论

Representation Extraction Using Hyperbolic Knowledge Distilled Framework - An Industrial Application on High Risk Environment 16

Representation Extraction Using Hyperbolic Knowledge Distill...

引用

16th International Conference on Computer and Automation Engineering (ICCAE)

作者： Kumar, Vijeth Murugesan, Malathi Veneri, Giacomo Baker Hughes Bangalore India Baker Hughes Florence Italy

ISBN: (纸本)9798350370058;9798350370164

We propose a computer vision architecture based on Hyperbolic networks, contrastive learning and knowledge distillation to detect unsafe behavior in energy production and oil & gas plants. Data scarcity poses a significant challenge to develop machine learning applications in industry. Indeed, the data may be incomplete, inconsistent, or biased, making it difficult to develop accurate and reliable models. Insufficient data during training phase has direct impact on the models' representation learning capabilities;with the aid of vision Transformers (ViTs), we are able to solve data crunch situations by learning efficient representations of the existing data. We harnessed the power of ViTs, as it incorporates more global information, leading to quantitatively stronger intermediate feature representations. Further, we approached the task with contrastive learning and obtained pairs of samples which are similar, to tackle the limited data availability in our industrial use case. The proposed approach by applying hyperbolic embeddings helps in extracting complex relationships in the data. Furthermore, the size of the model makes it suitable for devices with low computational capabilities such as unmanned robots.

关键词： Hyperbolic networks Contrastive Learning and Knowledge Distillation image processing image Understanding vision Transformers (ViTs) Data Scarcity

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：