检索结果-内蒙古大学图书馆

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND machine INTELLIGENCE 2022年第5期44卷 2485-2503页

作者： Giraldozuluaga, Jhony H. Javed, Sajid Bouwmans, Thierry La Rochelle Univ Lab MIA Math Image & Applicat F-17000 La Rochelle France Khalifa Univ Ctr Autonomous Robot Syst Abu Dhabi 127788 U Arab Emirates

Moving Object Segmentation (MOS) is a fundamental task in computer vision. Due to undesirable variations in the background scene, MOS becomes very challenging for static and moving camera sequences. Several deep learning methods have been proposed for MOS with impressive performance. However, these methods show performance degradation in the presence of unseen videos;and usually, deep learning models require large amounts of data to avoid overfitting. Recently, graph learning has attracted significant attention in many computer vision applications since they provide tools to exploit the geometrical structure of data. In this work, concepts of graph signal processing are introduced for MOS. First, we propose a new algorithm that is composed of segmentation, background initialization, graph construction, unseen sampling, and a semi-supervised learning method inspired by the theory of recovery of graph signals. Second, theoretical developments are introduced, showing one bound for the sample complexity in semi-supervised learning, and two bounds for the condition number of the Sobolev norm. Our algorithm has the advantage of requiring less labeled data than deep learning methods while having competitive results on both static and moving camera videos. Our algorithm is also adapted for Video Object Segmentation (VOS) tasks and is evaluated on six publicly available datasets outperforming several state-of-the-art methods in challenging conditions.

关键词： Videos Task analysis Signal processing algorithms Object segmentation Semisupervised learning Deep learning Complexity theory Moving object segmentation graph signal processing semi-supervised learning unseen videos video object segmentation

来源：评论

学校读者我要写书评

暂无评论

Class preserving projections and data augmentation for appearance-based face recognition

引用

PATTERN ANALYSIS AND applications 2025年第1期28卷 1-12页

作者： Soldera, John Scharcanski, Jacob Inst Fed Farroupilha Estr RS-218 BR-98806700 St Angelo Rio Grande do S Brazil Univ Fed Rio Grande do Sul Inst Informat Ave Bento Goncalves 9500 BR-91501970 Porto Alegre Rio Grande do S Brazil

Computer vision and Biometrics benefit from the recent advances in Pattern Recognition and Artificial Intelligence, which tends to make model-based face recognition more efficient. Also, deep learning combined with data augmentation tends to enrich the training sets used for learning tasks. Nevertheless, face recognition still is challenging, especially because of imaging issues that occur in practice, such as changes in lighting, appearance, head posture and facial expression. In order to increase the reliability of face recognition, we propose a novel supervised appearance-based face recognition method which creates a low-dimensional orthogonal subspace that enforces the face class separability. The proposed approach uses data augmentation to mitigate the problem of training sample scarcity. Unlike most face recognition approaches, the proposed approach is capable of handling efficiently grayscale and color face images, as well as low and high-resolution face images. Moreover, proposed supervised method presents better class structure preservation than typical unsupervised approaches, and also provides better data preservation than typical supervised approaches as it obtains an orthogonal discriminating subspace that is not affected by the singularity problem that is common in such cases. Furthermore, a soft margins Support Vector machine classifier is learnt in the low-dimensional subspace and tends to be robust to noise and outliers commonly found in practical face recognition. To validate the proposed method, an extensive set of face identification experiments was conducted on three challenging public face databases, comparing the proposed method with methods representative of the state-of-the-art. The proposed method tends to present higher recognition rates in all databases. In addition, the experiments suggest that data augmentation also plays an essential role in the appearance-based face recognition, and that the CIELAB color space (L*a*b) is generally mor

关键词： Pattern recognition image processing Appearance-based face recognition Supervised methods Dimensionality reduction

来源：评论

学校读者我要写书评

暂无评论

A Survey on Graph Neural Networks and its applications in Various Domains

引用

SN Computer Science 2025年第1期6卷 1-12页

作者： Murgod, Tejaswini R. Reddy, P. Srihith Gaddam, Shamitha Sundaram, S. Meenakshi Anitha, C. Department of Artificial Intelligence & Machine Learning BNM Institute of Technology Karnataka Bengaluru India Department of Computer Science and Engineering NITTE Meenakshi Institute of Technology Karnataka Bengaluru India

Graph Neural Networks (GNNs) are neural models that use message transmission between graph nodes to represent the dependency of graphs. Variants of Graph Neural Networks (GNNs), such as graph recurrent networks (GRN), graph attention networks (GAT), and graph convolutional networks (GCN), have shown remarkable results on a variety of deep learning tasks in recent years. In this study, we offer a generic design pipeline for GNN models, go over the variations of each part, classify the applications in an organized manner, and suggest four outstanding research issues. Dealing with graph data, which provides extensive connection information among pieces, is necessary for many learning tasks. A model that learns from graph inputs is required for modelling physics systems, learning molecular fingerprints, predicting protein interfaces, and identifying illnesses. Reasoning on extracted structures (such as the dependency trees of sentences and the scene graphs of photos) is an important research issue that also requires graph reasoning models in other domains, such as learning from non-structural data like texts and images. Graph Neural Networks (GNNs) are primarily designed for dealing with graph-structured data, where relationships between entities are modeled as edges in a graph. While GNNs are not traditionally applied to image classification problems, researchers have explored ways to leverage graph-based structures to enhance the performance of Convolutional Neural Networks (CNNs) in certain scenario. GNN have been increasingly applied to Natural Language processing (NLP) tasks, leveraging their ability to model structured data and capture relationships between elements in a graph. GNN are also applied for traffic related problems particularly in modeling and optimizing traffic flow, analyzing transportation networks, and addressing congestion issues. GNN can be used for traffic flow prediction, dynamic routing & navigation, Anomaly detection, public transport network

关键词： Computer vision Graph neural networks Intrusion detection Natural language processing Neural networks Traffic control

来源：评论

学校读者我要写书评

暂无评论

Unleashing Sustainable Efficiency: The Integration of Computer vision into Industry 4.0

引用

ENGINEERING MANAGEMENT JOURNAL 2024年

作者： Gill, Rupali Srivastava, Durgesh Hooda, Susheela Singla, Chaitanya Chaudhary, Rekha Chitkara Univ Chandigarh Punjab India Lovely Profess Univ Bengaluru India

The Internet of Things, artificial intelligence, machine learning, and big data are just a few of the cutting-edge technologies that are being integrated into manufacturing processes as part of the "Industry 4.0" revolution. Computer vision is an essential component of Industry 4.0 regarding sustainability, developed as a disruptive technology that extracts and interprets visual information from digital photos or videos using image processing techniques and advanced models. In the context of Industry 4.0, this article offers an overview of computer vision, including its associated prospects, difficulties, and applications. A particular emphasis is placed on sustainability. It explores computer vision applications in robotics and automation, safety and security, process optimization, augmented reality, robotics and inspection, object identification and tracking, predictive maintenance, and quality control and inspection. The study also identifies the critical approaches used to overcome the difficulties in implementing computer vision solutions. Incorporating computer vision into Industry 4.0 holds promise for unleashing unprecedented levels of efficiency, novelty, and competitiveness in the industrial sector. The manufacturing and industrial sectors may use Industry 4.0's prospects and adopt sustainable practices by utilizing computer vision and overcoming its inherent limits. This will help to create an eco-conscious and efficient future.

关键词： Computer vision Industry 4.0 Deep Learning machine Learning Industry Sustainability SDG New Product Development supply chain management sustainability technology management

来源：评论

学校读者我要写书评

暂无评论

Low-Cost Hardware-Accelerated vision-Based Depth Perception for Real-Time applications

Low-Cost Hardware-Accelerated Vision-Based Depth Perception ...

引用

International Conference on Computer vision and machine Intelligence, CVMI 2022

作者： Aditya, N.G. Dhruval, P.B. Shylaja, S.S. Katharguppe, Srinivas Department of Computer Science PES University Karnataka Bengaluru560085 India

ISBN: (纸本)9789811978661

Depth estimation and 3D object detection are critical for autonomous systems to gain context of their surroundings. In recent times, compute capacity has improved tremendously, enabling computer vision and AI on the edge. In this paper, we harness the power of CUDA and OpenMP to accelerate ELAS (a stereoscopic vision-based disparity calculation algorithm) and 3D projection of the estimated depth while performing object detection and tracking. We also examine the utility of Bayesian inference in achieving real-time object tracking. Finally, we build a drive-by-wire car equipped with a stereo camera setup to test our system in the real world. The entire system has been made public and easily accessible through a Python module. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

关键词： Stereo image processing

来源：评论

学校读者我要写书评

暂无评论

Detection and Counting of Nuts and Bolts with image processing Using MATLAB 1

引用

2nd Conference of Innovative Product Design and Intelligent Manufacturing System, IPDIMS 2020

作者： Murali, Gunji Bala Vasant, Gaikwad Tejas Kupwade, Dhananjay D. Girish, Deshmukh Atharva Vellore Institute of Technology Tamil Nadu Vellore632014 India

ISBN: (数字)9789811902963

ISBN: (纸本)9789811902956

machine vision system plays vital roles in the industrial application in order to maintain quality and control the process. machine vision technology has numerous applications in various industries like automotive industry, pharmaceutical industries, food and beverage, electronics, packages, and process control. The purpose behind this work is the awareness of machine vision technology and the identification of mechanical components particularly nuts and bolts. An automotive uses nuts and bolts and are assembled at different stages of assembly. The count of these nuts and bolts becomes vital in order to avoid shortage at any stage of assembly. Thus, sorting of the nuts and bolts and their counting is necessary in order to save time. The system can be designed using different algorithms, and all algorithms can be integrated using MATLAB software. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

关键词： image processing technique machine vision MATLAB Nuts bolts counting

来源：评论

学校读者我要写书评

暂无评论

Accelerating Traditional Object Detection Methods on Sophisticated Embedded Systems 4th

Accelerating Traditional Object Detection Methods on Sophist...

引用

4th International Conference on Intelligent Systems and Pattern Recognition-ISPR

作者： Sehairi, Kamal Bouhafs, Abdelkader Boucherit, Ibtissam Mama Bouwmans, Thierry Chouireb, Fatima Ecole Normale Super Laghouat Dept Phys Lab Appl Sci & Didact Laghouat Algeria Univ Amar Telidji Laghouat Dept Elect Telecommun Signals & Syst Lab Laghouat Algeria Univ La Rochelle Dept Comp Sci Lab Math Image & Applicat La Rochelle France

ISBN: (纸本)9783031821523;9783031821530

Identifying and locating objects in images and videos, including elements like traffic signs, vehicles, buildings, and people, constitutes a fundamental and demanding task in computer vision, known as object detection. Due to the higher computing complexity of this technique and the large amount of data carried by the video signal, it is nearly impossible for ordinary general-purpose processors GPPs or CPUs to run these techniques in real-time, especially for embedded systems applications. Therefore, special hardware that can acquire, control, or execute in parallel is required. These specialized hardware systems include Digital Signal Processors DSPs, Field Programmable Gate Arrays FPGAs, Visual processing Units VPUs, Tensor processing Units TPUs, Neural processing Units NPUs or Graphics processing Units GPUs. This work presents the benefits of accelerating traditional object detection methods on a high-end embedded system, the Jetson Nano Developer Kit. This single computer board is equipped with the Tegra K1 System on Chip SoC, which is composed of a quad-core ARM A15 and 192 cores of Kepler-embedded GPU. Computing acceleration was ensured via the use of the CUDA OpenCV library for both the Histogram of Oriented Gradients HOG and the Haar Cascade Classifier. For VGA resolution, results reveal that the GPU implementation on this embedded system is 1.4x faster than the CPU for the HOG method and 2x for the Haar Cascade Classifier method.

关键词： Object detection machine Learning Haar Cascade HOG Local feature descriptors GPU implementation OpenCV CUDA SoC Tegra X1 Jetson Nano

来源：评论

学校读者我要写书评

暂无评论

Hand Gesture-Based AI System for Accessing Windows applications 4th

Hand Gesture-Based AI System for Accessing Windows Applicati...

引用

4th International Conference on image processing and Capsule Networks, ICIPCN 2023

作者： Palivela, Lakshmi Harika Premanand, V. Begum, Afruza School of Computer Science Engineering Vellore Institute of Technology Chennai India

ISBN: (纸本)9789819970926

Actions speak more than words. In the context of the above statement, the importance of gestures and using them to control a system has become popular. The hand gesture recognition system for opening applications in Windows is a computer vision application that uses a webcam and machine learning models to recognize hand gestures and open specific applications. This system allows users to control their computers using simple hand gestures instead of relying on traditional input devices such as keyboards or mouse. The system uses the Tensor Flow and Media-Pipe libraries for training and detecting hand gestures, respectively. The trained machine learning model recognizes the gestures, and specific applications are opened based on the recognized gesture. The system has potential applications in accessibility, gaming, and virtual reality. In this work, we will discuss potential applications of the hand gesture recognition system for opening applications in Windows, and the second proposed system, which may also act as the control system of PowerPoint presentations, can be made easy and convenient using these gestures-based control. Thus, the proposed model improves the interaction between the human and the computer. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

关键词： Palmprint recognition

来源：评论

学校读者我要写书评

暂无评论

Impact of image size on Human Facial Expression Recognition: A Relative Study 10

Impact of image size on Human Facial Expression Recognition:...

引用

10th International Conference on Communication and Signal processing, ICCSP 2024

作者： Swathi, A. Menon, Shashank Bhavana, V. Amrita Vishwa Vidyapeetham Amrita School of Engineering Department of Electronics and Communications Engineering Bengaluru India

ISBN: (纸本)9798350353068

The recognition of facial emotions has received growing focus in recent years due to its importance and the significant role it plays in shaping the way humans interact with computers. This can be achieved using deep learning algorithms which are currently being used for various vision-based applications like self-driving vehicles. In this paper, we used the Convolutional Neural Network (CNN) for training and testing images with different facial expressions using the TensorFlow machine learning library. The application is comprised of two stages: a recognizer for validation purposes and a data training model that is used to train the data. The Data training model uses CNN to train data. With the help of this algorithm, the system can recognize the six universal emotions, anger, disgust, joy, surprise, sadness, and fear, along with neutral. The experiment demonstrated that the outcomes of the training program were significantly influenced by the image size that is used for training. As a result, both the RGB and grayscale images yielded comparable results, whereas images with different sizes exhibited variations in performance. The RGB images achieved a training accuracy of 99.666%, a validation accuracy of 99.95%, and a test accuracy of 32.28%, while the combination of RGB and grayscale images attained a training accuracy of 98.99%, a validation accuracy of 92.76%, and a test accuracy of 72.56%. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Attention to Emotions: Body Emotion Recognition In-the-Wild Using Self-attention Transformer Network 18th

Attention to Emotions: Body Emotion Recognition In-the-Wild ...

引用

18th International Joint Conference on Computer vision, Imaging and Computer Graphics Theory and applications (VISIGRAPP) / 18th International Conference on Computer Graphics Theory and applications

作者： Paiva, Pedro V. V. Ramos, Josue J. G. Gavrilova, Marina Carvalho, Marco A. G. Univ Estadual Campinas Sch Technol Limeira Brazil Renato Archer IT Ctr Cyber Phys Syst Div Campinas Brazil Univ Calgary Dept Comp Sci Calgary AB Canada

ISBN: (纸本)9783031667428;9783031667435

Body movements are an essential part of non-verbal communication as they help to express and interpret human emotions. The potential of Body Emotion Recognition (BER) is immense, as it can provide insights into user preferences, automate real-time exchanges and enable machines to respond to human emotions. BER finds applications in customer service, healthcare, entertainment, emotion-aware robots, and other areas. While face expression-based techniques are extensively researched, detecting emotions from body movements in the realworld presents several challenges, including variations in body posture, occlusions, and background. Recent research has established the efficacy of transformer deep-learning models beyond the language domain to solve video and image-related problems. A key component of transformers is the self-attention mechanism, which captures relationships among features across different spatial locations, allowing contextual information extraction. In this study, we aim to understand the role of body movements in emotion expression and to explore the use of transformer networks for body emotion recognition. Our method proposes a novel linear projection function of the visual transformer, which enables the transformation of 2D joint coordinates into a conventional matrix representation. Using an original method of contextual information learning, the developed approach enables a more accurate recognition of emotions by establishing unique correlations between individual's body motions over time. Our results demonstrated that the self-attention mechanism was able to achieve high accuracy in predicting emotions from body movements, surpassing the performance of other recent deep-learning methods. In addition, the impact of dataset size and frame rate on classification performance is analyzed.

关键词： Body emotion recognition Affective computing Video and image processing Gait analysis Attention-based design

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：