检索结果-内蒙古大学图书馆

18th International Joint Conference on Computer vision, Imaging and Computer Graphics Theory and applications (vISIGRAPP) / 18th International Conference on Computer Graphics Theory and applications

作者： Paiva, Pedro v. v. Ramos, Josue J. G. Gavrilova, Marina Carvalho, Marco A. G. Univ Estadual Campinas Sch Technol Limeira Brazil Renato Archer IT Ctr Cyber Phys Syst Div Campinas Brazil Univ Calgary Dept Comp Sci Calgary AB Canada

ISBN: (纸本)9783031667428;9783031667435

Body movements are an essential part of non-verbal communication as they help to express and interpret human emotions. The potential of Body Emotion Recognition (BER) is immense, as it can provide insights into user preferences, automate real-time exchanges and enable machines to respond to human emotions. BER finds applications in customer service, healthcare, entertainment, emotion-aware robots, and other areas. While face expression-based techniques are extensively researched, detecting emotions from body movements in the realworld presents several challenges, including variations in body posture, occlusions, and background. Recent research has established the efficacy of transformer deep-learning models beyond the language domain to solve video and image-related problems. A key component of transformers is the self-attention mechanism, which captures relationships among features across different spatial locations, allowing contextual information extraction. In this study, we aim to understand the role of body movements in emotion expression and to explore the use of transformer networks for body emotion recognition. Our method proposes a novel linear projection function of the visual transformer, which enables the transformation of 2D joint coordinates into a conventional matrix representation. Using an original method of contextual information learning, the developed approach enables a more accurate recognition of emotions by establishing unique correlations between individual's body motions over time. Our results demonstrated that the self-attention mechanism was able to achieve high accuracy in predicting emotions from body movements, surpassing the performance of other recent deep-learning methods. In addition, the impact of dataset size and frame rate on classification performance is analyzed.

关键词： Body emotion recognition Affective computing video and image processing Gait analysis Attention-based design

来源：评论

学校读者我要写书评

暂无评论

A machine vision Algorithm Approach for Angle Detection in Industrial applications 12th

A Machine Vision Algorithm Approach for Angle Detection in I...

引用

12th International Symposium on Intelligent Manufacturing and Service Systems, IMSS 2023

作者： Kayğusuz, Mehmet Öz, Barış Çelik, Ayberk Akgül, Yunus Emre Şimşek, Gözde Sarıgüzel, Ebru Gezgin Ar-Ge Merkezi Mamur Teknoloji Sistemleri Istanbul Turkey Graduate School of Science and Engineering Yildiz Technical University Istanbul Turkey Graduate School of Science and Engineering Kocaeli University Kocaeli İzmir Turkey

ISBN: (纸本)9789819960613

In automatic feeding systems, feeding of characteristic workpieces by mechanical tools causes accuracy and cost difficulties. For this reason, in systems where special workpieces are fed, image processing applications are necessary to obtain characteristic features of a product. In this study, a novel image processing algorithm is proposed for feeding a workpiece which has characteristic geometrical structures. The proposed algorithm is based on obtaining geometrical and rotational properties of the product and the gradient-based analysis as follows. The first step is to extract features from the shape of the workpiece, this step includes noise reduction, filtering, and edge detection operations. The gradient values of the edge information are used to create the angle-length vector pair in the second step. The workpiece rotation information is derived from length values indexed with angle information. The last step involves determination of the workpiece position in the 2D coordinate system. The coordinate information is used to determine the position of the gripper holder. The coordinates and angle are transmitted to the feed control. The proposed algorithm is applied on the 800 images that are collected from manufactured products. Rotation angle of the workpiece is determined by a tolerance of 1.5°. It is seen that results have sufficient accuracy for industrial applications. © 2024, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

关键词： Computer vision

来源：评论

学校读者我要写书评

暂无评论

machine learning applications for early detection of esophageal cancer: a systematic review

引用

BMC MEDICAL INFORMATICS AND DECISION MAKING 2023年第1期23卷 1-17页

作者： Hosseini, Farhang Asadi, Farkhondeh Emami, Hassan Ebnali, Mahdi Shahid Beheshti Univ Med Sci Sch Allied Med Sci Dept Hlth Informat Technol & Management Tehran Iran Harvard Med Sch Dept Emergency Med Boston MA USA

IntroductionEsophageal cancer (EC) is a significant global health problem, with an estimated 7th highest incidence and 6th highest mortality rate. Timely diagnosis and treatment are critical for improving patients' outcomes, as over 40% of patients with EC are diagnosed after metastasis. Recent advances in machine learning (ML) techniques, particularly in computer vision, have demonstrated promising applications in medical image processing, assisting clinicians in making more accurate and faster diagnostic decisions. Given the significance of early detection of EC, this systematic review aims to summarize and discuss the current state of research on ML-based methods for the early detection of *** conducted a comprehensive systematic search of five databases (PubMed, Scopus, Web of Science, Wiley, and IEEE) using search terms such as "ML", "Deep Learning (DL (", "Neural Networks (NN)", "Esophagus", "EC" and "Early Detection". After applying inclusion and exclusion criteria, 31 articles were retained for full *** results of this review highlight the potential of ML-based methods in the early detection of EC. The average accuracy of the reviewed methods in the analysis of endoscopic and computed tomography (CT (images of the esophagus was over 89%, indicating a high impact on early detection of EC. Additionally, the highest percentage of clinical images used in the early detection of EC with the use of ML was related to white light imaging (WLI) images. Among all ML techniques, methods based on convolutional neural networks (CNN) achieved higher accuracy and sensitivity in the early detection of EC compared to other *** findings suggest that ML methods may improve accuracy in the early detection of EC, potentially supporting radiologists, endoscopists, and pathologists in diagnosis and treatment planning. However, the current literature is limited, and more studies are needed to investigate the clinical applications of these met

关键词： machine learning Deep learning Esophagus Esophageal Cancer Early detection

来源：评论

学校读者我要写书评

暂无评论

Design and simulation of autonomous military vehicle control system based on machine vision and ensemble movement approach

引用

JOURNAL OF SUPERCOMPUTING 2022年第15期78卷 17309-17347页

作者： Ahmadi, Kourosh Dadashtabar Rashidi, Ali Jabar Moghri, Ali Massomi Malek Ashtar Univ Technol Dept Elect & Comp Engn Tehran Iran

On the battlefield, early detection of armored vehicles can have a positive effect. Because according to this issue, timely and appropriate reactions can be done. The purpose of this study is to achieve the required algorithm in the vehicle control system by considering the car sensor vision, which is necessary to identify and determine the equipment needed to control the military drone based on car sensor vision. Today, the use of wireless networks, especially inter-vehicle wireless networks, in military applications is inevitable. Therefore, in the first step of this research, a new method has been proposed to control and steer unmanned vehicles based on car vision. In the proposed method, two 180-degree panoramic cameras with horizontal vision are used from the recorded images. The simulation results of the proposed method show increased accuracy and reduced implementation cost compared to using LIDAR and RADAR technologies. In the second step, a new approach is introduced to identify four common classes of armored vehicles (tanks, personnel carriers, firing tanks, and military vehicles) that are more likely to be present on battlefields. For this purpose, the latest image processing methods, which is deep learning, have been used. The results of the simulation of the proposed approach show the high accuracy of the proposed approach in detecting armored vehicles in a short time. In the third step of this research, a new method has been proposed to increase the connection of wireless networks. In the proposed method, queue theory is used and the results of the simulation of the proposed method show the high efficiency of the method. As a result, accurate and fast detection with unique features makes the users of the system superior.

关键词： Inter-vehicular wireless network Military applications machine vision Detection Connectivity

来源：评论

学校读者我要写书评

暂无评论

Insightguard: machine Learning Empowered Self-Monitoring System for vision Centric applications 10

Insightguard: Machine Learning Empowered Self-Monitoring Sys...

引用

10th International Conference on Advanced Computing and Communication Systems, ICACCS 2024

作者： Bharathi, R. Ezra, P. Gowtham, S. Hemanth, D. Nagendran, M. M.Kumarasamy College of Engineering Department of Information Technology Karur India

ISBN: (数字)9798350384369

ISBN: (纸本)9798350384369

The non-trivial task of designing an automatic picture content recognition system has been researched for several applications, including human identification, face detection, and face recognition. Digital image processing is presented in many forms, one of which is face recognition. The automatic identification of a person in a digital image is the subject of the difficult problem known as automatic face detection. There are numerous algorithms available for use in this procedure. However, there are currently no methods for automatically detecting faces in a variety of application scenarios at low resolutions. This project's computer vision technology can be used to forecast whether or not screens will be within their field of vision. Positioning monitors incorrectly can result in issues that could cause eyestrain. For example, the user might scoot the chair away from the screen or tilt their head back, which would force you to type with your arms outstretched. However, there isn't an automated system in place to gauge the distance between the display and the eye. Thus, we may construct an autonomous alarm system based on face recognition and distance in this project. The maximum is 1.02 metres (3.3 feet), while the minimum is 0.38 metres (1.2 feet). Artificial intelligence can be used to accomplish that. Human head positions can be recorded using a web camera, allowing us to distinguish between foreground and background head positions. then identifying and detecting faces using image processing algorithms. Lastly, use a web camera to determine the distance between the screen and the face. Without the need of any sensors, an alert is automatically generated and sent to users if the distance is less than the pre-defined threshold value. © 2024 IEEE.

关键词： Face recognition

来源：评论

学校读者我要写书评

暂无评论

A novel hybrid architecture for video frame prediction: combining convolutional LSTM and 3D CNN

引用

JOURNAL OF REAL-TIME image processing 2025年第1期22卷 1-18页

作者： Aravinda, C. v. Al-Shehari, Taher Alsadhan, Nasser A. Shetty, Shashank Padmajadevi, G. Reddy, K. R. Udaya Kumar Nitte Deemed Be Univ NMAM Inst Technol Dept Comp Sci & Engn NITTE Karkala 574110 Karnataka India King Saud Univ Dept Selfdev Skill Comp Skills Common Year Deanship 1 Riyadh 11362 Saudi Arabia King Saud Univ Coll Comp & Informat Sci Comp Sci Dept Riyadh 12372 Saudi Arabia Maland Coll Engn Dept Elect & Commun Engn Hassan 573202 Karnataka India Dayananda Sagar Univ Bangalore 560078 Karnataka India

video frame prediction represents a fundamental challenge in computer vision, necessitating precise modeling of both spatial and temporal dynamics within video sequences. This computational task holds substantial implications across diverse domains, including video compression optimization, robust object tracking systems, and advanced motion forecasting applications. In this investigation, we present a novel hybrid architecture that synthesizes the complementary strengths of Convolutional Long Short-Term Memory (ConvLSTM) networks and three-dimensional Convolutional Neural Networks (3D CNN) for enhanced frame prediction capabilities. Our methodological framework incorporates a ConvLSTM component that fundamentally augments the traditional LSTM architecture through the integration of convolutional operations, thereby facilitating sophisticated modeling of sequential dependencies. Concurrently, the 3D CNN component employs volumetric convolutional layers to extract rich spatio-temporal features from the input sequences. Rigorous empirical evaluation demonstrates the superior performance of the ConvLSTM architecture, which consistently yields reduced validation errors and elevated coefficients of determination. Specifically, the ConvLSTM model achieves a validation Mean Squared Error (MSE) of 0.0237 and an R2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\textrm{R}}<^>{2}$$\end{document} value of 0.6951, substantially outperforming the 3D CNN model, which exhibits a validation MSE of 0.0471 and an R2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\textrm{R}}<^>{2}$$\end{document} value of 0.3939. These empiri

关键词： video frame prediction ConvLSTM network Spatio-temporal model Encoder-decoder network Convolutional neural network (CNN) Long short-term memory (LSTM) Convolutional operations Spatio-temporal dependencies Future frame generation Computer vision applications

来源：评论

学校读者我要写书评

暂无评论

image-based crack detection approaches: a comprehensive survey

引用

MULTIMEDIA TOOLS AND applications 2022年第28期81卷 40181-40229页

作者： Gupta, Priyanka Dixit, Manish MITS Dept Comp Sci & Engn Gwalior Madhya Pradesh India

Automatic crack detection is a challenging task that has been researched for decades due to the complex civil structures. Cracks on any structure are early signs of the deterioration of the object's surface. Therefore, detection and regular maintenance of cracks are necessary tasks as the propagation of cracks results in severe damage. Manual inspection is based on the expert's previous knowledge, and it can only be done in reachable human areas. On the other hand, autonomous detection of cracks by using image-based techniques may reduce human errors, less time-consuming, and more economical than human-based inspection for real-time crack detection. Since movable cameras can capture images for non-reachable areas, several techniques are available for crack detection. Several techniques are available for crack detection;however, image-based crack detection techniques have been analyzed in this survey. A detailed study is carried out to define the research problems and advancements in this area. This article analyses the pure image processing techniques and learning-based techniques based on the objectives, the methods, level of efficiency, level of errors, and type of crack image dataset. Besides the applications, limitations and other factors are explained for each technique. Moreover, the presented analysis shows the multiple problems related to cracks that could help the researcher perform further research.

关键词： Computer vision image processing Structure health monitoring Crack detection machine learning Deep convolutional neural network

来源：评论

学校读者我要写书评

暂无评论

One Shot Face image Style Transfer Method Based on GAN 2

One Shot Face Image Style Transfer Method Based on GAN

引用

2nd International Conference on image processing, Computer vision and machine Learning, ICICML 2023

作者： He, Wenyin Wuhan University Department of Computer Science Hubei Wuhan430072 China

ISBN: (纸本)9798350331417

Style transformation on face images has traditionally been a popular research area in the field of computer vision, and its applications are quite extensive. Currently, the more mainstream schemes include Generative Adversarial Network (GAN)-based image generation as well as style transformation and Stable diffusion method. In 2019, the NvIDIA team proposed StyleGAN, which is a relatively mature scheme for generating real faces as well as face feature blending. The whole StyleGAN model is trained based on the Flickr-Faces-HQ Dataset (FFHQ) dataset, the This is a large dataset, so the model takes a long time to train. My aim is to form a One-shot stylized face image generator, which means that only one reference face and one stylized face need to be input, and a brand-new face with a mixture of features can be generated in a short training time. This is inspired by the existing research result JoJoGAN, which learns a style mapper from a single example of the style. JoJoGAN uses a GAN inversion procedure and StyleGAN's style-mixing property to produce a substantial paired dataset from a single example of the style. This paper will make improvements to JoJoGAN, including improving the encoder that utilizes the GAN Inversion method to generate latent codes for image features, and the random mixing of latent codes to produce a more refined paired dataset. © 2023 IEEE.

关键词： deep learning GAN style transformation

来源：评论

学校读者我要写书评

暂无评论

Realistic and visually-Pleasing 3D Generation of Indoor Scenes from a Single image 7th

Realistic and Visually-Pleasing 3D Generation of Indoor Scen...

引用

7th Chinese Conference on Pattern Recognition and Computer vision

作者： Li, Jie Wang, Lei Chen, Gongbin Li, Ang Qiu, Yuhao Wu, Jiaji Cheng, Jun Chinese Acad Sci Shenzhen Inst Adv Technol Shenzhen 518055 Peoples R China Univ Chinese Acad Sci CAS Beijing Peoples R China Shenzhen MSU BIT Univ Shenzhen Peoples R China Xidian Univ Sch Elect Engn Xian 710071 Peoples R China

ISBN: (纸本)9789819785070;9789819785087

Artificial Intelligence Generated Content (AIGC) has experienced significant advancements, particularly in the areas of natural language processing and 2D image generation. However, the generation of three-dimensional (3D) content from a single image still poses challenges, particularly when the input image contains complex backgrounds. This limitation hinders the potential applications of AIGC in areas such as human-machine interaction, virtual reality (vR), and architectural design. Despite the progress made so far, existing methods face difficulties when dealing with single images that have intricate backgrounds. Their reconstructed 3D shapes tend to be incomplete, noisy, or lack of partial geometric structures. In this paper, we introduce a 3D generation framework for indoor scenes from a single image to generate realistic and visually-pleasing 3D geometry shapes, without the requirement of point clouds, multi-view images, depth or masks as input. The main idea of our method is clustering-based 3D shape learning and prediction, followed by a shape deformation. Since more than one objects tend to be existing in indoor scenes, our framework will simultaneously generate multi-objects and predict the layout with a camera pose, as well as 3D object bounding boxes for holistic 3D scene understanding. We have evaluated the proposed framework on benchmark datasets including ShapeNet, SUN RGB-D and Pix3D, and state-of-the-art performance has been achieved. We have also given examples to illustrate immediate applications in virtual reality.

关键词： 3D mesh Reconstruction Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

Evolving Convolutional Neural Networks with Meta-Heuristics for Transfer Learning in Computer vision 3

Evolving Convolutional Neural Networks with Meta-Heuristics ...

引用

3rd International Conference on Evolutionary Computing and Mobile Sustainable Networks, ICECMSN 2023

作者： Srilakshmi, v. Kiran, G. Uday Mounika, M. Sravanthi, A. Sravya, N.v.K. Akhil, v.N.S. Manasa, M. B V Raju Institute of Technology Telangana Narsapur India

In the rapidly evolving landscape of computer vision and artificial intelligence, transfer learning has emerged as a powerful tool for efficiently applying pre-trained models to new tasks. This article delves into the intriguing concept of evolving Convolutional Neural Networks (CNNs) with meta-heuristics for transfer learning in computer vision. The primary focus is on enhancing the adaptability and efficiency of CNNs, making them better suited for specialized tasks. The article covers the significance of transfer learning, the challenges faced in transfer learning with CNNs, the basics of CNN architecture, and the role of meta-heuristics in optimizing CNNs. Real-world applications and success stories demonstrate the transformative potential of these techniques in fields like medical image analysis and autonomous vehicles. It explores emerging trends and potential developments in the domain, emphasizing the impact on various sectors, including healthcare, natural language processing, and robotics. The promise of evolving CNNs with meta-heuristics lies in their capacity to tackle intricate problems with greater precision, ultimately reshaping the landscape of artificial intelligence and machine learning. Ongoing research ensures a promising future for this amalgamation of technologies, promising breakthroughs that will have a lasting impact on the world of computer vision and beyond. © 2023 Elsevier B.v.. All rights reserved.

关键词： Computer vision

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：