检索结果-内蒙古大学图书馆

2nd International Conference on image processing, Computer vision and machine Learning, ICICML 2023

作者： He, Wenyin Wuhan University Department of Computer Science Hubei Wuhan430072 China

ISBN: (纸本)9798350331417

Style transformation on face images has traditionally been a popular research area in the field of computer vision, and its applications are quite extensive. Currently, the more mainstream schemes include Generative Adversarial Network (GAN)-based image generation as well as style transformation and Stable diffusion method. In 2019, the NVIDIA team proposed StyleGAN, which is a relatively mature scheme for generating real faces as well as face feature blending. The whole StyleGAN model is trained based on the Flickr-Faces-HQ Dataset (FFHQ) dataset, the This is a large dataset, so the model takes a long time to train. My aim is to form a One-shot stylized face image generator, which means that only one reference face and one stylized face need to be input, and a brand-new face with a mixture of features can be generated in a short training time. This is inspired by the existing research result JoJoGAN, which learns a style mapper from a single example of the style. JoJoGAN uses a GAN inversion procedure and StyleGAN's style-mixing property to produce a substantial paired dataset from a single example of the style. This paper will make improvements to JoJoGAN, including improving the encoder that utilizes the GAN Inversion method to generate latent codes for image features, and the random mixing of latent codes to produce a more refined paired dataset. © 2023 IEEE.

关键词： deep learning GAN style transformation

来源：评论

学校读者我要写书评

暂无评论

Realistic and Visually-Pleasing 3D Generation of Indoor Scenes from a Single image 7th

Realistic and Visually-Pleasing 3D Generation of Indoor Scen...

引用

7th Chinese Conference on Pattern Recognition and Computer vision

作者： Li, Jie Wang, Lei Chen, Gongbin Li, Ang Qiu, Yuhao Wu, Jiaji Cheng, Jun Chinese Acad Sci Shenzhen Inst Adv Technol Shenzhen 518055 Peoples R China Univ Chinese Acad Sci CAS Beijing Peoples R China Shenzhen MSU BIT Univ Shenzhen Peoples R China Xidian Univ Sch Elect Engn Xian 710071 Peoples R China

ISBN: (纸本)9789819785070;9789819785087

Artificial Intelligence Generated Content (AIGC) has experienced significant advancements, particularly in the areas of natural language processing and 2D image generation. However, the generation of three-dimensional (3D) content from a single image still poses challenges, particularly when the input image contains complex backgrounds. This limitation hinders the potential applications of AIGC in areas such as human-machine interaction, virtual reality (VR), and architectural design. Despite the progress made so far, existing methods face difficulties when dealing with single images that have intricate backgrounds. Their reconstructed 3D shapes tend to be incomplete, noisy, or lack of partial geometric structures. In this paper, we introduce a 3D generation framework for indoor scenes from a single image to generate realistic and visually-pleasing 3D geometry shapes, without the requirement of point clouds, multi-view images, depth or masks as input. The main idea of our method is clustering-based 3D shape learning and prediction, followed by a shape deformation. Since more than one objects tend to be existing in indoor scenes, our framework will simultaneously generate multi-objects and predict the layout with a camera pose, as well as 3D object bounding boxes for holistic 3D scene understanding. We have evaluated the proposed framework on benchmark datasets including ShapeNet, SUN RGB-D and Pix3D, and state-of-the-art performance has been achieved. We have also given examples to illustrate immediate applications in virtual reality.

关键词： 3D mesh Reconstruction Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

Low Power Processors and image Sensors for vision-Based IoT Devices: A Review

引用

IEEE SENSORS JOURNAL 2021年第2期21卷 1172-1186页

作者： Maheepala, Malith Joordens, Matthew A. Kouzani, Abbas Z. Deakin Univ Sch Engn Geelong Vic 3216 Australia

With the advancements of the Internet of Things (IoT) technology, applications of battery powered machine vision based IoT devices is rapidly growing. While numerous research works are being conducted to develop low power hardware solutions for IoT devices, image capture and image processing remain high power demanding processes leading to a short battery life. However, the power consumption of the machine vision based IoT devices can be minimized by the careful optimization of the hardware components that are used is these devices. In this article, we present a review of low power machine vision hardware components for the IoT applications. A guide to selecting the optimum processors and image sensors for a given battery powered machine vision based IoT device is presented. Next, the factors that must be considered when selecting processors and image sensors for a given IoT application are discussed, and selection criteria for the processors and image sensors are established. Then, the current commercially available hardware components are reviewed in accordance with the established selection criteria. Finally, the research trends in the field of battery powered machine vision based IoT devices are discussed, and the potential future research directions in the field are presented.

关键词： Program processors machine vision Cloud computing image sensors Batteries Transceivers Internet of Things Internet of Things machine vision low power processor image sensor

来源：评论

学校读者我要写书评

暂无评论

Expoliting Confidence-Based Model Fusion for Boosting image Classification Accuracy 2

Expoliting Confidence-Based Model Fusion for Boosting Image ...

引用

2nd International Conference on image processing, Computer vision and machine Learning, ICICML 2023

作者： Jiang, Xinjian Nanjing University Department of Computer Science and Technology Jiangsu Nanjing210008 China

ISBN: (纸本)9798350331417

In the realm of deep learning, the traditional approach has been to train specialized models for individual tasks, which, although effective, is resource-intensive. The advent of large, universal models has mitigated this issue by offering multitask capabilities, reduced training time, and lower computational costs. However, these generalized models often underperform on specific tasks compared to specialized models. This paper introduces an innovative ensemble approach that integrates specialized and generalized models, specifically focusing on Contrastive Language-image Pretraining (CLIP) and EfficientNet. This work proposes three fusion strategies: Weighted Voting, Confidence Comparison, and Fully Connected Network Fusion, and evaluate them using the CIFAR-100 dataset. The ensemble model significantly outperforms individual models, achieving an adjusted accuracy of up to 0.848. The paper also introduces a novel evaluation metric, Confidence-Accuracy Correlation, to assess the reliability of model confidence. The findings could revolutionize ensemble learning by making it more adaptive and suited for real-world applications, thereby pushing the boundaries of possibility in artificial intelligence. © 2023 IEEE.

关键词： CLIP deep learning image classification

来源：评论

学校读者我要写书评

暂无评论

Patch-Attention GAN: image Translation using BiFormer and Attention Framework 5

Patch-Attention GAN: Image Translation using BiFormer and At...

引用

5th International Conference on Computer vision, image and Deep Learning, CVIDL 2024

作者： Xiao, Dingwen Wu, Huiqing The Hong Kong University of Science and Technology School of Science Hong Kong Faculty of Science and Technology BNU-HKBU United International College Zhuhai China

ISBN: (纸本)9798350373820

image-to-image translation has long been recognized as a crucial undertaking within the field of computer vision, owing to its broad applications in domain adaption. Existing methods differ in Generative Adversarial Network framework and model structure. Attention-Guided training framework absorbs the input image's semantic information based on the translation task. However, it ignores the local context for content extraction and lacks mask comparison in the loss function. The attention mechanism is widely regarded as an effective approach for addressing contextual challenges in machine learning models. By augmenting the model's capacity to capture long-range dependencies and concentrate on the most salient aspects of the input sequence, the attention mechanism significantly enhances its contextual understanding. Bi-level routing attention stands out among other mechanisms for its computation speed and memory cost. In this paper, we propose an image translation algorithm called Patch-Attention Generative Adversarial Network that combines foreground-background separation processing, bi-level routing attention, and content loss. Our proposed method has been extensively evaluated on diverse datasets, yielding compelling experimental results. These findings demonstrate the substantial performance boost achieved by our approach in the domain of image translation. The generated image is more clarified and understandable compared with other existing approaches. © 2024 IEEE.

关键词： Generative adversarial networks

来源：评论

学校读者我要写书评

暂无评论

Display and View—NSDN—image Caption Generator and Headcount 8th

Display and View—NSDN—Image Caption Generator and Headcoun...

引用

8th International Conference on Information System Design and Intelligent applications, ISDIA 2024

作者： Nahar, Shreya Jain, Tanishka R. Shah, Mihir Dilip, Golda Department of Computer Science and Engineering SRMIST Vadapalani Chennai India

ISBN: (纸本)9789819748914

The automatic description of the content of an image is a challenge that combines computer vision with natural language processing in the field of artificial intelligence. Most existing models have limitations in accurately identifying subtle elements of an image and counting the number of people present. To address this challenge, we introduce the natural-language scene description network (NSDN), which incorporates recent developments in computer vision and machine translation to generate descriptive phrases that effectively depict images. This technology has the potential to improve the efficiency, capacity, reliability, and safety of crowd management tasks, particularly in diverse and adaptive crowd situations. Despite challenges such as clutter, occlusion, non-uniform object scale, and irregular object distribution, YOLO shows promise for intelligent crowd counting and analysis in images. This article reviews, categorizes, analyzes distinguishing features, and extensively assesses the effectiveness of crowd-counting methods that rely on convolutional neural networks and provides a detailed analysis of their performance. © The Author(s).

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Transmission Line Inspection image Intelligent Diagnosis System 2

Transmission Line Inspection Image Intelligent Diagnosis Sys...

引用

2nd IEEE International Conference on image processing and Computer applications, ICIPCA 2024

作者： Zheng, Bangzhu Liu, Haitao Ma, Haiteng Liu, Yin Shen, Xingyi Liang, Hua Wen, Jie Guangdong Power Grid Co. Ltd. Maoming Power Supply Bureau Guangdong Province Maoming City525000 China

ISBN: (数字)9798350360240

ISBN: (纸本)9798350360240

This paper introduces the image diagnosis technology of power inspection based on computer vision. image import, database access, text output and other functional modules are designed using VisualStudio2010. ADO technology is used to access and change the database. The framework and method of diagnosis result fusion based on multidimensional fusion are proposed. The fault diagnosis criterion of the system is established by using genetic algorithm. Under the premise of ensuring the optimal solution, the key information is retained and the minimum expression of knowledge is obtained, so as to realize the rapid and accurate fault diagnosis. The experimental results show that the method is effective, fast and suitable for large-scale power grid, especially for transmission lines. Low error rate and friendly man-machine interface. It can meet the needs of power line inspection. The automatic inspection of the transmission line image is realized, which standardizes the working process and greatly improves the working efficiency. © 2024 IEEE.

关键词： Power distribution lines

来源：评论

学校读者我要写书评

暂无评论

Performance Evaluation of Computer vision Algorithms in a Programmable Logic Controller: An Industrial Case Study

引用

SENSORS 2024年第3期24卷 843页

作者： Vieira, Rodrigo Silva, Dino Ribeiro, Eliseu Perdigoto, Luis Coelho, Paulo Jorge Polytech Univ Leiria Sch Technol & Management P-2411901 Leiria Portugal Inst Syst Engn & Comp Coimbra INESC Coimbra P-3030290 Coimbra Portugal Univ Coimbra Inst Syst & Robot P-3030290 Coimbra Portugal

This work evaluates the use of a programmable logic controller (PLC) from Phoenix Contact's PLCnext ecosystem as an image processing platform. PLCnext controllers provide the functions of "classical" industrial controllers, but they are based on the Linux operating system, also allowing for the use of software tools usually associated with computers. Visual processing applications in the Python programming language using the OpenCV library are implemented in the PLC using this feature. This research is focused on evaluating the use of this PLC as an image processing platform, particularly for industrial machine vision applications. The methodology is based on comparing the PLC's performance against a computer using standard image processing algorithms. In addition, a demonstration application based on a real-world scenario for quality control by visual inspection is presented. It is concluded that despite significant limitations in processing power, the simultaneous use of the PLC as an industrial controller and image processing platform is feasible for applications of low complexity and undemanding cycle times, providing valuable insights and benchmarks for the scientific community interested in the convergence of industrial automation and computer vision technologies.

关键词： programmable logic controllers computer vision OpenCV performance benchmark

来源：评论

学校读者我要写书评

暂无评论

5th International Conference on Data Science, machine Learning and applications, ICDSMLA 2023

5th International Conference on Data Science, Machine Learni...

引用

5th International Conference on Data Science, machine Learning and applications, ICDSMLA 2023

ISBN: (纸本)9789819780426

The proceedings contain 128 papers. The special focus in this conference is on Data Science, machine Learning and applications. The topics include: Digitization of Monuments – An Impact on the Tourist Experience with Special Reference to Hampi;resume Parser Using machine Learning;IOT Based Smart Hydroponics System;comparative Study of machine Learning and Deep Learning Techniques for Cancer Disease Detection;High Thruput Modulation Approaches Used in Next Generation WiF’s Under Multi-impairments Environments with MATLAB Codes;skin Disease Detection;root Vegetable Crop Recommendation System Based on Soil Properties and Environmental Factors;deep Learning Model Development for an Automatic Healthcare Edge Computing Application;Empathetic Conversations in Mental Health: Fine-Tuning LLMs for Supportive AI Interactions;exploring Block Chain Technology with applications, and Future Prospects;a Comprehensive Review of Soft Computing Enabled Techniques for IoT Security: State-of-the-Art and Challenges Ahead;Performance Analysis of machine Learning Algorithms on Imbalanced Datasets Using SMOTE Technique;An AI Based Nutrient Tracking and Analysis System;power Saving Mechanism for Street Lights System Using IoT;Automatic Login System Using ATTINY85 IC;forecasting Stock Prices: A Comparative Analysis of machine Learning, Deep Learning, and Statistical Approaches;smart vision Bot;robots in Logistics: Apprehension of Current Status and Future Trends in Indian Warehouses;smart Healthcare: Enhancing Patient Well-Being with IoT;Detection of B-ALL Using CNN Model and Deep Learning;a Comprehensive Analysis for Advancements and Challenges in Deep Learning Models for image processing;a Comprehensive Survey on Enhancing Patient Care Through Deep Learning and IoT-Enabled Healthcare Innovations;attention-Based image Caption Generation.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Infrared Computer vision for Utility-Scale Photovoltaic Array Inspection 15

Infrared Computer Vision for Utility-Scale Photovoltaic Arra...

引用

15th International Conference on Information, Intelligence, Systems and applications, IISA 2024

作者： Ramirez, David F. Pujara, Deep Tepedelenlioglu, Cihan Srinivasan, Devarajan Spanias, Andreas SenSIP Center School of Ecee Arizona State University TempeAZ85281 United States Poundra Llc TempeAZ85281 United States

ISBN: (纸本)9798350368833

Utility-scale solar arrays require specialized inspection methods for detecting faulty panels. Photovoltaic (PV) panel faults caused by weather, ground leakage, circuit issues, temperature, environment, age, and other damage can take many forms but often symptomatically exhibit temperature differences. Included is a mini survey to review these common faults and PV array fault detection approaches. Among these, infrared thermography cameras are a powerful tool for improving solar panel inspection in the field. These can be combined with other technologies, including image processing and machine learning. This position paper examines several computer vision algorithms that automate thermal anomaly detection in infrared imagery. We demonstrate our infrared thermography data collection approach, the PV thermal imagery benchmark dataset, and the measured performance of image processing transformations, including the Hough Transform for PV segmentation. The results of this implementation are presented with a discussion of future work. © 2024 IEEE.

关键词： Hough transforms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：