检索结果-内蒙古大学图书馆

Learning from Deep Stereoscopic Attention for Simulator Sickness Prediction

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023年第2期29卷 1415-1423页

作者： Du, Minghan Cui, Hui Wang, Yuan Duh, Henry Been-Lirn La Trobe Univ Dept Comp Sci & Informat Technol Bundoora Vic 3086 Australia

Simulator sickness induced by 360 & DEG;stereoscopic video contents is a prolonged challenging issue in Virtual Reality (VR) system. Current machine learning models for simulator sickness prediction ignore the underlying interdependencies and correlations across multiple visual features which may lead to simulator sickness. We propose a model for sickness prediction by automatic learning and adaptive integrating multi-level mappings from stereoscopic video features to simulator sickness scores. Firstly, saliency, optical flow and disparity features are extracted from videos to reflect the factors causing simulator sickness, including human attention area, motion velocity and depth information. Then, these features are embedded and fed into a 3-dimensional convolutional neural network (3D CNN) to extract the underlying multi-level knowledge which includes low-level and higher-order visual concepts, and global image descriptor. Finally, an attentional mechanism is exploited to adaptively fuse multi-level information with attentional weights for sickness score estimation. The proposed model is trained by an end-to-end approach and validated over a public dataset. Comparison results with state-of-the-art models and ablation studies demonstrated improved performance in terms of Root Mean Square Error (RMSE) and Pearson Linear Correlation Coefficient.

关键词： Stereoscopic video simulator sickness virtual reality attention mechanism 3D CNN I.4.9 [image processing and computer vision] applications H.5.1 [information interfaces and presentation] multimedia information systems

来源：评论

学校读者我要写书评

暂无评论

Development Of An image Restoration Algorithm Utilizing Generative Adversarial Networks (GAN’s) For Enhanced Performance In Engineering applications: A Comprehensive Approach To Improving image Quality And Clarity Through Advanced machine Learning Techniques

Development Of An Image Restoration Algorithm Utilizing Gene...

引用

2024 IEEE International Conference on Innovation and Novelty in Engineering and Technology, INNOVA 2024

作者： Manjunath, T.C. Pavithra, G. Samyama Gunjal, G.H. Ninawe, Swapnil S. Dept. of Electronics & Communication Engineering Rajrajeswari College of Engineering Bangalore India Department of Electronics & Communication Engineering Dayananda Sagar College of Engineering Karnataka Bangalore India Department of Computer Science & Engineering University Visvesvaraya College of Engineering Karnataka Bangalore India

ISBN: (纸本)9798331505134

image restoration, a critical task in computer vision and image processing, focuses on recovering degraded or damaged images to their original, high-quality state. This paper introduces an innovative approach to image restoration using Generative Adversarial Networks (GANs). GANs, a prominent deep learning framework, consist of two neural networks—a generator and a discriminator—that compete to produce and evaluate realistic images. The generator creates images, while the discriminator distinguishes between real and generated ones, refining the generator's capability through adversarial training. Leveraging GANs' ability to learn complex image features, the proposed algorithm restores degraded images affected by noise, blur, and low resolution, producing high-quality, realistic results. Simulation outcomes demonstrate significant advancements in image restoration, showcasing GANs as a powerful tool for addressing challenges in this domain. The study underscores the potential of GANs in generating visually appealing restorations and advancing the state-of-the-art in image processing and restoration tasks. © 2024 IEEE.

关键词： Generative adversarial networks

来源：评论

学校读者我要写书评

暂无评论

Research on the Application of Generative Adversarial Networks in Aerial image Generation

Research on the Application of Generative Adversarial Networ...

引用

International Conference on image processing, Computer vision and machine Learning (ICICML)

作者： Cai, H. X. Zhu, X. Y. Wen, P. C. Bai, L. T. Li, R. Q. Han, W. AVIC Xian Aeronaut Comp Tech Res Inst Xian Peoples R China

ISBN: (纸本)9781665464680

Computer vision is one of the important areas and directions of deep learning research, which requires different approaches to be chosen for different fields due to the complexity and diversity of vision tasks. In the field of aviation, the existing image resources are still far from the real needs due to the influence and constraints of realistic scenes and difficulties of image acquisition. More detailed and comprehensive images can better provide reliable technical support and basis for applications, and then make more accurate decisions on problems, which requires generating more effective images to expand the data. Generative Adversarial Networks (GAN) are the fastest growing and most effective generation method in recent years, so this experiment investigates the application of GAN on aviation data, taking images of airplanes, cars and ships as examples to conduct quantitative research. The effect on the effect of GAN is studied from the perspective of image size, number of images, number of iterations, and different categories of images, in order to obtain better parameter settings for generating effective images, which provides a theoretical and experimental basis for the subsequent application of GAN in the aviation field to generate more images with similar characteristics and solve the problem of insufficient data.

关键词： Generative Adversarial Networks image generation aerial applications quantitative research

来源：评论

学校读者我要写书评

暂无评论

Digital Information image Recognition and Acquisition System Based on Artificial Intelligence Technology

Digital Information Image Recognition and Acquisition System...

引用

作者： Gao, Qi Bai, Jinniu Department of Computer Science and Technology Baotou Medical College Inner Mongolia Baotou014040 China

As a very important branch of computer science and engineering, graphics, and image processing is a research topic of capturing, storing, and manipulating information from reflected electromagnetic waves from objects or scenes. Graphics and image processing technology has a wide range of applications in many important fields such as satellite navigation, military applications, machine vision, and Internet search. The main purpose of this paper is to study the acquisition system of digital information and image recognition based on artificial intelligence technology. This paper mainly designs and completes the multi-channel data acquisition system based on FPGA, including system hardware design, system digital logic design, and system measurement error calibration. Experiments show that referring to the data sheet of OV5640, it can reach 90 fps for transmitting pictures of 640 × 480 size. The hardware processing scheme actually takes 16,135 us to transmit a frame of pictures, and the actual frame rate is 61.98 fps, which is more than 6 times higher than that of the software scheme. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

关键词： image recognition

来源：评论

学校读者我要写书评

暂无评论

Revolutionizing image Recognition: Next-Generation CNN Architectures for Handwritten Digits and Objects 8

Revolutionizing Image Recognition: Next-Generation CNN Archi...

引用

8th IEEE Symposium on Wireless Technology and applications, ISWTA 2024

作者： Absur, Md Nurul Nasif, Kazi Fahim Ahmad Saha, Sourya Nova, Sifat Nawrin Department of Computer Science City University of New York New York United States College of Computing and Software Engineering Kennesaw State University GA United States Department of Computer Science Chalmers University of Technology Gothenburg Sweden

ISBN: (纸本)9798350351354

This study addresses the pressing need for computer systems to interpret digital media images with a level of sophistication comparable to human visual perception. By leveraging Convolutional Neural Networks (CNNs), we introduce two innovative architectures tailored to distinct datasets: the MNIST handwritten digit dataset and the Fashion MNIST dataset. Unlike traditional machine learning methods such as Support Vector machines (SVM) and Random Forests, our customized CNN models remarkably enhance image attribute comprehension and recognition accuracy. Specifically, the model developed for the MNIST dataset achieved an unprecedented accuracy of 98.71% without any bias, while the Fashion MNIST model reached 91.39%, marking significant advancements over conventional algorithms without any bias. This research showcases the superior efficiency of CNNs in processing and understanding digital images. It underscores the potential of deep learning technologies in bridging the gap between computational systems and human-like visual recognition. Through meticulous experimentation and analysis, we illustrate how deep CNNs require less preparatory work than other image-processing algorithms, setting a new benchmark in computer vision. © 2024 IEEE.

关键词： Support vector machines

来源：评论

学校读者我要写书评

暂无评论

MAPHIS-Measuring arthropod phenotypes using hierarchical image segmentations

引用

METHODS IN ECOLOGY AND EVOLUTION 2024年第1期15卷 36-42页

作者： Mraz, Radoslav Stepka, Karel Pekar, Matej Matula, Petr Pekar, Stano Masaryk Univ Fac Informat Dept Visual Comp Brno Czech Republic Masaryk Univ Fac Sci Dept Bot & Zool Brno Czech Republic

1. Animal phenotypic traits are utilised in a variety of studies. Often the traits are measured from images. The processing of a large number of images can be challenging;nevertheless, image analytical applications, based on neural networks, can be an effective tool in automatic trait collection.2. Our aim was to develop a stand-alone application to effectively segment an arthropod from an image and to recognise individual body parts: namely, head, thorax (or prosoma), abdomen and four pairs of appendages. It is based on convolutional neural network with U-Net architecture trained on more than a thousand images showing dorsal views of arthropods (mainly of wingless insects and spiders). The segmentation model gave very good results, with the automatically generated segmentation masks usually requiring only slight manual adjustments.3. The application, named MAPHIS, can further (1) organise and preprocess the images;(2) adjust segmentation masks using a simple graphical editor;and (3) calculate various size, shape, colouration and pattern measures for each body part organised in a hierarchical manner. In addition, a special plug-in function can align body profiles of selected individuals to match a median profile and enable comparison among groups. The usability of the application is shown in three practical examples.4. The application can be used in a variety of fields where measures of phenotypic diversity are required, such as taxonomy, ecology and evolution (e.g. mimetic similarity). Currently, the application is limited to arthropods, but it can be easily extended to other animal taxa.

关键词： arachnids arthropods convolutional neural networks hierarchical segmentation image analysis insects machine vision morphological traits

来源：评论

学校读者我要写书评

暂无评论

An Intelligent Self-Driving Car’s Design and Development, Including Lane Detection Using ROS and machine vision Algorithms 1

引用

3rd International Conference on Universal Threats in Expert applications and Solutions, UNI-TEAS 2024

作者： Sujatha, E. Sundar, J. Sathiya Jeba Raju, D. Naveen Lakshminarayanan, S. Suganthi, N. Thandalam Chennai India Department of Computer Science and Engineering Velammal Institute of Technology Chennai India Department of Computer Science and Engineering R.M.K. Engineering College Kavaraipettai Chennai India Sri Sairam Engineering College Chennai India SRM Institute of Science and Technology Ramapuram Campus Chennai India

ISBN: (数字)9789819738106

ISBN: (纸本)9789819738090

It is challenging to find a solution for lane detection. It has aroused the curiosity of the computer vision field for many years. It has been found that computer vision and machine learning algorithms struggle to tackle the multi-feature identification problem known as lane detection. Even though there are a few different machine learning approaches that may be used for lane identification, these approaches are often employed for classification rather than feature development. On the other hand, contemporary techniques of machine learning may be used to discover features that have a high recognition value, and they have shown success in feature identification tests. These strategies haven’t been applied correctly, which compromises their efficiency and accuracy when it comes to lane recognition. In this study, we provide a fresh approach to solving the problem. A brand-new preprocessing and Region of Interest (ROI) selection method is presented in this article. The major objective is to extract white features by making use of the HSV color transformation, adding preliminary edge feature detection while doing preprocessing, and then selecting ROI based on the preprocessing that was proposed. With the help of this cutting-edge preprocessing strategy, the lane may be found. The integrated autonomous vehicle that we envision is one that is controlled by a Robotic Operating System and that is capable of making intelligent driving choices. The unique filtering and noise reduction techniques that were used on the visual feedback by means of the processing unit served as the basis for the digital image-processing algorithm that was responsible for the greatest performance achieved by the autonomous vehicle. Within the control system, we used two separate control units, one of which was a master and the other of which was a slave. The master control unit is in charge of the visual processing and filtering, while the slave control unit is in charge of the vehicle’s propulsio

关键词： Autonomous vehicles

来源：评论

学校读者我要写书评

暂无评论

Event-Based Hand Detection on Neuromorphic Hardware Using a Sigma Delta Neural Network 33rd

Event-Based Hand Detection on Neuromorphic Hardware Using a ...

引用

33rd International Conference on Artificial Neural Networks and machine Learning (ICANN)

作者： Azzalini, Loic Gluege, Stefan Struckmeier, Jens Sandamirskaya, Yulia ZHAW Zurich Univ Appl Sci Winterthur Switzerland WAIYS GmbH Langen Germany

ISBN: (纸本)9783031723582;9783031723599

The development of deep learning (DL) models has dramatically improved marker-free human pose estimation, including an important task of hand tracking. However, for applications in real-time critical and embedded systems, e.g. in robotics or augmented reality, hand tracking based on standard frame-based cameras is too slow and/or power hungry. The latency is limited by the frame rate of the image sensor already, and any subsequent DL processing further increases the latency gap, while requiring substantial power for processing. Dynamic vision sensors, on the other hand, enable sub-millisecond time resolution and output sparse signals that can be processed with an efficient Sigma Delta Neural Network (SDNN) model that preserves the sparsity advantage in the neural network. This paper presents the training and evaluation of a small SDNN for hand detection, based on event data from the DHP19 dataset deployed on Intel's Loihi 2 neuromorphic development board. We found it possible to deploy a hand detection model in neuromorphic hardware backend without a notable performance difference to the original GPU implementation, at an estimated mean dynamic power consumption for the network running on the chip of approximate to 7 mW.

关键词： Event-based vision Neuromorphic Hardware Hand Tracking Sigma Delta Neural Networks

来源：评论

学校读者我要写书评

暂无评论

Comparing CNNs and ViTs for Medical image Classification Leveraging Transfer Learning 29

Comparing CNNs and ViTs for Medical Image Classification Lev...

引用

29th IEEE Symposium on Computers and Communications (IEEE ISCC)

作者： Lonia, Giovanni Ciraolo, Davide Fazio, Maria Villari, Massimo Celeste, Antonio Univ Messina Dept Math & Comp Sci Phys Sci & Earth Sci Messina Italy

ISBN: (纸本)9798350354249;9798350354232

In recent years, significant progress has been achieved in medical image analysis, mainly due to the substantial advances in deep learning methods. In the past decade, Convolutional Neural Network (CNN) was the best model for image classification, demonstrating remarkable success in various medical applications. However, the advent of vision Transformers (ViTs) has challenged the dominance of CNN approaches. This study aims to explore the potential of ViTs in healthcare, comparing their performance with that of CNN models. The latter has traditionally excelled in image feature extraction through convolutional operations;on the other hand, ViTs, relying on self-attention mechanisms, exhibit unique capabilities in capturing long-range dependencies, enabling them to effectively capture complex patterns within images. In this study, after analyzing their architectures, we assessed the behaviour of from-scratch and pre-trained models, highlighting their differences in performance and providing light on the applicability of Transfer Learning (TL) approach in the healthcare scenario.

关键词： Deep Learning image processing Artificial Intelligence machine Learning Transfer Learning

来源：评论

学校读者我要写书评

暂无评论

How hard are computer vision datasets? Calibrating dataset difficulty to viewing time 37

How hard are computer vision datasets? Calibrating dataset d...

引用

37th Conference on Neural Information processing Systems (NeurIPS)

作者： Mayo, David Cummings, Jesse Lin, Xinyu Gutfreund, Dan Katz, Boris Barbu, Andrei MIT CSAIL Cambridge MA 02139 USA MIT CBMM Cambridge MA 02139 USA IBM Corp MIT IBM Watson AI Lab Cambridge MA USA

ISBN: (纸本)9781713899921

Humans outperform object recognizers despite the fact that models perform well on current datasets, including those explicitly designed to challenge machines with debiased images or distribution shift. This problem persists, in part, because we have no guidance on the absolute difficulty of an image or dataset making it hard to objectively assess progress toward human-level performance, to cover the range of human abilities, and to increase the challenge posed by a dataset. We develop a dataset difficulty metric MVT, Minimum Viewing Time, that addresses these three problems. Subjects view an image that flashes on screen and then classify the object in the image. images that require brief flashes to recognize are easy, those which require seconds of viewing are hard. We compute the imageNet and ObjectNet image difficulty distribution, which we find significantly undersamples hard images. Nearly 90% of current benchmark performance is derived from images that are easy for humans. Rather than hoping that we will make harder datasets, we can for the first time objectively guide dataset difficulty during development. We can also subset recognition performance as a function of difficulty: model performance drops precipitously while human performance remains stable. Difficulty provides a new lens through which to view model performance, one which uncovers new scaling laws: vision-language models stand out as being the most robust and human-like while all other techniques scale poorly. We release tools to automatically compute MVT, along with image sets which are tagged by difficulty. Objective image difficulty has practical applications - one can measure how hard a test set is before deploying a real-world system - and scientific applications such as discovering the neural correlates of image difficulty and enabling new object recognition techniques that eliminate the benchmark-vsreal-world performance gap.

关键词： Object recognition

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：