检索结果-内蒙古大学图书馆

Fusing Depths: Investigating the Synergy of Convolutional neural networks and Long Short-Term Memory networks for Enhanced image Caption Generation 1

Fusing Depths: Investigating the Synergy of Convolutional Ne...

引用

1st IEEE International conference on Cognitive Robotics and Intelligent Systems, ICC - ROBINS 2024

作者： Sharma, Vikas Alekh Chaudhary, Kirti Vashishth, Tarun Kumar Chaudhary, Sachin Kumar, Bhupendra Iimt University School of Computer Science and Applications U.P Meerut India Delhi Ncr Ghaziabad India Gl Bajaj Institute of Management Greater Noida India

ISBN: (纸本)9798350372748

Generation, poses a challenge due to the intricate visual content and nuanced semantic details. This research introduces a novel approach for image captioning by seamlessly integrating Convolutional neural networks (CNNs) and Long Short-Term Memory (LSTM) networks. The proposed methodology involves leveraging CNNs to extract visual features and employing LSTM for the generation of descriptive captions. To further enhance the model's performance, attention mechanisms are incorporated, allowing the model to focus on relevant visual features during the caption generation process. The evaluation of our modified model utilizes standard benchmark datasets, such as Flickr8k, and employs metrics including CIDEr, METEOR, and BLEU scores. Through rigorous assessment, our model showcases improved performance, demonstrating its efficacy in comparison to existing methods. The versatility of the proposed system extends its potential applications to image retrieval, image description, and other multimedia scenarios requiring robust image analysis and natural language processing capabilities. This research contributes to the advancement of image captioning techniques, offering a promising solution for real-world applications in multimedia and artificial intelligence domains. The goal of this study is to explore and leverage the combined capabilities of CNNs and LSTMs to enhance the process of generating descriptive captions for images. By merging the strengths of CNNs in image feature extraction with the sequential understanding and context modeling abilities of LSTMs, the aim is to develop a more sophisticated and effective approach for generating accurate and contextually relevant captions that better capture the nuances and details of the images. This research seeks to push the boundaries of image captioning technology, ultimately improving the quality and richness of generated captions, and advancing the state-of-the-art in artificial intelligence and computer vision applicatio

关键词： Long short-term memory

来源：评论

学校读者我要写书评

暂无评论

Enhanced Feature Extraction for image Dehazing: A Comparative Study between Deep Learning Architectures and FFA-NET 2

Enhanced Feature Extraction for Image Dehazing: A Comparativ...

引用

2nd International conference on Inventive Computing and Informatics (ICICI)

作者： Chaudhary, Sarthak Gupta, Samridh Iniyan, S. SRM Inst Sci & Technol Dept Comp Sci & Engn Chennai 603203 Tamil Nadu India SRM Inst Sci & Technol Dept Comp Technol Sch Comp Chennai 603203 Tamil Nadu India

ISBN: (纸本)9798350373301;9798350373295

The problem of poor visibility in foggy images has spurred various image de-hazing strategies. As the need for high-quality images grows, especially for autonomous systems, this research aims to leverage different Deep Learning (DL) architectures to draw out key details from images, localizing this retrieved data to mitigate the impact of haze. The work explores using DL methods, particularly contrasting the regression and classification models of Convolutional neural networks (CNN), to remove haze from foggy images. This work sets the stage for further developments in image processing, particularly in conditions with poor visibility. It opens opportunities for improving image quality in various applications, such as autonomous driving and outdoor robotics, where clarity of vision is crucial. The final stage of the proposed model involves three specific pre-processing methods: contextual regularization, air light estimation and boundary constraint for optimal results. The next stage sets out to determine the best DL model for producing clear images from de-hazed ones.

关键词： Machine Learning (ML) Convolutional neural networks (CNN) Feature Fusion Attention Network(FFA-NET) Deep Learning Dark Channel Prior(DCP) artificial Intelligence (AI)

来源：评论

学校读者我要写书评

暂无评论

A Review of Research Progress and Application of Wavelet neural networks

A Review of Research Progress and Application of Wavelet Neu...

引用

International conference on New Technologies, Development and Application

作者： Wang, Tonghao Guercio, Vincenzo Cattani, Piercarlo Villecco, Francesco China Agr Univ Coll Informat & Elect Engn 17 Tsinghua East Rd Beijing 100083 Peoples R China Deim Univ Tuscia Largo Univ Engn Sch I-01100 Viterbo Italy Univ Roma La Sapienza Dept Comp Control & Management Engn Via Ariosto 25 I-00185 Rome Italy Univ Salerno Dept Ind Engn Via Giovanni Paolo II 132 I-84084 Fisciano Italy

ISBN: (纸本)9783031310652;9783031310669

artificial neural Network (ANN) has been used extensively and constantly developed. The combination of wavelet transform theory and the neural network has become an important branch to explore the optimization of neural network structure, and Wavelet neural Network (WNN), a special network structure, was born. This paper reviews WNN's development and summarizes the system structure and algorithm implementation and presents derivative models and cutting-edge applications with obvious characteristics. The sorting and analysis of the above contents show that the combination of wavelet theory and neural network algorithm can make the network model have the advantages of fast convergence speed and high model accuracy, and has a rapid development trend in many fields such as audio signal and image processing. The work of this paper is intended to provide a reference for potential applications based on WNN and new network model design ideas.

关键词： Wavelet Transform Wavelet neural Network

来源：评论

学校读者我要写书评

暂无评论

Compression of Deep neural networks based on quantized tensor decomposition to implement on reconfigurable hardware platforms

引用

neural networks 2022年第0期150卷 350-363页

作者： Nekooei, Amirreza Safari, Saeed Univ Tehran Tehran Iran

Deep neural networks (DNNs) have been vastly and successfully employed in various artificial intelligence and machine learning applications (e.g., image processing and natural language processing). As DNNs become deeper and enclose more filters per layer, they incur high computational costs and large memory consumption to preserve their large number of parameters. Moreover, present processing platforms (e.g., CPU, GPU, and FPGA) have not enough internal memory, and hence external memory storage is needed. Hence deploying DNNs on mobile applications is difficult, considering the limited storage space, computation power, energy supply, and real-time processing requirements. In this work, using a method based on tensor decomposition, network parameters were compressed, thereby reducing access to external memory. This compression method decomposes the network layers' weight tensor into a limited number of principal vectors such that (i) almost all the initial parameters can be retrieved, (ii) the network structure did not change, and (iii) the network quality after reproducing the parameters was almost similar to the original network in terms of detection accuracy. To optimize the realization of this method on FPGA, the tensor decomposition algorithm was modified while its convergence was not affected, and the reproduction of network parameters on FPGA was straightforward. The proposed algorithm reduced the parameters of ResNet50, VGG16, and VGG19 networks trained with Cifar10 and Cifar100 by almost 10 times. (C)& nbsp;2022 Elsevier Ltd. All rights reserved.

关键词： Deep neural Network Tensor decomposition Principal vectors Network weights External memory

来源：评论

学校读者我要写书评

暂无评论

COMPREHENSIVE ANALYSIS OF CONVOLUTIONAL neural networks APPLIED TO CIFAR-10 DATASET 2

COMPREHENSIVE ANALYSIS OF CONVOLUTIONAL NEURAL NETWORKS APPL...

引用

2nd International conference on Mechatronic Automation and Electrical Engineering, ICMAEE 2024

作者： Su, Biao College of Physics and Optoelectronic Engineering Shenzhen University Shenzhen518060 China

ISBN: (纸本)9781837242672

This article explores the advancements and applications of Convolutional neural networks (CNNs) in image classification, focusing on the CIFAR-10 dataset. Since their inception in 2006, CNNs have revolutionized computer vision by efficiently extracting high-level features from raw images. This study discusses the structure of CNNs, including convolutional layers, pooling, and fully connected layers, which enhance feature extraction and classification accuracy. Experimentally, CNNs have demonstrated superior performance on tasks requiring the identification of complex patterns such as in autonomous driving, security, and medical imaging. The research further investigates the optimization of CNN models through architectural enhancements, attention mechanisms, data augmentation, model compression, and cross-domain knowledge transfer. This study concludes that multicore configurations with batch processing significantly outperform single-core setups, achieving lower latency and higher throughput, thus underscoring the potent applications and continuous evolution of deep learning in modern artificial intelligence challenges. © The Institution of Engineering & Technology 2024.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

TRAINING AN artificial INTELLIGENCE MODEL FOR THE DETECTION OF GESTURES RELATED TO TRICHOTILLOMANIA 22

TRAINING AN ARTIFICIAL INTELLIGENCE MODEL FOR THE DETECTION ...

引用

22nd International conference on e-Society 2024, ES 2024 and 20th International conference on Mobile Learning 2024, ML 2024

作者： de Gois Paulino, Daniel Victor Costa De Sousa Alves, Robinson Luis Instituto Federal de Educação Ciência e Tecnologia do Rio Grande do Norte Brazil

This article presents an artificial intelligence model capable of identifying actions strongly related to trichotillomania, a psychiatric disorder that causes people to have a desire to pull their hair. The model was trained with images and videos collected and variations generated through artificial intelligence to improve the image database. The work focused on the user's frontal perspective to optimize the construction of the dataset and neural network training. As a result, we obtained 89% precision in the model, which requires further testing and optimization to be used in real applications – still limited – that can provide users with statistics and results related to the disorder for possible treatments or alert the user to decrease their involuntary actions. © 2024 International conferences e-Society 2024 and Mobile Learning 2024. All rights reserved.

关键词： artificial Intelligence Behavioral Pattern Recognition Computational Psychiatry image processing Machine Learning neural networks Trichotillomania

来源：评论

学校读者我要写书评

暂无评论

Research on Expression Recognition Algorithm Based on University Students' Mental State 2

Research on Expression Recognition Algorithm Based on Univer...

引用

2nd IEEE International conference on image processing and Computer applications, ICIPCA 2024

作者： Luo, Guangli Guangzhou Institute of Science and Technology Guangzhou City510540 China

ISBN: (纸本)9798350360240

With the continuous development of higher education in our country, the number of college students is increasing year by year. There are more and more campus accidents caused by college students' psychological problems, which has aroused people's great attention to college students' psychological problems. Many scholars have conducted in-depth research on college students' psychological problems. To solve students' psychological problems, we must first find out the problems in time. With the improvement of the level of social information, face recognition technology integrates multiple key technologies such as computer vision, artificial intelligence, and deep learning, which has been widely used in different fields of society. However, there are few researches and applications on the mental health status of college students. Therefore, this paper combines the expression recognition technology with the psychological problems of college students, fully excavates the static and dynamic features of dynamic expression behavior through convolutional neural networks, establishes a facial expression recognition model, and conducts comparative experiments through open data sets to verify the accuracy of different algorithms, so as to timely discover the psychological problems of college students. The method of expression recognition is used to control the psychology of college students. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

PETNet- Coincident Particle Event Detection using Spiking neural networks

PETNet- Coincident Particle Event Detection using Spiking Ne...

引用

Neuro Inspired Computational Elements conference (NICE)

作者： Debus, Jan Debus, Charlotte Dissertori, Guenther Goetz, Markus Swiss Fed Inst Technol Inst Particle Phys & Astrophys Zurich Switzerland Karlsruhe Inst Technol Sci Comp Ctr Karlsruhe Germany Helmholtz AI Munich Germany

ISBN: (纸本)9798350390599;9798350390582

Spiking neural networks (SNN) hold the promise of being a more biologically plausible, low-energy alternative to conventional artificial neural networks. Their time-variant nature makes them particularly suitable for processing time-resolved, sparse binary data. In this paper, we investigate the potential of leveraging SNNs for the detection of photon coincidences in positron emission tomography (PET) data. PET is a medical imaging technique based on injecting a patient with a radioactive tracer and detecting the emitted photons. One central post-processing task for inferring an image of the tracer distribution is the filtering of invalid hits occurring due to e.g. absorption or scattering processes. Our approach, coined PETNet, interprets the detector hits as a binary-valued spike train and learns to identify photon coincidence pairs in a supervised manner. We introduce a dedicated multi-objective loss function and demonstrate the effects of explicitly modeling the detector geometry on simulation data for two use-cases. Our results show that PETNet can outperform the state-of-the-art classical algorithm with a maximal coincidence detection F1 of 95.2%. At the same time, PETNet is able to predict photon coincidences up to 36 times faster than the classical approach, highlighting the great potential of SNNs in particle physics applications.

关键词： Spiking neural networks positron emission tomography particle coincidence detection supervised denoising

来源：评论

学校读者我要写书评

暂无评论

An Efficient Knowledge Transfer Strategy for Spiking neural networks from Static to Event Domain 38

An Efficient Knowledge Transfer Strategy for Spiking Neural ...

引用

38th AAAI conference on artificial Intelligence (AAAI) / 36th conference on Innovative applications of artificial Intelligence / 14th Symposium on Educational Advances in artificial Intelligence

作者： He, Xiang Zhao, Dongcheng Li, Yang Shen, Guobin Kong, Qingqun Zeng, Yi Chinese Acad Sci Inst Automat Brain Inspired Cognit Intelligence Lab Beijing Peoples R China Univ Chinese Acad Sci Sch Artificial Intelligence Beijing Peoples R China Univ Chinese Acad Sci Sch Future Technol Beijing Peoples R China Chinese Acad Sci Ctr Excellence Brain Sci & Intelligence Technol Shanghai Peoples R China

ISBN: (纸本)9781577358879

Spiking neural networks (SNNs) are rich in spatio-temporal dynamics and are suitable for processing event-based neuromorphic data. However, event-based datasets are usually less annotated than static datasets. This small data scale makes SNNs prone to overfitting and limits their performance. In order to improve the generalization ability of SNNs on event-based datasets, we use static images to assist SNN training on event data. In this paper, we first discuss the domain mismatch problem encountered when directly transferring networks trained on static datasets to event data. We argue that the inconsistency of feature distributions becomes a major factor hindering the effective transfer of knowledge from static images to event data. To address this problem, we propose solutions in terms of two aspects: feature distribution and training strategy. Firstly, we propose a knowledge transfer loss, which consists of domain alignment loss and spatio-temporal regularization. The domain alignment loss learns domain-invariant spatial features by reducing the marginal distribution distance between the static image and the event data. Spatio-temporal regularization provides dynamically learnable coefficients for domain alignment loss by using the output features of the event data at each time step as a regularization term. In addition, we propose a sliding training strategy, which gradually replaces static image inputs probabilistically with event data, resulting in a smoother and more stable training for the network. We validate our method on neuromorphic datasets, including N-Caltech101, CEP-DVS, and N-Omniglot. The experimental results show that our proposed method achieves better performance on all datasets compared to the current state-of-the-art methods. Code is available at https://***/Brain-Cog-Lab/Transfer-for-DVS.

关键词： Alignment

来源：评论

学校读者我要写书评

暂无评论

Fake vs. Real Face Discrimination Using Convolutional neural networks 2nd

Fake vs. Real Face Discrimination Using Convolutional Neural...

引用

2nd Pan-African conference on artificial Intelligence (PanAfriCon AI)

作者： Eissa, Khaled Schwenker, Friedhelm German Univ Cairo New Cairo Egypt Ulm Univ Inst Neural Informat Proc Ulm Germany

ISBN: (纸本)9783031576386;9783031576393

The rapid advancements in image manipulation technology and the proliferation of Generative Adversarial Network (GAN) generated content have created a pressing need for effective methods to distinguish between fake and real imagery. Convolutional neural networks (CNNs) have exhibited exceptional performance in image recognition tasks, making them an ideal choice for addressing the challenge of fake vs. real face discrimination. This paper proposes three new CNN models specifically designed for discriminating between fake and real facial images. The proposed models are compared with several state-of-the-art pre-trained models, such as Inception-ResNet-V2. An extensive dataset made up of both real and fake face images is chosen to enable thorough inspection. Extensive experiments are conducted to evaluate the performance of the proposed models and compare them with the selected pre-trained models. The evaluation metrics employed include Accuracy, Precision, Recall, F1-score, Area Under the Receiver Operating Characteristic Curve (AUC-ROC), Receiver Operating Characteristic (ROC), and Confusion Matrix. These metrics provide a comprehensive analysis of the model's capability to accurately classify fake and real faces. The results demonstrate that Model 3 achieves highly promising performance in discriminating between fake and real faces. The comparison with pre-trained models reveals the superiority of Model 3 in terms of testing results, discrimination ability on new data, training, and validation performances. This study contributes to the field of fake vs. real face discrimination. The findings hold significant implications for applications in areas such as forensic analysis and social media content moderation.

关键词： Classification Threshold CNN Evaluation Metrics Fake Face Detection GAN

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：