检索结果-内蒙古大学图书馆

38th AAAI conference on artificial Intelligence (AAAI) / 36th conference on Innovative applications of artificial Intelligence / 14th Symposium on Educational Advances in artificial Intelligence

作者： Rojas, Nilton Natl Univ Engn Lima Peru

ISBN: (纸本)1577358872

This research investigates the generalization capabilities of neural networks in deep learning when applied to real-world scenarios where data often contains imperfections, focusing on their adaptability to both noisy and non-noisy scenarios for image retrieval tasks. Our study explores approaches to preserve all available data, regardless of quality, for diverse tasks. The evaluation of results varies per task, due to the ultimate goal of developing a technique to extract relevant information while disregarding noise in the final network design for each specific task. The aim is to enhance accessibility and efficiency of AI across diverse tasks, particularly for individuals or countries with limited resources, lacking access to high-quality data. The dedication is directed towards fostering inclusivity and unlocking the potential of AI for widespread societal benefit.

关键词： Deep learning

来源：评论

学校读者我要写书评

暂无评论

SAMNER: image Screening and Cross-Modal Alignment networks for Multimodal Named Entity Recognition

SAMNER: Image Screening and Cross-Modal Alignment Networks f...

引用

International Joint conference on neural networks (IJCNN)

作者： Yang, Luyi Xinjiang Univ Sch Comp Sci & Technol Urumqi Xinjiang Peoples R China Xinjiang Key Lab Signal Detect & Proc Urumqi Xinjiang Peoples R China

ISBN: (纸本)9798350359329;9798350359312

With the proliferation of social media data, Multimodal Named Entity Recognition (MNER) has received much attention;using different data modalities is crucial for the development of natural language processing and neural networks. However, existing methods suffer from two drawbacks: 1) textimage pairs in the data only sometimes correspond to each other, and it is impossible to rely on contextual information due to the short text nature of social media. 2) Despite the introduction of visual information, heterogeneity gaps may occur in previous complex fusion methods, leading to misidentification. This paper proposes a new synthetic image with a selected graphic alignment network(SAMNER) to address these challenges and construct a matching relationship between external images and text. To solve the graphic mismatch problem, we use a stable diffusion model to generate the images and perform entity labeling. Specifically, we generate images and perform entity labeling through the stable diffusion model to generate the image with the highest match to the text, filter the generated images by the internal image set to generate the best image, and then perform multimodal fusion to predict the entity labeling, we design a simple and effective multimodal attentional alignment mechanism to obtain a better visual representation, and we conduct a large number of experiments. The experiments prove that our model produces competitive results on the two publicly available datasets.

关键词： Named entity recognition Natural language processing Social media

来源：评论

学校读者我要写书评

暂无评论

Advancements in memory technologies for artificial synapses

引用

JOURNAL OF MATERIALS CHEMISTRY C 2024年第15期12卷 5274-5298页

作者： Sehgal, Anubha Dhull, Seema Roy, Sourajeet Kaushik, Brajesh Kumar Indian Inst Technol Roorkee Roorkee 247667 Uttarakhand India

neural networks (NNs) have made significant progress in recent years and have been applied in a broad range of applications, including speech recognition, image classification, automatic driving, and natural language processing. The hardware implementation of NNs presents challenges, and research communities have explored various analog and digital neuronal and synaptic devices for resource-efficient implementation. However, these hardware NNs face several challenges, such as overheads imposed by peripheral circuitry, speed-area tradeoffs, non-idealities associated with memory devices, low on-off resistance ratio, sneak path issues, low weight precision, and power-inefficient converters. This article reviews different synaptic devices and discusses the challenges associated with implementing these devices in hardware, along with corresponding solutions, and prospecting future research directions. Several categories of emerging synaptic devices such as resistive random-access memory (RRAM), phase change memory (PCM), analog-to-digital hybrid volatile memory-based, ferroelectric field effect transistor (FeFET)-based, spintronic-based spin transfer, spin-orbit, magnetic domain wall (DW) and skyrmion synaptic devices have been explored, and a comparison between them is presented. This study provides insights for researchers engaged in the field of hardware neural networks. This article reviews different synaptic devices and discusses the challenges associated with implementing these devices in hardware, along with corresponding solutions, applications, and prospecting future research directions.

关键词： Speech recognition

来源：评论

学校读者我要写书评

暂无评论

Event-Based Hand Detection on Neuromorphic Hardware Using a Sigma Delta neural Network 33rd

Event-Based Hand Detection on Neuromorphic Hardware Using a ...

引用

33rd International conference on artificial neural networks and Machine Learning (ICANN)

作者： Azzalini, Loic Gluege, Stefan Struckmeier, Jens Sandamirskaya, Yulia ZHAW Zurich Univ Appl Sci Winterthur Switzerland WAIYS GmbH Langen Germany

ISBN: (纸本)9783031723582;9783031723599

The development of deep learning (DL) models has dramatically improved marker-free human pose estimation, including an important task of hand tracking. However, for applications in real-time critical and embedded systems, e.g. in robotics or augmented reality, hand tracking based on standard frame-based cameras is too slow and/or power hungry. The latency is limited by the frame rate of the image sensor already, and any subsequent DL processing further increases the latency gap, while requiring substantial power for processing. Dynamic vision sensors, on the other hand, enable sub-millisecond time resolution and output sparse signals that can be processed with an efficient Sigma Delta neural Network (SDNN) model that preserves the sparsity advantage in the neural network. This paper presents the training and evaluation of a small SDNN for hand detection, based on event data from the DHP19 dataset deployed on Intel's Loihi 2 neuromorphic development board. We found it possible to deploy a hand detection model in neuromorphic hardware backend without a notable performance difference to the original GPU implementation, at an estimated mean dynamic power consumption for the network running on the chip of approximate to 7 mW.

关键词： Event-based Vision Neuromorphic Hardware Hand Tracking Sigma Delta neural networks

来源：评论

学校读者我要写书评

暂无评论

Cutting-Edge image Recognition Leveraging Deep Learning and Machine Learning for Enhanced Accuracy

Cutting-Edge Image Recognition Leveraging Deep Learning and ...

引用

2024 International conference on artificial Intelligence and Quantum Computation-Based Sensor applications, ICAIQSA 2024

作者： Shrivastava, Abhishek Kumar, Vinesh Maurya, Jay Prakash School of Mechanical Engineering VIT Bhopal University Sehore India School of Computing Science Engineering and Artificial Intelligence VIT Bhopal University Sehore India School of Computing Science & Engineering VIT Bhopal University Sehore India

ISBN: (纸本)9798331517953

This paper investigates advanced techniques in image recognition and classification by integrating deep learning and machine learning approaches to achieve higher accuracy. Through the implementation of sophisticated training algorithms, the study demonstrates enhanced performance in recognizing and categorizing images across various data models. A major turning point in the development of image identification technology came in 2012 when deep neural networks were introduced. These networks surpassed earlier cutting-edge algorithms and completely changed the computer vision industry. This progress has brought us closer to achieving human-level accuracy in tasks such as identity verification. The role of large datasets like imageNet is crucial, as they provide the foundation for the success of deep learning. With continuous research pushing the limits of picture identification and producing major advances in human knowledge, deep learning has a huge influence on business, society, and technology. Additional research in this area might lead to creative uses that revolutionize our relationship with our surroundings. Key topics discussed include data pre-processing, post-processing, model optimization, and accuracy enhancement. The findings highlight the potential of cutting-edge technologies to advance image classification and recognition in various sectors, such as medical imaging and visual analysis. The approach emphasizes scalability and adaptability, ensuring that models can be effectively applied to real-world scenarios. Future research will focus on refining these models to handle even more complex image datasets, further enhancing their practical utility and reliability. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

End-to-End artificial Intelligence-Based System for Automatic Stereo Camera Self-Calibration

引用

IEEE ACCESS 2024年 12卷 160927-160945页

作者： Satouri, Boutaina El Abderrahmani, Abdellatif Satori, Khalid USMBA Fac Sci Fes Dept Math & Informat LISAC Fes 30000 Morocco

Stereo camera self-calibration is a complex challenge in computer vision applications such as robotics, object tracking, surveillance and 3D reconstruction. To address this, we propose an efficient, fully automated End-To-End AI-Based system for automatic stereo camera self-calibration with varying intrinsic parameters, using only two images of any 3D scene. Our system combines deep convolutional neural networks (CNNs) with transfer learning techniques and fine-tuning. First, our end-to-end convolutional neural network optimized model begins by extracting matching points between a pair of stereo images. These matching points are then used, along with their 3D scene correspondences, to formulate a non-linear cost function. Direct optimization is subsequently performed to estimate the intrinsic camera parameters by minimizing this non-linear cost function. Following this initial optimization, a fine-tuning layer refines the intrinsic parameters for increased accuracy. Our hybrid approach is characterized by a special optimized architecture that leverages the strengths of end-to-end CNNs for image feature extraction and processing, as well as the pillars of our nonlinear cost function formulation and fine-tuning, to offer a robust and accurate method for stereo camera self-calibration. Extensive experiments on synthetic and real data demonstrate the superior performance of the proposed technique compared to traditional camera self-calibration methods in terms of precision and faster convergence.

关键词： Cameras Cost function Convolutional neural networks Three-dimensional displays Robot vision systems Solid modeling Accuracy Calibration Nonlinear distortion Mathematical models Camera self-calibration varying intrinsic parameters nonlinear direct optimization convolutional neural network transfer learning fine tuning matching interest points

来源：评论

学校读者我要写书评

暂无评论

An improved deep learning-based optimal object detection system from images

引用

MULTIMEDIA TOOLS AND applications 2024年第10期83卷 30045-30072页

作者： Yadav, Satya Prakash Jindal, Muskan Rani, Preeti de Albuquerque, Victor Hugo C. Nascimento, Caio dos Santos Kumar, Manoj GL Bajaj Inst Technol & Management GLBITM Dept Comp Sci & Engn Greater Noida 201306 India Fed Inst Educ Sci & Technol Ceara IFCE Grad Program Telecommun Engn PPGET Fortaleza CE Brazil Amity Univ Dept Comp Sci & Engn Noida 201313 India SRM Inst Sci & Technol Dept Elect & Commun Engn Delhi NCR CampusDelhi Meerut Rd Ghaziabad Uttar Pradesh India Univ Fed Ceara Dept Teleinformat Engn Fortaleza CE Brazil Univ Wollongong Dubai Sch Comp Sceince FEIS Dubai Knowledge Pk Dubai U Arab Emirates Middle East Univ MEU Res Unit Amman 11831 Jordan

Computer vision technology for detecting objects in a complex environment often includes other key technologies, including pattern recognition, artificial intelligence, and digital image processing. It has been shown that Fast Convolutional neural networks (CNNs) with You Only Look Once (YOLO) is optimal for differentiating similar objects, constant motion, and low image quality. The proposed study aims to resolve these issues by implementing three different object detection algorithms-You Only Look Once (YOLO), Single Stage Detector (SSD), and Faster Region-Based Convolutional neural networks (R-CNN). This paper compares three different deep-learning object detection methods to find the best possible combination of feature and accuracy. The R-CNN object detection techniques are performed better than single-stage detectors like Yolo (You Only Look Once) and Single Shot Detector (SSD) in term of accuracy, recall, precision and loss.

关键词： Object Detection Chess Piece Identification You Only Look Once (YOLO) Single Stage Detector (SSD) Faster Region-Based Convolutional neural networks (R-CNN)

来源：评论

学校读者我要写书评

暂无评论

Unravelling Visual Narratives with Deep Learning for image Caption Generation 1

Unravelling Visual Narratives with Deep Learning for Image C...

引用

1st IEEE International conference on Cognitive Robotics and Intelligent Systems, ICC - ROBINS 2024

作者： Yoga, M. Ramyasri, M.M. Bhavatharini, N. Harini, M. NivethaSri, A. Kongu Engineering College Department of Artificial Intelligence Tamil Nadu India

ISBN: (数字)9798350372748

ISBN: (纸本)9798350372748

image Caption Generation (ICG), situated at the confluence of computer vision and natural language processing, empowers machines to comprehend visual content and express it in human-like language. This research offers a comprehensive overview of key concepts, methodologies, and challenges in ICG. The process involves developing algorithms for the automatic generation of contextually relevant captions, utilizing deep neural networks for feature extraction, and employing natural language processing techniques for coherent composition. Recent advancements, particularly in convolutional neural networks for image processing and recurrent neural networks for language modelling, have significantly elevated the performance of image captioning systems. The study delves into the core components of an ICG system, including pre-processing techniques for image data, feature extraction mechanisms, and the integration of language models. Attention mechanisms, a key innovation in this field, enable the model to focus on relevant image regions while generating captions, closely mirroring human attention patterns. Despite notable progress, ICG faces several challenges, such as handling diverse and complex visual scenes, ensuring cross-modal coherence between images and captions, and addressing biases present in training data. Ethical considerations, particularly in applications like automated content generation, are also discussed. The study concludes by highlighting potential future directions in ICG research, including the incorporation of multimodal learning approaches, enhancing the interpretability of generated captions, and addressing societal concerns related to bias and fairness. As ICG continues to evolve, it holds promise for various applications, ranging from accessibility for the visually impaired to improving content indexing and retrieval in multimedia databases. The research also underscores the significance of the accuracy attainments, showcasing the success of the pr

关键词： Feature extraction

来源：评论

学校读者我要写书评

暂无评论

Advancing Human Action Recognition and Medical image Segmentation using GRU networks with V-Net Architecture

引用

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND applications 2024年第2期15卷 743-756页

作者： Rao, Dustakar Surendra Rao, L. Koteswara Bhagyaraju, Vipparthi Rohini, P. Koneru Lakshmaiah Educ Fdn Dept ECE Hyderabad 500075 Telangana India Guru Nanak Inst Tech Campus Dept ECE Hyderabad Telangana India Siddhartha Inst Engn & Technol Dept ECE Hyderabad 501506 Telangana India ICFAI Fdn Higher Educ Dept Data Sci & Artificial Intelligence Hyderabad INDIA4 India

Human Action Recognition and Medical image Segmentation study presents a novel framework that leverages advanced neural network architectures to improve Medical image Segmentation and Human Action Recognition (HAR). Gated Recurrent Units (GRU) are used in the HAR domain to efficiently capture complex temporal correlations in video sequences, yielding better accuracy, precision, recall, and F1 Score than current models. In computer vision and medical imaging, the current research environment highlights the significance of advanced techniques, especially when addressing problems like computational complexity, resilience, and noise in real-world applications. Improved medical image segmentation and human action recognition (HAR) are of growing interest. While methods such as the V-Net architecture for medical picture segmentation and Spatial Temporal Graph Convolutional networks (ST-GCNs) for HAR have shown promise, they are constrained by things like processing requirement and noise sensitivity. The suggested methods highlight the necessity of sophisticated neural network topologies and optimisation techniques for medical picture segmentation and HAR, with further study focusing on transfer learning and attention processes. A Python tool has been implemented to perform min-max normalization, utilize GRU for human action recognition, employ V- net for medical image segmentation, and optimize with the Adam optimizer, with performance evaluation metrics integrated for comprehensive analysis. This study provides an optimised GRU network strategy for Human Action Recognition with 92% accuracy, and a V-Net-based method for Medical image Segmentation with 88% Intersection over Union and 92% Dice Coefficient.

关键词： Human action recognition medical image segmentation grated rectifier unit V-net architecture neural network

来源：评论

学校读者我要写书评

暂无评论

Implementation and applications of neural networks Based on FPGA

Implementation and Applications of Neural Networks Based on ...

引用

2024 International conference on Electrical Drives, Power Electronics and Engineering, EDPEE 2024

作者： Li, Jiajun Zhang, Chuhuan Ma, Xiangcheng Shenzhen College Of International Education Shenzhen China Nanjing University Of Aeronautics And Astronautics Nanjing China

ISBN: (纸本)9798350395631

Deep neural networks have been crucial in several recent developments in artificial intelligence and big data technology, including natural language processing, speech recognition, and computer vision. Given the numerous layers of deep neural network models, the computational complexity, and a large number of parameters, the performance of hardware such as computing power, memory bandwidth, and data storage is highly demanding. Due to their high parallelism, low power consumption, and reconfigurability, Field Programmable Gate Arrays (FPGAs), which are programmable logic devices, are frequently used as an alternative to CPUs and GPUs. Their combination with neural networks has become a current research hotspot in the field of artificial intelligence. This paper briefly describes firstly the current mainstream neural network models and the design ideas of FPGAs. Next, the applications of FPGA-based neural networks are introduced. Finally, this paper concludes with a summary and outlook on the development of FPGA-based applications for deep neural networks. © 2024 IEEE.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：