检索结果-内蒙古大学图书馆

Fabric image retrieval based on multi-modal feature fusion

SIGNAL image AND VIDEO processing 2024年第3期18卷 2207-2217页

作者： Zhang, Ning Liu, Yixin Li, Zhongjian Xiang, Jun Pan, Ruru Jiangnan Univ Key Lab Ecotext Minist Educ Wuxi Jiangsu Peoples R China Shaoxing Univ Inst Artificial Intelligence Coll Text & Fash Shaoxing Zhejiang Peoples R China

With the increasing of multi-source heterogeneous data, flexible retrieval across different modalities is an urgent demand in industrial applications. To allow users to control the retrieval results, a novel fabric image retrieval method is proposed in this paper based on multi-modal feature fusion. First, the image feature is extracted using the modified pre-trained convolutional neural network to separate macroscopic and fine-grained features, which are then selected and aggregated by the multi-layer perception. The feature of the modification text is extracted by long short-term memory networks. Subsequently, the two features are fused in a visual-semantic joint embedding space by gated and residual structures to control the selective expression of separable image features. To validate the proposed scheme, a fabric image database for multi-modal retrieval is created as the benchmark. Qualitative and quantitative experiments indicate that the proposed method is practicable and effective, which can be extended to other similar industrial fields, like wood and wallpaper.

关键词： Separable feature extraction Multi-modal feature fusion Visual-semantic joint embedding Fabric retrieval

来源：评论

学校读者我要写书评

暂无评论

Acoustic-to-hyper-spectral: real-time perimeter intrusion detection system monitoring through learnable filters and hyper-spectral image generation from distributed acoustic sensing systems

引用

OPTICS EXPRESS 2025年第3期33卷 4109-4126页

作者： Pierau, Ruth-emely Katsifolis, Jim Meehan, Alaster Rezatofighi, Hamid Stuckey, Peter J. Monash Univ Clayton Vic 3800 Australia Univ Melbourne ARC Training Ctr Optimisat Technol Integrated Meth Parkville Vic 3052 Australia AVA Risk Grp Co Future Fibre Technol 10 Hartnett CL Mulgrave Vic 3170 Australia

This paper presents an integrated distributed acoustic sensing (DAS) system with artificial intelligence to provide real-time system monitoring for fence perimeter and buried system applications. The DAS system is a Rayleigh backscatter based fibre optic sensing system that has been deployed in two real-world, commercial applications to detect acoustic wave propagation and scattering along perimeter lines, and classify intrusions accurately. What we believe to be three novel signal processing methods are proposed to train filters for automatically selecting frequency bands from the power spectrum and generating hyper-spectral images from the data gathered by the DAS system without expert knowledge. The hyper-spectral images are analyzed by a neural network based object detection model. The system achieves 81.8% accuracy on a fence perimeter installation and 60.4% accuracy on a buried system application in detecting and classifying various intrusion events. The evaluation interval of the integrated DAS system framework between event sensing and detection does not exceed 5 s. (c) 2025 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

关键词： Deep learning Hyperspectral imaging Machine learning neural networks Signal processing Wave propagation

来源：评论

学校读者我要写书评

暂无评论

Defog artificial Intelligence Glasses: neural networks for the Imperfect Real World 38

Defog Artificial Intelligence Glasses: Neural Networks for t...

引用

38th AAAI conference on artificial Intelligence (AAAI) / 36th conference on Innovative applications of artificial Intelligence / 14th Symposium on Educational Advances in artificial Intelligence

作者： Rojas, Nilton Natl Univ Engn Lima Peru

ISBN: (纸本)1577358872

This research investigates the generalization capabilities of neural networks in deep learning when applied to real-world scenarios where data often contains imperfections, focusing on their adaptability to both noisy and non-noisy scenarios for image retrieval tasks. Our study explores approaches to preserve all available data, regardless of quality, for diverse tasks. The evaluation of results varies per task, due to the ultimate goal of developing a technique to extract relevant information while disregarding noise in the final network design for each specific task. The aim is to enhance accessibility and efficiency of AI across diverse tasks, particularly for individuals or countries with limited resources, lacking access to high-quality data. The dedication is directed towards fostering inclusivity and unlocking the potential of AI for widespread societal benefit.

关键词： Deep learning

来源：评论

学校读者我要写书评

暂无评论

SAMNER: image Screening and Cross-Modal Alignment networks for Multimodal Named Entity Recognition

SAMNER: Image Screening and Cross-Modal Alignment Networks f...

引用

International Joint conference on neural networks (IJCNN)

作者： Yang, Luyi Xinjiang Univ Sch Comp Sci & Technol Urumqi Xinjiang Peoples R China Xinjiang Key Lab Signal Detect & Proc Urumqi Xinjiang Peoples R China

ISBN: (纸本)9798350359329;9798350359312

With the proliferation of social media data, Multimodal Named Entity Recognition (MNER) has received much attention;using different data modalities is crucial for the development of natural language processing and neural networks. However, existing methods suffer from two drawbacks: 1) textimage pairs in the data only sometimes correspond to each other, and it is impossible to rely on contextual information due to the short text nature of social media. 2) Despite the introduction of visual information, heterogeneity gaps may occur in previous complex fusion methods, leading to misidentification. This paper proposes a new synthetic image with a selected graphic alignment network(SAMNER) to address these challenges and construct a matching relationship between external images and text. To solve the graphic mismatch problem, we use a stable diffusion model to generate the images and perform entity labeling. Specifically, we generate images and perform entity labeling through the stable diffusion model to generate the image with the highest match to the text, filter the generated images by the internal image set to generate the best image, and then perform multimodal fusion to predict the entity labeling, we design a simple and effective multimodal attentional alignment mechanism to obtain a better visual representation, and we conduct a large number of experiments. The experiments prove that our model produces competitive results on the two publicly available datasets.

关键词： Named entity recognition Natural language processing Social media

来源：评论

学校读者我要写书评

暂无评论

CRBC- An automated approach for Handwriting OCR 4

CRBC- An automated approach for Handwriting OCR

引用

4th International conference on artificial Intelligence and Signal processing

作者： Purohit, Aaryan Pujari, Pratik Shah, Krish Nimkar, Anant V. Sardar Patel Inst Technol Dept Comp Engn Mumbai Maharashtra India

ISBN: (纸本)9798350350661;9798350350654

In the realm of information management, the digitization of handwritten documents is pivotal. This research introduces an advanced Handwritten Optical Character Recognition (HOCR) model, leveraging Convolutional neural networks (CNN), Bidirectional Long Short-Term Memory networks (BiLSTM), and the Connectionist Temporal Classification (CTC) loss function. Together Convolutional RNN-based Bi-LSTM CTC (CRBC), demonstrates a robust 94% accuracy, the model adapts seamlessly across various domains, presenting a scalable solution for enhanced handwritten document processing. This fusion of machine learning and natural language processing techniques contributes to improved efficiency in information management, with potential applications in diverse industries and fields.

关键词： Handwritten character recognition (HCR) Convolutional neural networks (CNN) Optical Character Recognition (OCR) Connectionist Temporal Classification (CTC) Loss Recurrent neural networks (RNN) Bidirectional Long Short-Term Memory (LSTM)

来源：评论

学校读者我要写书评

暂无评论

Advancements in memory technologies for artificial synapses

引用

JOURNAL OF MATERIALS CHEMISTRY C 2024年第15期12卷 5274-5298页

作者： Sehgal, Anubha Dhull, Seema Roy, Sourajeet Kaushik, Brajesh Kumar Indian Inst Technol Roorkee Roorkee 247667 Uttarakhand India

neural networks (NNs) have made significant progress in recent years and have been applied in a broad range of applications, including speech recognition, image classification, automatic driving, and natural language processing. The hardware implementation of NNs presents challenges, and research communities have explored various analog and digital neuronal and synaptic devices for resource-efficient implementation. However, these hardware NNs face several challenges, such as overheads imposed by peripheral circuitry, speed-area tradeoffs, non-idealities associated with memory devices, low on-off resistance ratio, sneak path issues, low weight precision, and power-inefficient converters. This article reviews different synaptic devices and discusses the challenges associated with implementing these devices in hardware, along with corresponding solutions, and prospecting future research directions. Several categories of emerging synaptic devices such as resistive random-access memory (RRAM), phase change memory (PCM), analog-to-digital hybrid volatile memory-based, ferroelectric field effect transistor (FeFET)-based, spintronic-based spin transfer, spin-orbit, magnetic domain wall (DW) and skyrmion synaptic devices have been explored, and a comparison between them is presented. This study provides insights for researchers engaged in the field of hardware neural networks. This article reviews different synaptic devices and discusses the challenges associated with implementing these devices in hardware, along with corresponding solutions, applications, and prospecting future research directions.

关键词： Speech recognition

来源：评论

学校读者我要写书评

暂无评论

Cutting-Edge image Recognition Leveraging Deep Learning and Machine Learning for Enhanced Accuracy

Cutting-Edge Image Recognition Leveraging Deep Learning and ...

引用

2024 International conference on artificial Intelligence and Quantum Computation-Based Sensor applications, ICAIQSA 2024

作者： Shrivastava, Abhishek Kumar, Vinesh Maurya, Jay Prakash School of Mechanical Engineering VIT Bhopal University Sehore India School of Computing Science Engineering and Artificial Intelligence VIT Bhopal University Sehore India School of Computing Science & Engineering VIT Bhopal University Sehore India

ISBN: (纸本)9798331517953

This paper investigates advanced techniques in image recognition and classification by integrating deep learning and machine learning approaches to achieve higher accuracy. Through the implementation of sophisticated training algorithms, the study demonstrates enhanced performance in recognizing and categorizing images across various data models. A major turning point in the development of image identification technology came in 2012 when deep neural networks were introduced. These networks surpassed earlier cutting-edge algorithms and completely changed the computer vision industry. This progress has brought us closer to achieving human-level accuracy in tasks such as identity verification. The role of large datasets like imageNet is crucial, as they provide the foundation for the success of deep learning. With continuous research pushing the limits of picture identification and producing major advances in human knowledge, deep learning has a huge influence on business, society, and technology. Additional research in this area might lead to creative uses that revolutionize our relationship with our surroundings. Key topics discussed include data pre-processing, post-processing, model optimization, and accuracy enhancement. The findings highlight the potential of cutting-edge technologies to advance image classification and recognition in various sectors, such as medical imaging and visual analysis. The approach emphasizes scalability and adaptability, ensuring that models can be effectively applied to real-world scenarios. Future research will focus on refining these models to handle even more complex image datasets, further enhancing their practical utility and reliability. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Event-Based Hand Detection on Neuromorphic Hardware Using a Sigma Delta neural Network 33rd

Event-Based Hand Detection on Neuromorphic Hardware Using a ...

引用

33rd International conference on artificial neural networks and Machine Learning (ICANN)

作者： Azzalini, Loic Gluege, Stefan Struckmeier, Jens Sandamirskaya, Yulia ZHAW Zurich Univ Appl Sci Winterthur Switzerland WAIYS GmbH Langen Germany

ISBN: (纸本)9783031723582;9783031723599

The development of deep learning (DL) models has dramatically improved marker-free human pose estimation, including an important task of hand tracking. However, for applications in real-time critical and embedded systems, e.g. in robotics or augmented reality, hand tracking based on standard frame-based cameras is too slow and/or power hungry. The latency is limited by the frame rate of the image sensor already, and any subsequent DL processing further increases the latency gap, while requiring substantial power for processing. Dynamic vision sensors, on the other hand, enable sub-millisecond time resolution and output sparse signals that can be processed with an efficient Sigma Delta neural Network (SDNN) model that preserves the sparsity advantage in the neural network. This paper presents the training and evaluation of a small SDNN for hand detection, based on event data from the DHP19 dataset deployed on Intel's Loihi 2 neuromorphic development board. We found it possible to deploy a hand detection model in neuromorphic hardware backend without a notable performance difference to the original GPU implementation, at an estimated mean dynamic power consumption for the network running on the chip of approximate to 7 mW.

关键词： Event-based Vision Neuromorphic Hardware Hand Tracking Sigma Delta neural networks

来源：评论

学校读者我要写书评

暂无评论

Efficient Dual-Branch Information Interaction Network for Lightweight image Super-Resolution

引用

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT 2024年 73卷 1页

作者： Jin, Haonan Gao, Guangwei Li, Juncheng Guo, Zhenhua Yu, Yi Nanjing Univ Posts & Telecommun Inst Adv Technol Intelligent Visual Informat Percept Lab Nanjing 210023 Peoples R China Minist Educ Key Lab Artificial Intelligence Shanghai 200240 Peoples R China Soochow Univ Prov Key Lab Comp Informat Proc Technol Suzhou 215006 Peoples R China Shanghai Univ Sch Commun & Informat Engn Shanghai 200444 Peoples R China Tianyijiaotong Technol Ltd Suzhou 215100 Peoples R China Hiroshima Univ Grad Sch Adv Sci & Engn Hiroshima 7398511 Japan

Recently, deep convolutional neural networks (CNNs) have achieved remarkable success in single-image super-resolution (SISR) tasks. However, these methods often suffer from high computational and memory requirements, limiting their practicality for real-world applications. To address this challenge, we propose a lightweight and efficient dual-branch information interaction network (DIIN) for SISR. DIIN adopts a dual-branch structure that differs from the typical serial network architectures. Specifically, we design the CNN branch and Transformer branch as parallel structures. In the CNN branch, we employ a symmetric dual-branch feature interaction module (DFIM) to extract valuable local feature information. Concurrently, the Transformer branch utilizes a recursive Transformer to capture long-term global information and enhance reconstructed image details. By simultaneously considering these two branches, our model effectively combines the strengths of CNN in extracting local information and Transformer in capturing global information. Recognizing the complementarity of these two branches in SISR, we further incorporate a coefficient learning scheme to enhance their information interaction and obtain more comprehensive feature information, thereby improving overall model performance. Extensive experiments demonstrate that our DIIN outperforms competitive methods while consuming fewer computational resources and memory.

关键词： Transformers Feature extraction Superresolution Data mining image restoration image reconstruction Convolutional neural networks Information interaction lightweight network single image super-resolution (SISR)

来源：评论

学校读者我要写书评

暂无评论

End-to-End artificial Intelligence-Based System for Automatic Stereo Camera Self-Calibration

引用

IEEE ACCESS 2024年 12卷 160927-160945页

作者： Satouri, Boutaina El Abderrahmani, Abdellatif Satori, Khalid USMBA Fac Sci Fes Dept Math & Informat LISAC Fes 30000 Morocco

Stereo camera self-calibration is a complex challenge in computer vision applications such as robotics, object tracking, surveillance and 3D reconstruction. To address this, we propose an efficient, fully automated End-To-End AI-Based system for automatic stereo camera self-calibration with varying intrinsic parameters, using only two images of any 3D scene. Our system combines deep convolutional neural networks (CNNs) with transfer learning techniques and fine-tuning. First, our end-to-end convolutional neural network optimized model begins by extracting matching points between a pair of stereo images. These matching points are then used, along with their 3D scene correspondences, to formulate a non-linear cost function. Direct optimization is subsequently performed to estimate the intrinsic camera parameters by minimizing this non-linear cost function. Following this initial optimization, a fine-tuning layer refines the intrinsic parameters for increased accuracy. Our hybrid approach is characterized by a special optimized architecture that leverages the strengths of end-to-end CNNs for image feature extraction and processing, as well as the pillars of our nonlinear cost function formulation and fine-tuning, to offer a robust and accurate method for stereo camera self-calibration. Extensive experiments on synthetic and real data demonstrate the superior performance of the proposed technique compared to traditional camera self-calibration methods in terms of precision and faster convergence.

关键词： Cameras Cost function Convolutional neural networks Three-dimensional displays Robot vision systems Solid modeling Accuracy Calibration Nonlinear distortion Mathematical models Camera self-calibration varying intrinsic parameters nonlinear direct optimization convolutional neural network transfer learning fine tuning matching interest points

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：