检索结果-内蒙古大学图书馆

3dVSd: An end-to-end 3d convolutional object detection network for video smoke detection

FIRE SAFETY JOURNAL 2022年 134卷 1页

作者： Huo, Yinuo Zhang, Qixing Zhang, Yongming Zhu, Jiping Wang, Jinjun Univ Sci & Technol China State Key Lab Fire Sci 96 Jinzhai Rd Hefei Anhui Peoples R China

In addition to static features, dynamic features are also important for smoke recognition. 3d convolution can extract temporal and spatial information from video sequences. Currently, for video smoke detection, 3d convolution is usually used as a tool for secondary judgment of the detection results of single frame approaches. In this work, an end-to-end object detection neural network based on 3d convolution for video smoke detection, named 3dVSd, is proposed for the first time. The network captures moving objects from the input video se-quences by the dynamic feature extraction part first and then inputs the feature tensor to the static feature extraction part for recognition and localization, which makes full use of the spatiotemporal features of smoke and improves the reliability of the algorithm. In addition, a time-series smoke video dataset for network training is proposed. The proposed algorithm is compared with other related studies. The experimental results demon-strated that the 3dVSd is promising with an accuracy rate of 99.54%, a false alarm rate of 1.11%, and a missed detection rate of 0.14%, and meets the requirements of real-time detection.

关键词： 3d convolutional Video frame sequence Object detection End-to-end Video smoke detection

来源：评论

学校读者我要写书评

暂无评论

MSAC-Net: 3d Multi-Scale Attention convolutional Network for Multi-Spectral Imagery Pansharpening

引用

REMOTE SENSING 2022年第12期14卷 2761页

作者： Zhang, Erlei Fu, Yihao Wang, Jun Liu, Lu Yu, Kai Peng, Jinye Northwest A&F Univ Sch Informat Engn Xian 712100 Peoples R China Northwest Univ Sch Informat Sci & Technol Xian 710127 Peoples R China Shaanxi Prov Silk Rd Digital Protect & Inheritanc Xian 710127 Peoples R China

Pansharpening fuses spectral information from the multi-spectral image and spatial information from the panchromatic image, generating super-resolution multi-spectral images with high spatial resolution. In this paper, we proposed a novel 3d multi-scale attention convolutional network (MSAC-Net) based on the typical U-Net framework for multi-spectral imagery pansharpening. MSAC-Net is designed via 3d convolution, and the attention mechanism replaces the skip connection between the contraction and expansion pathways. Multiple pansharpening layers at the expansion pathway are designed to calculate the reconstruction results for preserving multi-scale spatial information. The MSAC-Net performance is verified on the IKONOS and QuickBird satellites' datasets, proving that MSAC-Net achieves comparable or superior performance to the state-of-the-art methods. Additionally, 2d and 3d convolution are compared, and the influences of the number of convolutions in the convolution block, the weight of multi-scale information, and the network's depth on the network performance are analyzed.

关键词： deep learning multi-spectral image 3d convolutional multi-scale cost

来源：评论

学校读者我要写书评

暂无评论

SMART-vision: survey of modern action recognition techniques in vision

引用

Multimedia Tools and Applications 2024年 1-72页

作者： AlShami, Ali K. Rabinowitz, Ryan Lam, Khang Shleibik, Yousra Mersha, Melkamu Boult, Terrance Kalita, Jugal Computer Science Department University of Colorado Colorado Springs 1420 Austin Bluffs Pkwy Colorado SpringsCO80918 United States Information Technology Department Can Tho University Campus II 3/2 Street Ninh Kieu District Can Tho Viet Nam

Human Action Recognition (HAR) is a challenging domain in computer vision, involving recognizing complex patterns by analyzing the spatiotemporal dynamics of individuals’ movements in videos. These patterns arise in sequential data, such as video frames, which are often essential to accurately distinguish actions that would be ambiguous in a single image. HAR has garnered considerable interest due to its broad applicability, ranging from robotics and surveillance systems to sports motion analysis, healthcare, and the burgeoning field of autonomous vehicles. While several taxonomies have been proposed to categorize HAR approaches in surveys, they often overlook hybrid methodologies and fail to demonstrate how different models incorporate various architectures and modalities. In this comprehensive survey, we present the novel SMART-Vision taxonomy, which illustrates how innovations in deep learning for HAR complement one another, leading to hybrid approaches beyond traditional categories. Our survey provides a clear roadmap from foundational HAR works to current state-of-the-art systems, highlighting emerging research directions and addressing unresolved challenges in discussion sections for architectures within the HAR domain. We provide details of the research datasets that various approaches used to measure and compare HAR approaches. We also explore the rapidly emerging field of Open-HAR systems, which challenges HAR systems by presenting samples from unknown, novel classes during test-time. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.

关键词： 3d convolutional Computer vision deep learning Graph convolutional network Human action recognition Machine learning Motion models Open-set recognition Open-world learning Transformer Two-streams network Vision-based

来源：评论

学校读者我要写书评

暂无评论

dim and Small Target detection in Multi-Frame Sequence Using Bi-Conv-LSTM and 3d-Conv Structure

引用

IEEE ACCESS 2021年 9卷 135845-135855页

作者： Liu, Xin Li, Xiaoyan Li, Liyuan Su, Xiaofeng Chen, Fansheng Univ Chinese Acad Sci Beijing Peoples R China Univ Chinese Acad Sci Hangzhou Inst Adv Study Hangzhou 310024 Peoples R China Chinese Acad Sci Key Lab Intelligent Infrared Percept Shanghai 200083 Peoples R China

Infrared dim and small target detection is widely used in military and civil fields. Traditional methods in that application rely on the local contrast between the target and background for single-frame detection. On the other hand, those algorithms depend on the motion model with fixed parameters for multi-frame association. For the great similarity of gray value and the dynamic changes of motion model parameters in the condition of low SNR and strong clutter, those methods possess weak robustness, low detection probability, and high false alarm rate. In this paper, an infrared video sequences encoding and decoding model based on Bidirectional convolutional Long Short-Term Memory structure (Bi-Conv-LSTM) and 3d convolutional structure (3d-Conv) is proposed, addressing the problem of high similarity and dynamic changes of parameters. For solving the problem of dynamic change in parameters, Bi-Conv-LSTM structure is used to learn the motion model of targets. And for the problem of low local contrast, 3d-Conv structure is adopted to extend receptive field in the time dimension. In order to improve the precision of detection, the decoding part is divided into two different full connections with distinctive active function. Simulation results show that the trajectory detection accuracy of the proposed model is more than 90% under the condition of low SNR and maneuvering motion, which is better than traditional method of 80% in dB-TBd 20% in others. Real data experiment to illustrate that that our proposed method can detect small infrared targets of a low false alarm rate and high detection probability.

关键词： Video sequences Object detection detection algorithms Signal to noise ratio Solid modeling Feature extraction decoding deep learning (dL) neural network (NN) dim and small target detection long short-term memory (LSTM) 3d convolutional

来源：评论

学校读者我要写书评

暂无评论

Two-Stream Convolution Neural Network with Video-stream for Action Recognition

Two-Stream Convolution Neural Network with Video-stream for ...

引用

International Joint Conference on Neural Networks (IJCNN)

作者： dai, Wei Chen, Yimin Huang, Chen Gao, Ming-Ke Zhang, Xinyu Shanghai Univ Shanghai Inst Adv Commun & Data Sci Sch Comp Engn & Sci Shanghai Peoples R China China Elect Technol Grp Corp Res Inst 32 Shanghai Peoples R China

ISBN: (纸本)9781728119854

Recently, as the application of the convolutional neural network in artificial intelligence is becoming increasingly diversified, a growing number of neural network methods are put forward. For example, 3d convolution and two-stream convolution method based on RGB and optical stream are applied to the neural network. convolutional neural network with 3d convolutional core is able to extract spatio-temporal features directly from a set of video sequences, used for action recognition. Although the 3d convolutional neural network can obtain partial spatio-temporal information, a new ConvNet architecture called CVdN(Combined Video-stream deep Network) is proposed to extract more spatio-temporal features from video fragments so as to effectively utilize the temporal information in the dataset. We evaluate our method on the UCF-101 dataset and obtain a good result. The following is some details about our method: First, we use pre-trained ResNets models on Kinetics dataset to initialize our training models, training and extracting the video stream features from UCF-101 dataset. Then, optical flow graphs obtained from the UCF-101 dataset, which are the input of the optical stream, are used to extract the optical features. At length, two-stream features are combined and the results are obtained after Softmax layer. When the linear fusion ratio of video stream features and optical stream features is 5:4, CVdN obtains good results. And the accuracy of our method with Resnet-101 achieves 92.2%.

关键词： 3d convolutional two-stream method video stream optical stream fusion

来源：评论

学校读者我要写书评

暂无评论

A unified model of video-based human action categorization using Chaotic Quantum Swarm Intelligence on Intuitionistic fuzzy 3d Convolution Neural Network

引用

INTELLIGENT dECISION TECHNOLOGIES-NETHERLANdS 2019年第4期13卷 507-521页

作者： Kumaravel, S. Veni, S. Karpagam Acad Higher Educ Dept Comp Sci Coimbatore 21 Tamil Nadu India

In the contemporary surveillance schemes of Computer Vision, videos concerning human action categorization have become a predominant zone, involving Pattern Recognition tasks. Factually, most of the human actions comprise complex temporal information, and it is quite difficult to discover the diverse activities of humans precisely, in an unpredictable variety of environmental circumstances. A deep Learning paradigm can tackle this issue, by providing additional capabilities to vision-based human action recognition. However, there are more complex challenges in extracting the spatio-temporal features, for instance, the presence of noise in videos and the highly vague feature points. This paper proposes a hybrid intelligent Intuitionistic Fuzzy 3d Convolution Neural Network that uses Chaotic Quantum Swarm Intelligence (CQSI-IFCNN), to optimize video-based human action categorization. Vagueness and ambiguity of input video frames are inherited by Intuitionistic Fuzzy networks in terms of membership, hesitation and non-membership components. By applying Chaotic Quantum Swarm Intelligence (CQSI), the learning parameters and error rates that occur in standard convolutional neural network are considerably reduced. The chaotic searching scheme is applied to overcome premature local optima in Quantum Swarm Intelligence. Therefore, this model produces optimized outcomes in Intuitionistic fuzzy 3d convolutional Neural Networks, thus improving the categorization of human actions in videos. The Performance of CQSI-IFCNN is assessed by using the KTH and UCF Sports Action datasets. From the simulation outcomes, it is observed that CQSI-IFCNN has attained a higher rate of action categorization accuracy than standard CNN and PSO-CNN.

关键词： Human action categorization deep learning intuitionistic fuzzy Chaotic Quantum Swarm intelligence 3d convolutional Neural Network

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：