检索结果-内蒙古大学图书馆

Conference on Geospatial Informatics XIII

作者： Guo, Hongji Aved, Alexander Roller, Collen Ardiles-Cruz, Erika Ji, Qiang Rensselaer Polytech Inst Troy NY 12180 USA Air Force Res Lab Rome NY 13441 USA

ISBN: (数字)9781510661653

ISBN: (纸本)9781510661646;9781510661653

Human action recognition is important for many applications such as surveillance monitoring, safety, and health-care. As 3D body skeletons can accurately characterize body actions and are robust to camera views, we propose a 3D skeleton-based human action method. Different from the existing skeleton-based methods that use only geometric features for action recognition, we propose a physics-augmented encoder and decoder model that produces physically plausible geometric features for human action recognition. Specifically, given the input skeleton sequence, the encoder performs a spatiotemporal graph convolution to produce spatiotemporal features for both predicting human actions and estimating the generalized positions and forces of body joints. The decoder, implemented as an ODE solver, takes the joint forces and solves the Euler-Lagrangian equation to reconstruct the skeletons in the next frame. By training the model to simultaneously minimize the action classification and the 3D skeleton reconstruction errors, the encoder is ensured to produce features that are consistent with both body skeletons and the underlying body dynamics as well as being discriminative. The physics-augmented spatiotemporal features are used for human action classification. We evaluate the proposed method on NTU-RGB+D, a large-scale dataset for skeleton-based action recognition. Compared with existing methods, our method achieves higher accuracy and better generalization ability.

关键词： Skeleton-based action recognition physics encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

encoder-decoder Structure With the Feature Pyramid for Depth Estimation From a Single Image

引用

IEEE ACCESS 2021年 9卷 22640-22650页

作者： Tang, Mengxia Chen, Songnan Dong, Ruifang Kan, Jiangming Beijing Forestry Univ Sch Technol Beijing 100083 Peoples R China

We address the problem of depth estimation from a single monocular image in the paper. Depth estimation from a single image is an ill-posed and inherently ambiguous problem. In the paper, we propose an encoder-decoder structure with the feature pyramid to predict the depth map from a single RGB image. More specifically, the feature pyramid is used to detect objects of different scales in the image. The encoder structure aims to extract the most representative information from the original image through a series of convolution operations and to reduce the resolution of the input image. We adopt Res2-50 as the encoder to extract important features. The decoder section uses a novel upsampling structure to improve the output resolution. Then, we also propose a novel loss function that adds gradient loss and surface normal loss to the depth loss, which can predict not only the global depth but also the depth of fuzzy edges and small objects. Additionally, we use Adam as our optimization function to optimize our network and speed up convergence. Our extensive experimental evaluation proves the efficiency and effectiveness of the method, which is competitive with previous methods on the Make3D dataset and outperforms state-of-the-art methods on the NYU Depth v2 dataset.

关键词： Estimation Feature extraction Three-dimensional displays Periodic structures Optimization Decoding Task analysis Depth prediction encoder-decoder feature pyramid single image

来源：评论

学校读者我要写书评

暂无评论

An Improved and Robust encoder-decoder for Skin Lesion Segmentation

引用

ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2022年第8期47卷 9861-9875页

作者： Hafhouf, Bellal Zitouni, Athmane Megherbi, Ahmed Chaouki Sbaa, Salim Univ Mohamed Khider Dept Elect Engn LESIA Lab Biskra Algeria Univ Mohamed Khider Dept Elect Engn LICCC Lab Biskra Algeria

Automatic segmentation of skin lesions is an important step in computer-aided diagnosis systems for melanoma detection. Although numerous methods have been proposed in the literature, this task is still a challenging issue due to the similarity between different lesions and complex visual characteristics that may be presented in the images. In this paper, we propose major modifications to the state-of-the-art U-Net structure to further improve its capability in skin lesion segmentation. These modifications are presented in both the encoding and the decoding paths. Instead of using only standard convolutional layers like U-Net, the proposed encoding path consists of 10 standard convolutional layers, which are inspired from the Visual Geometry Group (VGG16) network, followed by a pyramid pooling module and a dilated convolutional block. This combination enables to learn better representative feature maps and preserve more spatial resolution. Furthermore, dilated residual blocks are introduced in the decoding path to further refine the segmentation maps. The experimental results on three datasets including the IEEE International Symposium on Biomedical Imaging (ISBI) 2017, ISBI 2016, and PH2 showed that our proposed method has better performance than the basic U-Net, FCN, SegNet, and U-Net + + , and achieved the performance of state-of-the-art segmentation techniques, with minimum pre- and post-processing operations.

关键词： Dilated convolution Pyramid pooling Skin lesion segmentation Dermoscopy encoder-decoder U-Net

来源：评论

学校读者我要写书评

暂无评论

ED-DRAP: encoder-decoder Deep Residual Attention Prediction Network for Radar Echoes

引用

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS 2022年 19卷 1页

作者： Che, Hongshu Niu, Dan Zang, Zengliang Cao, Yichao Chen, Xisong Southeast Univ Key Lab Measurement & Control CSE Minist Educ Nanjing 210096 Peoples R China Southeast Univ Sch Automat Nanjing 210096 Peoples R China PLA Univ Sci & Technol Inst Meteorol & Oceanog Nanjing 211101 Peoples R China

Precipitation nowcasting is quite important and fundamental. It underlies various public services ranging from rainstorm warnings to flight safety. In order to further improve the prediction accuracy for the spatiotemporal sequence forecasting problem, we propose an encoder-decoder deep residual attention prediction network, which adaptively rescales the multiscale sequence- and spatial-wise features and achieves very deep trainable residual prediction by integrating global residual learning and local deep residual sequence and spatial attention blocks (RSSABs). Experiments in a real-world radar echo map dataset of South China show that compared with the ingenious PredRNN++, TrajGRU methods, and newly proposed Unet-based methods, our ED-DRAP network performs better on the precipitation nowcasting metrics, as well as occupies small GPU memory.

关键词： Feature extraction Spatiotemporal phenomena Radar Decoding Forecasting Three-dimensional displays Radar remote sensing Deep residual prediction encoder-decoder precipitation nowcasting sequence and spatial attention

来源：评论

学校读者我要写书评

暂无评论

A Multi-Head Self-Attention-based on GRU encoder-decoder Framework for Predicting Molten Iron Silicon Content 12

A Multi-Head Self-Attention-based on GRU Encoder-Decoder Fra...

引用

IEEE 12th Data Driven Control and Learning Systems Conference (DDCLS)

作者： Cai, Yu Yang, Chunjie Lou, Siwei Zeng, Zhenyu Liao, Huanyu Zhang, Bing Zhejiang Univ Coll Control Sci & Engn Hangzhou 310013 Peoples R China Alibaba Grp Hangzhou 311121 Peoples R China

ISBN: (纸本)9798350321050

Silicon content is a significant index in the process of blast furnace ironmaking. It is used to measure the quality of molten iron *** only meets the requirements if it is too high or too low. In the production process,the silicon content in molten iron needs to be controlled within a stable *** the same time,due to the time lag, nonlinear and dynamic characteristics of blast furnace itself, it is difficult to predict the silicon content accurately. This paper proposes a multi-head self-attention-based gate recurrent unit encoder-decoder framework that can better extract global dynamic features and local features, improve prediction accuracy and pass the experimental verification.

关键词： Blast Furnace Si Content Prediction encoder-decoder Multi-Head Attention Mechanism Gate Recurrent Unit

来源：评论

学校读者我要写书评

暂无评论

Hourly solar irradiance forecasting based on encoder-decoder model using series decomposition and dynamic error compensation

引用

ENERGY CONVERSION AND MANAGEMENT 2022年 270卷

作者： Tong, Junlong Xie, Liping Fang, Shixiong Yang, Wankou Zhang, Kanjian Southeast Univ Sch Automat Nanjing 210096 Peoples R China Minist Educ Key Lab Measurement & Control Complex Syst Engn Nanjing 210096 Peoples R China

Accurate solar irradiance prediction is crucial for harnessing solar energy resources. However, the pattern of irradiance sequence is intricate due to its nonlinear and non-stationary characteristics. In this paper, a deep hybrid model based on encoder-decoder is proposed to cope with the complex pattern for hourly irradiance forecasting. The hybrid deep model integrates complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN), encoder-decoder module, and dynamic error compensation (DEC) architecture. The CEEMDAN is implemented to reduce the nonlinear and non-stationarity of the irradiance sequence. The encoder-decoder integrates temporal convolutional networks (TCN), long short-term memory networks (LSTM), and multi-layer perceptron (MLP) for temporal features extraction and multi-step prediction. The DEC architecture dynamically updates the model based on adjacent error information to mine the predictable components of error information. Furthermore, a new loss function is further proposed for multi-objective optimization to balance the performance of multi-step forecasting. In the hourly irradiance forecasting experiments on the three public datasets, the root mean square error (RMSE), mean absolute error (MAE), and correlation coefficient (R) of the proposed model are observed to be in a range of 30.693-34.433 W/m2, 19.398-22.900 W/m2, and 0.9872-0.9902, respectively. Compared to the benchmark models (including MLP, LSTM, and TCN), the RMSE and MAE reduce by 10.76%-22.00% and 5.47%-20.40%, respectively. The experimental results indicate that the proposed model shows accurate and robust forecasting performance and is a reliable alternative to hourly irradiance forecasting.

关键词： Solar irradiance forecasting Deep learning Temporal convolutional network encoder-decoder Long short term memory Error compensation

来源：评论

学校读者我要写书评

暂无评论

EDChannel: channel prediction of backscatter communication network based on encoder-decoder

引用

TELECOMMUNICATION SYSTEMS 2022年第1期81卷 99-114页

作者： Li, Dengao Wen, Yongxin Xu, Shuang Wang, Qiang Bai, Ruiqin Zhao, Jumin Taiyuan Univ Technol Coll Data Sci Jinzhong 030600 Shanxi Peoples R China Taiyuan Univ Technol Coll Informat & Comp Jinzhong 030600 Shanxi Peoples R China Technol Res Ctr Spatial Informat Network Engn Sha Jinzhong 030600 Peoples R China Intelligent Percept Engn Technol Ctr Shanxi Jinzhong 030600 Peoples R China

Backscatter communication networks have attracted much attention due to their small size and low power waste, but their spectrum resources are very limited and are often affected by link bursts. Channel prediction is a method to effectively utilize the spectrum resources and improve communication quality. Most channel prediction methods have failed to consider both spatial and frequency diversity. Meanwhile, there are still deficiencies in the existing channel detection methods in terms of overhead and hardware dependency. For the above reasons, we design a sequence-to-sequence channel prediction scheme. Our scheme is designed with three modules. The channel prediction module uses an encoder-decoder based deep learning model (EDChannel) to predict the sequence of channel indicator measurements. The channel detection module decides whether to perform a channel detection by a trigger that reflects the prediction effect. The channel selection module performs channel selection based on the channel coefficients of the prediction results. We use a commercial reader to collect data in a real environment, and build an EDChannel model based on the deep learning module of Tensorflow and Keras. As a result, we have implemented the channel prediction module and completed the overall channel selection process. The experimental results show that the EDChannel algorithm has higher prediction accuracy than the previous state-of-the-art methods. The overall throughput of our scheme is improved by approximately 2.9% and 14.1% over Zhao's scheme in both stable and unstable environments.

关键词： Backscatter communication Channel prediction Deep learning encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

DHA-Net: An encoder-decoder network fusing multi-scale features for optic disc segmentation

DHA-Net: An encoder-decoder network fusing multi-scale featu...

引用

IEEE International Instrumentation and Measurement Technology Conference (I2MTC) - Rising Above Covid-19

作者： Zheng, Xuan He, Yi Yuan, Huaqing Jiang, Yanglin Xu, Yanbin Tianjin Univ Sch Elect & Informat Engn Tianjin Peoples R China TianJin Eye Hosp Tianjin Peoples R China

ISBN: (纸本)9781665453837

Automatic and accurate segmentation of the optic disc (OD) region has practical applications in the medical field. In this study, a novel encoder-decoder network is proposed to segment the ODs automatically and accurately. The encoder consists of three parts: (1) A low-level feature extraction module composed of dense connectivity block (Dense Block) which can output rich low-level features;(2) A High-resolution Block (HR Block) which can extract sufficient semantic information while reducing parameters;(3) An Atrous Spatial Pyramid Pooling (ASPP) module is used to obtain high-level features. Therefore, the network is named DHA-Net. The proposed decoder takes advantage of the multi-scale features from the encoder to predict OD regions. Compared with the existing methods on three datasets, it is proved that the proposed method is better than the current excellent methods in the segmentation results of normal and abnormal fundus. The ablation studies proved the influence of each module on the segmentation performance, and explained the network structure reasonably. In the case of fewer network parameters, DHA-Net achieves better prediction performance on intersection over union (IoU), dice similarity coefficient (DSC) and other evaluation metrics. DHA-Net is lightweight and can use multi-scale features to predict OD regions.

关键词： Medical image segmentation Convolutional neural network Optic disc encoder-decoder Multi-scale features

来源：评论

学校读者我要写书评

暂无评论

Comparison of encoder-decoder Networks for Soccer Field Segmentation 20

Comparison of Encoder-Decoder Networks for Soccer Field Segm...

引用

20th Latin American Robotics Symposium (LARS) / 15th Brazilian Symposium on Robotics (SBR) / 14th Workshop on Robotics in Education (WRE)

作者： Guimaraes, Otavio H. R. Maximo, Marcos R. O. A. Parente de Oliveira, Jose Maria Ecole Polytech Inst Polytech Paris Route Saclay F-91128 Palaiseau Ile De France France Aeronaut Inst Technol Comp Sci Div Autonomous Computat Syst Lab Praca Marechal Eduardo Gomes 50 BR-12228900 Sao Jose Dos Campos SP Brazil Aeronaut Inst Technol Comp Sci Div Praca Marechal Eduardo Gomes 50 BR-12228900 Sao Jose Dos Campos SP Brazil

ISBN: (纸本)9798350315387

Convolutional neural networks consist of state-of-the-art models used for the solution of computer vision problems. This paper contributes by evaluating the efficiency of several encoder-decoder neural networks, trained to perform the segmentation of the soccer field in Humanoid KidSize Robot Soccer competitions. To compare the efficiency of several encoders, a total of fourteen neural network models, based on the U-Net and SegNet architectures, were tested and compared in terms of accuracy, cost function value, IoU, and average inference time. Based on that, the networks based on U-Net that utilized the MobileNetv3Small or the ResNet18 for the encoding process were found to be the optimal solution among the considered alternatives to segment the soccer field.

关键词： neural networks CNN encoder-decoder semantic segmentation robot soccer

来源：评论

学校读者我要写书评

暂无评论

An encoder-decoder Method with Position-Aware for Printed Mathematical Expression Recognition 17th

An Encoder-Decoder Method with Position-Aware for Printed Ma...

引用

17th International Conference on Document Analysis and Recognition (ICDAR)

作者： Long, Jun Hong, Quan Yang, Liu Cent South Univ Sch Comp Sci & Engn Changsha 410075 Hunan Peoples R China Cent South Univ Big Data Inst Changsha 410083 Hunan Peoples R China

ISBN: (纸本)9783031416750;9783031416767

Printed mathematical expression recognition is to transform printed mathematical formula image into LaTeX sequence. Recently, many methods based on deep learning have been proposed to solve this task. However, the positional relationship between mathematical symbols is often ignored or represented insufficient, leading to the loss of structural features of mathematical formulas. To overcome this challenge, we propose a position-aware encoder-decoder model for printed mathematical expression recognition. We design a two-dimensional position encoding algorithm based on sin/cos function to capture positional relationship between mathematical symbols. Meanwhile, we adopt a more advanced image feature extraction network. In decoder component, we use Bi-GRU as the translator, and add attention mechanism to make decoder focus on the important local information. We conduct experiments on the public dataset IM2LaTeX-100K, and the results show that our proposed approach is more excellent than the majority of advanced methods.

关键词： Deep learning Mathematical expression recognition encoder-decoder Position encoding Attention mechanism

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：