检索结果-内蒙古大学图书馆

Pedestrian Trajectory Prediction in Heterogeneous Traffic using Facial Keypoints-based Convolutional encoder-decoder Network

引用

ACM TRANSACTIONS ON INTERNET TECHNOLOGY 2022年第4期22卷 83-83页

作者： Xiao, Song Chen, Kai Ren, Xiaoxiang Yuan, Haitao Beihang Univ Beijing Peoples R China Nanan Jr High Sch Quanzhou Peoples R China New Jersey Inst Technol Newark NJ 07102 USA

Future pedestrian trajectory prediction offers great prospects for many practical applications such as unmanned vehicles, building evacuation design and robotic path planning. Most existing methods focus on social interaction among pedestrians but ignore the fact that heterogeneous traffic objects (cars, dogs, bicycles, motorcycles, etc.) have significant influence on the future trajectory of a subject pedestrian. Also, the walking direction intention of a pedestrian may be referred by his/her facial keypoints. Considering this, this work proposes to predict a pedestrian's future trajectory by jointly using neighboring heterogeneous traffic information and his/her facial keypoints. To fulfill this, an end-to-end facial keypoints-based convolutional encoder-decoder network (FK-CEN) is designed, in which the heterogeneous traffic and facial keypoints are input. After training, FK-CEN is evaluated on 5 crowded video sequences collected from the public datasets MOT-16 and MOT-17. Experimental results demonstrate that it outperforms state-of-the-art approaches, in terms of prediction errors.

关键词： Social-interaction pedestrian intention convolutional long-short-term memory encoder-decoder attention

来源：评论

学校读者我要写书评

暂无评论

Efficient encoder-decoder Network With Estimated Direction for SAR Ship Detection

引用

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS 2022年 19卷

作者： Niu, Yuzhen Li, Yuezhou Huang, Jiangyi Chen, Yuzhong Fuzhou Univ Coll Comp & Data Sci Fujian Prov Key Lab Networking Comp & Intelligent Fuzhou 350116 Peoples R China Minist Educ Key Lab Spatial Data Min & Informat Sharing Fuzhou 350108 Fujian Peoples R China

Synthetic aperture radar (SAR) image ship detection has important applications in marine surveillance. There are two limitations when applying advanced detection methods naively for SAR ship detection. First, most detectors construct the model as an encoder and rely on the feature pyramid network (FPN) head for accurate prediction, which may lead to high computational costs. Second, the background noises in the ground truth (annotated as rectangular bounding boxes) of angular ships bring difficulties for model training. To meet these challenges, we propose an efficient encoder-decoder network with estimated direction for ship detection in SAR images. First, we present an anchor-free encoder-decoder model that can efficiently extract multiple-level features. Second, we formulate ship detection as a multitask learning problem, including a bounding box prediction and a ship direction regression. The estimated ship direction can weakly supervise and benefit ship detection. Furthermore, we develop a center-weighted labeling method for overlapped annotations. Comprehensive experiments on SAR-Ship-Detection and SSDD datasets show that our method achieves state-of-the-art performance with a high running speed.

关键词： Marine vehicles Radar polarimetry Synthetic aperture radar Decoding Feature extraction Task analysis Background noise encoder-decoder multitask learning ship detection in SAR image synthetic aperture radar (SAR) image

来源：评论

学校读者我要写书评

暂无评论

TED-Net: Convolution-Free T2T Vision Transformer-Based encoder-decoder Dilation Network for Low-Dose CT Denoising 12th

TED-Net: Convolution-Free T2T Vision Transformer-Based Encod...

引用

12th International Workshop on Machine Learning in Medical Imaging (MLMI 2021)

作者： Wang, Dayang Wu, Zhan Yu, Hengyong Univ Massachusetts Lowell Dept Elect & Comp Engn Lowell MA 01854 USA

ISBN: (纸本)9783030875886;9783030875893

Low dose computed tomography (CT) is a mainstream for clinical applications. However, compared to normal dose CT, in the low dose CT (LDCT) images, there are stronger noise and more artifacts which are obstacles for practical applications. In the last few years, convolution-based end-to-end deep learning methods have been widely used for LDCT image denoising. Recently, transformer has shown superior performance over convolution with more feature interactions. Yet its applications in LDCT denoising have not been fully cultivated. Here, we propose a convolution-free T2T vision transformer-based encoderdecoder Dilation Network (TED-Net) to enrich the family of LDCT denoising algorithms. The model is free of convolution blocks and consists of a symmetric encoder-decoder block with sole transformer. Our model (Codes are available at https://***/wdayang/TED- Net) is evaluated on the AAPM-Mayo clinic LDCT Grand Challenge dataset, and results show outperformance over the state-of-the-art denoising methods.

关键词： Low-dose CT Transformer Token-to-Token encoder-decoder Dilation

来源：评论

学校读者我要写书评

暂无评论

Micro-climate Prediction - Multi Scale encoder-decoder based Deep Learning Framework 21

Micro-climate Prediction - Multi Scale Encoder-decoder based...

引用

27th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD)

作者： Kumar, Peeyush Chandra, Ranveer Bansal, Chetan Kalyanaraman, Shivkumar Ganu, Tanuja Grant, Michael Microsoft Res Redmond WA 98052 USA Microsoft Bengaluru India Microsoft Res Bengaluru India Univ Washington Seattle WA 98195 USA Climate Corp San Francisco CA USA

ISBN: (纸本)9781450383325

This paper presents a deep learning approach for a versatile Microclimate prediction framework (DeepMC). Micro climate predictions are of critical importance across various applications, such as Agriculture, Forestry, Energy, Search & Rescue, etc. To the best of our knowledge, there is no other single framework which can accurately predict various micro-climate entities using Internet of Things (IoT) data. We present a generic framework (DeepMC) which predicts various climatic parameters such as soil moisture, humidity, wind speed, radiation, temperature based on the requirement over a period of 12 hours - 120 hours with a varying resolution of 1 hour - 6 hours, respectively. This framework proposes the following new ideas: 1) Localization of weather forecast to IoT sensors by fusing weather station forecasts with the decomposition of IoT data at multiple scales and 2) A multi-scale encoder and two levels of attention mechanisms which learns a latent representation of the interaction between various resolutions of the IoT sensor data and weather station forecasts. We present multiple real-world agricultural and energy scenarios, and report results with uncertainty estimates from the live deployment of DeepMC, which demonstrate that DeepMC outperforms various baseline methods and reports 90%+ accuracy with tight error bounds.

关键词： Micro-climate prediction Sustainability Deep learning encoder-decoder Attention mechanism Transfer learning IoT sensors Sequence to sequence Time series Agriculture Energy Wavelet transform

来源：评论

学校读者我要写书评

暂无评论

3D Multi-Branch encoder-decoder Networks with Attentional Feature Fusion for Pulmonary Nodule Detection in CT Scans

3D Multi-Branch Encoder-Decoder Networks with Attentional Fe...

引用

International Joint Conference on Neural Networks (IJCNN)

作者： Zhang, Chenjiao Wang, Lulu Wu, Xing He, Zhongshi Chongqing Univ Coll Comp Sci Chongqing Peoples R China

ISBN: (纸本)9780738133669

Pulmonary nodule detection in low-dose computed tomography (CT) images is essential for early screening and treatment of lung cancer. Previous related researches based on deep convolutional neural networks generally rely on 2D or 2.5D components and only focus on the output feature information under a single receptive field. Considering the 3D nature of lung CT images and the performance limitation of state-of-the-art nodule detection methods, we develop a novel 3D multi-branch region proposal network with an encoder-decoder structure. Specifically, each parallel branch is designed with 3D residual blocks and U-Net-like structure to effectively extract multi-scale fusion features based on 3D spatial information of CT scans, and the strategies of varying receptive fields and sharing weight parameters are used to improve the sensitivity of the detection network to nodules with scale variation and maintain the original parameters. Besides, we propose a multi-scale attentional feature fusion module to better fuse high-resolution and semantically strong features and adaptively learn the inter-dependency information of different feature maps. Finally, we compare a dynamically scaled cross entropy loss and online hard example mining (OHEM) to combat the imbalance of positive and negative samples during training, which is aimed at assisting with network optimization. Our extensive experiments on publicly available CT scans obtained from LUNA16 and TianChi(1) competition dataset demonstrate that our method outperform state-of-the-art pulmonary nodule detection models.

关键词： pulmonary nodule detection 3d convolutional neural network encoder-decoder multi-scale attentional feature fusion

来源：评论

学校读者我要写书评

暂无评论

Constrained Image Splicing Detection and Localization With Attention-Aware encoder-decoder and Atrous Convolution

引用

IEEE ACCESS 2020年 8卷 6729-6741页

作者： Liu, Yaqi Zhao, Xianfeng Chinese Acad Sci Inst Informat Engn State Key Lab Informat Secur Beijing 100093 Peoples R China Univ Chinese Acad Sci Sch Cyber Secur Beijing 100093 Peoples R China

Constrained image splicing detection and localization (CISDL) is a newly formulated image forensics task and plays an important role in verifying the generating process of a forged image. CISDL conducts dense matching between two investigated images and detects whether one image has forged regions pasted from the other. In this work, we introduce a novel attention-aware encoder-decoder deep matching network named as AttentionDM for CISDL. An encoder-decoder with atrous convolution is newly designed for hierarchical features dense matching and fine-grained masks generation. A novel attention-aware correlation computation module is built on normalization operations and informative features recalibration with channel attention blocks. Last but not least, VGG and ResNets are respectively formulated as feature extractors for comprehensive comparisons in CISDL. Extensive experiments demonstrate the superior performance of AttentionDM over the state-of-the-art methods.

关键词： encoder-decoder atrous convolution normalization channel attention

来源：评论

学校读者我要写书评

暂无评论

Pedestrian behavior prediction model with a convolutional LSTM encoder-decoder

引用

PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS 2020年 560卷 125132-125132页

作者： Chen, Kai Song, Xiao Han, Daolin Sun, Jinghan Cui, Yong Ren, Xiaoxiang Beihang Univ BUAA Beijing Peoples R China Univ Illinois Urbana IL USA Nanan Primary Sch Xian Shanxi Peoples R China

Pedestrian behavior modeling is a challenging problem especially in crowded transportation scenarios. Some recent studies have addressed this problem using deep neural network, but the accuracy of trajectory prediction is still not high because the internal structure of the typical deep neural network with long short-term memory (LSTM) is a one-dimensional vector, which destroys the spatial information around a pedestrian. Therefore, these models cannot fully learn spatial sensing behavior of pedestrians. To solve this, we recommend using multi-channel tensors to represent the environmental information of pedestrians. Meanwhile, the spatiotemporal interactions among the pedestrians are represented by convolution operations of these tensors. Then, an end-to-end fully convolutional LSTM encoder-decoder is designed, trained and tested. Finally, our approach is compared with existing LSTM-based methods using five crowded video sequences with public datasets. The results show that our method reduces the displacement offset error and provides more realistic trajectory prediction in manifold cases. (c) 2020 Published by Elsevier B.V.

关键词： Pedestrian behavior model Trajectory prediction Long short-term memory Convolution encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

An encoder-decoder based thermo-visible image translation for disguised and undisguised faces

引用

IMAGE AND VISION COMPUTING 2022年 119卷 104376-104376页

作者： Kumar, Sumit Singh, Satish Kumar Mishra, Nayaneesh Kumar Dutta, Mainak Indian Inst Informat Technol Jhalwa 211012 Prayagraj India Qualcomm Hyderabad Telangana India

Thermal cameras can capture images even in low light conditions. However, humans cannot recognize human faces in thermal images. Translation of thermal images to visible domain is one solution to the problem of face recognition in thermal images. Most of the research works have proposed Generative Adversarial Networks (GANs) based solutions for thermal to visible image translation. However, GAN is a heavy network that consumes huge amount of resource for thermal to visible image translation. In this paper, we propose an encoder-decoder architecture for thermal to visible image translation of human faces. Since our proposed architecture is not based on GANs, it is lightweight. The proposed method works well for both disguised and non-disguised thermal facial images. Standard comparison parameters such as Peak Signal-to-noise Ratio (PSNR), Structural Similarity Index (SSIM), and Multiscale Structural Similarity Index (MS-SSIM) are used to evaluate the quality of the generated visible images with respect to the ground truth. It has been found that our proposed architecture outperforms the current state-of-the-art image translator architectures namely pix2pix, Cycle-GAN, modified thermal to visible GAN and Dual GAN by a considerable margin for both disguised as well as non-disguised dataset. (c) 2022 Elsevier B.V. All rights reserved.

关键词： Image translation Generative Adversarial Networks (GAN) Pix2Pix Cycle-GAN encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

LSTM enhanced by dual-attention-based encoder-decoder for daily peak load forecasting

引用

ELECTRIC POWER SYSTEMS RESEARCH 2022年第0期208卷 1页

作者： Zhu, Kedong Li, Yaping Mao, Wenbo Li, Feng Yan, Jiahao China Elect Power Res Inst Power Automat Dept Nanrui 8 Nanjing 210003 Jiangsu Peoples R China

Daily peak load forecasting is a challenging problem in the filed of electric power load forecasting. Since the nonlinear and dynamic of influence factors and their sequential dependencies are significant for modeling daily peak load, a prediction model based on long short-term memory (LSTM) enhanced by dual-attention-based encoder-decoder is presented. Functioned as the specific encoder and decoder, LSTM is utilized to participate in the nonlinear dynamic temporal modeling. The encoder-decoder is used for information utilization of both the influence factors and daily peak load. Moreover, a dual-attention mechanism, which is inserted into the encoder decoder, is designed to take into account the effects of different influence factors and time nodes on the daily peak load simultaneously. It is benefit for the above mechanism design to analyze the characteristics of daily peak load precisely and to achieve more accurate prediction results. Comprehensive experiments are performed based on a real set of one provincial capital city in eastern China. The case study shows that the proposed methodology provides the most accurate results with an average MAPE 2.07%, an average RMSE 133 MW and an average MAE 326.6 MW.

关键词： Dual-attention mechanism Daily peak load forecasting encoder-decoder Long short-term memory (LSTM)

来源：评论

学校读者我要写书评

暂无评论

A fully-convolutional residual encoder-decoder neural network to localize breast cancer on histopathology images

引用

COMPUTERS IN BIOLOGY AND MEDICINE 2022年 147卷 105698-105698页

作者： Farajzadeh, Nacer Sadeghzadeh, Nima Hashemzadeh, Mahdi Azarbaijan Shahid Madani Univ Fac Informat Technol & Comp Engn Tabriz Iran Azarbaijan Shahid Madani Univ Artificial Intelligence & Machine Learning Res Lab Tabriz Iran Azarbaijan Shahid Madani Univ Azarshahr Rd Tabriz *** Iran

Cancer detection in its early stages may allow patients to receive the proper treatment and save lives along with recovering the routine lifestyles. Breast cancer is of the top leading causes of mortality among women all around the globe. A source to find these cancerous nuclei is through analyzing histopathology images. These images, however, are very complex and large. Thus, locating the cancerous nuclei in them is very challenging. Hence, if an expert fails to diagnose their patients via these images, the situation may be exacerbated. Therefore, this study aims to introduce a method to mask as many cancer nuclei on histopathology images as possible with a high visual aesthetic to make them distinguishable by experts easily. A tailored residual fully convolutional encoderdecoder neural network based on end-to-end learning is proposed to issue the matter. The proposed method is evaluated quantitatively and qualitatively on ER + BCa H&E-stained dataset. The average detection accuracy achieved by the method is 98.61%, which is much better than that of competitors.

关键词： Breast cancer nuclei End-to-End learning Fully convolutional neural networks Image masking encoder-decoder Residual networks

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：