检索结果-内蒙古大学图书馆

Lost data reconstruction for structural health monitoring using deep convolutional generative adversarial networks

STRUCTURAL HEALTH MONITORING-AN INTERNATIONAL JOURNAL 2021年第4期20卷 2069-2087页

作者： Lei, Xiaoming Sun, Limin Xia, Ye Tongji Univ Dept Bridge Engn Shanghai 200092 Peoples R China Tongji Univ State Key Lab Disaster Reduct Civil Engn Shanghai Peoples R China

In the application of structural health monitoring, the measured data might be temporarily or permanently lost due to sensor fault or transmission failure. The measured data with a high data loss ratio undermine its ability for modal identifications and structural condition evaluations. To reconstruct the lost data in the field of structural health monitoring, this study proposes a deep convolutional generative adversarial network which includes a generator with encoder-decoder structure and an adversarial discriminator. The proposed generative adversarial network model needs to understand the content of the complete signals, as well as produce realistic hypotheses for the lost signals. Given the data stably measured before the occurrence of data loss, the generator is trained to extract the features maintained in the data set and reconstruct lost signals using the responses of the remaining functional sensors alone. The discriminator feeds back the distinguished results to the generator to improve its reconstruction accuracy. When training the model, the reconstruction loss and the adversarial loss are employed to better handle the low-frequency features and high-frequency features of the signals. The effectiveness and efficiency of the proposed method are validated by two case studies. As the number of training epoch increases, the reconstructed signals learn the features from low-frequency to high-frequency, and the amplitude of the reconstructed signals gradually increases. It can be seen that the final reconstruction signals match well with the real signals in the time domain and frequency domain. To further demonstrate the applicability of the reconstructed signals in data analysis, the reconstructed acceleration data are used to accurately identify the modal parameters in the numerical case, and the vehicle-induced responses are precisely decomposed from the reconstructed strain data in the field case. Finally, the reconstruction capacity is also investigated

关键词： structural health monitoring data loss data reconstruction generative adversarial network encoder-decoder deep learning

来源：评论

学校读者我要写书评

暂无评论

OF-DFN: Optical flow prediction network for different perspective image fusion

引用

NEUROCOMPUTING 2024年 591卷

作者： You, Tianshun Liu, Ming Zhao, Yongming Dong, Liquan Beijing Inst Technol Beijing 100081 Peoples R China Yangtze Delta Reg Acad Beijing Inst Technol Jiaxing 314019 Peoples R China China Aerosp Sci & Ind Corp Def Technol Acad 2 Inst 706 Beijing 100854 Peoples R China

Currently, non -decision -level image fusion algorithms require extremely high registration precision of the images to be fused. In the face of different perspective image fusion scenarios, traditional feature registration algorithms and learning -based methods have poor robustness and are unsuitable for large image differences because of the Registration -Fusion separation. In addition, the lack of relevant datasets also hinders the development of different perspective image fusion methods. Given the above problems, we collect 5000 sets of different perspective RGB-MONO datasets in multiple scenes for raw data support. We present an end -toend learned system for fusing two different perspective photographs into a chosen target view. The cascaded feature extraction based on encoder-decoder structure enables learning optical flow at different feature levels systematically. Then the optical flow module enables the image to be continuously registered and optimized during the fusion process, thus avoiding the deviations introduced by non -end -to -end algorithms. Extensive quantitative and qualitative experiments demonstrate that our proposed system can effectively fuse images from different perspectives in our self -built dataset. Compared with non -end -to -end fusion, our method provides superior performance in several fusion evaluation indicators.

关键词： Image fusion Different perspective Optical flow encoder-decoder End-to-end system

来源：评论

学校读者我要写书评

暂无评论

LA-ResUNet: An Efficient Linear Attention Mechanism in ResUNet for the Semantic Segmentation of Pulmonary Nodules

引用

IEEE ACCESS 2024年 12卷 182894-182907页

作者： Sarah Prithvika, P. C. Jani Anbarasi, L. Vellore Inst Technol Sch Comp Sci & Engn Chennai 600127 India

Numerous people die from lung cancer every year, making it a serious public health issue. Oftentimes, the symptoms of lung cancer manifest only at a later stage, when it is difficult to treat. Pulmonary nodules are commonly found while screening the lungs using a Computed Tomography (CT) scan, and some of the nodules may be cancerous. So, an efficient automated pulmonary nodule segmentation system is needed to isolate the pulmonary nodules from the scan images. The doctors can track the nodules that are likely to be malignant and provide early treatment if they become cancerous, thereby improving the patient's chance of survival. The attention mechanism is a technique that is often used in computer vision to enhance the neural network's performance. LA-ResUNet, a pulmonary nodule segmentation model, built using ResUNet with a linear attention mechanism and the Leaky ReLU activation function is proposed. LA-ResUNet efficiently segments pulmonary nodules, while achieving a linear time and space complexity. By employing residual blocks, it is possible to construct a deep network without facing the vanishing gradient problem. Additionally, it makes deep network training simpler. Skip connections allow for better gradient flow during training and better information flow between layers. Leaky ReLU addresses the dying ReLU scenario, a situation where some neurons cease to learn when the network is being trained. LA-ResUNet was used on the dataset LIDC-IDRI (The Lung Image Database Consortium and Image Database Resource Initiative) and it produced a dice score coefficient (DSC) of 73.11% and Intersection over Union score (IoU) of 60.62%.

关键词： Lungs Lung cancer Image segmentation Tumors Solids Glass Feature extraction Computed tomography Biomedical imaging Blood vessels Convolutional neural network encoder-decoder linear attention pulmonary nodule segmentation ResUNet

来源：评论

学校读者我要写书评

暂无评论

ICEAP: An advanced fine-grained image captioning network with enhanced attribute predictor

引用

DISPLAYS 2024年 84卷

作者： Hossen, Md. Bipul Ye, Zhongfu Abdussalam, Amr Hossain, Mohammad Alamgir Univ Sci & Technol China Sch Informat Sci & Technol Hefei 230027 Anhui Peoples R China

Fine-grained image captioning is a focal point in the vision-to-language task and has attracted considerable attention for generating accurate and contextually relevant image captions. Effective attribute prediction and their utilization play a crucial role in enhancing image captioning performance. Despite progress in prior attribute-related methods, they either focus on predicting attributes related to the input image or concentrate on predicting linguistic context-related attributes at each time step in the language model. However, these approaches often overlook the importance of balancing visual and linguistic contexts, leading to ineffective exploitation of semantic information and a subsequent decline in performance. To address these issues, an Independent Attribute Predictor (IAP) is introduced to precisely predict attributes related to the input image by leveraging relationships between visual objects and attribute embeddings. Following this, an Enhanced Attribute Predictor (EAP) is proposed, initially predicting linguistic context-related attributes and then using prior probabilities from the IAP module to rebalance image and linguistic context-related attributes, thereby generating more robust and enhanced attribute probabilities. These refined attributes are then integrated into the language LSTM layer to ensure accurate word prediction at each time step. The integration of the IAP and EAP modules in our proposed image captioning with the enhanced attribute predictor (ICEAP) model effectively incorporates high-level semantic details, enhancing overall model performance. The ICEAP outperforms contemporary models, yielding significant average improvements of 10.62% in CIDEr-D scores for MS-COCO, 9.63% for Flickr30K and 7.74% for Flickr8K datasets using cross-entropy optimization, with qualitative analysis confirming its ability to generate fine-grained captions.

关键词： Fine-grained image caption Attention mechanism encoder-decoder Independent attribute predictor Enhanced attribute predictor

来源：评论

学校读者我要写书评

暂无评论

Enhancing building segmentation by deep multiview classification for advancing sustainable urban development

引用

JOURNAL OF BUILDING ENGINEERING 2024年 83卷

作者： El Hajjar, Sally Kassem, Hassan Abdallah, Fahed Omrani, Hichem Luxembourg Inst Socio Econ Res LISER Urban Dev & Mobil Dept 11 Porte Sci L-4366 Esch Sur Alzette Luxembourg Lebanese Univ Beirut Lebanon Univ Lorraine Lab LCOMS Metz France

Accurate building segmentation plays a crucial role in a wide range of applications such as urban planning, monitoring, and mapping. Different deep learning models were employed for building segmentation. However, these models analyze images from a single view. Given the limitations of single-view building segmentation models, our research aims to enhance accuracy by proposing a novel multi-view U-Net deep model for accurate building segmentation that incorporates multiple views of the images. We employ two pre-trained convolutional neural network architectures, MobileNetV2 and ResNet50, to extract features representing two different views of our images. By fusing these features, our proposed method effectively captures complementary information, leading to enhanced segmentation accuracy. To further improve the model's performance, we incorporate skip connections and up-convolutional layers to ensure fine-grained feature propagation. Our experimental results on a large building dataset demonstrate a significant improvement in segmentation accuracy 91% compared to state-of-theart methods, highlighting the effectiveness of our multiview fusion approach. The experimental results enhance the benefits of creating different views by adopting the novel concept proposed in this paper. This research has the potential to redefine the landscape of building segmentation in applications such as urban planning and mapping. We also conducted a test on a large study area (city scale of Belval-Luxembourg). This demonstrates the capabilities of our method and its efficiency in segmenting satellite images from a large extent area and reinforces its potential for real-world applications.

关键词： Deep multiview building segmentation U-Net model Pix2pix Skip connection encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

An unsupervised deep global-local views model for anomaly detection in attributed networks

引用

KNOWLEDGE-BASED SYSTEMS 2024年 300卷

作者： Lei, Tianyang Ou, Mengxin Gong, Chang Li, Jichao Yang, Kewei Natl Univ Def Technol Coll Syst Engn Deya Rd 109 Changsha 410073 Peoples R China

An attribute network is a form of data that contains rich semantic information. Many real scenarios can be modeled as attributed networks, such as social media, citations, and traffic networks. Anomaly detection in attributed networks is an interesting research topic owing to its potential in various practical applications, including spam, network intrusion, and financial fraud detection. However, attributed networks exhibit many anomaly patterns, such as structural, attribute, local, and global anomalies, making anomaly detection in attributed networks a challenging task. To address these difficulties, we designed DeepGL, a novel unsupervised deep global-local view model, for anomaly detection in attributed networks. Our model is an encoder-decoder framework with multiple views that capture node attributes and network structure information from both global and local views. Specifically, our model contains two encoders and four decoders. The two encoders are used to capture network features from local and global views, and the four decoders are used to reconstruct the local node attribute information, local structure information, global node attribute information, and global structure information. To the encoders and decoders, we applied Laplacian sharpening and smoothing techniques to maintain the integrity of normal node features while diminishing the conspicuousness of anomalous nodes in the reconstructed information, thereby facilitating the calculation of reconstruction errors. Extensive experiments on four real-world attributed network datasets demonstrate the excellent performance of the proposed method.

关键词： Anomaly detection Attributed network Graph convolutional networks encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

Learning to Automatically Generate Accurate ECG Captions 5

Learning to Automatically Generate Accurate ECG Captions

引用

5th International Conference on Medical Imaging with Deep Learning (MIDL)

作者： Bartels, Mathieu G. G. Najdenkoska, Ivona van de Leur, Rutger R. Sammani, Arjan Taha, Karim Knigge, David M. Doevendans, Pieter A. Worring, Marcel van Es, Rene Univ Med Ctr Utrecht Dept Cardiol Utrecht Netherlands Univ Amsterdam Amsterdam Netherlands Netherlands Heart Inst Utrecht Netherlands

The electrocardiogram (ECG) is an affordable, non-invasive and quick method to gain essential information about the electrical activity of the heart. Interpreting ECGs is a time-consuming process even for experienced cardiologists, which motivates the current usage of rule-based methods in clinical practice to automatically describe ECGs. However, in comparison to descriptions created by experts, ECG-descriptions generated by such rule-based methods show considerable limitations. Inspired by image captioning methods, we instead propose a data-driven approach for ECG description generation. We introduce a label-guided Transformer model, and show that it is possible to automatically generate relevant and readable ECG descriptions with a data-driven captioning model. We incorporate prior ECG labels into our model design, and show this improves the overall quality of generated descriptions. We find that training these models on free-text annotations of ECGs - instead of the clinically-used computer generated ECG descriptions - greatly improves performance. Moreover, we perform a human expert evaluation study of our best system, which shows that our data-driven approach improves upon existing rule-based methods.

关键词： Transformer encoder-decoder ECG Signal processing ResNet Captioning

来源：评论

学校读者我要写书评

暂无评论

A Faster Approach For Direct Speech to Speech Translation 3

A Faster Approach For Direct Speech to Speech Translation

引用

IEEE Women in Technology Conference (WINTECHCON) - Smarter Technologies for a Sustainable and Hyper-Connected World

作者： Shankarappa, Rashmi T. Tiwari, Sourabh Samsung R&D Inst Voice Intelligence Team Bengaluru India

ISBN: (数字)9781665486743

ISBN: (纸本)9781665486743

As the world is pacing towards globalization, the demand for automatic language translators is increasing rapidly. Traditional translation systems consist of multiple steps like speech recognition, text to text machine translation, and speech generation. Issue with these systems are, latency due to multiple steps and error propagation from first steps toward last steps. Another challenge is that many spoken languages do not have text representation, so traditional system involving speech to text and text to text translation do not work. In this paper, we are presenting a recurrent neural network (RNN) based translation system that can generate a direct waveform of target language audio. We have used the sparse coding technique for the extraction and inversion of audio features. An attention-based multi-layered sequence to sequence model is trained using a novel technique on a dataset of Spanish to English audio and no intermediate text representation is used while training or inference. We have done performance comparison of proposed approaches using latency, bilingual evaluation understudy (BLEU) score and Perceptual Evaluation of Speech Quality PESQ score analysis. The resulting system provides a very fast translation with good translation accuracy and audio quality.

关键词： Speech Signal Processing Machine Learning Translation System encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

Automated Radiology Report Generation Using a Transformer-Template System: Improved Clinical Accuracy and an Assessment of Clinical Safety 35th

Automated Radiology Report Generation Using a Transformer-Te...

引用

35th Australasian Joint Conference on Artificial Intelligence (AI)

作者： Abela, Brandon Abu-Khalaf, Jumana Yang, Chi-Wei Robin Masek, Martin Gupta, Ashu Edith Cowan Univ Joondalup WA 6027 Australia Fiona Stanley Hosp Murdoch WA 6150 Australia

ISBN: (纸本)9783031226946;9783031226953

Radiologists are required towrite a descriptive report for each examination they perform which is a time-consuming process. Deep-learning researchers are developing models to automate this process. Currently, the most researched architecture for this task is the encoder-decoder (E-D). An issue with this approach is that these models are optimised to produce output that is more coherent and grammatically correct rather than clinically correct. The current study considers this and instead builds upon a more recent approach that generates reports using a multi-label classification model attached to a Template-based Report Generation (TRG) subsystem. In the current study two TRG models that utilise either a Transformer or CNN classifier are produced and directly compared to the most clinically accurate E-D in the literature at the time of writing. The models were trained using the MIMIC-CXR dataset, a public set of 473,057 chest X-rays and 206,563 corresponding reports. Precision, recall and F1 scores were obtained by applying a rule-based labeller to the MIMIC-CXR reports, applying those labels to the corresponding images, and then using the labeller on the generated reports. The TRG models outperformed the E-D model for clinical accuracy with the largest difference being the recall rate (T-TRG: Precision 0.38, Recall 0.58, F1 0.45;CNN-TRG: Precision 0.34, Recall 0.69, F1 0.42;E-D: Precision 0.38, Recall 0.14, F1 0.19). Examination of the quantitative metrics for each specific abnormality combined with the qualitative assessment concludes that significant progress still needs to be made before clinical integration is safe.

关键词： Medical text Medical imaging Deep learning Templates encoder-decoder CNN Transformer

来源：评论

学校读者我要写书评

暂无评论

Accurate Generation of Trigger-Action Programs with Domain-Adapted Sequence-to-Sequence Learning 30

Accurate Generation of Trigger-Action Programs with Domain-A...

引用

30th IEEE/ACM International Conference on Program Comprehension (ICPC)

作者： Yusuf, Imam Nur Bani Jiang, Lingxiao Lo, David Singapore Management Univ Sch Comp & Informat Syst Singapore Singapore

ISBN: (纸本)9781450392983

Trigger-action programming allows end users to write event-driven rules to automate smart devices and internet services. Users can create a trigger-action program (TAP) by specifying triggers and actions from a set of predefined functions along with suitable data fields for the functions. Many trigger-action programming platforms have emerged as the popularity grows, e.g., IFTTT, Microsoft Power Automate, and Samsung SmartThings. Despite their simplicity, composing trigger-action programs (TAPs) can still be challenging for end users due to the domain knowledge needed and enormous search space of many combinations of triggers and actions. We propose RecipeGen, a new deep learning-based approach that leverages Transformer sequence-to-sequence (seq2seq) architecture to generate TAPs on the fine-grained field-level granularity from natural language descriptions. Our approach adapts autoencoding pre-trained models to warm-start the encoder in the seq2seq model to boost the generation performance. We have evaluated RecipeGen on real-world datasets from the IFTTT platform against the prior state-of-the-art approach on the TAP generation task. Our empirical evaluation shows that the overall improvement against the prior best results ranges from 9.5%-26.5%. Our results also show that adopting a pre-trained autoencoding model boosts the MRR@3 further by 2.8%-10.8%. Further, in the field-level generation setting, RecipeGen achieves 0.591 and 0.575 in terms of MRR@3 and BLEU scores respectively.

关键词： Trigger-Action Programming IFTTT Program Generation Deep Learning encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：