检索结果-内蒙古大学图书馆

35th Australasian Joint Conference on Artificial Intelligence (AI)

作者： Abela, Brandon Abu-Khalaf, Jumana Yang, Chi-Wei Robin Masek, Martin Gupta, Ashu Edith Cowan Univ Joondalup WA 6027 Australia Fiona Stanley Hosp Murdoch WA 6150 Australia

ISBN: (纸本)9783031226946;9783031226953

Radiologists are required towrite a descriptive report for each examination they perform which is a time-consuming process. Deep-learning researchers are developing models to automate this process. Currently, the most researched architecture for this task is the encoder-decoder (E-D). An issue with this approach is that these models are optimised to produce output that is more coherent and grammatically correct rather than clinically correct. The current study considers this and instead builds upon a more recent approach that generates reports using a multi-label classification model attached to a Template-based Report Generation (TRG) subsystem. In the current study two TRG models that utilise either a Transformer or CNN classifier are produced and directly compared to the most clinically accurate E-D in the literature at the time of writing. The models were trained using the MIMIC-CXR dataset, a public set of 473,057 chest X-rays and 206,563 corresponding reports. Precision, recall and F1 scores were obtained by applying a rule-based labeller to the MIMIC-CXR reports, applying those labels to the corresponding images, and then using the labeller on the generated reports. The TRG models outperformed the E-D model for clinical accuracy with the largest difference being the recall rate (T-TRG: Precision 0.38, Recall 0.58, F1 0.45;CNN-TRG: Precision 0.34, Recall 0.69, F1 0.42;E-D: Precision 0.38, Recall 0.14, F1 0.19). Examination of the quantitative metrics for each specific abnormality combined with the qualitative assessment concludes that significant progress still needs to be made before clinical integration is safe.

关键词： Medical text Medical imaging Deep learning Templates encoder-decoder CNN Transformer

来源：评论

学校读者我要写书评

暂无评论

Accurate Generation of Trigger-Action Programs with Domain-Adapted Sequence-to-Sequence Learning 30

Accurate Generation of Trigger-Action Programs with Domain-A...

引用

30th IEEE/ACM International Conference on Program Comprehension (ICPC)

作者： Yusuf, Imam Nur Bani Jiang, Lingxiao Lo, David Singapore Management Univ Sch Comp & Informat Syst Singapore Singapore

ISBN: (纸本)9781450392983

Trigger-action programming allows end users to write event-driven rules to automate smart devices and internet services. Users can create a trigger-action program (TAP) by specifying triggers and actions from a set of predefined functions along with suitable data fields for the functions. Many trigger-action programming platforms have emerged as the popularity grows, e.g., IFTTT, Microsoft Power Automate, and Samsung SmartThings. Despite their simplicity, composing trigger-action programs (TAPs) can still be challenging for end users due to the domain knowledge needed and enormous search space of many combinations of triggers and actions. We propose RecipeGen, a new deep learning-based approach that leverages Transformer sequence-to-sequence (seq2seq) architecture to generate TAPs on the fine-grained field-level granularity from natural language descriptions. Our approach adapts autoencoding pre-trained models to warm-start the encoder in the seq2seq model to boost the generation performance. We have evaluated RecipeGen on real-world datasets from the IFTTT platform against the prior state-of-the-art approach on the TAP generation task. Our empirical evaluation shows that the overall improvement against the prior best results ranges from 9.5%-26.5%. Our results also show that adopting a pre-trained autoencoding model boosts the MRR@3 further by 2.8%-10.8%. Further, in the field-level generation setting, RecipeGen achieves 0.591 and 0.575 in terms of MRR@3 and BLEU scores respectively.

关键词： Trigger-Action Programming IFTTT Program Generation Deep Learning encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

AN EFFICIENT END-TO-END IMAGE COMPRESSION TRANSFORMER 29

AN EFFICIENT END-TO-END IMAGE COMPRESSION TRANSFORMER

引用

IEEE International Conference on Image Processing (ICIP)

作者： Jeny, Afsana Ahsan Junayed, Masum Shah Islam, Md Baharul Bahcesehir Univ Dept Comp Engn Istanbul Turkey Amer Univ Malta Coll Data Sci & Engn Bormla Malta

ISBN: (数字)9781665496209

ISBN: (纸本)9781665496209

Image and video compression received significant research attention and expanded their applications. Existing entropy estimation-based methods combine with hyperprior and local context, limiting their efficacy. This paper introduces an efficient end-to-end transformer-based image compression model, which generates a global receptive field to tackle the long-range correlation issues. A hyper encoder-decoder-based transformer block employs a multi-head spatial reduction self-attention (MHSRSA) layer to minimize the computational cost of the self-attention layer and enable rapid learning of multi-scale and high-resolution features. A Casual Global Anticipation Module (CGAM) is designed to construct highly informative adjacent contexts utilizing channel-wise linkages and identify global reference points in the latent space for end-to-end rate-distortion optimization (RDO). Experimental results demonstrate the effectiveness and competitive performance of the KODAK dataset.

关键词： Image compression transformer encoder-decoder entropy model

来源：评论

学校读者我要写书评

暂无评论

DEEP SPATIO-TEMPORAL WIND POWER FORECASTING 47

DEEP SPATIO-TEMPORAL WIND POWER FORECASTING

引用

47th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

作者： Li, Jiangyuan Armandpour, Mohammadreza Texas A&M Univ Dept Stat College Stn TX 77843 USA

ISBN: (纸本)9781665405409

Wind power forecasting has drawn increasing attention among researchers as the consumption of renewable energy grows. In this paper, we develop a deep learning approach based on encoder-decoder structure. Our model forecasts wind power generated by a wind turbine using its spatial location relative to other turbines and historical wind speed data. In this way, we effectively integrate spatial dependency and temporal trends to make turbine-specific predictions. The advantages of our method over existing work can be summarized as 1) it directly predicts wind power based on historical wind speed, without the need for prediction of wind speed first, and then using a transformation;2) it can effectively capture long-term dependency 3) our model is more scalable and efficient compared with other deep learning based methods. We demonstrate the efficacy of our model on the benchmark real-world datasets.

关键词： Spatio-temporal model encoder-decoder wind power forecasting temporal relation

来源：评论

学校读者我要写书评

暂无评论

Label-aware Attention Network with Multi-scale Boosting for Medical Image Segmentation

引用

EXPERT SYSTEMS WITH APPLICATIONS 2024年第PartD期255卷

作者： Wang, Linbo Xu, Peng Cao, Xianfeng Nappi, Michele Wan, Shaohua Anhui Univ Sch Comp Sci & Technol Key Lab Intelligent Comp & Signal Proc MOE Hefei 230601 Peoples R China Univ Salerno Dept Comp Sci Fisciano SA Italy Univ Elect Sci & Technol China Shenzhen Inst Adv Study Shenzhen Peoples R China

Deep medical image segmentation calls for features with strong discrimination and rich scales due to ambiguous background distraction and large variations in object sizes and shapes. In this paper, we propose two modules to obtain these features. First, existing encoders tend to extract similar foreground/background features at blurry boundaries due to mixed-label feature aggregation. To enhance the discrimination of these features, a Label- Aware Attention (LAA) module is presented to reconstruct them by fusing same-label local features. The fusion is guided by local attention maps based on label-aware affinity learning. Second, instead of relying on a single encoder for scale context mining, we propose a Multi-scale Feature Boosting (MFB) module that applies parallel convolution with different receptive fields for scale embedding and integrates an additional backbone for cross- encoder scale reference. Combining LAA and MFB, a new encoder-decoder based framework is presented, where MFBs act as encoder blocks to recursively extract features with rich scale context, while LAA operates in the decoder layer to enhance the label-aware discriminativeness of features. Extensive experiments on three standard medical segmentation datasets demonstrate the effectiveness of the proposed framework.

关键词： Medical image segmentation Attention map encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

A Fractal Curve-Inspired Framework for Enhanced Semantic Segmentation of Remote Sensing Images

引用

SENSORS 2024年第22期24卷 7159页

作者： Wang, Xinhua Yuan, Botao Li, Zhuang Wang, Heqi Northeast Elect Power Univ Sch Comp Sci Jilin 132012 Peoples R China Chinese Acad Sci Changchun Inst Opt Fine Mech & Phys Changchun 130033 Peoples R China

The classification and recognition of features play a vital role in production and daily life;however, the current semantic segmentation of remote sensing images is hampered by background interference and other factors, leading to issues such as fuzzy boundary segmentation. To address these challenges, we propose a novel module for encoding and reconstructing multi-dimensional feature layers. Our approach first utilizes a bilinear interpolation method to downsample the multi-dimensional feature layer in the coding stage of the U-shaped framework. Subsequently, we incorporate a fractal curve module into the encoder, which aggregates points on feature maps from different layers, effectively grouping points from diverse regions. Finally, we introduce an aggregation layer that combines the upsampling method from the UNet series, employing the multi-scale censoring of multi-dimensional feature map outputs from various layers to efficiently capture both spatial and feature information. The experimental results across diverse scenarios demonstrate that our model achieves excellent performance in aggregating point information from feature maps, significantly enhancing semantic segmentation tasks.

关键词： remote sensing images bilinear interpolation fractal curve gather layers encoder-decoder semantic segmentation

来源：评论

学校读者我要写书评

暂无评论

Concrete spalling detection system based on semantic segmentation using deep architectures

引用

COMPUTERS & STRUCTURES 2024年 300卷

作者： Yasmin, Tamanna La, Duc La, Kien Nguyen, Minh Tuan La, Hung Manh Univ Nevada Dept Comp Sci & Engn Adv Robot & Automat Lab 1664 North Virginia StMS0171 Reno NV 89557 USA Univ Utah Kahlert Sch Comp 50 Cent Campus DrRoom 3190 Salt Lake City UT 84112 USA Thai Nguyen Univ Thai Nguyen Univ Technol Dept Elect Engn Adv Wireless Commun Network Lab 6663-2 Natl Rd Thai Nguyen 240000 Vietnam

This paper presents a method for detecting the location of spalling and assessing the severity level of the spalling in concrete surfaces. The proposed method is constructed based on deep learning architectures and multi-class semantic segmentation. The proposed method can detect each pixel as a non-spalling, a deepspalling, or a shallow-spalling. The proposed method consists of three different deep learning architectures with several encoders as backbone networks. Both qualitative and quantitative analyses show that the deep learning architecture with a certain encoder network can detect spalling with different severity levels very well. Additionally, the paper proposes a method to analyze the deep spalling areas of concrete to show their severity levels. The performance analysis shows that this approach provides very convincing results with respect to the actual affected spalling areas. The results convey that this paper achieved a higher level of performance for detecting spalling and assessing the severity of the spalling.

关键词： Spalling detection Spalling severity Deep architecture encoder-decoder Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

SAA: SCALE-AWARE ATTENTION BLOCK FOR MULTI-LESION SEGMENTATION OF FUNDUS IMAGES 19

SAA: SCALE-AWARE ATTENTION BLOCK FOR MULTI-LESION SEGMENTATI...

引用

19th IEEE International Symposium on Biomedical Imaging (IEEE ISBI)

作者： Bo, Wang Li, Tao Liu, Xinhui Wang, Kai Nankai Univ Coll Comp Sci Tianjin Peoples R China

ISBN: (纸本)9781665429238

Multiple lesion segmentation, namely the segmentation of microaneurysms, soft exudate, hard exudate, and haemorrhage is very important to diabetic retinopathy diagnosis. However, the scales of different kinds of lesions are inconsistent. This inconsistent scale problem is unavoidable in the unified architecture design in which identical time of downsampling operations is used for different kinds of lesions. To achieve better performance at different scales, multiscale features need to be captured and adjusted. In this paper, we simply consider features from different stages of an encoder-decoder network as multiscale features. To re-weight importance of multiscale features dynamically, a scale-aware attention (SAA) block which consists of a spatial path and a channel path is introduced. In SAA block, adjusting operations are performed scale-wise instead of channel-wise or uniformly for all scales. Extensive experiments were conducted on two publicly-available datasets to verify the effect of SAA. SAA surpasses popular attention blocks and state-of-the-art results in the overall evaluation while comparable performance can be achieved in the individual evaluation at the same time.

关键词： Lesion segmentation inconsistent scale encoder-decoder attention block

来源：评论

学校读者我要写书评

暂无评论

A COMBINED FEATURE ENCODING NETWORK WITH SEMANTIC ENHANCEMENT FOR IMAGE TAMPERING FORENSICS 18th

A COMBINED FEATURE ENCODING NETWORK WITH SEMANTIC ENHANCEMEN...

引用

18th IFIP WG 11.9 International Conference on Digital Forensics

作者： Luo, Yuling Liang, Ce Zhang, Shunsheng Qin, Sheng Guangxi Normal Univ Elect Engn Guilin Peoples R China Guangxi Normal Univ Sch Elect Engn Guilin Peoples R China

ISBN: (纸本)9783031100789;9783031100772

Image tampering forensics is performed by analyzing images to locate the tampered regions. However, most image tampering detection methods lack locational accuracy and are effective only for specific types of tampering. To address these problems, this chapter proposes a method that employs an encoder-decoder network structure with combined multiple feature encoding to segment tampered regions of an image from untampered regions. Three features, obtained using constrained convolution, steganalysis rich model filtering and common convolution, are combined. During the encoding stage, ring residual units are used to extract features. The combination of multiple features and the ring residual units makes the proposed method most suitable for image tampering detection. Channel attention with a soft threshold function is used to reinforce semantic information in the decoding stage. Experiments with three image forensic datasets, NIST16, COVERAGE and CASIA, demonstrate that the proposed method exhibits strong performance in terms of the F1 score and localization of tampered regions.

关键词： Image tampering encoder-decoder combined features semantics

来源：评论

学校读者我要写书评

暂无评论

Study of facial generation methods after orthodontic treatment 46

Study of facial generation methods after orthodontic treatme...

引用

46th Annual IEEE-Computer-Society International Computers, Software, and Applications Conference (COMPSAC) - Computers, Software, and Applications in an Uncertain World

作者： Tian, Jia-Liang Zhang, Qin-Yan Li, Hai-Zhen Wang, Qing Lei, Yi Zang, Lin Gao, Xue-Mei Yang, Ji-Jiang Beijing Univ Posts & Telecommun Coll Artificial Intelligence Beijing Peoples R China Peking Univ Dept Orthodont Sch & Hosp Stomatol Beijing Peoples R China Tsinghua Univ Dept Automat Beijing Peoples R China Beijing Univ Technol Fac Informat Technol Sch Software Engn Beijing Peoples R China Pharmacovigilance Res Ctr Informat Technol & Data Xiamen Fujian Peoples R China

ISBN: (纸本)9781665488105

As the medical aesthetic market is growing rapidly in China, orthodontic treatment is becoming very common among the adolescent population. However, there are countless doctor-patient disputes due to treatment results that do not meet patients' expectations, so there is an urgent need for a method to predict treatment results. With the development of artificial intelligence technology, generative adversarial network has provided us with a new way of thinking. The purpose of this paper is to accurately predict the face of patients after orthodontic treatment by using generative adversarial network. Therefore, we designed an evaluation index to reflect the difference between the algorithm predicted image and the patient's real image. After that, we designed a network based on encoder-decoder architecture to transform the vectors in StyleGAN latent space. Finally, we carried out experiments to verify the effectiveness of the evaluation index design and the advantages of the algorithm.

关键词： Face Generation Orthodontic Treatment StyleGAN encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：