检索结果-内蒙古大学图书馆

30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

作者： Monsefi, Amin Karimi Karisani, Payam Zhou, Mengxi Ohio State Univ Dept Comp Sci & Engn Columbus OH 43210 USA Univ Illinois Dept Comp Sci Urbana IL USA

ISBN: (纸本)9798400704901

Standard modern machine-learning-based imaging methods have faced challenges in medical applications due to the high cost of dataset construction and, thereby, the limited labeled training data available. Additionally, upon deployment, these methods are usually used to process a large volume of data on a daily basis, imposing a high maintenance cost on medical facilities. In this paper, we introduce a new neural network architecture, termed LoGoNet, with a tailored self-supervised learning (SSL) method to mitigate such challenges. LoGoNet integrates a novel feature extractor within a U-shaped architecture, leveraging Large Kernel Attention (LKA) and a dual encoding strategy to capture both long-range and short-range feature dependencies adeptly. This is in contrast to existing methods that rely on increasing network capacity to enhance feature extraction. This combination of novel techniques in our model is especially beneficial in medical image segmentation, given the difficulty of learning intricate and often irregular body organ shapes, such as the spleen. Complementary, we propose a novel SSL method tailored for 3D images to compensate for the lack of large labeled datasets. Our method combines masking and contrastive learning techniques within a multi-task learning framework and is compatible with both Vision Transformer (ViT) and CNN-based models. We demonstrate the efficacy of our methods in numerous tasks across two standard datasets (i.e., BTCV and MSD). Benchmark comparisons with eight state-of-the-art models highlight LoGoNet's superior performance in both inference time and accuracy. Code available at: https://***/aminK8/Masked-LoGoNet.

关键词： Medical Imaging Image Segmentation dual-encoder Self-Supervised Learning Multi-task learning

来源：评论

学校读者我要写书评

暂无评论

EFFICIENT CONVOLUTION AND TRANSFORMER-BASED NETWORK FOR VIDEO FRAME INTERPOLATION 30

EFFICIENT CONVOLUTION AND TRANSFORMER-BASED NETWORK FOR VIDE...

引用

30th IEEE International Conference on Image Processing (ICIP)

作者： Khalifeh, Issa Murn, Luka Mrak, Marta Izquierdo, Ebroul BBC Res & Dev London England Queen Mary Univ London London England

ISBN: (纸本)9781728198354

Video frame interpolation is an increasingly important research task with several key industrial applications in the video coding, broadcast and production sectors. Recently, transformers have been introduced to the field resulting in substantial performance gains. However, this comes at a cost of greatly increased memory usage, training and inference time. In this paper, a novel method integrating a transformer encoder and convolutional features is proposed. This network reduces the memory burden by close to 50% and runs up to four times faster during inference time compared to existing transformer-based interpolation methods. A dual-encoder architecture is introduced which combines the strength of convolutions in modelling local correlations with those of the transformer for long-range dependencies. Quantitative evaluations are conducted on various benchmarks with complex motion to showcase the robustness of the proposed method, achieving competitive performance compared to state-of-the-art interpolation networks.

关键词： Video frame interpolation transformer complexity reduction dual-encoder

来源：评论

学校读者我要写书评

暂无评论

Analysing the Robustness of dual encoders for Dense Retrieval Against Misspellings 22

Analysing the Robustness of Dual Encoders for Dense Retrieva...

引用

45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR)

作者： Sidiropoulos, Georgios Kanoulas, Evangelos Univ Amsterdam Amsterdam Netherlands

ISBN: (纸本)9781450387323

Dense retrieval is becoming one of the standard approaches for document and passage ranking. The dual-encoder architecture is widely adopted for scoring question-passage pairs due to its efficiency and high performance. Typically, dense retrieval models are evaluated on clean and curated datasets. However, when deployed in real-life applications, these models encounter noisy user-generated text. That said, the performance of state-of-the-art dense retrievers can substantially deteriorate when exposed to noisy text. In this work, we study the robustness of dense retrievers against typos in the user question. We observe a significant drop in the performance of the dual-encoder model when encountering typos and explore ways to improve its robustness by combining data augmentation with contrastive learning. Our experiments on two large-scale passage ranking and open-domain question answering datasets show that our proposed approach outperforms competing approaches. Additionally, we perform a thorough analysis on robustness. Finally, we provide insights on how different typos affect the robustness of embeddings differently and how our method alleviates the effect of some typos but not of others.

关键词： dense retrieval dual-encoder robustness typos misspellings

来源：评论

学校读者我要写书评

暂无评论

KnowReQA: A Knowledge-aware Retrieval Question Answering System 1

引用

15th International Conference on Knowledge Science, Engineering, and Management (KSEM)

作者： Wang, Chuanrui Bai, Jun Zhang, Xiaofeng Yan, Cen Ouyang, Yuanxin Rong, Wenge Xiong, Zhang Beihang Univ Sino French Engineer Sch Beijing Peoples R China Beihang Univ Sch Comp Sci & Engn Beijing Peoples R China

ISBN: (数字)9783031109836

ISBN: (纸本)9783031109836;9783031109829

Retrieval question answering (ReQA) is an essential mechanism to automatically satisfy the users' information needs and overcome the problem of information overload. As a promising solution to achieve fast retrieval from large-scale candidate answers, dual-encoder framework has been widely studied to improve its representation quality for text in the recent years. Inspired by that humans usually answer the question using their background knowledge, in this work, we explore the way to incorporate knowledge entities into the retrieval model to build high-quality text representations and propose novel knowledge-aware text encoding and knowledge-aware text matching modules to facilitate the fusion between text and knowledge. The promising experimental results on various benchmarks prove the potential of the proposed approach.

关键词： Retrieval question answering dual-encoder Knowledge aware retrieval Natural language processing

来源：评论

学校读者我要写书评

暂无评论

Automatic Thai Text Summarization Using Keyword-Based Abstractive Method 17

Automatic Thai Text Summarization Using Keyword-Based Abstra...

引用

17th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP) / 3rd International Conference on Artificial Intelligence and Internet of Things (AIoT)

作者： Ngamcharoen, Parun Sanglerdsinlapachai, Nuttapong Vejjanugraha, Pikul Natl Sci & Technol Dev Agcy Natl Elect & Comp Technol Ctr Pathum Thani Thailand Thai Nichi Inst Technol Dept Int Coll Bangkok Thailand

ISBN: (数字)9781665457279

ISBN: (纸本)9781665457279

Traditionally, the training phase of abstractive text summarization involves inputting two sets of integer sequences;the first set representing the source text, and the second set representing words existing in the reference summary, into the encoder and decoder parts of the model, respectively. However, by using this method, the model tends to perform poorly if the source text includes words which are irrelevant or insignificant to the key ideas. In order to address this issue, we propose a new keywords-based method for abstractive summarization by combining the information provided by the source text and its keywords to generate summary. We utilize a bi-directional long short-term memory model for keyword labelling, using overlapping words between the source text and the reference summary as ground truth. The results obtained from our experiments on ThaiSum dataset show that our proposed method outperforms the traditional encoder-decoder model by 0.0425 on ROUGE-1 F1, 0.0301 on ROUGE-2 F1 and 0.0140 on BERTScore F1.

关键词： Abstractive Text Summarization encoderDecoder dual-encoder Natural Language Processing Deep Learning

来源：评论

学校读者我要写书评

暂无评论

Water index and Swin Transformer Ensemble (WISTE) for water body extraction from multispectral remote sensing images

引用

GISCIENCE & REMOTE SENSING 2023年第1期60卷

作者： Ma, Donghui Jiang, Liguang Li, Jie Shi, Yun Geovis Spatial Technol Co Ltd Xian Peoples R China Southern Univ Sci & Technol Sch Environm Sci & Engn Shenzhen Key Lab Precis Measurement & Early Warnin Shenzhen Peoples R China Xian Surveying & Mapping Inst Xian Peoples R China Xian Univ Sci & Technol Coll Geomat Xian Peoples R China

Automatic surface water body mapping using remote sensing technology is greatly meaningful for studying inland water dynamics at regional to global scales. Convolutional neural networks (CNN) have become an efficient semantic segmentation technique for the interpretation of remote sensing images. However, the receptive field value of a CNN is restricted by the convolutional kernel size because the network only focuses on local features. The Swin Transformer has recently demonstrated its outstanding performance in computer vision tasks, and it could be useful for processing multispectral remote sensing images. In this article, a Water Index and Swin Transformer Ensemble (WISTE) method for automatic water body extraction is proposed. First, a dual-branch encoder architecture is designed for the Swin Transformer, aggregating the global semantic information and pixel neighbor relationships captured by fully convolutional networks (FCN) and multihead self-attention. Second, to prevent the Swin Transformer from ignoring multispectral information, we construct a prediction map ensemble module. The predictions of the Swin Transformer and the Normalized Difference Water Index (NDWI) are combined by a Bayesian averaging strategy. Finally, the experimental results obtained on two distinct datasets demonstrate that the WISTE has advantages over other segmentation methods and achieves the best results. The method proposed in this study can be used for improving regional to continental surface water mapping and related hydrological studies.

关键词： Deep learning water body extraction Swin Transformer water index dual-encoder

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：