检索结果-内蒙古大学图书馆

Viterbi Decoding of Directed Acyclic Transformer for Non-Autoregressive Machine Translation

学校读者我要写书评

暂无评论

arXiv 2022年

作者： Shao, Chenze Ma, Zhengrui Feng, Yang Key Laboratory of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences China University of Chinese Academy of Sciences China

Non-autoregressive models achieve significant decoding speedup in neural machine translation but lack the ability to capture sequential dependency. Directed Acyclic Transformer (DA-Transformer) was recently proposed to model sequential dependency with a directed acyclic graph. Consequently, it has to apply a sequential decision process at inference time, which harms the global translation accuracy. In this paper, we present a Viterbi decoding framework for DA-Transformer, which guarantees to find the joint optimal solution for the translation and decoding path under any length constraint. Experimental results demonstrate that our approach consistently improves the performance of DA-Transformer while maintaining a similar decoding speedup. Copyright © 2022, The Authors. All rights reserved.

关键词： Decoding

Self-distillation Augmented Masked Autoencoders for Histopathological Image Classification

学校读者我要写书评

暂无评论

arXiv 2022年

作者： Luo, Yang Chen, Zhineng Zhou, Shengtian Gao, Xieping Fudan University Shanghai China Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing Hunan Normal University China

Self-supervised learning (SSL) has drawn increasing attention in histopathological image analysis in recent years. Compared to contrastive learning which is troubled with the false negative problem, i.e., semantically similar images are selected as negative samples, masked autoencoders (MAE) building SSL from a generative paradigm is probably a more appropriate pre-training. In this paper, we introduce MAE and verify the effect of visible patches for histopathological image understanding. Moreover, a novel SD-MAE model is proposed to enable a self-distillation augmented MAE. Besides the reconstruction loss on masked image patches, SD-MAE further imposes the self-distillation loss on visible patches to enhance the representational capacity of the encoder located shallow layer. We apply SD-MAE to histopathological image classification, cell segmentation and object detection. Experiments demonstrate that SD-MAE shows highly competitive performance when compared with other SSL methods in these tasks. © 2022, CC BY-NC-SA.

关键词： Distillation

Design of Filtering Power Divider with Wide Stopband Based on QMSIW 2

学校读者我要写书评

暂无评论

Design of Filtering Power Divider with Wide Stopband Based o...

2020 2nd International Conference on Artificial Intelligence Technologies and Application, ICAITA 2020

作者： Cheng, Yi Huang, Zhixiang Wang, Zhongmiao Wang, Chao Li, Xuemei Key Lab of Ministry of Education of Intelligent Computing and Signal Processing Anhui University Hefei Anhui230601 China

A miniaturized filtering power divider with wide stopband based on quarter-mode substrate integrated waveguide (QMSIW) is proposed. The physical size of the QMSIW is reduced by 3/4 comparing with the traditional SIW structure, which can achieve miniaturization, and its electromagnetic field distribution has little effect. Using the metal interference column and the etched U-shaped groove to adjust the resonance frequency, so that the main mode and the first higher mode form a passband, meanwhile, suppressing the second higher mode. The simulation results show that the center frequency is 4.42 GHz with insertion loss of 0.45 dB, the 3-dB bandwidth is 4.12-4.63 GHz with a fractional bandwidths of 9.7%, and there is a great stopband up to 11 GHz with 15 dB rejection level. Besides, the return loss of passband is better than 17.5 dB. To verify the validity of this design, a prototype is fabricated and measured. The measured results demonstrate that it has a good agreement with the design simulations. © 2020 Published under licence by IOP Publishing Ltd.

关键词： Substrate integrated waveguides

Belief-selective Propagation Detection for MIMO Systems

学校读者我要写书评

暂无评论

arXiv 2022年

作者： Zhou, Wenyue Shen, Yifei Li, Liping Huang, Yongming Zhang, Chuan You, Xiaohu National Mobile Communications Research Laboratory Southeast University Nanjing210096 China Purple Mountain Laboratories Nanjing211189 China Key Laboratory of Intelligent Computing and Signal Processing Ministry of Education Anhui University Hefei230039 China

Compared to the linear MIMO detectors, the Belief Propagation (BP) detector has shown greater capabilities in achieving near optimal performance and better nature to iteratively cooperate with channel decoders. Aiming at real applications, recent works mainly fall into the category of reducing the complexity by simplified calculations, at the expense of performance sacrifice. However, the complexity is still unsatisfactory with exponentially increasing complexity or required exponentiation operations. Furthermore, due to the inherent loopy structure, the existing BP detectors persistently encounter error floor in high signal-to-noise ratio (SNR) region, which becomes even worse with calculation approximation. This work aims at a revised BP detector, named Belief-selective Propagation (BsP) detector by selectively utilizing the trusted incoming messages with sufficiently large a priori probabilities for updates. Two proposed strategies: symbol-based truncation (ST) and edge-based simplification (ES) squeeze the complexity (orders lower than the Original-BP), while greatly relieving the error floor issue over a wide range of antenna and modulation combinations. For the 16-QAM 8 × 4 MIMO system, the B(1, 1) BsP detector achieves more than 4 dB performance gain (@BER = 10−4) with roughly 4 orders lower complexity than the Original-BP detector. Trade-off between performance and complexity towards different application requirement can be conveniently obtained by configuring the ST and ES parameters. © 2022, CC BY.

关键词： Belief propagation

Robust 3D phase retrieval via compressed support detection from snapshot diffraction pattern

学校读者我要写书评

暂无评论

Computers in Biology and Medicine 2024年 177卷 108644-108644页

作者： Zhang, Cheng Zhang, Liru Zhang, Ru Chen, Mingsheng Wei, Sui Key Laboratory of Intelligent Computing and Signal Processing Ministry of Education Anhui University Anhui Province Hefei230601 China Department of Electronic Engineering Tsinghua University Beijing100084 China School of Integrated Circuits Anhui University Anhui Province Hefei230601 China Anhui Provincial High-performance Integrated Circuit Engineering Research Center Anhui University Anhui Province Hefei230601 China

Traditional multislice iterative phase retrieval (MIPR) from snapshot two-dimensional measurements suffers from the two limitations of pre-defined support and iterative stagnation. To eliminate the requirements for priori knowledge of support masks, this paper proposes a multislice iterative phase retrieval algorithm based on compressed support detection and hybrid input-output algorithm (CSD-MIPR-HIO). The CSD-MIPR-HIO algorithm firstly uses compressed support detection to adaptively detect the support masks of each plane from single 2D diffraction intensity, and then uses a hybrid input-output (HIO) iterative algorithm for MIPR. The proposed method breaks the limitations of traditional MIPR algorithms on priori knowledge of support masks and achieve high-quality reconstruction in noisy environments. Numerical and optical experiments confirm the feasibility, superiority, and robustness of our proposed CSD-MIPR-HIO method. © 2024 Elsevier Ltd

关键词： Iterative methods

Non-Monotonic Latent Alignments for CTC-Based Non-Autoregressive Machine Translation

学校读者我要写书评

暂无评论

arXiv 2022年

作者： Shao, Chenze Feng, Yang Key Laboratory of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences China University of Chinese Academy of Sciences China

Non-autoregressive translation (NAT) models are typically trained with the cross-entropy loss, which forces the model outputs to be aligned verbatim with the target sentence and will highly penalize small shifts in word positions. Latent alignment models relax the explicit alignment by marginalizing out all monotonic latent alignments with the CTC loss. However, they cannot handle non-monotonic alignments, which is non-negligible as there is typically global word reordering in machine translation. In this work, we explore non-monotonic latent alignments for NAT. We extend the alignment space to non-monotonic alignments to allow for the global word reordering and further consider all alignments that overlap with the target sentence. We non-monotonically match the alignments to the target sentence and train the latent alignment model to maximize the F1 score of non-monotonic matching. Extensive experiments on major WMT benchmarks show that our method substantially improves the translation performance of CTC-based models. Our best model achieves 30.06 BLEU on WMT14 En-De with only one-iteration decoding, closing the gap between non-autoregressive and autoregressive models.2 Copyright © 2022, The Authors. All rights reserved.

关键词： Alignment

Real-Time Action Detection Method based on Multi-Scale Spatiotemporal Feature

学校读者我要写书评

暂无评论

Real-Time Action Detection Method based on Multi-Scale Spati...

Image processing, Computer Vision and Machine Learning (ICICML), International Conference on

作者： Xin Miao Xiao Ke Fujian Key Laboratory of Network Computing and Intelligent Information Processing College of Computer and Data Science Fuzhou University Fuzhou Fujian China

ISBN: (纸本)9781665464697

Spatiotemporal action detection relies on the learning of video spatial and temporal information. The current state-of-the-art convolutional neural network-based action detectors have achieved remarkable results using 2D CNN or 3D CNN architectures. However, due to the complexity of the network structure and spatiotemporal information perception, these methods are usually used in a non-real-time, offline manner. The main challenge of spatiotemporal action detection is to design an effective detection network architecture and effectively perceive the fused spatiotemporal features. Aiming at the above problems, our paper proposes a real-time action detection method based on multi-scale spatiotemporal feature. Aiming at the problem that only 2D or 3D backbone network cannot effectively model spatiotemporal features, we extract spatiotemporal features by multi-branch feature extraction networks respectively. For the lack of descriptiveness of single-scale spatiotemporal features, a multi-scale spatiotemporal feature-aware attention network is proposed to learn long-term temporal dependencies and spatial context information. And the fusion between temporal and spatial features is guided by fusion attention to highlight more discriminative spatiotemporal feature representations. The proposed method achieves 82.59% and 78.30% accuracy on two spatiotemporal action datasets UCF101-24 and JHMDB-21, respectively and reaching 73 frames/s.

关键词： Solid modeling Computer vision Three-dimensional displays Machine learning Detectors Network architecture Feature extraction

Coherence-Based Distributed Document Representation Learning for Scientific Documents

学校读者我要写书评

暂无评论

arXiv 2022年

作者： Tan, Shicheng Zhao, Shu Zhang, Yanping Key Laboratory of Intelligent Computing and Signal Processing Ministry of Education Anhui Province 230601 China School of Computer Science and Technology Anhui University Anhui Province Hefei230601 China Information Materials and Intelligent Sensing Laboratory of Anhui Province Anhui Province 230601 China

Distributed document representation is one of the basic problems in natural language processing. Currently distributed document representation methods mainly consider the context information of words or sentences. These methods do not take into account the coherence of the document as a whole, e.g., a relation between the paper title and abstract, headline and description, or adjacent bodies in the document. The coherence shows whether a document is meaningful, both logically and syntactically, especially in scientific documents (papers or patents, etc.). In this paper, we propose a coupled text pair embedding (CTPE) model to learn the representation of scientific documents, which maintains the coherence of the document with coupled text pairs formed by segmenting the document. First, we divide the document into two parts (e.g., title and abstract, etc) which construct a coupled text pair. Then, we adopt negative sampling to construct uncoupled text pairs whose two parts are from different documents. Finally, we train the model to judge whether the text pair is coupled or uncoupled and use the obtained embedding of coupled text pairs as the embedding of documents. We perform experiments on three datasets for one information retrieval task and two recommendation tasks. The experimental results verify the effectiveness of the proposed CTPE model. Copyright © 2022, The Authors. All rights reserved.

关键词： Information retrieval

BALANCED SNR-AWARE DISTILLATION FOR GUIDED TEXT-TO-AUDIO GENERATION

学校读者我要写书评

暂无评论

arXiv 2023年

作者： Liu, Bingzhi Cao, Yin Liu, Haohe Zhou, Yi School of Communication and Information Engineering Chongqing University of Posts and Telecommunications Chongqing China Chongqing Key Laboratory of Signal and Information Processing CQUPT Chongqing China CQUPT Chongqing China Department of Intelligent Science Xi’an Jiaotong-Liverpool University China Centre for Vision Speech and Signal Processing University of Surrey United Kingdom

Diffusion models have demonstrated promising results in text-to-audio generation tasks. However, their practical usability is hindered by slow sampling speeds, limiting their applicability in high-throughput scenarios. To address this challenge, progressive distillation methods have been effective in producing more compact and efficient models. Nevertheless, these methods encounter issues with unbalanced weights at both high and low noise levels, potentially impacting the quality of generated samples. In this paper, we propose the adaptation of the progressive distillation method to text-to-audio generation tasks and introduce the Balanced SNR-Aware (BSA) method, an enhanced loss-weighting mechanism for diffusion distillation. The BSA method employs a balanced approach to weight the loss for both high and low noise levels. We evaluate our proposed method on the AudioCaps dataset and report experimental results showing superior performance during the reverse diffusion process compared to previous distillation methods with the same number of sampling steps. Furthermore, the BSA method allows for a significant reduction in sampling steps from 200 to 25, with minimal performance degradation when compared to the original teacher models. © 2023, CC BY.

关键词： Distillation