检索结果-内蒙古大学图书馆

28th National Conference on Communications (NCC)

作者： Yajnanarayana, Vijaya Ericsson Res Lund Sweden

ISBN: (纸本)9781665451369

Beyond 5G networks will operate at high frequencies with wide bandwidths. This brings both opportunities and challenges. Opportunities include high throughput connectivity with low latency. However, one of the main challenges in these networks is due to the high path loss at these operating frequencies, which requires network to be deployed densely to provide coverage. Since these cells have small inter-site-distance (ISD), the dwell-time of the UEs in these cells are small, thus supporting mobility in these types of dense networks is a challenge and require frequent beam or cell reassignments. A pro-active mobility management scheme which exploits the historical trajectories can provide better prediction of cells and beams as UEs move in the coverage area. We propose an AI based method using sequence-to-sequence modeling for the estimation of handover cells/beams along with dwell-time using the trajectory information of the UE. Results indicate that for a dense deployment, an accuracy of more than 90 percent can be achieved for handover cell estimation and very low mean absolute error (MAE) for dwell-time.

关键词： Handover (HO) Mobility Machine-Learning sequence-to-sequence modeling Recurrent Neural Network (RNN) Beamforming Beam Prediction

来源：评论

学校读者我要写书评

暂无评论

Any-to-Many Voice Conversion With Location-Relative sequence-to-sequence modeling

引用

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING 2021年 29卷 1717-1728页

作者： Liu, Songxiang Cao, Yuewen Wang, Disong Wu, Xixin Liu, Xunying Meng, Helen Chinese Univ Hong Kong Dept Syst Engn & Engn Management Human Comp Commun Lab HCCL Hong Kong Peoples R China Univ Cambridge Engn Dept Cambridge CB2 1TN England

This paper proposes an any-to-many location-relative, sequence-to-sequence (seq2seq), non-parallel voice conversion approach, which utilizes text supervision during training. In this approach, we combine a bottle-neck feature extractor (BNE) with a seq2seq synthesis module. During the training stage, an encoder-decoder-based hybrid connectionist-temporal-classification-attention (CTC-attention) phoneme recognizer is trained, whose encoder has a bottle-neck layer. A BNE is obtained from the phoneme recognizer and is utilized to extract speaker-independent, dense and rich spoken content representations from spectral features. Then a multi-speaker location-relative attention based seq2seq synthesis model is trained to reconstruct spectral features from the bottle-neck features, conditioning on speaker representations for speaker identity control in the generated speech. To mitigate the difficulties of using seq2seq models to align long sequences, we down-sample the input spectral feature along the temporal dimension and equip the synthesis model with a discretized mixture of logistic (MoL) attention mechanism. Since the phoneme recognizer is trained with large speech recognition data corpus, the proposed approach can conduct any-to-many voice conversion. Objective and subjective evaluations show that the proposed any-to-many approach has superior voice conversion performance in terms of both naturalness and speaker similarity. Ablation studies are conducted to confirm the effectiveness of feature selection and model design strategies in the proposed approach. The proposed VC approach can readily be extended to support any-to-any VC (also known as one/few-shot VC), and achieve high performance according to objective and subjective evaluations.

关键词： Hidden Markov models Feature extraction Training Acoustics Decoding Pipelines Computational modeling Any-to-many voice conversion location relative attention sequence-to-sequence modeling

来源：评论

学校读者我要写书评

暂无评论

Seq2SeqPy: A Lightweight and Customizable Toolkit for Neural sequence-to-sequence modeling 12

Seq2SeqPy: A Lightweight and Customizable Toolkit for Neural...

引用

12th International Conference on Language Resources and Evaluation (LREC)

作者： Qader, Raheel Portet, Francois Labbe, Cyril Univ Grenoble Alpes LIG Grenoble France

ISBN: (纸本)9791095546344

We present Seq2SeqPy a lightweight toolkit for sequence-to-sequence modeling that prioritizes simplicity and ability to customize the standard architectures easily. The toolkit supports several known models such as Recurrent Neural Networks, Pointer Generator Networks, and transformer model. We evaluate the toolkit on two datasets and we show that the toolkit performs similarly or even better than a very widely used sequence-to-sequence toolkit.

关键词： sequence-to-sequence modeling deep learning toolkit

来源：评论

学校读者我要写书评

暂无评论

SoURA: a user-reliability-aware social recommendation system based on graph neural network

引用

NEURAL COMPUTING & APPLICATIONS 2023年第25期35卷 18533-18551页

作者： Dawn, Sucheta Das, Monidipa Bandyopadhyay, Sanghamitra Indian Stat Inst Machine Intelligence Unit Kolkata India

Exploiting user trust information for developing a recommendation system has gained increasing research interest in recent years. Due to the exchange of opinions about items over the social network, trust plays a crucial role for a user to like or dislike an item. Graph Neural Networks (GNNs), which have the intrinsic power of integrating node information and topological structure, have a high potential to advance the field of trust-aware social recommendation. However, as of now, this area is little explored, with most of the existing GNN-based models ignoring the trust propagation and trust composition properties. To address this issue, in this paper, we propose a novel GNN-based framework that can capture such trust propagation and trust composition aspects by incorporating the concept of 'user-reliability.' Our proposed user-reliability-aware social recommendation framework, termed as SoURA, generates the user-embedding and item-embedding with consideration to the user-reliability values, which, in turn, helps in better evaluation of the user trust. Experimental evaluations on the benchmark Ciao and Epinion datasets demonstrate the effectiveness of incorporating user-reliability for finding user-embedding and item embedding in a social recommendation system. The proposed SoURA is found to show a minimum of 25% improvement over the state-of-the-art GNN-based recommendation algorithms.

关键词： Graph neural networks Recommender system User-reliability sequence-to-sequence modeling Trust-aware recommendation

来源：评论

学校读者我要写书评

暂无评论

Sequential modeling by Leveraging Non-Uniform Distribution of Speech Emotion

引用

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING 2023年 31卷 1087-1099页

作者： Lin, Wei-Cheng Busso, Carlos Univ Texas Dallas Erik Jonsson Sch Engn & Comp Sci Richardson TX 75080 USA

The expression and perception of human emotions are not uniformly distributed over time. Therefore, tracking local changes of emotion within a segment can lead to better models for speech emotion recognition (SER), even when the task is to provide a sentence-level prediction of the emotional content. A challenge to exploring local emotional changes within a sentence is that most existing emotional corpora only provide sentence-level annotations (i.e., one label per sentence). This labeling approach is not appropriate for leveraging the dynamic emotional trends within a sentence. We propose a framework that splits a sentence into a fixed number of chunks, generating chunk-level emotional patterns. The approach relies on emotion rankers to unveil the emotional pattern within a sentence, creating continuous emotional curves. Our approach trains the sentence-level SER model with a sequence-to-sequence formulation by leveraging the retrieved emotional curves. The proposed method achieves the best concordance correlation coefficient (CCC) prediction performance for arousal (0.7120), valence (0.3125), and dominance (0.6324) on the MSP-Podcast corpus. In addition, we validate the approach with experiments on the IEMOCAP and MSP-IMPROV databases. We further compare the retrieved curves with time-continuous emotional traces. The evaluation demonstrates that these retrieved chunk-label curves can effectively capture emotional trends within a sentence, displaying a time-consistency property that is similar to time-continuous traces annotated by human listeners. The proposed SER model learns meaningful, complementary, local information that contributes to the improvement of sentence-level predictions of emotional attributes.

关键词： Hidden Markov models Task analysis Emotion recognition Feature extraction Annotations Speech processing Databases Emotion rankers speech emotion recognition chunk-level segmentation sequence-to-sequence modeling

来源：评论

学校读者我要写书评

暂无评论

ML-assisted Optimization of Securities Lending 23

ML-assisted Optimization of Securities Lending

引用

4th ACM International Conference on AI in Finance (ICAIF)

作者： Prasad, Abhinav Arunachalam, Prakash Motamedi, Ali Bhattacharya, Ranjeeta Liu, Beibei McCormick, Hays Skip Xu, Shengzhe Muralidhar, Nikhil Ramakrishnan, Naren Bank New York Mellon New York NY 10166 USA Virginia Tech Comp Sci Arlington VA USA Stevens Inst Technol Comp Sci Hoboken NJ USA

ISBN: (纸本)9798400702402

This paper presents an integrated methodology to forecast the direction and magnitude of movements of lending rates in security markets. We develop a sequence-to-sequence (seq2seq) modeling framework that integrates feature engineering, motif mining, and temporal prediction in a unified manner to perform forecasting at scale in real-time or near real-time. We have deployed this approach in a large custodial setting demonstrating scalability to a large number of equities as well as newly introduced IPO-based securities in highly volatile environments.

关键词： Securities Lending sequence-to-sequence modeling Motif Mining Deep Learning

来源：评论

学校读者我要写书评

暂无评论

Augmenting Scientific Creativity with an Analogical Search Engine

引用

ACM TRANSACTIONS ON COMPUTER-HUMAN INTERACTION 2022年第6期29卷 57-57页

作者： Kang, Hyeonsu B. Qian, Xin Hope, Tom Shahaf, Dafna Chan, Joel Kittur, Aniket Carnegie Mellon Univ 5000 Forbes Ave Pittsburgh PA 15213 USA Univ Maryland College Pk MD 20742 USA Allen Inst AI Seattle WA 98103 USA Univ Washington Seattle WA 98103 USA Hebrew Univ Jerusalem Jerusalem Israel

Analogies have been central to creative problem-solving throughout the history of science and technology. As the number of scientific articles continues to increase exponentially, there is a growing opportunity for finding diverse solutions to existing problems. However, realizing this potential requires the development of a means for searching through a large corpus that goes beyond surface matches and simple keywords. Here we contribute the first end-to-end system for analogical search on scientific articles and evaluate its effectiveness with scientists' own problems. Using a human-in-the-loop AI system as a probe we find that our system facilitates creative ideation, and that ideation success is mediated by an intermediate level of matching on the problem abstraction (i.e., high versus low). We also demonstrate a fully automated AI search engine that achieves a similar accuracy with the human-in-the-loop system. We conclude with design implications for enabling automated analogical inspiration engines to accelerate scientific innovation.

关键词： Computational analogies innovation scientist users interactive analogical search engine sequence-to-sequence modeling word embeddings think-aloud studies

来源：评论

学校读者我要写书评

暂无评论

NetTraj: A Network-Based Vehicle Trajectory Prediction Model With Directional Representation and Spatiotemporal Attention Mechanisms

引用

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS 2022年第9期23卷 14470-14481页

作者： Liang, Yuebing Zhao, Zhan Univ Hong Kong Dept Urban Planning & Design Hong Kong Peoples R China

Trajectory prediction of vehicles in city-scale road networks is of great importance to various location-based applications such as vehicle navigation, traffic management, and location-based recommendations. Existing methods typically represent a trajectory as a sequence of grid cells, road segments or intention sets. None of them is ideal, as the cell-based representation ignores the road network structures and the other two are less efficient in analyzing city-scale road networks. Moreover, previous models barely leverage spatial dependencies or only consider them at the grid cell level, ignoring the non-Euclidean spatial structure shaped by irregular road networks. To address these problems, we propose a network-based vehicle trajectory prediction model named NetTraj, which represents each trajectory as a sequence of intersections and associated movement directions, and then feeds them into a LSTM encoder-decoder network for future trajectory generation. Furthermore, we introduce a local graph attention mechanism to capture network-level spatial dependencies of trajectories, and a temporal attention mechanism with a sliding context window to capture both short- and long-term temporal dependencies in trajectory data. Extensive experiments based on two real-world large-scale taxi trajectory datasets show that NetTraj outperforms the existing state-of-the-art methods for vehicle trajectory prediction, validating the effectiveness of the proposed trajectory representation method and spatiotemporal attention mechanisms.

关键词： Trajectory Roads Predictive models Hidden Markov models Data models Spatiotemporal phenomena Public transportation Trajectory prediction trajectory representation road networks sequence-to-sequence modeling spatiotemporal attention

来源：评论

学校读者我要写书评

暂无评论

Output Only Damage Detection of a Steel Truss Bridge Based on a Semisupervised BiLSTM modeling Scheme

引用

STRUCTURAL CONTROL & HEALTH MONITORING 2025年第1期2025卷

作者： Zahid, Tazwar Bakhtiyar Rana, Shohel Haque, Md. Niamul Bangladesh Univ Engn & Technol BUET Dept Civil Engn Dhaka 1000 Bangladesh

The application of machine learning techniques in bridge health monitoring is gaining widespread popularity as it overcomes the problems faced by conventional methods. However, the scarcity of labeled data for damaged bridges in training the model acts as a hindrance. The present study proposes a data science-based novel approach for overcoming this hindrance using a semisupervised, output-only method for multiple-level damage identification of a steel truss bridge. The method employs sequence-to-sequence modeling of vehicle-induced vibration response only from a single sensor position. The authors have used a bidirectional long short-term memory (BiLSTM) network for damage feature extraction. A statistical distance metric tool, Kullback-Leibler divergence, has then been utilized for feature discrimination. The method's efficiency is numerically investigated through a 3-D finite element model of a steel truss bridge based on real bridge specifications. A dynamic analysis using a moving vehicle is performed to obtain vehicle-induced accelerations. A total of 36 different damage scenarios have then been incorporated into the bridge. The effect of sensor position and performance because of variation in vehicle operation has also been investigated. The results show that the proposed approach successfully detects all the damage scenarios. The methodology's performance has also been validated in detecting damages for the Old ADA Bridge benchmark data. The methodology successfully detected multiple damage states using a single sensor response.

关键词： damage detection deep learning algorithm sequence-to-sequence modeling statistical metric steel truss bridge vehicle induced response

来源：评论

学校读者我要写书评

暂无评论

ANY-TO-ONE sequence-TO-sequence VOICE CONVERSION USING SELF-SUPERVISED DISCRETE SPEECH REPRESENTATIONS

ANY-TO-ONE SEQUENCE-TO-SEQUENCE VOICE CONVERSION USING SELF-...

引用

IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

作者： Huang, Wen-Chin Wu, Yi-Chiao Hayashi, Tomoki Nagoya Univ Nagoya Aichi Japan

ISBN: (纸本)9781728176055

We present a novel approach to any-to-one (A2O) voice conversion (VC) in a sequence-to-sequence (seq2seq) framework. A2O VC aims to convert any speaker, including those unseen during training, to a fixed target speaker. We utilize vq-wav2vec (VQW2V), a discretized self-supervised speech representation that was learned from massive unlabeled data, which is assumed to be speaker-independent and well corresponds to underlying linguistic contents. Given a training dataset of the target speaker, we extract VQW2V and acoustic features to estimate a seq2seq mapping function from the former to the latter. With the help of a pretraining method and a newly designed postprocessing technique, our model can be generalized to only 5 min of data, even outperforming the same model trained with parallel data.

关键词： voice conversion any-to-one voice conversion self-supervised speech representation vq-wav2vec sequence-to-sequence modeling

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：