检索结果-内蒙古大学图书馆

Multimedia Tools and Applications 2024年 1-17页

作者： Gholami, Aminreza Nasihatkon, Behrooz Soryani, Mohsen School of Computer Engineering Iran University of Science and Technology Tehran Iran School of Computer Engineering K. N. Toosi University of Technology Tehran Iran

Most augmented reality (AR) pipelines typically involve the computation of the camera’s pose in each frame, followed by the 2D projection of virtual objects. The camera pose estimation is commonly implemented as SLAM (Simultaneous Localisation and Mapping) algorithm. However, SLAM systems are often limited to scenarios where the camera intrinsics remain fixed or are known in all frames. This paper presents an initial effort to circumvent the pose estimation stage altogether and directly computes 2D projections using epipolar constraints. To achieve this, we initially calculate the fundamental matrices between the keyframes and each new frame. The 2D locations of objects can then be triangulated by finding the intersection of epipolar lines in the new frame. We propose a robust algorithm that can handle situations where some of the fundamental matrices are entirely erroneous. Most notably, we introduce a depth-buffering algorithm that relies solely on the fundamental matrices, eliminating the need to compute 3D point locations in the target view. By utilizing fundamental matrices, our method remains effective even when all intrinsic camera parameters vary over time. Notably, our proposed approach achieved sufficient accuracy, even with more degrees of freedom in the solution space. © The Author(s), under exclusive licence to Springer science+Business Media, LLC, part of Springer Nature 2024.

关键词： Augmented reality

来源：评论

学校读者我要写书评

暂无评论

Deep Reinforcement Learning Based Adaptive Environmental Selection for Evolutionary Multi-Objective Optimization 13

Deep Reinforcement Learning Based Adaptive Environmental Sel...

引用

13th IEEE Congress on Evolutionary Computation, CEC 2024

作者： Tian, Ye Yao, Lianjie Shao, Shuai Zhang, Yajie Zhang, Xingyi School of Computer Science and Technology Anhui University Hefei China

ISBN: (纸本)9798350308365

Evolutionary algorithms have demonstrated superior performance in solving multi-objective optimization problems (MOPs), but no single algorithm is consistently effective across all MOPs. When using evolutionary algorithms to solve MOPs, environmental selection strategies determining which solutions should survive are crucial to population evolution. While different environmental selection strategies exhibit different search behaviors on various MOPs, existing multi-objective evolutionary algorithms rarely focus on the adaptation of environmental selection strategies. To fill this gap, this paper proposes a framework for assembling environmental selection strategies, which utilizes neural networks to assess the effects of different strategies on population evolution, and employs reinforcement learning to adaptively select the most effective strategies. The effectiveness and versatility of the proposed framework are verified on four test sets, where the proposed framework shows significant superiority over the state-of-the-art. © 2024 IEEE.

关键词： Deep reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Towards Finer Human Reconstruction for Single RGB-D Images 41st

Towards Finer Human Reconstruction for Single RGB-D Images

引用

41st computer Graphics International Conference, CGI 2024

作者： Zhu, Yan Qian, Yu Dai, Renlong Wang, Linbo Liu, Zhengyi Fang, Xianyong School of Computer Science and Technology Anhui University Hefei230601 China

ISBN: (纸本)9783031820205

Existing methods on the parametric model assisted human surface reconstruction from single RGB-D images are still difficult to obtain fine results. This article proposes an improved method which includes three tactics to overcome this limitation. First, a direct optimization scheme is adopted to refine the parametric model for better back prior, considering that the estimated model can be inaccurate and thus affect the reconstruction performances. Second, a new encoder-decoder structured residual-feature based back refinement network is proposed to further polish the initial back surface. It can preserve the global human shapes and poses without missing body parts while keeping local details. Here, a learnable weighted based cross attention module (LCA) is embedded, which adaptively merges the residual features in high levels from both the SMPL-X and initial back depths via cross-attention for rich details. Thirdly, a new silhouette loss on both front and back surfaces is introduced, so that fine back surfaces with smooth transition between the front and back can be reached. With those three tactics, a novel framework is proposed for robust surface reconstruction for single RGB-D images. Experiment results show that the proposed approach can obtain surfaces with significant details without missing parts. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

关键词： 3D reconstruction

来源：评论

学校读者我要写书评

暂无评论

Named Entity Recognition in Electronic Medical Records Incorporating Pre-trained and Multi-Head Attention

IAENG International Journal of Computer Science

引用

IAENG International Journal of computer science 2024年第4期51卷 401-408页

作者： Yang, Haotian Wang, Li Yang, Yanpeng School of Computer Science and Software Engineering University of Science and Technology Liaoning CO Anshan114051 China College of Computer Science and Software Engineering University of Science and Technology Liaoning CO Anshan114051 China Network Information Centre University of Science and Technology Liaoning CO Anshan114051 China

Chinese Named Entity Recognition (NER) for Electronic Medical Records (EMR) is a fundamental task in building a digital hospital and is widely considered to be a sequence annotation problem in the Natural Language Processing domain. (NLP). However, existing deep learning sequence annotation models cannot fully use the large amount of unannotated data for Chinese EMRs that contain a vast number of professional unregistered words, named entities, and inter-of-entity relationships carrying rich professional knowledge. Moreover, the syntactic structure of EMR sentences is complex, and the text is long;the features of the EMR documents often cannot be captured deeply. Aiming at these two problems, this paper proposes a deep learning method that combines Multi-Head Attention with a pre-trained language model. (BERT-BiGRU-Att-CRF). The method uses the BERT pre-trained model to obtain dynamic word vectors combined with contextual information, extracts global semantic features through a Bidirectional Gated Recurrent Unit (BiGRU), obtains augmented semantic features by using the Multi-Head Attention. Finally, using Conditional Random Field (CRF) decoding, outputs the globally optimal label sequence with greatest probability. The model is trained using the CCKS2019 Chinese EMR dataset containing six types of entities: anatomical sites, surgeries, diseases and diagnoses, medicines, laboratory tests, and imaging tests, and good results are achieved with an F1 score of 86.97%. © (2024) International Association of Engineers.

关键词： Random processes

来源：评论

学校读者我要写书评

暂无评论

A Survey of Edge Caching:Key Issues and Challenges

引用

Tsinghua science and technology 2024年第3期29卷 818-842页

作者： Hanwen Li Mingtao Sun Fan Xia Xiaolong Xu Muhammad Bilal School of Computer Science Nanjing University of Information Science and TechnologyNanjing 210044China Shandong Provincial University Laboratory for Protected Horticulture Weifang University of Science and TechnologyWeifang 262700China Reading Academy Nanjing University of Information Science and TechnologyNanjing 210044China School of Software Nanjing University of Information Science and TechnologyNanjing 210044China Department of Computer Engineering Hankuk University of Foreign StudiesYongin-si 17035Republic of Korea

With the rapid development of mobile communication technology and intelligent applications,the quantity of mobile devices and data traffic in networks have been growing exponentially,which poses a great burden to networks and brings huge challenge to servicing user *** caching,which utilizes the storage and computation resources of the edge to bring resources closer to end users,is a promising way to relieve network burden and enhance user *** this paper,we aim to survey the edge caching techniques from a comprehensive and systematic *** first present an overview of edge caching,summarizing the three key issues regarding edge caching,i.e.,where,what,and how to cache,and then introducing several significant caching *** then carry out a detailed and in-depth elaboration on these three issues,which correspond to caching locations,caching objects,and caching strategies,*** particular,we innovate on the issue“what to cache”,interpreting it as the classification of the“caching objects”,which can be further classified into content cache,data cache,and service ***,we discuss several open issues and challenges of edge caching to inspire future investigations in this research area.

关键词： edge caching edge computing caching location caching object caching strategy 5G networkarchitecture Internet of Things(IoT)

来源：评论

学校读者我要写书评

暂无评论

Augmentative Fusion Network for Robust RGBT Tracking 16

Augmentative Fusion Network for Robust RGBT Tracking

引用

2024 16th International Conference on Graphics and Image Processing, ICGIP 2024

作者： Hong, Fanghua Wang, Jinhu Lu, Andong Wang, Qunjing School of Electronic and Information Engineering Anhui University Hefei230601 China School of Computer Science and Technology Anhui University Hefei230601 China

ISBN: (数字)9781510688780

ISBN: (纸本)9781510688773

RGBT tracking aims to take full advantage of the complementary advantages of RGB and thermal infrared (TIR) modalities to achieve robust tracking in complex scenes. However, current approaches face limitations when dealing with the quality-imbalanced problem. In this paper, we introduce a novel augmentative fusion learning framework that aims to maximize the potential of existing fusion modules in modality quality imbalanced scenarios. In particular, we design a modality stochastic degradation strategy to improve the robustness of the fusion module in modality quality imbalance scenarios. Meanwhile, to further enhance the fusion performance with the modality quality imbalance inputs, a self-supervised constraint is introduced to reconstruct the modality features before degradation by combining high-quality modality and degraded modality information. Finally, the effectiveness of the proposed method is verified by evaluating it on two standard RGBT datasets and two state-of-the-art algorithms. And the results indicate that our method achieved superior performance without adding any parameters or computational complexity. © 2025 SPIE.

关键词： RGB color model

来源：评论

学校读者我要写书评

暂无评论

Large-scale Multi-modal Pre-trained Models: A Comprehensive Survey

引用

Machine Intelligence Research 2023年第4期20卷 447-482页

作者： Xiao Wang Guangyao Chen Guangwu Qian Pengcheng Gao Xiao-Yong Wei Yaowei Wang Yonghong Tian Wen Gao Peng Cheng Laboratory Shenzhen518055China School of Computer Science and Technology Anhui UniversityHefei230601China School of Computer Science Peking UniversityBeijing100871China College of Computer Science Sichuan UniversityChengdu610065China

With the urgent demand for generalized deep models,many pre-trained big models are proposed,such as bidirectional encoder representations(BERT),vision transformer(ViT),generative pre-trained transformers(GPT),*** by the success of these models in single domains(like computer vision and natural language processing),the multi-modal pre-trained big models have also drawn more and more attention in recent *** this work,we give a comprehensive survey of these models and hope this paper could provide new insights and helps fresh researchers to track the most cutting-edge ***,we firstly introduce the background of multi-modal pre-training by reviewing the conventional deep learning,pre-training works in natural language process,computer vision,and ***,we introduce the task definition,key challenges,and advantages of multi-modal pre-training models(MM-PTMs),and discuss the MM-PTMs with a focus on data,objectives,network architectures,and knowledge enhanced *** that,we introduce the downstream tasks used for the validation of large-scale MM-PTMs,including generative,classification,and regression *** also give visualization and analysis of the model parameters and results on representative downstream ***,we point out possible research directions for this topic that may benefit future *** addition,we maintain a continuously updated paper list for large-scale pre-trained multi-modal big models:https://***/wangxiao5791509/MultiModal_BigModels_Survey.

关键词： Multi-modal(MM) pre-trained model(PTM) information fusion representation learning deep learning

来源：评论

学校读者我要写书评

暂无评论

A Credible and Fair Federated Learning Framework Based on Blockchain

IEEE Transactions on Artificial Intelligence

引用

IEEE Transactions on Artificial Intelligence 2025年第2期6卷 301-316页

作者： Chen, Leiming Zhao, Dehai Tao, Liping Wang, Kai Qiao, Sibo Zeng, Xingjie Tan, Chee Wei China University of Petroleum School of Computer Science and Technology East China Qingdao266580 China CSIRO's Data61 Sydney2015 Australia Hefei University of Technology Anhui230009 China Nanyang Technological University School of Computer Science and Engineering Singapore639798 Singapore Tiangong University School of Software Tianjin300387 China Southwest Petroleum University School of Computer Science Chengdu610500 China

Federated learning (FL) enables cooperative computation between multiple participants while protecting user privacy. Currently, FL algorithms assume that all participants are trustworthy and their systems are secure. However, the following problems arise in real-world scenarios: 1) Malicious clients disrupt FL through model poisoning and data poisoning attacks. Although some research has proposed secure aggregation methods to solve this problem, most methods have limitations. 2) The current method cannot fairly evaluate client contribution in some scenarios. Some clients exhibit free-rider behavior, seeking to cheat the reward system and manipulate global models. Evaluating client contribution and distributing rewards also present challenges. To address these challenges, we design a trustworthy federated framework to ensure secure computing throughout the federated task process. First, we propose a method of detecting malicious models to guarantee secure model aggregation. Then, we propose a fair method of assessing contribution to identify client-side free-riding behavior. Finally, we implement a computation process based on blockchain and smart contracts to guarantee the trustworthiness and fairness of federated tasks. To validate the performance of our framework, we simulate different types of client attacks and contribution evaluation scenarios on several open-source datasets. The experiments show that our framework ensures the credibility of federated tasks and achieves a fair evaluation of client contributions. © 2020 IEEE.

关键词： Blockchain

来源：评论

学校读者我要写书评

暂无评论

Web API Recommendation via Combining Adaptive Multichannel Graph Representation and xDeepFM Quality Prediction

IEEE Transactions on Artificial Intelligence

引用

IEEE Transactions on Artificial Intelligence 2024年第6期5卷 3218-3232页

作者： Cao, Buqing Qing, Yueying Zhou, Dong Xie, Xiang Kang, Guosheng Liu, Jianxun Fletcher, Kenneth K. Hunan University of Science and Technology School of Computer Science and Engineering Xiangtan411201 China Guangdong University of Foreign Studies School of Information Science and Technology Guangzhou510420 China University of Massachusetts Computer Science BostonMA02125-3393 United States

With the increasing number of Web services, how to provide developers with Web APIs that meet their Mashup requirements accurately and efficiently has become a challenge. Even though the existing methods show improvements in service recommendation, the efficiency and accuracy (ACC) of them still need to be improved due to their limited representation in fuzing network topology and node feature of Web service, and the neglected higher-order feature interactions of Web service. To address this problem, this article proposes a Web APIs recommendation method via combining adaptive multichannel (AMC) graph representation and eXtreme deep factorization machine (xDeepFM) quality prediction. In this method, firstly, specific embedding and shared embedding in Web API node isomorphic network are extracted from the nodes' feature space, topology space, and the combination of the two spaces. Then, attention mechanism is used to adaptively learn the importance weight of each embedding. Next, these embeddings are adaptively integrated to generate the multichannel graph representation of Web APIs for service classification. Finally, aiming at the Web APIs in the service cluster, it utilizes xDeepFM to model and mine the complex feature interactions and predict and rank the scores of Web APIs for Mashup creation. The experimental results on the real datasets of ProgrammableWeb show that compared with DeepFM, wide and deep learning (WDL), FM supported neural network (FNN), neural factorization machine (NFM), and mixed logistic regression (MLR), the method proposed in this article has an average improvement in AUC of 2.3%, 7.9%, 8.0%, 9.6%, and 13.3%. © 2020 IEEE.

关键词： Network topology

来源：评论

学校读者我要写书评

暂无评论

FLAG: frequency-based local and global network for face forgery detection

引用

Multimedia Tools and Applications 2025年第2期84卷 647-663页

作者： Zhou, Kai Sun, Guanglu Wang, Jun Wang, Jiahui Yu, Linsen School of Computer Science and Technology Harbin University of Science and Technology Harbin150080 China Department of Information Engineering and Mathematics University of Siena Siena53100 Italy

Deepfake detection aims to mitigate the threat of manipulated content by identifying and exposing forgeries. However, previous methods primarily tend to perform poorly when confronted with cross-dataset scenarios. To address the above issue, we propose an innovative hybrid network called the Frequency-based Local and Global (FLAG) network to explore local and global information with the help of frequency-domain cues for better generalization capability. In consideration of the fact that forged faces often exhibit flaws in the frequency domain, we design a Frequency-based Attention Enhancement Module (FAEM) to enhance the aggregation of CNN and Vision Transformer (ViT). In this design, local features from CNN are attentively enhanced by selected frequency coefficients in FAEM, facilitating generalizable global features learning by the ViT module. The effectiveness of the proposed method is validated via numerous experiments and the generalization performance is improved under cross-dataset scenarios. Especially, the proposed method have obtained an AUC of 99.26% and an ACC of 96.56% using intra-dataset experimental results on FaceForensics++ (C23). © The Author(s), under exclusive licence to Springer science+Business Media, LLC, part of Springer Nature 2024.

关键词： Face recognition

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：