检索结果-内蒙古大学图书馆

arXiv 2023年

作者： An, Xiaoqi Zhao, Lin Gong, Chen Wang, Nannan Wang, Di Yang, Jian PCA Lab Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education Jiangsu Key Lab of Image and Video Understanding for Social Security School of Computer Science and Engineering Nanjing University of Science and Technology China State Key Laboratory of Integrated Services Networks Xidian University China

High-resolution representation is essential for achieving good performance in human pose estimation models. To obtain such features, existing works utilize high-resolution input images or fine-grained image tokens. However, this dense high-resolution representation brings a significant computational burden. In this paper, we address the following question: "Only sparse human keypoint locations are detected for human pose estimation, is it really necessary to describe the whole image in a dense, high-resolution manner?" Based on dynamic transformer models, we propose a framework that only uses Sparse High-resolution Representations for human Pose estimation (SHaRPose). In detail, SHaRPose consists of two stages. At the coarse stage, the relations between image regions and keypoints are dynamically mined while a coarse estimation is generated. Then, a quality predictor is applied to decide whether the coarse estimation results should be refined. At the fine stage, SHaRPose builds sparse high-resolution representations only on the regions related to the keypoints and provides refined high-precision human pose estimations. Extensive experiments demonstrate the outstanding performance of the proposed method. Specifically, compared to the state-of-the-art method ViTPose, our model SHaRPose-Base achieves 77.4 AP (+0.5 AP) on the COCO validation set and 76.7 AP (+0.5 AP) on the COCO test-dev set, and infers at a speed of 1.4× faster than ViTPose-Base. Code is available at https://***/AnxQ/sharpose. Copyright © 2023, The Authors. All rights reserved.

关键词： Machine learning

来源：评论

学校读者我要写书评

暂无评论

Towards End-to-End Unsupervised Saliency Detection with Self-Supervised Top-Down Context

arXiv

引用

arXiv 2023年

作者： Song, Yicheng Gao, Shuyong Xing, Haozhe Cheng, Yiting Wang, Yan Zhang, Wenqiang Shanghai Key Lab of Intelligent Information Processing School of Computer Science Fudan University Shanghai China Keenon Robotics Co. Ltd. Shanghai China Academy for Engineering & Technology Fudan University Shanghai China

Unsupervised salient object detection aims to detect salient objects without using supervision signals eliminating the tedious task of manually labeling salient objects. To improve training efficiency, end-to-end methods for USOD have been proposed as a promising alternative. However, current solutions rely heavily on noisy handcraft labels and fail to mine rich semantic information from deep features. In this paper, we propose a self-supervised end-to-end salient object detection framework via top-down context. Specifically, motivated by contrastive learning, we exploit the self-localization from the deepest feature to construct the location maps which are then leveraged to learn the most instructive segmentation guidance. Further considering the lack of detailed information in deepest features, we exploit the detail-boosting refiner module to enrich the location labels with details. Moreover, we observe that due to lack of supervision, current unsupervised saliency models tend to detect non-salient objects that are salient in some other samples of corresponding scenarios. To address this widespread issue, we design a novel Unsupervised Non-Salient Suppression (UNSS) method developing the ability to ignore non-salient objects. Extensive experiments on benchmark datasets demonstrate that our method achieves leading performance among the recent end-to-end methods and most of the multi-stage solutions. The code is available. Copyright © 2023, The Authors. All rights reserved.

关键词： Object detection

来源：评论

学校读者我要写书评

暂无评论

Inspire the Large Language Model by External Knowledge on BioMedical Named Entity Recognition

arXiv

引用

arXiv 2023年

作者： Bian, Junyi Zheng, Jiaxuan Zhang, Yuyi Zhu, Shanfeng School of Computer Science Fudan University Shanghai200433 China Institute of Science and Technology for Brain-Inspired Intelligence Fudan University China Shanghai Key Lab of Intelligent Information Processing Fudan University Shanghai200433 China

Large language models (LLMs) have demonstrated dominating performance in many NLP tasks, especially on generative tasks. However, they often fall short in some information extraction tasks, particularly those requiring domain-specific knowledge, such as Biomedical Named Entity Recognition (NER). In this paper, inspired by Chain-of-thought, we leverage the LLM to solve the Biomedical NER step-by-step: break down the NER task into entity span extraction and entity type determination. Additionly, for entity type determination, we inject entity knowledge to address the problem that LLM’s lack of domain knowledge when predicting entity category. Experimental results show a significant improvement in our two-step BioNER approach compared to previous few-shot LLM baseline. Additionally, the incorporation of external knowledge significantly enhances entity category determination performance. Copyright © 2023, The Authors. All rights reserved.

关键词： Domain Knowledge

来源：评论

学校读者我要写书评

暂无评论

Clothed Human Performance Capture with a Double-layer Neural Radiance Fields

Clothed Human Performance Capture with a Double-layer Neural...

引用

Conference on computer Vision and Pattern Recognition (CVPR)

作者： Kangkan Wang Guofeng Zhang Suxu Cong Jian Yang Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education Jiangsu Key Lab of Image and Video Understanding for Social Security School of Computer Science and Engineering Nanjing University of Science and Technology China State Key Laboratory of CAD&CG Zhejiang University China

This paper addresses the challenge of capturing performance for the clothed humans from sparse-view or monocular videos. Previous methods capture the performance of full humans with a personalized template or recover the garments from a single frame with static human poses. However, it is inconvenient to extract cloth semantics and capture clothing motion with one-piece template, while single frame-based methods may suffer from instable tracking across videos. To address these problems, we propose a novel method for human performance capture by tracking clothing and human body motion separately with a double-layer neural radiance fields (NeRFs). Specifically, we propose a double-layer NeRFsfor the body and garments, and track the densely deforming template of the clothing and body by jointly optimizing the deformation fields and the canonical double-layer NeRFs. In the optimization, we introduce a physics-aware cloth simulation network which can help generate physically plausible cloth dynamics and body-cloth interactions. Compared with existing methods, our method is fully differentiable and can capture both the body and clothing motion robustly from dynamic videos. Also, our method represents the clothing with an independent NeRFs, allowing us to model implicit fields of general clothes feasibly. The experimental evaluations validate its effectiveness on real multi-view or monocular videos.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Characterizing and Understanding Development of Social Computing Through DBLP: A Data-Driven Analysis

引用

Journal of Social Computing 2022年第4期3卷 287-302页

作者： Jiaqi Wu Bodian Ye Qingyuan Gong Atte Oksanen Cong Li Jingjing Qu Felicia F.Tian Xiang Li Yang Chen the Shanghai Key Lab of Intelligent Information Processing the School of Computer ScienceFudan UniversityShanghai 200438China the Faculty of Social Sciences Tampere UniversityTampere 33100Finland School of Information Science and Technology Fudan UniversityShanghai 200438China Shanghai Artificial Intelligence Laboratory Shanghai 200232China. School of Social Development and Public Policy Fudan UniversityShanghai 200433China. Institute of Complex Networks and Intelligent Systems Shanghai Research Institute for Intelligent Autonomous SystemsTongji UniversityShanghai 200092China.

During the past decades,the term“social computing”has become a promising interdisciplinary area in the intersection of computer science and social *** this work,we conduct a data-driven study to understand the development of social computing using the data collected from Digital Bibliography and Library Project(DBLP),a representative computer science bibliography *** have observed a series of trends in the development of social computing,including the evolution of the number of publications,popular keywords,top venues,international collaborations,and research *** findings will be helpful for researchers and practitioners working in relevant fields.

关键词： social computing Digital Bibliography and Library Project(DBLP) bibliometric evolution visualization

来源：评论

学校读者我要写书评

暂无评论

VANER: Leveraging Large Language Model for Versatile and Adaptive Biomedical Named Entity Recognition 27

VANER: Leveraging Large Language Model for Versatile and Ada...

引用

27th European Conference on Artificial Intelligence, ECAI 2024

作者： Bian, Junyi Zhai, Weiqi Huang, Xiaodi Zheng, Jiaxuan Zhu, Shanfeng School of Computer Science Fudan University Shanghai200433 China Institute of Science and Technology for Brain-Inspired Intelligence Fudan University China Ministry of Education Shanghai200433 China MOE Frontiers Center for Brain Science Fudan University Shanghai200433 China Zhangjiang Fudan International Innovation Center Shanghai200433 China Shanghai Key Lab of Intelligent Information Processing Fudan University Shanghai200433 China School of Computing and Mathematics Charles Sturt University AlburyNSW2640 Australia

ISBN: (纸本)9781643685489

The prevalent solution for BioNER involves using representation learning techniques combined with sequence ***, such methods are inherently task-specific, demonstrate poor generalizability, and often require a dedicated model for each *** leverage the versatile capabilities of recent large language models (LLMs), several approaches have explored generative techniques for entity ***, these approaches often fall short compared to previous sequence labeling *** this paper, we utilize the open-sourced LLM LLaMA2 as the backbone model, and design specific instructions to distinguish between different types of entities and *** combining the LLM's understanding of instructions with sequence labeling techniques, we train a model using a mix of datasets capable of extracting various types of *** that the backbone LLMs lacks specialized medical knowledge, we also integrate external entity knowledge bases and employ instruction tuning to enable the model to densely recognize curated *** parameter-efficient training model, VANER, significantly outperforms previous LLMs-based *** the first time, as an LLM-based model, VANER surpasses the majority of conventional state-of-the-art BioNER systems, achieving the highest F1 scores across three datasets. © 2024 The Authors.

关键词： labeled data

来源：评论

学校读者我要写书评

暂无评论

Video Action Recognition Method Based on Personalized Federated Learning and Spatiotemporal Features

引用

computers, Materials and Continua 2025年第3期83卷 4961-4978页

作者： Wu, Rongsen Xu, Jie Zhang, Yuhang Zhao, Changming Xie, Yiweng Wu, Zelei Li, Yunji Guo, Jinhong Tang, Shiyang School of Information and Communication Engineering University of Electronic Science and Technology of China Chengdu611731 China School of Computer Science Chengdu University of Information Technology Chengdu610225 China Shanghai Key Lab of Intelligent Information Processing School of CS Fudan University Shanghai200433 China School of Sensing Science and Engineering Shanghai Jiao Tong University Shanghai200240 China School of Mechanical and Manufacturing Engineering University of New South Wales Sydney2052 Australia School of Electronics and Computer Science University of Southampton SouthamptonSO17 1BJ United Kingdom

With the rapid development of artificial intelligence and Internet of Things technologies, video action recognition technology is widely applied in various scenarios, such as personal life and industrial production. However, while enjoying the convenience brought by this technology, it is crucial to effectively protect the privacy of users’ video data. Therefore, this paper proposes a video action recognition method based on personalized federated learning and spatiotemporal features. Under the framework of federated learning, a video action recognition method leveraging spatiotemporal features is designed. For the local spatiotemporal features of the video, a new differential information extraction scheme is proposed to extract differential features with a single RGB frame as the center, and a spatial-temporal module based on local information is designed to improve the effectiveness of local feature extraction;for the global temporal features, a method of extracting action rhythm features using differential technology is proposed, and a time module based on global information is designed. Different translational strides are used in the module to obtain bidirectional differential features under different action rhythms. Additionally, to address user data privacy issues, the method divides model parameters into local private parameters and public parameters based on the structure of the video action recognition model. This approach enhances model training performance and ensures the security of video data. The experimental results show that under personalized federated learning conditions, an average accuracy of 97.792% was achieved on the UCF-101 dataset, which is non-independent and identically distributed (non-IID). This research provides technical support for privacy protection in video action recognition. Copyright © 2025 The Authors.

关键词： Federated learning

来源：评论

学校读者我要写书评

暂无评论

VideoPure: Diffusion-based Adversarial Purification for Video Recognition

arXiv

引用

arXiv 2025年

作者： Jiang, Kaixun Chen, Zhaoyu Fu, Jiyuan Hong, Lingyi Li, Jinglun Zhang, Wenqiang Shanghai Engineering Research Center of AI Robotics Academy for Engineering & Technology Fudan University Shanghai China Engineering Research Center of AI & Robotics Ministry of Education Academy for Engineering & Technology Fudan University Shanghai China Shanghai Key Lab of Intelligent Information Processing School of Computer Science Fudan University Shanghai China

—Recent work indicates that video recognition models are vulnerable to adversarial examples, posing a serious security risk to downstream applications. However, current research has primarily focused on adversarial attacks, with limited work exploring defense mechanisms. Furthermore, due to the spatial-temporal complexity of videos, existing video defense methods face issues of high cost, overfitting, and limited defense performance. Recently, diffusion-based adversarial purification methods have achieved robust defense performance in the image domain. However, due to the additional temporal dimension in videos, directly applying these diffusion-based adversarial purification methods to the video domain suffers performance and efficiency degradation. To achieve an efficient and effective video adversarial defense method, we propose the first diffusion-based video purification framework to improve video recognition models’ adversarial robustness: VideoPure. Given an adversarial example, we first employ temporal DDIM inversion to transform the input distribution into a temporally consistent and trajectory-defined distribution, covering adversarial noise while preserving more video structure. Then, during DDIM denoising, we leverage intermediate results at each denoising step and conduct guided spatial-temporal optimization, removing adversarial noise while maintaining temporal consistency. Finally, we input the list of optimized intermediate results into the video recognition model for multi-step voting to obtain the predicted class. We investigate the defense performance of our method against state-of-the-art black-box, gray-box, and adaptive attacks on benchmark datasets and models. Compared with other adversarial purification methods, our method overall demonstrates better defense performance against different attacks. Moreover, our method can be applied as a flexible defense plugin for video recognition models. Our code is available at https://***/deep-kaix

关键词： HTTP

来源：评论

学校读者我要写书评

暂无评论

Community-Centric Graph Unlearning

arXiv

引用

arXiv 2024年

作者： Li, Yi Zhang, Shichao Zhang, Guixian Cheng, Debo Key Lab of Education Blockchain and Intelligent Technology Ministry of Education Guangxi Normal University Guilin541004 China Guangxi Key Lab of Multi-Source Information Mining and Security Guangxi Normal University Guilin541004 China School of Computer Science and Technology China University of Mining and Technology Xuzhou Jiangsu221116 China UniSA STEM University of South Australia Mawson Lakes Adelaide Australia

Graph unlearning technology has become increasingly important since the advent of the ‘right to be forgotten’ and the growing concerns about the privacy and security of artificial intelligence. Graph unlearning aims to quickly eliminate the effects of specific data on graph neural networks (GNNs). However, most existing deterministic graph unlearning frameworks follow a balanced partition-submodel training-aggregation paradigm, resulting in a lack of structural information between subgraph neighborhoods and redundant unlearning parameter calculations. To address this issue, we propose a novel Graph Structure Mapping Unlearning paradigm (GSMU) and a novel method based on it named Community-centric Graph Eraser (CGE). CGE maps community subgraphs to nodes, thereby enabling the reconstruction of a node-level unlearning operation within a reduced mapped graph. CGE makes the exponential reduction of both the amount of training data and the number of unlearning parameters. Extensive experiments conducted on five real-world datasets and three widely used GNN backbones have verified the high performance and efficiency of our CGE method, highlighting its potential in the field of graph unlearning. https://***/liiiyi/CCGU © 2024, CC BY-NC-ND.

关键词： Graph neural networks

来源：评论

学校读者我要写书评

暂无评论

Debiased Contrastive Representation Learning for Mitigating Dual Biases in Recommender Systems

arXiv

引用

arXiv 2024年

作者： Huang, Zhirong Zhang, Shichao Cheng, Debo Li, Jiuyong Liu, Lin Zhang, Guixian Key Lab of Education Blockchain and Intelligent Technology Ministry of Education Guangxi Normal University Guilin541004 China Guangxi Key Lab of Multi-Source Information Mining and Security Guangxi Normal University Guilin541004 China UniSA STEM University of South Australia Mawson Lakes Adelaide Australia School of Computer Science and Technology China University of Mining and Technology Jiangsu Xuzhou221116 China

In recommender systems, popularity and conformity biases undermine recommender effectiveness by disproportionately favouring popular items, leading to their over-representation in recommendation lists and causing an unbalanced distribution of user-item historical data. We construct a causal graph to address both biases and describe the abstract data generation mechanism. Then, we use it as a guide to develop a novel Debiased Contrastive Learning framework for Mitigating Dual Biases, called DCLMDB. In DCLMDB, both popularity bias and conformity bias are handled in the model training process by contrastive learning to ensure that user choices and recommended items are not unduly influenced by conformity and popularity. Extensive experiments on two real-world datasets, Movielens-10M and Netflix, show that DCLMDB can effectively reduce the dual biases, as well as significantly enhance the accuracy and diversity of recommendations. © 2024, CC BY-NC-ND.

关键词： Contrastive Learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：