检索结果-内蒙古大学图书馆

arXiv 2024年

作者： Meng, Benyuan Xu, Qianqian Wang, Zitai Cao, Xiaochun Huang, Qingming Institute of Information Engineering CAS China School of Cyber Security University of Chinese Academy of Sciences China Key Lab. of Intelligent Information Processing Institute of Computing Technology CAS China Peng Cheng Laboratory China School of Cyber Science and Tech. Shenzhen Campus of Sun Yat-sen University China School of Computer Science and Tech. University of Chinese Academy of Sciences China Key Laboratory of Big Data Mining and Knowledge Management CAS China

Diffusion models are initially designed for image generation. Recent research shows that the internal signals within their backbones, named activations, can also serve as dense features for various discriminative tasks such as semantic segmentation. Given numerous activations, selecting a small yet effective subset poses a fundamental problem. To this end, the early study of this field performs a large-scale quantitative comparison of the discriminative ability of the activations. However, we find that many potential activations have not been evaluated, such as the queries and keys used to compute attention scores. Moreover, recent advancements in diffusion architectures bring many new activations, such as those within embedded ViT modules. Both combined, activation selection remains unresolved but overlooked. To tackle this issue, this paper takes a further step with a much broader range of activations evaluated. Considering the significant increase in activations, a full-scale quantitative comparison is no longer operational. Instead, we seek to understand the properties of these activations, such that the activations that are clearly inferior can be filtered out in advance via simple qualitative evaluation. After careful analysis, we discover three properties universal among diffusion models, enabling this study to go beyond specific models. On top of this, we present effective feature selection solutions for several popular diffusion models. Finally, the experiments across multiple discriminative tasks validate the superiority of our method over the SOTA competitors. Our code is available at this url. © 2024, CC BY.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

Suppress Content Shift: Better Diffusion Features via Off-the-Shelf Generation Techniques

arXiv

引用

arXiv 2024年

作者： Meng, Benyuan Xu, Qianqian Wang, Zitai Yang, Zhiyong Cao, Xiaochun Huang, Qingming Institute of Information Engineering CAS China School of Cyber Security University of Chinese Academy of Sciences China Key Lab. of Intelligent Information Processing Institute of Computing Technology CAS China Peng Cheng Laboratory China School of Computer Science and Tech. University of Chinese Academy of Sciences China Key Laboratory of Big Data Mining and Knowledge Management CAS China School of Cyber Science and Tech. Sun Yat-sen University Shenzhen Campus China

Diffusion models are powerful generative models, and this capability can also be applied to discrimination. The inner activations of a pre-trained diffusion model can serve as features for discriminative tasks, namely, diffusion feature. We discover that diffusion feature has been hindered by a hidden yet universal phenomenon that we call content shift. To be specific, there are content differences between features and the input image, such as the exact shape of a certain object. We locate the cause of content shift as one inherent characteristic of diffusion models, which suggests the broad existence of this phenomenon in diffusion feature. Further empirical study also indicates that its negative impact is not negligible even when content shift is not visually perceivable. Hence, we propose to suppress content shift to enhance the overall quality of diffusion features. Specifically, content shift is related to the information drift during the process of recovering an image from the noisy input, pointing out the possibility of turning off-the-shelf generation techniques into tools for content shift suppression. We further propose a practical guideline named GATE to efficiently evaluate the potential benefit of a technique and provide an implementation of our methodology. Despite the simplicity, the proposed approach has achieved superior results on various tasks and datasets, validating its potential as a generic booster for diffusion features. Our code is available at this url. © 2024, CC BY.

关键词：

来源：评论

学校读者我要写书评

暂无评论

New entanglement-assisted quantum codes from negacyclic codes

arXiv

引用

arXiv 2023年

作者： Chen, Xiaojing Lu, Xingbo Zhu, Shixin Jiang, Wan Wang, Xindi School of Internet Anhui University Anhui Hefei230039 China School of Mathematics Hefei University of Technology Anhui Hefei230601 China Key Laboratory of Knowledge Engineering with Big Data Ministry of Education Hefei University of Technology Anhui Hefei230601 China School of Computer and Information Hefei University of Technology Anhui Hefei230601 China

The theory of entanglement-assisted quantum error-correcting codes (EAQECCs) is a generalization of the standard stabilizer quantum error-correcting codes, which can be possibly constructed from any classical codes by relaxing the duality condition and utilizing preshared entanglement between the sender and receiver. In this paper, a new family of EAQECCs is constructed from negacyclic codes of length n = q2a+1, where q is an odd prime power, a = m22+1 and m is an odd integer. Some new entanglement-assisted quantum maximum distance separable (EAQMDS) codes are obtained in the sense that their parameters are not covered by the previously known ones. Copyright © 2023, The Authors. All rights reserved.

关键词： Quantum entanglement

来源：评论

学校读者我要写书评

暂无评论

A Tutorial on Movable Antennas for Wireless Networks

arXiv

引用

arXiv 2025年

作者： Zhu, Lipeng Ma, Wenyan Mei, Weidong Zeng, Yong Wu, Qingqing Ning, Boyu Xiao, Zhenyu Shao, Xiaodan Zhang, Jun Zhang, Rui Department of Electrical and Computer Engineering National University of Singapore Singapore117583 Singapore Chengdu611731 China National Mobile Communications Research Laboratory Frontiers Science Center for Mobile Information Communication and Security Southeast University Nanjing210096 China Purple Mountain Laboratories Nanjing211111 China Department of Electronic Engineering Shanghai Jiao Tong University Shanghai200240 China School of Electronic and Information Engineering Beihang University Beijing100191 China Department of Electrical and Computer Engineering University of Waterloo WaterlooONN2L 3G1 Canada State Key Laboratory of CNS/ATM MIIT Key Laboratory of Complex-field Intelligent Sensing Beijing Institute of Technology Beijing100081 China School of Science and Engineering Shenzhen Research Institute of Big Data The Chinese University of Hong Kong Guangdong Shenzhen518172 China

Movable antenna (MA) has been recognized as a promising technology to enhance the performance of wireless communication and sensing by enabling antenna movement. Such a significant paradigm shift from conventional fixed antennas (FAs) to MAs offers tremendous new opportunities towards realizing more versatile, adaptive and efficient next-generation wireless networks such as 6G. In this paper, we provide a comprehensive tutorial on the fundamentals and advancements in the area of MA-empowered wireless networks. First, we overview the historical development and contemporary applications of MA technologies. Next, to characterize the continuous variation in wireless channels with respect to antenna position and/or orientation, we present new field-response channel models tailored for MAs, which are applicable to narrowband and wideband systems as well as far-field and near-field propagation conditions. Subsequently, we review the state-of-the-art architectures for implementing MAs and discuss their practical constraints. A general optimization framework is then formulated to fully exploit the spatial degrees of freedom (DoFs) in antenna movement for performance enhancement in wireless systems. In particular, we delve into two major design issues for MA systems. First, we address the intricate antenna movement optimization problem for various communication and/or sensing systems to maximize the performance gains achievable by MAs. Second, we deal with the challenging channel acquisition issue in MA systems for reconstructing the channel mapping between arbitrary antenna positions inside the transmitter and receiver regions. Moreover, we show existing prototypes developed for MA-aided communication/sensing and the experimental results based on them. Finally, the extension of MA design to other wireless systems and its synergy with other emerging wireless technologies are discussed. We also highlight promising research directions in this area to inspire future investigatio

关键词： Antenna arrays

来源：评论

学校读者我要写书评

暂无评论

Personalized Recommendation Based On Entity Attributes and Graph Features

Personalized Recommendation Based On Entity Attributes and G...

引用

IEEE International Conference on Big knowledge (ICBK)

作者： Yi Zhu Bingbing Dong Zhiqing Sha School of Information Engineering Yangzhou University Yangzhou China Ministry of Education Key Laboratory of Knowledge Engineering with Big Data (Hefei University of Technology) China School of Computer Science and Information Engineering Hefei University of Technology Hefei China Instituse of Big Knowledge Science Hefei University of Technology Hefei China

ISBN: (纸本)9781665438599

With the rapid increase in the amount of website data, it has been a more difficult task for users to get the infor-mation they are interested in. Personalized recommendation is an important bridge to find the information which users really need on the website. Many recent studies have introduced additional attribute information about users and/or items to the rating matrix for alleviating the problem of data sparsity. In order to make full use of the attribute information and scoring matrix, deep learning based recommendation methods are proposed, especially the autoencoder model has attracted much attention because of its strong ability to learn hidden features. However, most of the existing autoencoder- based models require that the dimension of the input layer is equal to the dimension of the output layer, which may increase model complexity and certain information loss when using attribute information. In addition, as users' awareness of privacy protection increases, user attribute information is difficult to obtain. To address the above problems, in this paper, we propose a hybrid personalized recommendation model, which uses a semi-autoencoder to jointly embed the item's score vector and internal graph features (short for Co-Agpre). Specifically, we regard the user-item historical interaction matrix as a bipartite graph, and the Laplacian of the user-item co-occurrence graph is utilized to obtain the graph features of the item for solving the problem of sparse attributes. Then a semi-autoencoder is introduced to learn the hidden features of the item and perform rating prediction. The proposed model can flexibly use information from different sources to reduce the complexity of the model. Experiments on two real-world datasets demonstrate the effectiveness of the proposed Co-Agpre compared with state-of-the-art methods.

关键词： Deep learning Privacy Laplace equations Conferences Predictive models Feature extraction Complexity theory

来源：评论

学校读者我要写书评

暂无评论

Reading-strategy Inspired Visual Representation Learning for Text-to-Video Retrieval

arXiv

引用

arXiv 2022年

作者： Dong, Jianfeng Wang, Yabing Chen, Xianke Qu, Xiaoye Li, Xirong He, Yuan Wang, Xun The College of Computer and Information Engineering Zhejiang Gongshang University Hangzhou310035 China The State Key Laboratory of Information Security Institute of Information Engineering Chinese Academy of Sciences Beijing100093 China The School of Electronic Information and Communications Huazhong University of Science and Technology Hubei430074 China The Key Lab of Data Engineering and Knowledge Engineering Renmin University of China The AIMC Lab. School of Information Renmin University of China Beijing100872 China The Alibaba Group Beijing100102 China

This paper aims for the task of text-to-video retrieval, where given a query in the form of a natural-language sentence, it is asked to retrieve videos which are semantically relevant to the given query, from a great number of unlabeled videos. The success of this task depends on cross-modal representation learning that projects both videos and sentences into common spaces for semantic similarity computation. In this work, we concentrate on video representation learning, an essential component for text-to-video retrieval. Inspired by the reading strategy of humans, we propose a Reading-strategy Inspired Visual Representation Learning (RIVRL) to represent videos, which consists of two branches: a previewing branch and an intensive-reading branch. The previewing branch is designed to briefly capture the overview information of videos, while the intensive-reading branch is designed to obtain more in-depth information. Moreover, the intensive-reading branch is aware of the video overview captured by the previewing branch. Such holistic information is found to be useful for the intensive-reading branch to extract more fine-grained features. Extensive experiments on three datasets are conducted, where our model RIVRL achieves a new state-of-the-art on TGIF and VATEX. Moreover, on MSR-VTT, our model using two video features shows comparable performance to the state-of-the-art using seven video features and even outperforms models pre-trained on the large-scale HowTo100M dataset. Code is available at https://***/LiJiaBei-7/rivrl. © 2022, CC BY.

关键词： Video recording

来源：评论

学校读者我要写书评

暂无评论

Generating Action-conditioned Prompts for Open-vocabulary Video Action Recognition 24

Generating Action-conditioned Prompts for Open-vocabulary Vi...

引用

32nd ACM International Conference on Multimedia, MM 2024

作者： Jia, Chengyou Luo, Minnan Chang, Xiaojun Dang, Zhuohang Han, Mingfei Wang, Mengmeng Dai, Guang Dang, Sizhe Wang, Jingdong School of Computer Science and Technology MOEKLINNS Lab Xi'an Jiaotong University Shaanxi Xi'an China University of Science and Technology of China Anhui Hefei China School of Computer Science and Technology Xi'an Jiaotong University Shaanxi Xi'an China ReLER Lab AAII University of Technology Sydney SydneyNSW Australia Zhejiang University of Technology College of Computer Science and Technology China SGIT AI Lab State Grid Corporation of China Beijing China Baidu Inc Beijing China United Arab Emirates Shaanxi Province Key Laboratory of Big Data Knowledge Engineering Xi'an Jiaotong University Xi'an710049 China SGIT AI Lab State Grid Corporation of China China School of Computer Science and Technology Ministry of Education Key Laboratory of Intelligent Networks and Network Security Xi'an Jiaotong University Xi'an710049 China

ISBN: (纸本)9798400706868

Exploring open-vocabulary video action recognition is a promising venture, which aims to recognize previously unseen actions within any arbitrary set of categories. Existing methods typically adapt pretrained image-text models to the video domain, capitalizing on their inherent strengths in generalization. A common thread among such methods is the augmentation of visual embeddings with temporal information to improve the recognition of seen actions. Yet, they compromise with standard less-informative action descriptions, thus faltering when confronted with novel actions. Drawing inspiration from human cognitive processes, we argue that augmenting text embeddings with human prior knowledge is pivotal for open-vocabulary video action recognition. To realize this, we innovatively blend video models with Large Language Models (LLMs) to devise Action-conditioned Prompts. Specifically, we harness the knowledge in LLMs to produce a set of descriptive sentences that contain distinctive features for identifying given actions. Building upon this foundation, we further introduce a multi-modal action knowledge alignment mechanism to align concepts in video and textual knowledge encapsulated within the prompts. Extensive experiments on various video benchmarks, including zero-shot, few-shot, and base-to-novel generalization settings, demonstrate that our method not only sets new SOTA performance but also possesses excellent interpretability. © 2024 ACM.

关键词： Embeddings

来源：评论

学校读者我要写书评

暂无评论

Mandari: Multi-Modal Temporal knowledge Graph-aware Sub-graph Embedding for Next-POI Recommendation

Mandari: Multi-Modal Temporal Knowledge Graph-aware Sub-grap...

引用

IEEE International Conference on Multimedia and Expo (ICME)

作者： Xiaoqian Liu Xiuyun Li Yuan Cao Fan Zhang Xiongnan Jin Jinpeng Chen School of Computer Science (National Pilot Software Engineering School) Beijing University of Posts and Telecommunications Beijing China Key Laboratory of Trustworthy Distributed Computing and Service (BUPT) Ministry of Education Beijing China The Technology Innovation Center of Cultural Tourism Big Data of Hebei Province Chengde China Hebei Normal University for Nationalities Chengde China Knowledge Discovery and Data Mining Research Center Zhejiang Lab Hangzhou China

Next-POI recommendation aims to explore from user check-in sequence to predict the next possible location to be visited. Existing methods are often difficult to model the implicit association of multi-modal data with user choices. Moreover, traditional methods struggle to fully explore the variation of user preferences at variable time intervals. To tackle these limitations, we propose a Multi-Modal Temporal knowledge Graph-aware Sub-graph Embedding approach (Mandari). We first construct a novel Multi-Modal Temporal knowledge Graph. Based on the proposed knowledge graph, we integrate multi-modal information and leverage the graph attention network to calculate sub-graph prediction probability. Next, we implement a temporal knowledge mining method to model the segmentation and periodicity of user check-in and obtain temporal prediction probability. Finally, we fuse temporal prediction probability with the previous sub-graph prediction probability to obtain the final result. Extensive experiments demonstrate that our approach outperforms existing state-of-the-art methods.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Not all diffusion model activations have been evaluated as discriminative features 24

Not all diffusion model activations have been evaluated as d...

引用

Proceedings of the 38th International Conference on Neural Information Processing Systems

作者： Benyuan Meng Qianqian Xu Zitai Wang Xiaochun Cao Qingming Huang Institute of Information Engineering CAS and School of Cyber Security University of Chinese Academy of Sciences Key Lab. of Intelligent Information Processing Institute of Computing Technology CAS and Peng Cheng Laboratory Key Lab. of Intelligent Information Processing Institute of Computing Technology CAS School of Cyber Science and Tech. Shenzhen Campus of Sun Yat-sen University School of Computer Science and Tech. University of Chinese Academy of Sciences and Key Lab. of Intelligent Information Processing Institute of Computing Technology CAS and Key Laboratory of Big Data Mining and Knowledge Management CAS

ISBN: (纸本)9798331314385

关键词：

来源：评论

学校读者我要写书评

暂无评论

Listening to the user's voice: A temporal analysis of autism-related questions on Quora

引用

Proceedings of the Association for Information Science and Technology 2019年第1期56卷 513-516页

作者： Zhao, Yuehua Min, Chao Han, Xu Deng, Sanhong Wang, Hao Li, Jiang Nanjing University Jiangsu Key Laboratory of Data Engineering and Knowledge Service Nanjing China Nanjing University Nanjing China

Social question and answer (Q&A) platforms offer a new way for identifying information needs of people with certain diseases. Taking Quora as an example, we examine which health topics are of interest to autistic people and how these topics evolve over time. Experimental results reveal increasingly heavy and diverse attention to the condition, from diagnosis and treatment of autism itself to extended issues like social challenges, parenting, and education issues. We find that users tend to post clinical concerns about autism on Quora although traditionally such social Q&A platforms encourage more social and awareness-level questions. New concerns have appeared recently about autism's relations to other diseases like attention deficit hyperactivity disorder (ADHD) and obsessive–compulsive disorder (OCD). This study is beneficial for tracking and responding to autistic patients' and caregivers' information needs. Author(s) retain copyright, but ASIS&T receives an exclusive publication license

关键词： Diseases

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：