检索结果-内蒙古大学图书馆

MindScore: quantifying human preference for text-to-image generation through multi-view lens

science China(Information sciences) 2025年第6期68卷 72-85页

作者： Yiqi TONG Jiarui ZHANG Shaohang WEI Wei GUO Fuzhen ZHUANG Deqing WANG Xi YANG Richeng XUAN School of Artificial Intelligence Beihang University School of Computer Science and Engineering Beihang University Department of Computer Science and Engineering Shanghai Jiao Tong University School of Computer Science Peking University State Key Laboratory of Complex & Critical Software Environment Beihang University Beijing Academy of Artificial Intelligence

Understanding and quantifying the capabilities of foundation models, particularly in text-to-image(T2I) generation, is crucial for verifying their alignment with human expectations and practical requirements. However, evaluating T2I foundation models presents significant challenges due to the complex, multi-dimensional psychological factors that influence human preferences for generated images. In this work, we propose MindScore, a multi-view framework for assessing the generation capacity of T2I models through the lens of human preference. Specifically, MindScore decomposes the evaluation into four complementary modules that align with human cognitive processing of images: matching, faithfulness, quality,and realness. The matching module quantifies the semantic alignment between generated images and prompt text, while the faithfulness module measures how accurately the images reflect specific prompt details. Furthermore, we incorporate quality and realness modules to capture deeper psychological preferences, recognizing that unpleasant or distorted images often trigger adverse human responses. Extensive experiments on three T2I datasets with human preference annotations clearly validate the superiority of our proposed MindScore over various state-of-the-art baselines. Our case studies further reveal that MindScore offers valuable insights into T2I generation from a human-centric perspective.

关键词： text-to-image generation foundation models human preference evaluation multi-view assessment language and vision

来源：评论

学校读者我要写书评

暂无评论

A Robust Pilot Decontamination Scheme in Massive MIMO Systems: Integrating Rateless Orthogonal STBC, Weighted Graph Coloring, and Channel Estimation

引用

IEEE Access 2025年 13卷 94453-94463页

作者： Hussien, Habib M. Kelem, Zelalem A. Addis Ababa Science and Technology University Artificial Intelligence and Robotics Center of Excellence Department of Electrical and Computer Engineering Addis Ababa16417 Ethiopia

Even though Multiple Input Multiple Output (MIMO) systems are one of the greatest innovations in wireless cellular network technology, they still face many challenges as we progress to 6G, hindering their ability to achieve the best throughput. In cellular networks, channel estimations are made using pilot sequences. Deploying these orthogonal time-frequency limited resources leads to pilot contamination, causing incorrect channel estimation due to mixed signals from neighboring cells and vice versa. Previous research has proven that this significantly decreases system performance. To reduce the impact and improve the output of Multiuser MIMO systems, this paper proposes a hybrid model combining Rateless Orthogonal Space-Time Block Codes (ROSTBC), Weighted Graph Coloring Pilot Allocation (WGCPA), and Maximum Likelihood Estimation (MLE). While ROSTBC has the ability to adapt to channel conditions, allowing the system to perform well despite pilot contamination, WGCPA and MLE address the problems of pilot contamination and the accuracy of channel estimation, respectively. The results showed that this hybrid scheme has proven to perform better compared to previously proposed schemes, providing a valuable contribution to reliable Massive MIMO wireless communication. © 2013 IEEE.

关键词： Maximum likelihood estimation

来源：评论

学校读者我要写书评

暂无评论

Reinforcement learning of non-additive joint steganographic embedding costs with attention mechanism

引用

science China(Information sciences) 2023年第3期66卷 273-286页

作者： Weixuan TANG Bin LI Weixiang LI Yuangen WANG Jiwu HUANG Institute of Artificial Intelligence and Blockchain Guangzhou University Guangdong Key Laboratory of Intelligent Information Processing Shenzhen Key Laboratory of Media Security Shenzhen University Shenzhen Institute of Artificial Intelligence and Robotics for Society School of Computer Science and Cyber Engineering Guangzhou University

Image steganography is the art and science of secure communication by concealing information within digital images. In recent years, the techniques of steganographic cost learning have developed rapidly. Although the existing methods can learn satisfactory additive costs, the interplay of different pixels' embedding impacts has not been considered, so the potential of learning may not be fully exploited. To overcome this limitation, in this paper, a reinforcement learning paradigm called Jo Po L(joint policy learning) is proposed to extend the idea of additive cost learning to a non-additive situation. Jo Po L aims to capture the interactions within pixel blocks by defining embedding policies and evaluating contributions of embedding impacts on a block level rather than a pixel level. Then, a policy network is utilized to learn optimal joint embedding policies for pixel blocks through interactions with the environment. Afterwards,these policies can be converted into joint embedding costs for practical message embedding. The structure of the policy network is designed with an effective attention mechanism and incorporated with the domain knowledge derived from traditional non-additive steganographic methods. The environment is responsible for assigning rewards according to the impacts of the sampled joint embedding actions, which are evaluated by the gradient information of a neural network-based steganalyzer. Experimental results show that the proposed non-additive method Jo Po L significantly outperforms the existing additive methods against both feature-based and CNN-based steganalzyers over different payloads.

关键词： information hiding non-additive steganography steganalysis cost learning image processing

来源：评论

学校读者我要写书评

暂无评论

A survey on cross-user federated recommendation

引用

science China(Information sciences) 2025年第4期68卷 7-32页

作者： Enyue YANG Yudi XIONG Wei YUAN Weike PAN Qiang YANG Zhong MING College of Computer Science and Software Engineering Shenzhen University School of Electrical Engineering and Computer Science The University of Queensland WeBank AI Lab WeBank Department of Computer Science and Engineering Hong Kong University of Science and Technology College of Big Data and Internet Shenzhen Technology University Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ)

Recommender systems are effective in mitigating information overload, yet the centralized storage of user data raises significant privacy concerns. Cross-user federated recommendation(CUFR) provides a promising distributed paradigm to address these concerns by enabling privacy-preserving recommendations directly on user devices. In this survey, we review and categorize current progress in CUFR, focusing on four key aspects: privacy, security, accuracy, and efficiency. Firstly,we conduct an in-depth privacy analysis, discuss various cases of privacy leakage, and then review recent methods for privacy protection. Secondly, we analyze security concerns and review recent methods for untargeted and targeted *** untargeted attack methods, we categorize them into data poisoning attack methods and parameter poisoning attack methods. For targeted attack methods, we categorize them into user-based methods and item-based methods. Thirdly,we provide an overview of the federated variants of some representative methods, and then review the recent methods for improving accuracy from two categories: data heterogeneity and high-order information. Fourthly, we review recent methods for improving training efficiency from two categories: client sampling and model compression. Finally, we conclude this survey and explore some potential future research topics in CUFR.

关键词： cross-user federated recommendation federated recommendation federated learning recommender systems user privacy

来源：评论

学校读者我要写书评

暂无评论

A Variational Inference-Based LSTM-Enhanced Deep Neural Model for Sequential Recommendations

引用

IEEE Access 2025年 13卷 95945-95962页

作者： Ngaffo, Armielle Noulapeu Kamgang, Inès Raïssa Djouela Maka, Ebenezer Maka Malong, Yannick University of Douala Laboratory of Computer Science Data Science and Artificial Intelligence National Higher Polytechnic School of Douala Douala Cameroon University of Buea Faculty of Engineering and Technology Department of Computer Engineering Buea Cameroon

Long Short-Term Memory (LSTM) networks are particularly useful in recommender systems since user preferences change over time. Unlike traditional recommender models which assume static user-item interactions, LSTM models capture sequential dependencies and temporal dynamics, making them ideal for personalized, session-based, and sequential recommendation tasks. However, traditional LSTM-based recommender systems do not model uncertainty since they perform deterministic predictions. Moreover, those recommender models exhibit poor generalization on small datasets, and poor-handle cold-start and data sparsity problems. To overcome those limitations, this paper presents an LSTM-based recommender model that leverages an improved deep matrix factorization to accurately address the data sparsity problem. In addition, an enhanced variational inference coupled to the Evidence Lower Bound Optimization is applied on LSTM unit layers to perform uncertainty-aware predictions. Thereafter, latent vectors obtained from LSTM networks are effectively involved in the sequential recommendation process. Experiments have been conducted on several real-world datasets to evaluate our proposal and the results show the proposed model presents significant performances compared to state-of-the-art models. © 2013 IEEE.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

Classification of Jackfruit Species Using Deep Learning Model 2

Classification of Jackfruit Species Using Deep Learning Mode...

引用

2nd IEEE International Conference on Recent Advances in Information Technology for Sustainable Development, ICRAIS 2024

作者： Veeresha, R.K. Karegoudra, Shilpa Kumar, Bhavin Pavan, Pavan Department of Robotics and Artificial Intelligence Udupi574110 India Department of Computer Science and Engineering Udupi574110 India

ISBN: (纸本)9798350354461

Identifying Jackfruit spices using a deep learning model involves leveraging advanced neural network architectures to classify spices accurately. The model is trained on a dataset containing images of various spices commonly used with Jackfruit, such as deng surya, manmohan, prakashchandra, vietnamearly, and others. Through multiple layers of convolutional neural networks (CNNs) and possibly recurrent neural networks (RNNs), the model learns intricate patterns and features within the images to differentiate between different species. Transfer learning techniques might also be employed to adapt pre-trained models to the specific task of spice identification. The model's performance is evaluated on a separate test dataset to measure its accuracy, precision, recall, and F1 score. The paper provides a comprehensive overview of the deep learning approach to identifying Jackfruit spices, highlighting its potential applications in culinary research, food industry automation, and dietary analysis. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

ChemDFM-X: towards large multimodal model for chemistry

引用

science China(Information sciences) 2024年第12期67卷 99-100页

作者： Zihan ZHAO Bo CHEN Jingpiao LI Lu CHEN Liyang WEN Pengyu WANG Zichen ZHU Danyang ZHANG Yansi LI Zhongyang DAI Xin CHEN Kai YU X-LANCE Lab Department of Computer Science and EngineeringMoE Key Lab of Artificial Intelligence AI Institute Shanghai Jiao Tong University Suzhou Laboratory

Chemistry, as a naturally multimodal discipline, plays a crucial role in various vital fields such as pharmaceutical research and material manufacturing. Therefore, research on artificial intelligence(AI) for chemistry has garnered increasing attention. Despite the rapid development, most of the chemical AI models today mainly focus on single tasks with unimodal input [1].

关键词：

来源：评论

学校读者我要写书评

暂无评论

Data Augmentation via Face Morphing for Recognizing Intensities of Facial Emotions

引用

IEEE Transactions on Affective Computing 2023年第2期14卷 1228-1235页

作者： Huang, Tsung-Ren Hsu, Shin-Min Fu, Li-Chen National Taiwan University Department of Psychology Center for Artificial Intelligence & Advanced Robotics Taipei106319 Taiwan National Taiwan University Center for Artificial Intelligence & Advanced Robotics Department of Electrical Engineering Department of Computer Science and Information Engineering Taipei106319 Taiwan

Being able to recognize emotional intensity is a desirable feature for a facial emotional recognition (FER) system. However, the development of such a feature is hindered by the paucity of intensity-labeled data for model training. To ameliorate the situation, the present study proposes using face morphing as a novel way of data augmentation to synthesize faces that express different degrees of a designated emotion. Such an approach has been successfully validated on humans and machines. Specifically, humans indeed perceived different levels of intensified emotions in these parametrically synthesized faces, and FER systems based on neural networks indeed showed improved sensitivities to intensities of different emotions when additionally trained on the synthesized faces. Overall, the proposed data augmentation method is not only simple and effective but also useful for building FER systems that recognize facial expressions of mixed emotions. © 2010-2012 IEEE.

关键词： Face recognition

来源：评论

学校读者我要写书评

暂无评论

Support vector machine with discriminative low-rank embedding

引用

CAAI Transactions on intelligence Technology 2024年第5期9卷 1249-1262页

作者： Guangfei Liang Zhihui Lai Heng Kong Computer Vision Institute College of Computer Science and Software EngineeringShenzhen UniversityShenzhenChina Shenzhen Institute of Artificial Intelligence and Robotics for Society ShenzhenChina Department of Breast and Thyroid Surgery BaoAn Central Hospital of ShenzhenShenzhenChina

Support vector machine(SVM)is a binary classifier widely used in machine ***,neglecting the latent data structure in previous SVM can limit the performance of SVM and its *** address this issue,the authors propose a novel SVM with discriminative low-rank embedding(LRSVM)that finds a discriminative latent low-rank subspace more suitable for SVM *** extension models of LRSVM are introduced by imposing different orthogonality constraints to prevent computational inaccuracies.A detailed derivation of the authors’iterative algorithms are given that is essentially for solving the SVM on the low-rank ***,some theorems and properties of the proposed models are presented by the *** is worth mentioning that the subproblems of the proposed algorithms are equivalent to the standard or the weighted linear discriminant analysis(LDA)*** indicates that the projection subspaces obtained by the authors’algorithms are more suitable for SVM classification compared to those from the LDA *** convergence analysis for the authors proposed algorithms are also ***,the authors conduct experiments on various machine learning data sets to evaluate the *** experiment results show that the authors’algorithms perform significantly better than other algorithms,which indicates their superior abilities on classification tasks.

关键词： iterative methods machine leaning support vector machunes

来源：评论

学校读者我要写书评

暂无评论

Outage Probability Analysis for D2D-Enabled Heterogeneous Cellular Networks with Exclusion Zone:A Stochastic Geometry Approach

引用

computer Modeling in engineering & sciences 2024年第1期138卷 639-661页

作者： Yulei Wang Li Feng Shumin Yao Hong Liang Haoxu Shi Yuqiang Chen School of Computer Science and Engineering Macao University of Science and TechnologyTaipaMacaoChina Department of Broadband Communication Peng Cheng LaboratoryShenzhenChina School of Artificial Intelligence Dongguan PolytechnicDongguanChina

Interference management is one of the most important issues in the device-to-device(D2D)-enabled heterogeneous cellular networks(HetCNets)due to the coexistence of massive cellular and D2D devices in which D2D devices reuse the cellular *** alleviate the interference,an efficient interference management way is to set exclusion zones around the cellular *** this paper,we adopt a stochastic geometry approach to analyze the outage probabilities of cellular and D2D users in the D2D-enabled *** main difficulties contain three aspects:1)how to model the location randomness of base stations,cellular and D2D users in practical networks;2)how to capture the randomness and interrelation of cellular and D2D transmissions due to the existence of random exclusion zones;3)how to characterize the different types of interference and their impacts on the outage probabilities of cellular and D2D *** then run extensive Monte-Carlo simulations which manifest that our theoretical model is very accurate.

关键词： Device-to-device(D2D)-enabled heterogeneous cellular networks(HetCNets) exclusion zone stochastic geometry(SG) Matérn hard-core process(MHCP)

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：