检索结果-内蒙古大学图书馆

A Global-Local Parallel Dual-Branch Deep Learning Model with Attention-Enhanced Feature Fusion for Brain Tumor MRI Classification

引用

computers, Materials & Continua 2025年第4期83卷 739-760页

作者： Zhiyong Li Xinlian Zhou School of Computer Science and Engineering Hunan University of Science and TechnologyXiangtan411100China

Brain tumor classification is crucial for personalized treatment *** deep learning-based Artificial Intelligence(AI)models can automatically analyze tumor images,fine details of small tumor regions may be overlooked during global feature ***,we propose a brain tumor Magnetic Resonance Imaging(MRI)classification model based on a global-local parallel dual-branch *** global branch employs ResNet50 with a Multi-Head Self-Attention(MHSA)to capture global contextual information from whole brain images,while the local branch utilizes VGG16 to extract fine-grained features from segmented brain tumor *** features from both branches are processed through designed attention-enhanced feature fusion module to filter and integrate important ***,to address sample imbalance in the dataset,we introduce a category attention block to improve the recognition of minority *** results indicate that our method achieved a classification accuracy of 98.04%and a micro-average Area Under the Curve(AUC)of 0.989 in the classification of three types of brain tumors,surpassing several existing pre-trained Convolutional Neural Network(CNN)***,feature interpretability analysis validated the effectiveness of the proposed *** suggests that the method holds significant potential for brain tumor image classification.

关键词： Deep learning attention mechanism feature fusion dual-branch structure brain tumor MRI classification

来源：评论

学校读者我要写书评

暂无评论

Embedding prescribed-time adaptive control protocol unveiling distributed consensus in multirobot systems via directed topology

引用

science China(Information sciences) 2025年第2期68卷 383-384页

作者： Yonghao XIE Xinru MA Hengyu LI Shaorong XIE School of Computer Engineering and Science Shanghai University School of Mechatronic Engineering and Automation Shanghai University

Recently, multirobot systems(MRSs) have found extensive applications across various domains, including industrial manufacturing, collaborative formation of unmanned equipment, emergency disaster relief, and war scenarios [1]. These advancements are largely supported by the development of consistency control theory. However, traditional dynamicsfree models may cause instability in complex robotic systems. Lagrangian dynamics offers a better approach for modeling these systems, as it facilitates controller design and optimization analysis. Despite this, challenges persist with unknown parameters and nonlinear friction within the systems.

关键词： Microrobots

来源：评论

学校读者我要写书评

暂无评论

MindScore: quantifying human preference for text-to-image generation through multi-view lens

引用

science China(Information sciences) 2025年第6期68卷 72-85页

作者： Yiqi TONG Jiarui ZHANG Shaohang WEI Wei GUO Fuzhen ZHUANG Deqing WANG Xi YANG Richeng XUAN School of Artificial Intelligence Beihang University School of Computer Science and Engineering Beihang University Department of Computer Science and Engineering Shanghai Jiao Tong University School of Computer Science Peking University State Key Laboratory of Complex & Critical Software Environment Beihang University Beijing Academy of Artificial Intelligence

Understanding and quantifying the capabilities of foundation models, particularly in text-to-image(T2I) generation, is crucial for verifying their alignment with human expectations and practical requirements. However, evaluating T2I foundation models presents significant challenges due to the complex, multi-dimensional psychological factors that influence human preferences for generated images. In this work, we propose MindScore, a multi-view framework for assessing the generation capacity of T2I models through the lens of human preference. Specifically, MindScore decomposes the evaluation into four complementary modules that align with human cognitive processing of images: matching, faithfulness, quality,and realness. The matching module quantifies the semantic alignment between generated images and prompt text, while the faithfulness module measures how accurately the images reflect specific prompt details. Furthermore, we incorporate quality and realness modules to capture deeper psychological preferences, recognizing that unpleasant or distorted images often trigger adverse human responses. Extensive experiments on three T2I datasets with human preference annotations clearly validate the superiority of our proposed MindScore over various state-of-the-art baselines. Our case studies further reveal that MindScore offers valuable insights into T2I generation from a human-centric perspective.

关键词： text-to-image generation foundation models human preference evaluation multi-view assessment language and vision

来源：评论

学校读者我要写书评

暂无评论

Video sketching using multi-domain guidance and implicit encoding

引用

Visual computer 2025年 1-12页

作者： Fang, Xiaonan Chang, Muhan School of Computer Science and Engineering Macau University of Science and Technology China

Sketch data are a common element in visual communication. While synthesizing sketches from photographs has been extensively explored, creating sketches from video remains a complex challenge due to its inherent intricacy and the necessity for temporal consistency. This study delves into the generation of a sequence of vector sketches from a video clip. We have developed an optimization framework that utilizes the CLIP perceptual loss with guidance from multiple domains, including natural images and stylized line drawings. This approach aids in capturing the prominent visual content within a complex scene. We initialize the sketches by propagating control points from the keyframes through the video content deformation field. These initial points are implicitly encoded and serve as input to a transformer network that predicts the control point offsets for each frame. We also conduct an additional temporal refinement stage by using more precise initial points for optimization. Experimental results on the DAVIS video dataset demonstrate that our method successfully delivers high visual fidelity and temporal consistency. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2025.

关键词： Drawing (graphics)

来源：评论

学校读者我要写书评

暂无评论

Exploring Potential Barrier Estimation Mechanism Based on Quantum Dynamics Framework

引用

Chinese Journal of Electronics 2025年第1期34卷 350-364页

作者： Quan Tang Peng Wang Chengdu Institution of Computer Application Chinese Academy of Science University of Chinese Academy of Sciences School of Computer Science and Engineering Southwest Minzu University

Due to the probability characteristics of quantum mechanism, the combination of quantum mechanism and intelligent algorithm has received wide attention. Quantum dynamics theory uses the Schr?dinger equation as a quantum dynamics equation. Through three approximation of the objective function, quantum dynamics framework(QDF) is obtained which describes basic iterative operations of optimization algorithms. Based on QDF, this paper proposes a potential barrier estimation(PBE) method which originates from quantum mechanism. With the proposed method, the particle can accept inferior solutions during the sampling process according to a probability which is subject to the quantum tunneling effect, to improve the global search capacity of optimization *** effectiveness of the proposed method in the ability of escaping local minima was thoroughly investigated through double well function(DWF), and experiments on two benchmark functions sets show that this method significantly improves the optimization performance of high dimensional complex functions. The PBE method is quantized and easily transplanted to other algorithms to achieve high performance in the future.

关键词： Potential energy Electric potential Shape Heuristic algorithms Estimation Tunneling Linear programming Approximation algorithms Iterative algorithms Optimization

来源：评论

学校读者我要写书评

暂无评论

Resonant tunneling diode cellular neural network with memristor coupling and its application in police forensic digital image protection

引用

Chinese Physics B 2025年第5期34卷 289-301页

作者： Fei Yu Dan Su Shaoqi He Yiya Wu Shankou Zhang Huige Yin School of Computer and Communication Engineering Changsha University of Science and TechnologyChangsha 410114China

Due to their biological interpretability,memristors are widely used to simulate synapses between artificial neural *** a type of neural network whose dynamic behavior can be explained,the coupling of resonant tunneling diode-based cellular neural networks(RTD-CNNs)with memristors has rarely been reported in the ***,this paper designs a coupled RTD-CNN model with memristors(RTD-MCNN),investigating and analyzing the dynamic behavior of the *** on this model,a simple encryption scheme for the protection of digital images in police forensic applications is *** results show that the RTD-MCNN can have two positive Lyapunov exponents,and its output is influenced by the initial values,exhibiting ***,a set of amplitudes in its output sequence is affected by the internal parameters of the memristor,leading to nonlinear ***,the rich dynamic behaviors described above make the RTD-MCNN highly suitable for the design of chaos-based encryption schemes in the field of privacy *** tests and security analyses validate the effectiveness of this scheme.

关键词： memristor hyperchaos resonant tunneling diode-based cellular neural network(RTD-CNN) dynamic analysis image encryption

来源：评论

学校读者我要写书评

暂无评论

BAD-FM:Backdoor Attacks Against Factorization-Machine Based Neural Network for Tabular Data Prediction

引用

Chinese Journal of Electronics 2024年第4期33卷 1077-1092页

作者： Lingshuo MENG Xueluan GONG Yanjiao CHEN College of Electrical Engineering Zhejiang University School of Computer Science Wuhan University

Backdoor attacks pose great threats to deep neural network models. All existing backdoor attacks are designed for unstructured data(image, voice, and text), but not structured tabular data, which has wide real-world applications, e.g., recommendation systems, fraud detection, and click-through rate prediction. To bridge this research gap, we make the first attempt to design a backdoor attack framework, named BAD-FM, for tabular data prediction models. Unlike images or voice samples composed of homogeneous pixels or signals with continuous values, tabular data samples contain well-defined heterogeneous fields that are usually sparse and discrete. Tabular data prediction models do not solely rely on deep networks but combine shallow components(e.g., factorization machine, FM) with deep components to capture sophisticated feature interactions among fields. To tailor the backdoor attack framework to tabular data models, we carefully design field selection and trigger formation algorithms to intensify the influence of the trigger on the backdoored model. We evaluate BAD-FM with extensive experiments on four datasets, i.e.,HUAWEI, Criteo, Avazu, and KDD. The results show that BAD-FM can achieve an attack success rate as high as 100%at a poisoning ratio of 0.001%, outperforming baselines adapted from existing backdoor attacks against unstructured data models. As tabular data prediction models are widely adopted in finance and commerce, our work may raise alarms on the potential risks of these models and spur future research on defenses.

关键词： Adaptation models Systematics Frequency modulation Finance Predictive models Prediction algorithms Data models

来源：评论

学校读者我要写书评

暂无评论

MLRT-UNet:An Efficient Multi-Level Relation Transformer Based U-Net for Thyroid Nodule Segmentation

引用

computer Modeling in engineering & sciences 2025年第4期143卷 413-448页

作者： Kaku Haribabu Prasath R Praveen Joe IR Department of Computer Science and Engineering RMK College of Engineering and TechnologyTiruvallur601206India School of Computer Science and Engineering Vellore Institute of TechnologyChennai600127India

Thyroid nodules,a common disorder in the endocrine system,require accurate segmentation in ultrasound images for effective diagnosis and ***,achieving precise segmentation remains a challenge due to various factors,including scattering noise,low contrast,and limited resolution in ultrasound *** existing segmentation models have made progress,they still suffer from several limitations,such as high error rates,low generalizability,overfitting,limited feature learning capability,*** address these challenges,this paper proposes a Multi-level Relation Transformer-based U-Net(MLRT-UNet)to improve thyroid nodule *** MLRTUNet leverages a novel Relation Transformer,which processes images at multiple scales,overcoming the limitations of traditional encoding *** transformer integrates both local and global features effectively through selfattention and cross-attention units,capturing intricate relationships within the *** approach also introduces a Co-operative Transformer Fusion(CTF)module to combine multi-scale features from different encoding layers,enhancing the model’s ability to capture complex patterns in the ***,the Relation Transformer block enhances long-distance dependencies during the decoding process,improving segmentation *** results showthat the MLRT-UNet achieves high segmentation accuracy,reaching 98.2% on the Digital Database Thyroid Image(DDT)dataset,97.8% on the Thyroid Nodule 3493(TG3K)dataset,and 98.2% on the Thyroid Nodule3K(TN3K)*** findings demonstrate that the proposed method significantly enhances the accuracy of thyroid nodule segmentation,addressing the limitations of existing models.

关键词： Thyroid nodules endocrine system multi-level relation transformer U-Net self-attention external attention co-operative transformer fusion thyroid nodules segmentation

来源：评论

学校读者我要写书评

暂无评论

Dual-Task Contrastive Meta-Learning for Few-Shot Cross-Domain Emotion Recognition

引用

computers, Materials & Continua 2025年第2期82卷 2331-2352页

作者： Yujiao Tang Yadong Wu Yuanmei He Jilin Liu Weihan Zhang School of Computer Science and Engineering Sichuan University of Science and EngineeringYibin644002China School of Mechanical and Power Engineering Chongqing University of Science and TechnologyChongqing401331China

Emotion recognition plays a crucial role in various fields and is a key task in natural language processing (NLP). The objective is to identify and interpret emotional expressions in text. However, traditional emotion recognition approaches often struggle in few-shot cross-domain scenarios due to their limited capacity to generalize semantic features across different domains. Additionally, these methods face challenges in accurately capturing complex emotional states, particularly those that are subtle or implicit. To overcome these limitations, we introduce a novel approach called Dual-Task Contrastive Meta-Learning (DTCML). This method combines meta-learning and contrastive learning to improve emotion recognition. Meta-learning enhances the model’s ability to generalize to new emotional tasks, while instance contrastive learning further refines the model by distinguishing unique features within each category, enabling it to better differentiate complex emotional expressions. Prototype contrastive learning, in turn, helps the model address the semantic complexity of emotions across different domains, enabling the model to learn fine-grained emotions expression. By leveraging dual tasks, DTCML learns from two domains simultaneously, the model is encouraged to learn more diverse and generalizable emotions features, thereby improving its cross-domain adaptability and robustness, and enhancing its generalization ability. We evaluated the performance of DTCML across four cross-domain settings, and the results show that our method outperforms the best baseline by 5.88%, 12.04%, 8.49%, and 8.40% in terms of accuracy.

关键词： Contrastive learning emotion recognition cross-domain learning dual-task meta-learning

来源：评论

学校读者我要写书评

暂无评论

Local saliency consistency-based label inference for weakly supervised salient object detection using scribble annotations

引用

CAAI Transactions on Intelligence Technology 2024年第1期9卷 239-249页

作者： Shuo Zhao Peng Cui Jing Shen Haibo Liu School of Computer Science and Technology Harbin University of Science and TechnologyHarbinChina School of Computer Science and Technology Harbin Engineering UniversityHarbinChina

Recently,weak supervision has received growing attention in the field of salient object detection due to the convenience of ***,there is a large performance gap between weakly supervised and fully supervised salient object detectors because the scribble annotation can only provide very limited foreground/background ***,an intuitive idea is to infer annotations that cover more complete object and background regions for *** this end,a label inference strategy is proposed based on the assumption that pixels with similar colours and close positions should have consistent ***,k-means clustering algorithm was first performed on both colours and coordinates of original annotations,and then assigned the same labels to points having similar colours with colour cluster centres and near coordinate cluster ***,the same annotations for pixels with similar colours within each kernel neighbourhood was set *** experiments on six benchmarks demonstrate that our method can significantly improve the performance and achieve the state-of-the-art results.

关键词： label inference salient object detection weak supervision

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：