Brain tumor classification is crucial for personalized treatment *** deep learning-based Artificial Intelligence(AI)models can automatically analyze tumor images,fine details of small tumor regions may be overlooked d...
详细信息
Brain tumor classification is crucial for personalized treatment *** deep learning-based Artificial Intelligence(AI)models can automatically analyze tumor images,fine details of small tumor regions may be overlooked during global feature ***,we propose a brain tumor Magnetic Resonance Imaging(MRI)classification model based on a global-local parallel dual-branch *** global branch employs ResNet50 with a Multi-Head Self-Attention(MHSA)to capture global contextual information from whole brain images,while the local branch utilizes VGG16 to extract fine-grained features from segmented brain tumor *** features from both branches are processed through designed attention-enhanced feature fusion module to filter and integrate important ***,to address sample imbalance in the dataset,we introduce a category attention block to improve the recognition of minority *** results indicate that our method achieved a classification accuracy of 98.04%and a micro-average Area Under the Curve(AUC)of 0.989 in the classification of three types of brain tumors,surpassing several existing pre-trained Convolutional Neural Network(CNN)***,feature interpretability analysis validated the effectiveness of the proposed *** suggests that the method holds significant potential for brain tumor image classification.
Recently, multirobot systems(MRSs) have found extensive applications across various domains, including industrial manufacturing, collaborative formation of unmanned equipment, emergency disaster relief, and war scenar...
详细信息
Recently, multirobot systems(MRSs) have found extensive applications across various domains, including industrial manufacturing, collaborative formation of unmanned equipment, emergency disaster relief, and war scenarios [1]. These advancements are largely supported by the development of consistency control theory. However, traditional dynamicsfree models may cause instability in complex robotic systems. Lagrangian dynamics offers a better approach for modeling these systems, as it facilitates controller design and optimization analysis. Despite this, challenges persist with unknown parameters and nonlinear friction within the systems.
Understanding and quantifying the capabilities of foundation models, particularly in text-to-image(T2I) generation, is crucial for verifying their alignment with human expectations and practical requirements. However,...
详细信息
Understanding and quantifying the capabilities of foundation models, particularly in text-to-image(T2I) generation, is crucial for verifying their alignment with human expectations and practical requirements. However, evaluating T2I foundation models presents significant challenges due to the complex, multi-dimensional psychological factors that influence human preferences for generated images. In this work, we propose MindScore, a multi-view framework for assessing the generation capacity of T2I models through the lens of human preference. Specifically, MindScore decomposes the evaluation into four complementary modules that align with human cognitive processing of images: matching, faithfulness, quality,and realness. The matching module quantifies the semantic alignment between generated images and prompt text, while the faithfulness module measures how accurately the images reflect specific prompt details. Furthermore, we incorporate quality and realness modules to capture deeper psychological preferences, recognizing that unpleasant or distorted images often trigger adverse human responses. Extensive experiments on three T2I datasets with human preference annotations clearly validate the superiority of our proposed MindScore over various state-of-the-art baselines. Our case studies further reveal that MindScore offers valuable insights into T2I generation from a human-centric perspective.
Sketch data are a common element in visual communication. While synthesizing sketches from photographs has been extensively explored, creating sketches from video remains a complex challenge due to its inherent intric...
详细信息
Due to the probability characteristics of quantum mechanism, the combination of quantum mechanism and intelligent algorithm has received wide attention. Quantum dynamics theory uses the Schr?dinger equation as a quant...
详细信息
Due to the probability characteristics of quantum mechanism, the combination of quantum mechanism and intelligent algorithm has received wide attention. Quantum dynamics theory uses the Schr?dinger equation as a quantum dynamics equation. Through three approximation of the objective function, quantum dynamics framework(QDF) is obtained which describes basic iterative operations of optimization algorithms. Based on QDF, this paper proposes a potential barrier estimation(PBE) method which originates from quantum mechanism. With the proposed method, the particle can accept inferior solutions during the sampling process according to a probability which is subject to the quantum tunneling effect, to improve the global search capacity of optimization *** effectiveness of the proposed method in the ability of escaping local minima was thoroughly investigated through double well function(DWF), and experiments on two benchmark functions sets show that this method significantly improves the optimization performance of high dimensional complex functions. The PBE method is quantized and easily transplanted to other algorithms to achieve high performance in the future.
Due to their biological interpretability,memristors are widely used to simulate synapses between artificial neural *** a type of neural network whose dynamic behavior can be explained,the coupling of resonant tunnelin...
详细信息
Due to their biological interpretability,memristors are widely used to simulate synapses between artificial neural *** a type of neural network whose dynamic behavior can be explained,the coupling of resonant tunneling diode-based cellular neural networks(RTD-CNNs)with memristors has rarely been reported in the ***,this paper designs a coupled RTD-CNN model with memristors(RTD-MCNN),investigating and analyzing the dynamic behavior of the *** on this model,a simple encryption scheme for the protection of digital images in police forensic applications is *** results show that the RTD-MCNN can have two positive Lyapunov exponents,and its output is influenced by the initial values,exhibiting ***,a set of amplitudes in its output sequence is affected by the internal parameters of the memristor,leading to nonlinear ***,the rich dynamic behaviors described above make the RTD-MCNN highly suitable for the design of chaos-based encryption schemes in the field of privacy *** tests and security analyses validate the effectiveness of this scheme.
Backdoor attacks pose great threats to deep neural network models. All existing backdoor attacks are designed for unstructured data(image, voice, and text), but not structured tabular data, which has wide real-world a...
详细信息
Backdoor attacks pose great threats to deep neural network models. All existing backdoor attacks are designed for unstructured data(image, voice, and text), but not structured tabular data, which has wide real-world applications, e.g., recommendation systems, fraud detection, and click-through rate prediction. To bridge this research gap, we make the first attempt to design a backdoor attack framework, named BAD-FM, for tabular data prediction models. Unlike images or voice samples composed of homogeneous pixels or signals with continuous values, tabular data samples contain well-defined heterogeneous fields that are usually sparse and discrete. Tabular data prediction models do not solely rely on deep networks but combine shallow components(e.g., factorization machine, FM) with deep components to capture sophisticated feature interactions among fields. To tailor the backdoor attack framework to tabular data models, we carefully design field selection and trigger formation algorithms to intensify the influence of the trigger on the backdoored model. We evaluate BAD-FM with extensive experiments on four datasets, i.e.,HUAWEI, Criteo, Avazu, and KDD. The results show that BAD-FM can achieve an attack success rate as high as 100%at a poisoning ratio of 0.001%, outperforming baselines adapted from existing backdoor attacks against unstructured data models. As tabular data prediction models are widely adopted in finance and commerce, our work may raise alarms on the potential risks of these models and spur future research on defenses.
Thyroid nodules,a common disorder in the endocrine system,require accurate segmentation in ultrasound images for effective diagnosis and ***,achieving precise segmentation remains a challenge due to various factors,in...
详细信息
Thyroid nodules,a common disorder in the endocrine system,require accurate segmentation in ultrasound images for effective diagnosis and ***,achieving precise segmentation remains a challenge due to various factors,including scattering noise,low contrast,and limited resolution in ultrasound *** existing segmentation models have made progress,they still suffer from several limitations,such as high error rates,low generalizability,overfitting,limited feature learning capability,*** address these challenges,this paper proposes a Multi-level Relation Transformer-based U-Net(MLRT-UNet)to improve thyroid nodule *** MLRTUNet leverages a novel Relation Transformer,which processes images at multiple scales,overcoming the limitations of traditional encoding *** transformer integrates both local and global features effectively through selfattention and cross-attention units,capturing intricate relationships within the *** approach also introduces a Co-operative Transformer Fusion(CTF)module to combine multi-scale features from different encoding layers,enhancing the model’s ability to capture complex patterns in the ***,the Relation Transformer block enhances long-distance dependencies during the decoding process,improving segmentation *** results showthat the MLRT-UNet achieves high segmentation accuracy,reaching 98.2% on the Digital Database Thyroid Image(DDT)dataset,97.8% on the Thyroid Nodule 3493(TG3K)dataset,and 98.2% on the Thyroid Nodule3K(TN3K)*** findings demonstrate that the proposed method significantly enhances the accuracy of thyroid nodule segmentation,addressing the limitations of existing models.
Emotion recognition plays a crucial role in various fields and is a key task in natural language processing (NLP). The objective is to identify and interpret emotional expressions in text. However, traditional emotion...
详细信息
Emotion recognition plays a crucial role in various fields and is a key task in natural language processing (NLP). The objective is to identify and interpret emotional expressions in text. However, traditional emotion recognition approaches often struggle in few-shot cross-domain scenarios due to their limited capacity to generalize semantic features across different domains. Additionally, these methods face challenges in accurately capturing complex emotional states, particularly those that are subtle or implicit. To overcome these limitations, we introduce a novel approach called Dual-Task Contrastive Meta-Learning (DTCML). This method combines meta-learning and contrastive learning to improve emotion recognition. Meta-learning enhances the model’s ability to generalize to new emotional tasks, while instance contrastive learning further refines the model by distinguishing unique features within each category, enabling it to better differentiate complex emotional expressions. Prototype contrastive learning, in turn, helps the model address the semantic complexity of emotions across different domains, enabling the model to learn fine-grained emotions expression. By leveraging dual tasks, DTCML learns from two domains simultaneously, the model is encouraged to learn more diverse and generalizable emotions features, thereby improving its cross-domain adaptability and robustness, and enhancing its generalization ability. We evaluated the performance of DTCML across four cross-domain settings, and the results show that our method outperforms the best baseline by 5.88%, 12.04%, 8.49%, and 8.40% in terms of accuracy.
Recently,weak supervision has received growing attention in the field of salient object detection due to the convenience of ***,there is a large performance gap between weakly supervised and fully supervised salient o...
详细信息
Recently,weak supervision has received growing attention in the field of salient object detection due to the convenience of ***,there is a large performance gap between weakly supervised and fully supervised salient object detectors because the scribble annotation can only provide very limited foreground/background ***,an intuitive idea is to infer annotations that cover more complete object and background regions for *** this end,a label inference strategy is proposed based on the assumption that pixels with similar colours and close positions should have consistent ***,k-means clustering algorithm was first performed on both colours and coordinates of original annotations,and then assigned the same labels to points having similar colours with colour cluster centres and near coordinate cluster ***,the same annotations for pixels with similar colours within each kernel neighbourhood was set *** experiments on six benchmarks demonstrate that our method can significantly improve the performance and achieve the state-of-the-art results.
暂无评论