检索结果-内蒙古大学图书馆

MLRT-UNet:An Efficient Multi-Level Relation Transformer Based U-Net for Thyroid Nodule Segmentation

computer Modeling in Engineering & sciences 2025年第4期143卷 413-448页

作者： Kaku Haribabu Prasath R Praveen Joe IR Department of Computer Science and Engineering RMK College of Engineering and TechnologyTiruvallur601206India School of Computer Science and Engineering Vellore Institute of TechnologyChennai600127India

Thyroid nodules,a common disorder in the endocrine system,require accurate segmentation in ultrasound images for effective diagnosis and ***,achieving precise segmentation remains a challenge due to various factors,including scattering noise,low contrast,and limited resolution in ultrasound *** existing segmentation models have made progress,they still suffer from several limitations,such as high error rates,low generalizability,overfitting,limited feature learning capability,*** address these challenges,this paper proposes a Multi-level Relation Transformer-based U-Net(MLRT-UNet)to improve thyroid nodule *** MLRTUNet leverages a novel Relation Transformer,which processes images at multiple scales,overcoming the limitations of traditional encoding *** transformer integrates both local and global features effectively through selfattention and cross-attention units,capturing intricate relationships within the *** approach also introduces a Co-operative Transformer Fusion(CTF)module to combine multi-scale features from different encoding layers,enhancing the model’s ability to capture complex patterns in the ***,the Relation Transformer block enhances long-distance dependencies during the decoding process,improving segmentation *** results showthat the MLRT-UNet achieves high segmentation accuracy,reaching 98.2% on the Digital Database Thyroid Image(DDT)dataset,97.8% on the Thyroid Nodule 3493(TG3K)dataset,and 98.2% on the Thyroid Nodule3K(TN3K)*** findings demonstrate that the proposed method significantly enhances the accuracy of thyroid nodule segmentation,addressing the limitations of existing models.

关键词： Thyroid nodules endocrine system multi-level relation transformer U-Net self-attention external attention co-operative transformer fusion thyroid nodules segmentation

来源：评论

学校读者我要写书评

暂无评论

Fake Face Detection Based on Fusion of Spatial Texture and High-Frequency Noise

引用

Chinese Journal of Electronics 2025年第1期34卷 212-221页

作者： Dengyong Zhang Feifan Qi Jiahao Chen Jiaxin Chen Rongrong Gong Yuehong Tian Lebing Zhang Hunan Provincial Key Laboratory of Intelligent Processing of Big Data on Transportation School of Computer and Communication Engineering Changsha University of Science and Technology School of Computer and Communication Engineering Changsha University of Science and Technology Changsha Social Work College Changkuangao Beijing Technology Co. Ltd. School of Computer and Artificial Intelligence Huaihua University

The rapid development of the Internet has led to the widespread dissemination of manipulated facial images, significantly impacting people's daily lives. With the continuous advancement of Deepfake technology, the generated counterfeit facial images have become increasingly challenging to distinguish. There is an urgent need for a more robust and convincing detection method. Current detection methods mainly operate in the spatial domain and transform the spatial domain into other domains for analysis. With the emergence of transformers, some researchers have also combined traditional convolutional networks with transformers for detection. This paper explores the artifacts left by Deepfakes in various domains and, based on this exploration, proposes a detection method that utilizes the steganalysis rich model to extract high-frequency noise to complement spatial features. We have designed two main modules to fully leverage the interaction between these two aspects based on traditional convolutional neural networks. The first is the multi-scale mixed feature attention module, which introduces artifacts from high-frequency noise into spatial textures, thereby enhancing the model's learning of spatial texture features. The second is the multi-scale channel attention module, which reduces the impact of background noise by weighting the features. Our proposed method was experimentally evaluated on mainstream datasets, and a significant amount of experimental results demonstrate the effectiveness of our approach in detecting Deepfake forged faces, outperforming the majority of existing methods.

关键词： Deepfakes Adaptation models Attention mechanisms Noise Transforms Transformers Feature extraction Internet Convolutional neural networks Faces

来源：评论

学校读者我要写书评

暂无评论

AccEmo: Accelerometer Based Human Emotion Recognition for Eyewear Devices

引用

IEEE Transactions on Mobile Computing 2025年第6期24卷 4639-4650页

作者： Zhuang, Hui Zhang, Yihang Yang, Yanni Zhang, Guoming Chen, Zhe Spolaor, Riccardo Cheng, Xiuzhen Mohapatra, Prasant Hu, Pengfei Shandong University School of Computer Science and Technology Qingdao China Fudan University School of Computer Science Shanghai China UC Davis Department of Computer Science CA United States

With the increasing popularity of virtual reality applications, there is an increasing demand for more interactive entertainment, learning, social interactions, and other activities on eyewear devices. Recognizing users' emotion and providing reliable feedback can significantly improve the immersive experience for users. However, previous works in emotion recognition required modifications to existing eyewear devices and the integration of additional sensors, or relied on specialized sensors in expensive commercial-grade eyewear devices, making direct deployment on existing consumer-grade eyewear devices challenging. In this paper, we propose AccEmo, the first system that analyzes the data from the built-in accelerometer sensor on eyewear devices to accurately recognize human emotion. AccEmo first employs signal processing technologies to process raw accelerometer data, and then uses a binary classification network to determine whether the accelerometer data is influenced by emotional changes. Subsequently, AccEmo proposes a network architecture based on residual neural network and channel- wise attention mechanism as a universal feature extractor to extract complex features related to human emotions from the accelerometer data. Finally, AccEmo uses personalized classifiers to achieve emotion recognition for different users. Extensive performance evaluation of AccEmo across diverse users demonstrates an exceptional average accuracy of 94.3%. Additionally, the robustness of AccEmo is validated through evaluations in various scenarios, yielding promising results. © 2002-2012 IEEE.

关键词： Emotion Recognition

来源：评论

学校读者我要写书评

暂无评论

CRT and PUF-based Self/Mutual-healing Key Distribution Protocol with Collusion Resistance and Revocation Capability

引用

IEEE Transactions on Mobile Computing 2025年第6期24卷 4607-4622页

作者： Othman, Wajdy Hong, Zhong Fuyou, Miao Xue, Kaiping Hawbani, Ammar Amin, Ruhul Zhao, Liang Li, Tao Nankai University Haihe Lab of ITAI College of Cyber Science Tianjin China Anhui University School of Computer Science and Technology China University of Science and Technology of China School of Computer Science and Technology School of Information Science and Technology Anhui Hefei China Shenyang Aerospace University School of Computer Science Shenyang110136 China Department of Computer Science & Engineering IIIT-NR China Nankai University Haihe Lab of ITAI College of Cyber Science China

Self-healing group key distribution (SGKD) protocols guarantee the security of group communications by allowing authorized users to independently recover missed previous session keys from the current broadcast without retransmission. However, existing SGKD protocols have flaws: (1) collusion resistance and revocable nodes are both upper-bounded by the degree of polynomials used, (2) the disclosure of personal secrets enables the recovery of group key, (3) temporary revocation of a group member is not possible, and (4) a revoked node may obtain the session key when initiating mutual healing, moreover, a malicious node may cause the recovery of false group keys. To address these limitations, we propose an SGKD protocol using the Chinese remainder theorem (CRT) and Physical Unclonable Function (PUF). Our proposed SGKD protocol generates a PUF-based dynamic secret by stimulating nodes' PUF using a polynomial-based encrypted challenge. This secret is then employed to retrieve a CRT-based encrypted group key. By combining PUF and CRT, we can generate dynamic secrets on the fly and reduce computation time significantly. Utilizing such a technique, our protocol achieves superior security goals, including resistance to any coalition of group nodes even if nodes' personal secrets were disclosed. Furthermore, the proposed protocol provides an unlimited number of revocable nodes. Additionally, a revoked node can rejoin its group in later sessions without affecting backward secrecy. Moreover, the protocol provides a backward secrecy guaranteed mutual-healing feature free from desynchronization. Our performance and security analyses (i.e., theorem-based formal analysis, NS3-based experiment, and formal verification using the AVIPSA tool) show that our proposed protocol achieves stronger security goals and better efficiency in terms of computation, communication, and storage costs compared to existing SGKD schemes. © 2002-2012 IEEE.

关键词： Authentication Protocol

来源：评论

学校读者我要写书评

暂无评论

Multi-Task Chinese Speech Recognition Method Based on the Squeezeformer Model

IAENG International Journal of Computer Science

引用

IAENG International Journal of computer science 2025年第1期52卷 23-31页

作者： Guo, Ying Wang, Li School of Computer Science and software Engineering University of Science and Technology Liaoning Anshan114051 China College of Computer Science and Technology Liaoning Anshan114051 China

End-to-end training has emerged as a prominent trend in speech recognition, with Conformer models effectively integrating Transformer and CNN architectures. However, their complexity and high computational cost pose deployment challenges. To address these issues, we propose a multi-task Chinese speech recognition method based on the Squeezeformer model. We replace the FMCF structure in Conformer with an MF/CF structure, leveraging the convolutional module as a local Multi- Head Attention (MHA) module to enhance efficiency. Multilevel down-sampling and up-sampling using a time-series U-Net further reduce computational costs. By eliminating redundant LayerNorm layers and employing depthwise separable convolutions, we streamline the model, reduce parameters, and lower deployment costs. An Adaptor Layer is integrated into the MHSA module to mitigate the vanishing gradient problem, and a ScaleVar Layer is added to enhance flexibility. Additionally, the RealFormer module is introduced on the decoding side to improve context understanding. Combining Connectionist Temporal Classification (CTC) with attention-based encoding and decoding models for multi-task learning improves performance and accuracy. Experimental results show that the proposed method reduces the parameters on AISHELL-1 dataset by 16% and reduces the character error rate to 5.50%. At the same time, it also shows good performance on AISHELL-2 dataset. © (2025), (International Association of Engineers). All rights reserved.

关键词： Speech recognition

来源：评论

学校读者我要写书评

暂无评论

Gloss-driven Conditional Diffusion Models for Sign Language Production

引用

ACM Transactions on Multimedia Computing, Communications and Applications 2025年第4期21卷 1-17页

作者： Tang, Shengeng Xue, Feng Wu, Jingjing Wang, Shuo Hong, Richang School of Computer Science and Information Engineering Hefei University of Technology Hefei China School of Software Hefei University of Technology Hefei China School of Data Science School of Information Science and Technology University of Science and Technology of China Hefei China

Sign Language Production (SLP) aims to convert text or audio sentences into sign language videos corresponding to their semantics, which is challenging due to the diversity and complexity of sign languages, and cross-modal semantic mapping issues. In this work, we propose a Gloss-driven Conditional Diffusion Model (GCDM) for SLP. The core of the GCDM is a diffusion model architecture, in which the sign gloss sequence is encoded by a Transformer-based encoder and input into the diffusion model as a semantic prior condition. In the process of sign pose generation, the textual semantic priors carried in the encoded gloss features are integrated into the embedded Gaussian noise via cross-attention. Subsequently, the model converts the fused features into sign language pose sequences through T-round denoising steps. During the training process, the model uses the ground-truth labels of sign poses as the starting point, generates Gaussian noise through T rounds of noise, and then performs T rounds of denoising to approximate the real sign language gestures. The entire process is constrained by the MAE loss function to ensure that the generated sign language gestures are as close as possible to the real labels. In the inference phase, the model directly randomly samples a set of Gaussian noise, generates multiple sign language gesture sequence hypotheses under the guidance of the gloss sequence, and outputs a high-confidence sign language gesture video by averaging multiple hypotheses. Experimental results on the Phoenix2014T dataset show that the proposed GCDM method achieves competitiveness in both quantitative performance and qualitative visualization. © 2025 Copyright held by the owner/author(s). Publication rights licensed to ACM.

关键词： Forming

来源：评论

学校读者我要写书评

暂无评论

FACNet: Feature alignment fast point cloud completion network

引用

Computational Visual Media 2025年第1期11卷 141-157页

作者： Xinxing Yu Jianyi Li Chi-Chong Wong Chi-Man Vong Yanyan Liang School of Computer Science and Engineering Faculty of Innovation EngineeringMacao University of Science and TechnologyTaipaMacaoChina Faculty of Science and Technology University of MacaoTaipaMacaoChina

Point cloud completion aims to infer complete point clouds based on partial 3D point cloud *** previous methods apply coarseto-fine strategy networks for generating complete point ***,such methods are not only relatively time-consuming but also cannot provide representative complete shape features based on partial *** this paper,a novel feature alignment fast point cloud completion network(FACNet)is proposed to directly and efficiently generate the detailed shapes of *** aligns high-dimensional feature distributions of both partial and complete point clouds to maintain global information about the complete *** its decoding process,the local features from the partial point cloud are incorporated along with the maintained global information to ensure complete and time-saving generation of the complete point *** results show that FACNet outperforms the state-of-theart on PCN,Completion3D,and MVP datasets,and achieves competitive performance on ShapeNet-55 and KITTI ***,FACNet and a simplified version,FACNet-slight,achieve a significant speedup of 3–10 times over other state-of-the-art methods.

关键词： 3D point clouds shape completion geometry processing deep learning

来源：评论

学校读者我要写书评

暂无评论

ConvGRU: A Lightweight Intrusion Detection System for Vehicle Networks Based on Shallow CNN and GRU

引用

IEEE Access 2025年 13卷 73297-73318页

作者： Wang, Shaoqiang Cheng, Jiahui Wang, Yizhe Li, Shutong Kang, Lei Dai, Yin-Fei Changchun University School of Computer Science and Technology Changchun130022 China Jilin University College of Computer Science and Technology Changchun130012 China

The rapid proliferation of connected vehicles has significantly expanded the attack surface of the Internet of Vehicles (IoV), introducing severe security risks. In such resource-constrained environments, developing lightweight solutions is crucial to ensuring real-time detection and efficient deployment. To ad-dress these challenges, this study proposes ConvGRU, a lightweight vehicular network intrusion detection model that integrates a shallow Convolutional Neural Network (CNN) with a Gated Recurrent Unit (GRU). By employing optimizations such as small convolutional kernels and depthwise separable convolutions, the model significantly reduces the number of parameters and computational overhead, making it well-suited for resource-limited IoV environments. The shallow CNN effectively captures spatial features, while the GRU extracts temporal dependencies, enhancing the model's generalization ability. ConvGRU achieves an accu-racy, precision, recall, and F1-score exceeding 0.99 on the HCRL-Car-hacking, OTIDS, and CICIDS-2018 datasets, with only 112.55K parameters and a memory footprint of merely 0.43 MB. Experimental results demonstrate that this intrusion detection solution substantially improves malicious traffic detection accuracy while ensuring efficient operation in resource-constrained vehicular environments. © 2013 IEEE.

关键词： Network intrusion

来源：评论

学校读者我要写书评

暂无评论

FMCC-RT: a scalable and fine-grained all-reduce algorithm for large-scale SMP clusters

引用

science China(Information sciences) 2025年第5期68卷 362-379页

作者： Jintao PENG Jie LIU Jianbin FANG Min XIE Yi DAI Zhiquan LAI Bo YANG Chunye GONG Xinjun MAO Guo MAO Jie REN School of Computer Science and Technology National University of Defense Technology Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Laboratory of Digitizing Software for Frontier Equipment National University of Defense Technology National Supercomputer Center in Tianjin School of Computer Science Shaanxi Normal University

All-reduce is a widely used communication technique for distributed and parallel applications typically implemented using either a tree-based or ring-based scheme. Each of these approaches has its own limitations: tree-based schemes struggle with efficiently exchanging large messages, while ring-based solutions assume constant communication throughput,an unrealistic expectation in modern network communication infrastructures. We present FMCC-RT, an all-reduce approach that combines the advantages of tree-and ring-based implementations while mitigating their drawbacks. FMCC-RT dynamically switches between tree and ring-based implementations depending on the size of the message being processed. It utilizes an analytical model to assess the impact of message sizes on the achieved throughput, enabling the derivation of optimal work partitioning parameters. Furthermore, FMCC-RT is designed with an Open MPI-compatible API, requiring no modification to user code. We evaluated FMCC-RT through micro-benchmarks and real-world application tests. Experimental results show that FMCC-RT outperforms state-of-the-art tree-and ring-based methods, achieving speedups of up to 5.6×.

关键词： all-reduce collective communication MPI scalability

来源：评论

学校读者我要写书评

暂无评论

A Meta-Computing Framework for Collaborative Federated Graph Learning in Industrial IoTs

引用

IEEE Internet of Things Journal 2025年第10期12卷 13828-13837页

作者： Zheng, Xu Hu, Xinzhe Wang, Tingqi Huang, Qian Zhang, Lizong University of Electronic Science and Technology of China School of Computer Science and Engineering China School of Aeronautics and Astronautics University of Electronic Science and Technology of China China

Owing to strong capabilities in capturing interactions among objects and concepts, graph data has been treated as an important type of information collected by smart devices in Industrial Internet of Things, and the distributed training of graph learning models over these devices brings fundamental supports for intelligent services and operations. However, different IoT devices may collect Non-IID graph data due to different roles in the system, and suffer poor performance when only one unified instance of model is trained. Besides, IoT devices usually belong to different communities in Industrial Internet of Things, such that each community pursues both optimized and rational performance when joining in the training process. Considering both challenges, this paper proposes a novel meta-computing framework for federated graph learning in Industrial Internet of Things. A collaborative resource allocation task is formulated where devices belonging to different communities adopt limited resources to participate in the training of multiple instances either within or across communities. Two algorithms are introduced for adaptive and rational resource allocation based on whether devices are owned by single or multiple communities. Both algorithms provide guaranteed performance on efficiency and effectiveness, and the fairness among IoT devices are proved. Finally, extensive numerical results have demonstrated the performance of the proposed framework in handling collaborative graph model learning within Industrial Internet of Things. © 2014 IEEE.

关键词： Federated learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：