检索结果-内蒙古大学图书馆

PCformer: A parallel convolutional transformer network for 360° depth estimation

IET COMPUTER VISION 2023年第2期17卷 156-169页

作者： Xu, Chao Yang, Huamin Han, Cheng Zhang, Chao Changchun Univ Sci & Technol Sch Comp Sci & Technol 7186 Weixing Rd Changchun 130022 Peoples R China

360 degrees depth estimation has been extensively studied because 360 degrees images provide a full field of view of the surrounding environment as well as a detailed description of the entire scene. However, most well-studied convolutional neural networks (CNNs) for 360 degrees depth estimation can extract local features well, but fail to capture rich global features from the panorama due to a fixed receptive field in CNNs. PCformer, a parallel convolutional transformer network that combines the benefits of CNNs and transformers, is proposed for 360 degrees depth estimation. The transformer has the nature to model long-range dependency and extract global features. With PCformer, both global dependency and local spatial features can be efficiently captured. To fully incorporate global and local features, a dual attention fusion module is designed. Besides, a distortion-weighted loss function is designed to reduce the distortion in panoramas. Extensive experiments demonstrate that the proposed method achieves competitive results against the state-of-the-art methods on three benchmark datasets. Additional experiments also demonstrate that the proposed model has benefits in terms of model complexity and generalisation capability.

关键词： feature extraction PCformer global dependency rich global features local features deep learning (artificial intelligence) convolutional neural networks image recognition local spatial features neural nets parallel convolutional transformer network 360° depth estimation Optical, image and video signal processing CNNs Computer vision and image processing techniques convolutional neural nets

来源：评论

学校读者我要写书评

暂无评论

Graph reasoning and Inception attention network for dermoscopy segmentation

引用

BIOMEDICAL signal processing AND CONTROL 2024年 92卷

作者： Cheng, Tongtong Northwest Normal Univ Lanzhou Gansu Peoples R China

Precise segmentation of lesions from dermoscopy images is an essential task in computer-aided surgical planning. Unlike current methods that often concentrate on attention mechanisms, we build a pixel -to -pixel segmentation model called Graph reasoning and Inception Attention Network (GIAN). First, we propose a graph reasoning module that is data -dependent. The node matrix of the graph reasoning derives from the original image and the feature map, so our graph reasoning module can accurately capture global information in feature maps. Second, to avoid the information redundancy caused by channels, we propose the Inception attention module based on the original Inception module, which extracts the local spatial semantic information in the feature information. The Inception attention module can select representative node graphs as feature guidance graphs for image segmentation. The spatial information extracted by multiple parallel convolution kernels ensures the stability of subsequent pixel classification. In this way, the GIAN takes the extraction of global and the guidance of local information into account. The organic combination of the two modules provides a correct ideological basis for the segmentation task. In particular, we extensively evaluate the proposed method on two challenging datasets. The experimental results show that the GIAN can obtain comparable performance to state -of -the -art deep learning models under the same environmental conditions.

关键词： Dermoscopy segmentation Graph reasoning Inception attention Convolution neural network

来源：评论

学校读者我要写书评

暂无评论

Benign and Malignant Tumor Segmentation on Thorax Computed Tomography images 31

Benign and Malignant Tumor Segmentation on Thorax Computed T...

引用

31st IEEE Conference on signal processing and Communications Applications (SIU)

作者： Yoldas, Irem Nur Cevikalp, Hakan Gundogdu, Muhammed Aydin, Nevin Metintas, Muzaffer Eskisehir Osmangazi Univ Bilgisayar Muhendisligi Bolumu Eskisehir Turkiye Eskisehir Osmangazi Univ Elekt Elekt Muhendisligi Bolumu Eskisehir Turkiye Eskisehir Osmangazi Univ Daihili Tip Bilimleri Bolumu Eskisehir Turkiye

ISBN: (纸本)9798350343557

Medical imaging techniques are frequently used for tumor detection and diagnosis. Segmentation of tumor from medical images is a popular field of study. To this end, various deep neural network based methods are introduced for segmenting tumor regions. Within the scope of this study, we first collected a data set consisting of thorax CT (Computed Tomography) images with two class labels as benign and malignant with the help of chest radiologists and chest disease clinicians. Then, we trained four different deep neural network based segmentation methods, Mask R-CNN, YOLACT, SOLOV2, and U-Net, and compared their accuracies. Finally, we conducted experiments to show which CT image channels are more useful for segmentation. Among the tested methods, it was observed that the YOLACT algorithm returned the best results in classifying tumors and U-Net yielded the best segmentation masks.

关键词： lung tumor thorax CT dataset deep learning medical image segmentation

来源：评论

学校读者我要写书评

暂无评论

Adaptive weighted total variation expansion and Gaussian curvature guided low-dose CT image denoising network

引用

BIOMEDICAL signal processing AND CONTROL 2024年 94卷

作者： Li, Zhiyuan Liu, Yi Zhang, Pengcheng Lu, Jing Ren, Shilei Gui, Zhiguo North Univ China Taiyuan 030051 Shanxi Peoples R China North Univ China State Key Lab Dynam Testing Technol Taiyuan 030051 Peoples R China

The denoising task of low-dose CT images is a highly complex and uncertain inverse problem. Previous studies have primarily relied on convolutional neural network to reduce noise by learning the mapping from LDCT images to normal dose CT images. However, simply increasing the network depth alone is not an optimal choice due to the limited performance improvement and significant computational cost. In contrast, integrating prior knowledge of images with a model to assist in image reconstruction is a more efficient approach. This study proposes a new framework for denoising LDCT images, named Noise-Optimized Edge Feature Guided Network (NEFGN). The task of NEFGN is to integrate the noise optimization model of adaptive weighted total variation expansion, the edge detection model guided by Gaussian curvature, and image reconstruction into an end -to -end CNN framework. In order to achieve this goal, the noise optimization model is first constructed by learning the parameters in the adaptive weighted total variation regularization model to approximate the noise level in the NDCT image. The edge detection network is constructed using Gaussian curvature, predicting clear edges directly from the noise image. Finally, under the guidance of the noise optimization model and the edge detail model, NEFGN is more capable of suppressing artifact noise, demonstrating good accuracy and robustness, and can restore finer details. Numerous experimental studies demonstrate that the NEFGN denoising framework effectively restores the structure of LDCT images with limited image details and outperforms other methods in terms of performance.

关键词： LDCT AWTV image denoising Gaussian curvature CNN

来源：评论

学校读者我要写书评

暂无评论

A sequential combination of convolution neural network and machine learning for finger vein recognition system

引用

signal image AND VIDEO processing 2024年第11期18卷 8267-8278页

作者： Nadir, Cheyma Attallah, Bilal Brik, Youcef Univ Msila Fac Technol Dept Elect LASS Lab Msila 28000 Algeria

Biometric systems play a crucial role in securely recognizing an individual's identity based on physical and behavioral traits. Among these methods, finger vein recognition stands out due to its unique position beneath the skin, providing heightened security and individual distinctiveness that cannot be easily manipulated. In our study, we propose a robust biometric recognition system that combines a lightweight architecture with depth-wise separable convolutions and residual blocks, along with a machine-learning algorithm. This system employs two distinct learning strategies: single-instance and multi-instance. Using these strategies demonstrates the benefits of combining largely independent information. Initially, we address the problem of shading of finger vein images by applying the histogram equalization technique to enhance their quality. After that, we extract the features using a MobileNetV2 model that has been fine-tuned for this task. Finally, our system utilizes a support vector machines (SVM) to classify the finger vein features into their classes. Our experiments are conducted on two widely recognized datasets: SDUMLA and FV-USM and the results are promising and show excellent rank-one identification rates with 99.57% and 99.90%, respectively.

关键词： Finger vein Multi-instance Single instance Support vector machines Machine learning algorithms Feature extraction

来源：评论

学校读者我要写书评

暂无评论

Hierarchical image Feature Compression for Machines via Feature Sparsity Learning

引用

IEEE signal processing LETTERS 2024年 31卷 1159-1163页

作者： Ding, Ding Chen, Zhenzhong Liu, Zizheng Xu, Xiaozhong Liu, Shan Wuhan Univ Sch Remote Sensing Informat Engn Wuhan 430072 Peoples R China Tencent Shenzhen 518000 Peoples R China Tencent Amer Palo Alto CA 94306 USA

Recently, Video Coding for Machines (VCM) has gained more and more attention due to its efforts in machine vision tasks. As a crucial track in VCM, feature compression preserves and transmits critical feature information for machine vision. Most existing studies employ dimensionality reduction to the raw multi-scale feature before compression. However, feature sparsity is left insufficiently considered in removing redundancy in compressed features. In this letter, we propose a novel framework for image feature compression for machines, where the multi-scale feature is hierarchically transformed into a sparse representation for compression. The multi-scale feature is first fused by convolutional neural networks and the attention mechanism. To introduce sparsity into the fused feature, informative channels are identified by a channel-wise binary mask where activated elements are sampled from the importance distribution of channels learned from feature content. Then, the fused feature is masked to generate a sparse representation for compression. Experiments conducted on two machine tasks show significant improvements in our model over state-of-the-art methods.

关键词： Feature compression learned image compression sparsity learning video coding for machines (VCM)

来源：评论

学校读者我要写书评

暂无评论

Exploring temporal information dynamics in Spiking neural Networks: Fast Temporal Efficient Training

引用

JOURNAL OF NEUROSCIENCE methods 2025年 417卷 110401页

作者： Han, Changjiang Liu, Li-Juan Karimi, Hamid Reza Dalian Jiaotong Univ Sch Railway Intelligent Engn Dalian 116000 Liaoning Peoples R China Politecn Milan Dept Mech Engn I-20121 Milan Lombardia Italy

Background: Spiking neural Networks (SNNs) hold significant potential in brain simulation and temporal data processing. While recent research has focused on developing neuron models and leveraging temporal dynamics to enhance performance, there is a lack of explicit studies on neuromorphic datasets. This research aims to address this question by exploring temporal information dynamics in SNNs. New Method: To quantify the dynamics of temporal information during training, this study measures the Fisher information in SNNs trained on neuromorphic datasets. The information centroid is calculated to analyze the influence of key factors, such as the parameter k, on temporal information dynamics. Results: Experimental results reveal that the information centroid exhibits two distinct behaviors: stability and fluctuation. This study terms this phenomenon the Stable Information Centroid (SIC), which is closely related to the parameter k. Based on these findings, we propose the Fast Temporal Efficient Training (FTET) algorithm. Comparison with Existing methods: Firstly, the method proposed in this paper does not require the introduction of additional complex training techniques. Secondly, it can reduce the computational load by 30% in the final 50 epochs. However, the drawback is the issue of slow convergence during the early stages of training. Conclusion: This study reveals that the learning processes of SNNs vary across different datasets, providing new insights into the mechanisms of human brain learning. A limitation is the restricted sample size, focusing only on a few datasets and image classification tasks. The code is available at https://***/gtii123/fasttemporal-efficient-training.

关键词： Brain-inspired computing Spiking neural networks Brain simulation Vision signal processing

来源：评论

学校读者我要写书评

暂无评论

FEATURES DISENTANGLEMENT FOR EXPLAINABLE CONVOLUTIONAL neural NETWORKS 31

FEATURES DISENTANGLEMENT FOR EXPLAINABLE CONVOLUTIONAL NEURA...

引用

2024 International Conference on image processing

作者： Coscia, Pasquale Genovese, Angelo Scotti, Fabio Piuri, Vincenzo Univ Milan Dept Comp Sci Milan Italy

ISBN: (纸本)9798350349405;9798350349399

Explainable methods for understanding deep neural networks are currently being employed for many visual tasks and provide valuable insights about their decisions. While post-hoc visual explanations offer easily understandable human cues behind neural networks' decision-making processes, comparing their outcomes still remains challenging. Furthermore, balancing the performance-explainability trade-off could be a time-consuming process and require a deep domain knowledge. In this regard, we propose a novel auxiliary module, built upon convolutional-based encoders, which acts on the final layers of convolutional neural networks (CNNs) to learn orthogonal feature maps with a more discriminative and explainable power. This module is trained via a disentangle loss which specifically aims to decouple the object from the background in the input image. To quantitatively assess its impact on standard CNNs, and compare the quality of the resulting visual explanations, we employ metrics specifically designed for semantic segmentation tasks. These metrics rely on bounding-box annotations that may accompany image classification (or recognition) datasets, allowing us to compare both ground-truth and predicted regions. Finally, we explore the impact of various self-supervised pre-training strategies, due to their positive influence on vision tasks, and assess their effectiveness on our considered metrics.

关键词： Explainable AI (XAI) ResNet self-supervised learning (SSL) disentanglement

来源：评论

学校读者我要写书评

暂无评论

Dual-axis Generalized Cross Attention and Shape-aware Network for 2D medical image segmentation

引用

BIOMEDICAL signal processing AND CONTROL 2025年 107卷

作者： Zhang, Zengmin Peng, Yanjun Duan, Xiaomeng Hou, Qingfan Li, Zhengyu Shandong Univ Sci & Technol Coll Comp Sci & Engn Qingdao Peoples R China

Convolutional neural networks and Transformer methods have been widely applied in medical image segmentation and have shown tremendous potential. However, existing methods still face challenges in effectively integrating local semantic information and long-range dependencies, leading to suboptimal performance and reduced efficiency. Moreover, due to complex deformations and low-contrast blurry edges, the recognition capability of small organs is also unsatisfactory. To address these issues, we propose DGCA-Net. First, we design a Dual-axis Generalized Cross Attention module (DGCA) in the encoding phase to effectively integrate long- and short-range semantic relationships. DGCA consists of two continuous attention based on axial features, namely Generalized Channel Attention (GCA) and Generalized Efficient Attention (GEA), which enhance the recognition of large organs with long-range dependencies through axis-based generalized features and more efficient computation. Secondly, we design a boundary-constrained decoder comprising the Inter-scale Boundary Detector (IBD) and the Boundary Attention Guidance (BAG) to better identify small organs with blurry boundaries. The IBD extracts boundary information of foreground objects from multi-scale features, while the BAG leverages enhanced boundary features to guide the fusion of encoder features and decoder contexts, complementing fine spatial and edge details. DGCA-Net achieves SOTA performance four public datasets covering different modalities and segmentation regions (Synapse, FLARE2023, ACDC, and MoNuSeg), demonstrating its superiority, transferability, and strong generalization capability. Our code: ***/zzm3zz/DGCA-Net.

关键词： Medical image segmentation Dual-axis Generalized Cross Attention Inter-scale boundary detector Boundary-attention guidance

来源：评论

学校读者我要写书评

暂无评论

Convolutional neural networks based time-frequency image enhancement for the analysis of EEG signals (Feb, 10.1007/s11045-022-00822-2, 2022)

引用

MULTIDIMENSIONAL SYSTEMS AND signal processing 2022年第3期33卷 1071-1071页

作者： Khan, Nabeel Ali Mohammadi, Mokhtar Ghafoor, Mubeen Tariq, Syed Ali Univ Islamabad Fac Engn & IT Fdn Islamabad Pakistan Lebanese French Univ Coll Engn & Comp Sci Dept Informat Technol Kurdistan Iraq Univ Lincoln Sch Comp Sci Lincoln England COMSATS Univ Dept Comp Sci Islamabad Pakistan

Quadratic time-frequency (TF) methods are commonly used for the analysis, modeling, and classification of time-varying non-stationary electroencephalogram (EEG) signals. Commonly employed TF methods suffer from an inherent tradeoff between cross-term suppression and preservation of auto-terms. In this paper, we propose a new convolutional neural network (CNN) based approach to enhancing TF images. The proposed method trains a CNN using the Wigner-Ville distribution as the input image and the ideal time-frequency distribution with the total concentration of signal energy along the IF curves as the output image. The results show significant improvement compared to the other state-of-the-art TF enhancement methods. The codes for reproducing the results can be accessed on the GitHub via https://***/nabeelalikhan1/CNN-based-TF-image-enhancement .

关键词： Convolutional neural network EEG signals High resolution time-frequency distributions Time-frequency analysis

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：