检索结果-内蒙古大学图书馆

44th IEEE International Conference on Acoustics, Speech and signal processing (ICASSP)

作者： Zhao, Bocheng Yang, Minghao Tao, Jianhua Center for Language and Speech Processing The Johns Hopkins University Baltimore USA Human Language Technology Center of Excellence The Johns Hopkins University Baltimore USA

ISBN: (纸本)9781479981311

Recover drawing orders from a Chinese handwriting image is a challenge issue. Most of English drawing order recovery( DOR) methods perform unsatisfactorily in Chinese. This paper proposes a novel image-to-sequence algorithm to deal with Chinese DOR problem. The proposed method utilizes two regression convolution neural network(CNN) models to generate two corresponding pen-tip movement heat-maps. To estimate pen-tip movement for most of the normal states in writing process, the algorithm analyzes the above two heat-maps with a specifically designed framework. Then the drawing order is restored through a simple iteration process based on the proposed framework. Experiments on public online handwriting database show that our method have got a remarkable result for Chinese DOR tasks. In addition, for English tasks, our method performs superiorly among state-of-the-art methods.

关键词： Drawing order recovery Chinese handwriting Convolution neural network image-to-sequence model

来源：评论

学校读者我要写书评

暂无评论

A Fetal Brain magnetic resonance Acquisition Numerical phantom (FaBiAN)

引用

SCIENTIFIC REPORTS 2022年第1期12卷 1-21页

作者： Lajous, Helene Roy, Christopher W. Hilbert, Tom de Dumast, Priscille Tourbier, Sebastien Aleman-Gomez, Yasser Yerly, Jerome Yu, Thomas Kebiri, Hamza Payette, Kelly Ledoux, Jean-Baptiste Meuli, Reto Hagmann, Patric Jakab, Andras Dunet, Vincent Koob, Meriam Kober, Tobias Stuber, Matthias Cuadra, Meritxell Bach Lausanne Univ Hosp CHUV Dept Radiol Lausanne Switzerland Univ Lausanne UNIL Lausanne Switzerland CIBM Ctr Biomed Imaging Lausanne Switzerland Siemens Healthcare Adv Clin Imaging Technol ACIT Lausanne Switzerland Ecole Polytech Fed Lausanne EPFL Signal Proc Lab 5 LTS5 Lausanne Switzerland Univ Zurich Univ Childrens Hosp Zurich Ctr MR Res Zurich Switzerland Univ Zurich Neurosci Ctr Zurich Zurich Switzerland

Accurate characterization of in utero human brain maturation is critical as it involves complex and interconnected structural and functional processes that may influence health later in life. Magnetic resonance imaging is a powerful tool to investigate equivocal neurological patterns during fetal development. However, the number of acquisitions of satisfactory quality available in this cohort of sensitive subjects remains scarce, thus hindering the validation of advanced image processing techniques. Numerical phantoms can mitigate these limitations by providing a controlled environment with a known ground truth. In this work, we present FaBiAN, an open-source Fetal Brain magnetic resonance Acquisition Numerical phantom that simulates clinical T2-weighted fast spin echo sequences of the fetal brain. This unique tool is based on a general, flexible and realistic setup that includes stochastic fetal movements, thus providing images of the fetal brain throughout maturation comparable to clinical acquisitions. We demonstrate its value to evaluate the robustness and optimize the accuracy of an algorithm for super-resolution fetal brain magnetic resonance imaging from simulated motion-corrupted 2D low-resolution series compared to a synthetic high-resolution reference volume. We also show that the images generated can complement clinical datasets to support data-intensive deep learning methods for fetal brain tissue segmentation.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Pulse coupled neural network based MRI image enhancement using classical visual receptive field for smarter mobile healthcare

引用

JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING 2019年第10期10卷 4059-4070页

作者： Nie, Rencan He, Min Cao, Jinde Zhou, Dongming Liang, Zifei Yunnan Univ Sch Informat Sci & Technol Kunming 650091 Yunnan Peoples R China Southeast Univ Sch Automat Nanjing 210096 Jiangsu Peoples R China NYU Langone Hlth Dept Radiol New York NY 10016 USA

With the rapid growth of medical big data, medical signal processing measurement techniques are facing severe challenges. Enormous medical images are constantly generated by various health monitoring and sensing devices, such as ultrasound, MRI machines. Hence, based on pulse coupled neural network (PCNN) and the classical visual receptive field (CVRF) with the difference of two Gaussians (DOG), a contrast enhancement of MRI image is suggested to improve the accuracy of clinical diagnosis for smarter mobile healthcare. As one premise, the parameters of DOG are estimated from the fundamentals of CVRF;then the PCNN parameters in image enhancement are estimated eventually with the help of DOG. As a result, the MRI images can be enhanced adaptively. Due to the exponential decay of the dynamic threshold and the pulses coupling among neurons, PCNN effectively enhances the contrast of low grey levels in MRI image. Moreover, because of the inhibitory effects from inhibitory region in CVRF, PCNN also effectively preserves the structures such as edges for enhanced results. Experiments on several MRI images show that the proposed method performs better than other methods by improving contrast and preserving structures well.

关键词： MRI image enhancement Medical big data Pulse coupled neural network Classical visual receptive field

来源：评论

学校读者我要写书评

暂无评论

image Classification Based on Light Convolutional neural Network Using Pulse Couple neural Network

引用

Computational intelligence and neuroscience 2023年第1期2023卷 7371907页

作者： Maminiaina Alphonse Rafidison Hajasoa Malalatiana Ramafiarisona Paul Auguste Randriamitantsoa Sabine Harisoa Jacques Rafanantenana Faniriharisoa Maxime Rajaonarison Toky Lovasoa Patrick Rakotondrazaka Andry Harivony Rakotomihamina Telecommunication-Automatic-Signal-Image-Research Laboratory/Doctoral School in Science and Technology of Engineering and Innovation/University of Antananarivo Antananarivo 101 Madagascar.

Recently, most image classification studies solicit the intervention of convolutional neural networks because these DL-based classification methods generally outperform other methodologies with higher accuracy. However, this type of deep learning networks require many parameters and have a complex structure with multiple convolutional and pooling layers depending on the objective. These layers compute a large volume of data and it may impact the processing time and the performance. Therefore, this paper proposes a new method of image classification based on the light convolutional neural network. It consists of replacing the feature extraction layers of standard convolutional neural network with a single pulse coupled neural network by introducing the notion of foveation. This module provides the feature map of input image and the data compression using Discrete Wavelet Transform which is an optional step depending on the information quantity of this signature. The fully connected neural network, which has six hidden layers, classifies the image. With this technique, the computation time is reduced, and the network architecture is identical and simple independent of the type of dataset. The number of parameter is less than that in current research. The proposed method was validated with different dataset such as Caltech-101, Caltech-256, CIFAR-10, CIFAR-100, and imageNet, and the accuracy reaches 92%, 90%, 99%, 94%, and 91%, respectively, which are better than the previous related works.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Encoder-Decoder Convolutional neural Network based Iris-Sclera Segmentation 27

Encoder-Decoder Convolutional Neural Network based Iris-Scle...

引用

27th signal processing and Communications Applications Conference (SIU)

作者： Sahin, Gurkan Susuz, Orkun Cybersoft Ar Ge Birimi Istanbul Turkey Yildiz Tekn Univ Bilgisayar Muhendisligi Bolumu Istanbul Turkey

ISBN: (纸本)9781728119045

Iris-sclera biometry is one of the features that yields high accuracy in user recognition and liveness detection systems. In this study, segmentation processes the first stage of an iris-sclera user verification system have been considered. Traditional and convolutional neural network based deep learning methods have been used for iris-sclera segmentation. Performance of the investigated methods has been tested on two distinct eye image datasets (UBIRIS and self-collected data). Our experimental results show that deep learning based segmentation methods outperformed conventional methods in terms of dice score on both datasets.

关键词： pixel-wise segmentation deep learning convolutional neural network

来源：评论

学校读者我要写书评

暂无评论

Decoding audio related region with temporal and structural enhancement for audio visual segmentation

引用

Digital signal processing: A Review Journal 2025年 165卷

作者： Geng, Qingwei Gu, Xiaodong Department of Electronic Engineering Fudan University Shanghai200438 China

Audio-Visual Segmentation (AVS) is a task that aims to predict pixel-level masks for sound-producing objects in videos. Recent advanced AVS methods primarily focus on cross-modal interaction while often neglecting the significance of temporal modeling and precise structural prediction. To address these challenges, we propose a novel AVS framework incorporating several innovations. Firstly, we propose a Temporal Enhancement Module (TEM) that effectively captures temporal relationships across frames. Secondly, we devise an Audio-Visual Decoder that utilizes audio information to selectively emphasize relevant visual regions during decoding. Besides, Structural Similarity (SSIM) is introduced into the loss function to preserve the structural integrity of predicted masks, thereby enhancing the coherence and precision of object boundaries. The extensive experimental results on multiple AVS datasets show that our proposed method outperforms current advanced AVS models and approaches from other tasks in terms of the mean Intersection over Union (mIoU) and F-score metrics. © 2025 Elsevier Inc.

关键词： image segmentation

来源：评论

学校读者我要写书评

暂无评论

Building segmentation of remote sensing images using deep neural networks and domain transform CRF 25

Building segmentation of remote sensing images using deep ne...

引用

Conference on image and signal processing for Remote Sensing XXV

作者： Sun, Jingxi Li, Weihong Zhang, Yan Gong, Weiguo Chongqing Univ Key Lab Optoelect Technol & Syst Educ Minist Chongqing Peoples R China

ISBN: (数字)9781510630147

ISBN: (纸本)9781510630147

Automatic building segmentation from remote sensing images is critical in the remote sensing image semantic segmentation. The success of deep neural networks has led to advances in using fully convolutional neural networks (FCN) to extract buildings from the high-resolution image. However, the downsampling processing inevitably leads to loss of details of the segmentation results. To solve this problem, some methods try to refine the results of FCN by using probability graph models such as fully connected CRF (Conditional Random Fields). Nevertheless, many fully connected CRF based methods are too time-consuming and not suitable for building segmentation tasks in some situations. In this paper, we propose a novel time- efficient end-to-end CRF model with the domain transform algorithm called DT-CRF. In the proposed model, in order to accelerate the message passing in the mean-field approximate inference algorithm, we take the edge maps as the joint image for DT-CRF and use the domain transformation algorithm to calculate the pair-wise potential instead of the Gaussian kernel function. Meanwhile, we design a multi-task network which can generate masks and edges simultaneously, and the network can make the DT-CRF to easily optimize the segmentation results using model information. The evaluation of remote sensing image datasets verifies the time and space efficiency of the proposed DTCRF and demonstrates a distinct improvement.

关键词： Remote Sensing image Convolutional neural Networks Building Segmentation Conditional Random Field Domain Transform

来源：评论

学校读者我要写书评

暂无评论

image DEMOSAICKING VIA CHROMINANCE imageS WITH PARALLEL CONVOLUTIONAL neural NETWORKS 44

IMAGE DEMOSAICKING VIA CHROMINANCE IMAGES WITH PARALLEL CONV...

引用

44th IEEE International Conference on Acoustics, Speech and signal processing (ICASSP)

作者： Yamaguchi, Takuro Ikehara, Masaaki Keio Univ EEE Dept Yokohama Kanagawa 2238522 Japan

ISBN: (纸本)9781479981311

Many conventional demosaicking methods are based on hand-crafted filters. However, the filters yield false colors in salient regions like edges and textures. For acquisition of high quality images, we focus on neural networks. neural networks lead to high accuracy in many fields. However, there are few methods in demosaicking field. For adaptation to demosaicking, we consider not only network's architecture but also the input. In this research, we utilize a Bayer image as input of our networks. However, different filter is needed in estimation at different color pixels, for example, missing red value at green pixel and that at blue pixel. Therefore, we prepare four networks with downsampling operators classified by color patterns in Bayer images. This downsampling operator not only identifies the color pattern but also reduces the calculation cost in each network due to reduction of the size of feature maps. Besides, preparation of multi-networks instead of a deep single-network is suitable for today's parallel computing. Moreover, we utilize not missing color images but chrominance images as output. Compared to results with missing color images as output, the results with chrominance images obtains higher accuracy. Experimental results show our CNN-based approach produces high quality restored images.

关键词： Demosaicking Convolutional neural Network Multi-network Parallel Computing

来源：评论

学校读者我要写书评

暂无评论

DECOUPLING CATEGORY-WISE INDEPENDENCE AND RELEVANCE WITH SELF-ATTENTION FOR MULTI-LABEL image CLASSIFICATION 44

DECOUPLING CATEGORY-WISE INDEPENDENCE AND RELEVANCE WITH SEL...

引用

44th IEEE International Conference on Acoustics, Speech and signal processing (ICASSP)

作者： Liu, Luchen Guo, Sheng Huang, Weilin Scott, Matthew R. Malong Technol Shenzhen Peoples R China Shenzhen Malong Artificial Intelligence Res Ctr Shenzhen Peoples R China

ISBN: (纸本)9781479981311

Multi-label image classification has achieved remarkable progress thanks to deep convolutional neural networks (CNNs). In this paper, we propose a Decouple Network (DecoupleNet) which is an end-to-end CNN-based framework able to trade off class-level feature independence and relevance during training. The proposed DecoupleNet is able to decouple category-wise independence and relevance with image-level supervision. We design a category-wise space-to-depth module with a spatial pooling strategy to exploit more meaningful convolutional features. They are integrated with class-wise correlated information which is automatically learned via a new self-attention mechanism. We conduct extensive experiments on two large-scale benchmarks: the MS-COCO and the NUS-WIDE, where the proposed DecoupleNet obtains impressive performance compared favorably against the state-of-the-art methods on multi-label image classification.

关键词： Multi-label image classification self-attention convolutional neural network

来源：评论

学校读者我要写书评

暂无评论

ResCBAR-FusionNet: A Hybrid CNN-BiGRU-Attention Model for Human Activity Recognition

引用

signal, image and Video processing 2025年第8期19卷

作者： Hassan, Ameer Ali R. Feizi-Derakhshi, Mohammad-Reza Department of Computer Engineering University of Tabriz Tabriz Iran College of Engineering Uruk University Baghdad Iraq

Human activity recognition (HAR) is recognized as one of the most critical key academic fields, focusing on the classification of human behaviors based on sensor data. It has gained increasing attention due to its broad in rehabilitation, medical treatment, health monitoring, and other domains. Despite the notable progress made in recent years, HAR still faces several challenges-particularly related to model complexity and the effective extraction of spatiotemporal features. To address these challenges, we propose a novel hybrid deep learning architecture called the ResCBAR-FusionNet model, that integrates 1D Convolutional neural Networks (1DCNN), Bidirectional Gated Recurrent Units (BiGRU), Convolutional Block Attention Module (CBAM), and Residual Connections. The proposed model Leverages 1DCNN and BiGRU to extract comprehensive spatiotemporal features while enhancing model focus on the most critical information through attention mechanisms and maintaining training stability via residual connections. We evaluated our model on three datasets: KU-HAR, UCI-HAR, and WISDM, achieving classification accuracies of 99.03%, 97.40%, and 99.83%, while the F1-scores were 99.01%, 97.39%, and 99.68%, respectively. The proposed model achieved high accuracy across a variety of datasets and activity categories, outperforming recent state-of-the-art methods and demonstrating strong robustness and generalization capability. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2025.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：