检索结果-内蒙古大学图书馆

Hardware acceleration of YOLOv7-tiny using high-level synthesis tools

JOURNAL OF REAL-TIME image processing 2023年第4期20卷 75页

作者： Hosseiny, Adib Jahanirad, Hadi Univ Kurdistan Dept Elect & Commun Engn Sanandaj 90210 Iran

FPGAs have emerged as a promising platform for implementing neural networks due to their reconfigurability, parallelism, and low power consumption. Nonetheless, designing and optimizing FPGA-based neural network accelerators is a complex and time-consuming task with register transfer level (RTL) languages. High-level synthesis (HLS) tools provide a higher level of abstraction for FPGA design, enabling designers to concentrate on top-level design aspects, such as algorithms, rather than low-level hardware implementation details. One of the state-of-the-art object detection networks is you look only once (YOLO) network series which is constructed using different neural network technologies using cross-stage connections and feature extraction techniques like pyramid networks. In this paper, we propose a method for the implementation of YOLOv7-tiny network on FPGAs using HLS tools. We present a comprehensive analysis of the performance and resource utilization of FPGA-based neural network accelerators. Our methods show excellent results for real-time application requirements such as latency. Specifically, our work reduces the usage of digital signal processing (DSP) units by 90% and it saves up to 60% of flip-flops compared to state-of-the-art designs, while achieving competitive usage of block RAM and look-up tables. Additionally, the achieved design latency of 15 ms is extremely suitable for real-time applications. Also we will propose a method for BRAM utilization method and off-chip memory access.

关键词： High level synthesis Convolutional neural network Object detection FPGA YOLO

来源：评论

学校读者我要写书评

暂无评论

Extreme Low Bitrate image Compression System for Mobile Deployment 26

Extreme Low Bitrate Image Compression System for Mobile Depl...

引用

26th International Workshop on Multimedia signal processing

作者： Wu, Junqi Duan, Wenhong Ma, Xianping Chang, Jianhui Wang, Shanshe Ma, Siwei Jia, Chuanmin Peking Univ Beijing Peoples R China Shanghai Jiao Tong Univ Shanghai Peoples R China Chinese Univ Hong Kong Shenzhen Peoples R China

ISBN: (纸本)9798350387261;9798350387254

End-to-end image compression has achieved satisfactory results in recent studies. However, existing methods suffer from high complexity of complicated neural network computation and cannot be directly deployed on mobile devices due to the limitations of computing ability and storage. Therefore, considering the resource and computing ability constrains of the mobile devices, we make a trade-off in this paper between rate-distortion (R-D) performance, inference time, and model complexity. Then we design a novel lightweight perceptual image compression framework to alleviate the storage and complexity burden of mobile devices. Moreover, we design a hardware-friendly deployment scheme to apply the proposed compression framework on high-end mobile devices, which can achieve efficient image compression. Based on the above structures, we propose the first mobile system that achieves image compression on mobile devices. The supplementary material of our system demo is on https://***/documents/extreme-lowbitrate-image-compression-system-mobile-deployment.

关键词： image compression low bitrate mobile devices hardware-friendly

来源：评论

学校读者我要写书评

暂无评论

Self-Supervised Spontaneous Latent-Based Facial Expression Sequence Generation

IEEE OPEN JOURNAL OF SIGNAL PROCESSING

引用

IEEE OPEN JOURNAL OF signal processing 2023年 4卷 304-312页

作者： Yap, Chuin Hong Yap, Moi Hoon Davison, Adrian K. Cunningham, Ryan Manchester Metropolitan Univ Dept Comp & Math Manchester M15 6BH England

In this article, we investigate the spontaneity issue in facial expression sequence generation. Current leading methods in the field are commonly reliant on manually adjusted conditional variables to direct the model to generate a specific class of expression. We propose a neural network-based method which uses Gaussian noise to model spontaneity in the generation process, removing the need for manual control of conditional generation variables. Our model takes two sequential images as input, with additive noise, and produces the next image in the sequence. We trained two types of models: single-expression, and mixed-expression. With single-expression, unique facial movements of certain emotion class can be generated;with mixed expressions, fully spontaneous expression sequence generation can be achieved. We compared our method to current leading generation methods on a variety of publicly available datasets. Initial qualitative results show our method produces visually more realistic expressions and facial action unit (AU) trajectories;initial quantitative results using image quality metrics (SSIM and NIQE) show the quality of our generated images is higher. Our approach and results are novel in the field of facial expression generation, with potential wider applications to other sequence generation tasks.

关键词： Faces Mathematical models Markov processes Gold Training Task analysis image sequences Affective computing artificial neural networks self-supervised learning

来源：评论

学校读者我要写书评

暂无评论

Noise Identification for Data-augmented Physics-based State-Space Models 37

Noise Identification for Data-augmented Physics-based State-...

引用

2024 International Workshop on signal processing Systems

作者： Dunik, J. Straka, O. Kost, O. Tang, S. Imbiriba, T. Closas, P. Univ West Bohemia Dept Cybernet Univ 8 Plzen 30614 Czech Republic Northeastern Univ Dept Elect & Comp Engn 360 Huntington Ave Boston MA 02115 USA

ISBN: (纸本)9798350373769;9798350373752

This paper deals with the state-space modelling of nonlinear stochastic dynamic systems. The emphasis is laid on the emerging area of data-augmented physics-based modelling of the state dynamics, which combines the benefits of the physics-driven and data-based identified models. As the augmented state-space models depend on the measured data, modelling the state noise properties becomes challenging. This paper proposes and validates a concept for the state noise identification of nonlinear data-augmented state equation using the maximum likelihood and correlation-based methods. The numerical simulation of a tracking scenario shows significant improvement of the state estimation accuracy and consistency when using the identified noise model.

关键词： State estimation neural networks Correlation method Maximum likelihood method.

来源：评论

学校读者我要写书评

暂无评论

ITERATIVELY PRECONDITIONED GUIDANCE OF DENOISING (DIFFUSION) MODELS FOR image RESTORATION 49

ITERATIVELY PRECONDITIONED GUIDANCE OF DENOISING (DIFFUSION)...

引用

49th IEEE International Conference on Acoustics, Speech, and signal processing (ICASSP)

作者： Tirer, Tom Bar Ilan Univ Fac Engn Ramat Gan Israel

ISBN: (纸本)9798350344868;9798350344851

Training deep neural networks has become a common approach for addressing image restoration problems. An alternative for training a "task-specific" network for each observation model is to use pretrained deep denoisers for imposing only the signal's prior within iterative algorithms, without additional training. Recently, this approach has become increasingly popular with the rise of diffusion/score-based generative models, whose core is iterative denoising. Using denoisers for general purpose restoration requires guiding the iterations to ensure agreement of the signal with the observations. In low-noise settings, guidance that is based on back-projection (BP) has been shown to be a promising strategy (used recently in the context of diffusion models also under the names "pseudoinverse" or "range/null-space" guidance). However, the presence of noise in the observations hinders the gains from this approach. In this paper, we propose a novel guidance technique, based on preconditioning that allows traversing from BP-based guidance to least squares based guidance along the restoration scheme. The proposed approach is robust to noise while still having much simpler implementation than alternative methods (e.g., no SVD is required). We demonstrate its advantages for image deblurring and superresolution.

关键词： image restoration iterative denoising plug-and-play denoisers diffusion models back-projection

来源：评论

学校读者我要写书评

暂无评论

A one-shot face detection and recognition using deep learning method for access control system

引用

signal image AND VIDEO processing 2023年第4期17卷 1571-1579页

作者： Tsai, Tsung-Han Tsai, Chi-En Chi, Po-Ting Natl Cent Univ Dept Elect Engn 300 Jung Da Rd Zhongli 320 Taiwan

In this paper, we propose a face detection and recognition system using deep learning method. It can be used as an access control system that performs face detection and recognition in real-time processing. Our goal is to achieve a one-shot recognition instead of traditional two-step methods. We use SSD as the main model for face detection and VGG-Face as the main model for face recognition. We perform the deep learning method through the collection of datasets. Moreover, we use some techniques, such as data augmentation, preprocessing of the image, and post-processing of the image to train the robust face detection and recognition subsystems. We use continuous frames as input to avoid false-positive cases and make the system output without wrong results. A real demonstration system is constructed to determine the identification of the laboratory members. We use 1280 x 960 resolution video for experimental testing and achieve about 30 fps speed under GPU acceleration.

关键词： Face detection Face recognition Deep neural network Machine learning Artificial intelligence

来源：评论

学校读者我要写书评

暂无评论

Unified Dictionary Training Approach for High-Resolution image Enhancement 4

Unified Dictionary Training Approach for High-Resolution Ima...

引用

4th International Conference on Artificial Intelligence and signal processing

作者： Srinadh, Kannuru David, Valaparla Lakshmi, Goruputi Ramakrishna, Karlapudi NIT Rourkela Dept ECE Rourkela Odisha India Anurag Engn Coll A Dept ECE Kodada Telangana India NIT Rourkela Dept Math Rourkela Odisha India

ISBN: (纸本)9798350350661;9798350350654

This research presents a new unified dictionary training super-resolution (UDTSR) approach to single and multi-image super-resolutions that uses a bilevel optimization framework and patchwise sparse recovery. By employing interconnected dictionaries to bridge the two spaces of image patches, we ensure that a sparse representation of a low-resolution (LR) image patch can accurately reconstruct its corresponding high-resolution (HR) patch. To facilitate efficient stochastic gradient descent, implicit differentiation calculates the gradient. Furthermore, by using a neural network model for rapid sparse inference and selective processing of visually essential areas, we can almost tenfold improve the performance of real-world applications. Also, by discovering the fundamental relationships between different data modalities, our approach overcomes the difficulty of dealing with panchromatic and multispectral images. For example, using shared and individual sparse representations, we describe a data model that can detect similarities and differences in multimodal signals. Single image super-resolution (SISR) and multi-frame super-resolution (MFSR) are advanced separately, with minimal research on their ideal combination. We propose a novel UDTSR analysis using an iterative shrinkage and thresholding algorithm. Our simulations of many combinations of SISR and MFSR, such as x2, x3, and x4, confirm our theory quantitatively and qualitatively.

关键词： Dictionary learning single image multi-frame image super-resolution sparse representations

来源：评论

学校读者我要写书评

暂无评论

Thermal image super-resolution via multi-path residual attention network

引用

signal image AND VIDEO processing 2023年第5期17卷 2073-2081页

作者： Zhang, Haikun Hu, Yueli Yan, Ming Ma, Bin Shanghai Univ Sch Mechatron Engn & Automat Shanghai 200444 Peoples R China

Convolutional neural Networks (CNN)-based Single-image Super-Resolution (SISR) methods for RGB images have flourished rapidly. However, thermal images SR methods based on CNN are rarely studied. The performance of existing deep SR methods is limited by the narrow receptive field of single small convolution kernel (e.g., 3 x 3). In this paper, we propose a thermal image SISR deep network MPRANet, combining multi-path residual and attention blocks. Specifically, an innovative design multi-path residual block, constructed by parallel depth-wise separable convolution paths composed of convolution kernels of different sizes, is used to extract local minute and global large features, effectively enhancing the capacity of MPRANet. Meanwhile, the attention block is formed by cascading channel attention and spatial attention modules to re-scale features in the channel and spatial dimensions sequentially. A Mixture of Data Augmentation (MoDA) strategy for meliorating MPRANet performance without increasing computational burden is proposed. MoDA makes full use of multiple pixel-domain data augmentation methods to raise the generalization of MPRANet. Qualitative and quantitative experiments on three test datasets show that the proposed MPRANet has obvious advantages over state-of-the-art thermal and RGB image SR methods for the preservation of details such as edges and textures.

关键词： Super-resolution Thermal image Data augmentation Attention mechanism Multi-path learning

来源：评论

学校读者我要写书评

暂无评论

Automated magnetocardiography classification using a deformable convolutional block attention module

引用

BIOMEDICAL signal processing AND CONTROL 2025年 105卷

作者： Wang, Ruizhe Pang, Jiaojiao Han, Xiaole Xiang, Min Ning, Xiaolin Beihang Univ Sch Instrumentat & Optoelect Engn Key Lab Ultraweak Magnet Field Measurement Technol Minist Educ Beijing 100191 Peoples R China Beihang Univ Hangzhou Innovat Inst Zhejiang Prov Key Lab Ultraweak Magnet Field Space Hangzhou 310051 Zhejiang Peoples R China Beihang Univ Hangzhou Inst Natl Extremely Weak Magnet Field Inf Hangzhou 310028 Zhejiang Peoples R China Shandong Univ Inst Magnet Field Free Med & Funct Imaging Shandong Key Lab Magnet Field Free Med & Funct Ima Jinan Peoples R China Shandong Univ Shandong Prov Clin Res Ctr Emergency & Crit Care M Dept Emergency Med Qilu Hosp Jinan Peoples R China Shandong Univ Natl Innovat Platform Ind Educ Intearat Med Engn I Jinan Peoples R China Hefei Natl Lab Hefei 230088 Anhui Peoples R China

Objective: This study developed a fast and accurate automated method for magnetocardiography (MCG) classification. Approach: We propose a deformable convolutional block attention module (DCBAM)-based method for classifying coronary artery disease (CAD) using MCG. After preprocessing, the raw MCG data were segmented into individual heartbeat segments and encoded into image representations using the Hilbert curve to convert the temporal features into spatial image features. We combined DCBAM with convolutional neural networks (CNNs) for MCG classification. DCBAM incorporated a deformable convolutional architecture along with temporal and spatial attention mechanisms to capture representative and correlative features of the image representation MCG along the temporal and spatial multichannel dimensions. We performed ablation experiments to evaluate the rationality and validity of the proposed model structure. Additionally, we performed an interpretability analysis to investigate the model's region of interest for CAD diagnosis. Results: The proposed method achieved an average accuracy of 93.57%, precision of 94.71%, sensitivity of 92.56%, specificity of 94.68%, and average F1-score of 93.60%. In contrast to existing methods, our proposed model achieved superior diagnostic classification results in MCG with fewer parameters. Significance: Integrating DCBAM with image-representation MCG establishes a novel feature extraction method that enhances the clinical utility of MCG and effectively addresses long-range dependencies and spatiotemporal inconsistencies in time-series signal analysis.

关键词： Magnetocardiography Coronary artery disease Convolutional neural network Attention mechanism Deformable convolutional block attention module

来源：评论

学校读者我要写书评

暂无评论

Optimized interpretable generalized additive neural networks based malicious activity detection with video surveillance

引用

signal image AND VIDEO processing 2025年第7期19卷 1-13页

作者： Lokesh, K. Baskar, M. SRM Inst Sci & Technol Coll Engn & Technol Sch Comp Dept Comp Sci & Engn Chennai 606203 Tamilnadu India SRM Inst Sci & Technol Coll Engn & Technol Sch Comp Dept Comp Technol Chennai 606203 Tamilnadu India

Video surveillance continues to have difficulties with identifying the anomalies such as illegal activities and crimes despite the development of interactive multimedia anomaly detection systems. To address this issue, an Optimized Interpretable Generalized Additive neural Networks based Malicious Activity Detection with Video Surveillance (IGANN-MAD-VS-EOSSOA) is proposed in this paper. Initially, the input videos are collected from UCF-Crime and ShanghaiTech dataset. The collected video is fed to pre-processing for improving the quality of video, removing the noise and enhancing the clarity of image using Multiple Local Particle Filtering (MLPF). The pre-processed video is fed to the segmentation process. Here, the input videos are segmented into image using Maximum Entropy Scaled Super-pixels Segmentation (MESPS). Then the feature extraction is done by Synchro-Transient-Extracting Transform (STET) to extract the features, like color, texture, size, shape, and orientation. The extracted features are provided to the Interpretable Generalized Additive neural Networks (IGANN) for classifying malicious activity, like Normal, Assault, Fighting, Shooting, Vandalism, Abuse and Accident. In general, IGANN does not adapt any optimization techniques for determining the optimal parameters to assure appropriate categorization. Hence, Elite opposite Sparrow Search Optimization Algorithm (EOSSOA) is proposed to enhance the weight parameter of IGANN for the detection of malicious activity with video surveillance. The proposed IGANN-MAD-VS-EOSSOA method is implemented in Python. The proposed technique attains 26.36%, 20.69% and 30.29% higher accuracy, 19.12%, 28.32%, and 27.84% higher precision when compared with the existing methods: Video anomaly detection scheme with deep convolutional and recurrent techniques (AD-CNN-VS), Toward trustworthy human suspicious activity detection from surveillance videos with deep learning (HSAD-SV-RNN), Deep learning-based real-world object dete

关键词： Multiple local particle filter Simple contrastive graph clustering Synchro-transient-extracting transform UCF-crime dataset

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：