检索结果-内蒙古大学图书馆

IEEE International Conference on Emerging Trends in Computing, Communication and Nanotechnology (ICECCN)

作者： Vanaja, R. Praba, N. Lakshmi Dhanalakshmi Srinivasan Engn Coll Commun Syst Perambalur Tamil Nadu India Anna Univ Dept EEE Madras Tamil Nadu India

ISBN: (纸本)9781467350372;9781467350365;9781467350358

This paper presents a throughput efficient FPGA implementation of the 'Set Partitioning in Hierarchical Trees' (SPIHT) algorithm for compression of images. The SPIHT uses inherent redundancy among wavelet coefficients and suited for both gray and color images. The SPIHT algorithm uses dynamic data structures which hinders hardware realization. In this FPGA implementation have modified basic SPIHT in two ways, one by using static (fixed) mappings which represent significant information and the other by interchanging the sorting and refinement passes. A hardware realization is done in a Xilinx XC3S200 device. The SPIHT algorithm can be applied to both grey-scale and colored images. SPIHT displays exceptional characteristics over several properties like good image quality, fast coding and decoding, a fully progressive bit stream, application in lossless compression, error protection and ability to code for exact bit rate.

关键词： SPIHT wavelet coefficient DWT compression process wavelet transform encoder and decoder offsprings list of insignificant pixels (LIP) list of significant pixels (LSP) list of insignificant sets(LIS)

来源：评论

学校读者我要写书评

暂无评论

Selected Topics in Bayesian Image/Video Processing

Selected Topics in Bayesian Image/Video Processing

引用

作者： Hao, Qiang West Virginia University

学位级别：Ph.D.

In this dissertation, three problems in image deblurring, inpainting and virtual content insertion are solved in a Bayesian framework. Camera shake, motion or defocus during exposure leads to image blur. Single image deblurring has achieved remarkable results by solving a MAP problem, but there is no perfect solution due to inaccurate image prior and estimator. In the first part, a new non-blind deconvolution algorithm is proposed. The image prior is represented by a Gaussian Scale Mixture(GSM) model, which is estimated from non-blurry images as training data. Our experimental results on a total twelve natural images have shown that more details are restored than previous deblurring algorithms. In augmented reality, it is a challenging problem to insert virtual content in video streams by blending it with spatial and temporal information. A generic virtual content insertion (VCI) system is introduced in the second part. To the best of my knowledge, it is the first successful system to insert content on the building facades from street view video streams. Without knowing camera positions, the geometry model of a building facade is established by using a detection and tracking combined strategy. Moreover, motion stabilization, dynamic registration and color harmonization contribute to the excellent augmented performance in this automatic VCI system. Coding efficiency is an important objective in video coding. In recent years, video coding standards have been developing by adding new tools. However, it costs numerous modifications in the complex coding systems. Therefore, it is desirable to consider alternative standard-compliant approaches without modifying the codec structures. In the third part, an exemplar-based data pruning video compression scheme for intra frame is introduced. Data pruning is used as a pre-processing tool to remove part of video data before they are encoded. At the decoder, missing data is reconstructed by a sparse linear combination of similar

关键词： data-driven non-blind deconvolution visual content insertion image statistics pre- and post- processing encoder and decoder sparse representation

来源：评论

学校读者我要写书评

暂无评论

Effective image tampering localization with multi-scale ConvNeXt feature fusion

引用

JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION 2024年 98卷

作者： Zhu, Haochen Cao, Gang Zhao, Mo Tian, Huawei Lin, Weiguo Commun Univ China Sch Comp & Cyber Sci Beijing 100024 Peoples R China Commun Univ China State Key Lab Media Convergence & Commun Beijing 100024 Peoples R China Peoples Publ Secur Univ China Beijing 100038 Peoples R China

With the widespread use of powerful image editing tools, image tampering becomes easy and realistic. Existing image forensic methods still face challenges of low generalization performance and robustness. In this letter, we propose an effective image tampering localization scheme based on ConvNeXt encoder and multi-scale Feature Fusion (ConvNeXtFF). Stacked ConvNeXt blocks are utilized as an encoder to capture hierarchical multi-scale features, which are then fused in decoder for locating tampered pixels accurately. Combined loss function and effective data augmentation strategies are adopted to further improve the model performance. Extensive experimental results show that both localization accuracy and robustness of the ConvNeXtFF scheme outperform other state-of-the-art ones. The source code is available at https://***/multimediaFor/ConvNeXtFF.

关键词： Image forensics Tampering localization encoder and decoder ConvNeXt Multi-scale feature fusion

来源：评论

学校读者我要写书评

暂无评论

A novel feature separation model exchange-GAN for facial expression recognition

引用

KNOWLEDGE-BASED SYSTEMS 2020年 204卷

作者： Yang, Lie Tian, Yong Song, Yonghao Yang, Nachuan Ma, Ke Xie, Longhan South China Univ Technol Shien Ming Wu Sch Intelligent Engn Guangzhou Peoples R China Fudan Univ Sch Comp Sci Shanghai Key Lab Intelligent Informat Proc Shanghai Peoples R China South China Univ Technol Sch Mech & Automot Engn Guangzhou Peoples R China

Currently, with the rapid development of deep learning, many breakthroughs have been made in the field of facial expression recognition (FER). However, according to our prior knowledge, facial images contain not only expression-related features but also some identity-related features, and the identity-related features vary from person to person which often have a negative influence on the FER process. It is one of the most important challenges in the field of FER. In this paper, a novel feature separation model exchange-GAN is proposed for the FER task, which can realize the separation of expression-related features and expression-independent features with high purity. And the FER method based on the exchange-GAN can overcome the interference of identity-related features to a large extent. First, the feature separation is achieved by the exchange-GAN through partial feature exchange and various constraints. Then we ignore the expression-independent features, and conduct FER only according to the expression-related features to alleviate the adverse effect of identity-related features. Finally, some experiments are conducted on three famous databases with the FER methods proposed in this paper. The experimental results show that the proposed FER method can alleviate the interference of identity-related information through feature separation by the exchange-GAN and achieve excellent performance for the objects that have not appeared in the training set. What's more, our method can obtain very competitive FER accuracy on the three experimental databases. (C) 2020 Elsevier B.V. All rights reserved.

关键词： Generative adversarial network Facial expression recognition encoder and decoder Feature separation Partial feature exchange

来源：评论

学校读者我要写书评

暂无评论

Enhancing the accuracy of shock advisory algorithms in automated external defibrillators during ongoing cardiopulmonary resuscitation using a cascade of CNNEDs

引用

COMPUTERS IN BIOLOGY AND MEDICINE 2024年 172卷 108180-108180页

作者： Nejad, Mahdi Pirayesh Shirazi Kargin, Vadym Hajeb-M, Shirin Hicks, David Valentine, Matt Chon, K. H. Univ Connecticut Dept Biomed Engn Storrs CT 06269 USA Defibtech LLC Guilford CT 06437 USA Philips Healthcare Bothell WA 98021 USA

Delivery of continuous cardiopulmonary resuscitation (CPR) plays an important role in the out -of -hospital cardiac arrest (OHCA) survival rate. However, to prevent CPR artifacts being superimposed on ECG morphology data, currently available automated external defibrillators (AEDs) require pauses in CPR for accurate analysis heart rhythms. In this study, we propose a novel Convolutional Neural Network -based encoder -decoder (CNNED) structure with a shock advisory algorithm to improve the accuracy and reliability of shock versus nonshock decision -making without CPR pause in OHCA scenarios. Our approach employs a cascade of CNNEDs in conjunction with an AED shock advisory algorithm to process the ECG data for shock decisions. Initially, a CNNED trained on an equal number of shockable and non -shockable rhythms is used to filter the CPR -contaminated data. The resulting filtered signal is then fed into a second CNNED, which is trained on imbalanced data more tilted toward the specific rhythm being analyzed. A reliable shock versus non -shock decision is made when both classifiers from the cascade structure agree, while segments with conflicting classifications are labeled as indeterminate, indicating the need for additional segments to analyze. To evaluate our approach, we generated CPR -contaminated ECG data by combining clean ECG data with 52 CPR samples. We used clean ECG data from the CUDB, AFDB, SDDB, and VFDB databases, to which 52 CPR artifact cases were added, while a separate test set provided by the AED manufacturer Defibtech LLC was used for performance evaluation. The test set comprised 20,384 non -shockable CPR -contaminated segments from 392 subjects, as well as 3744 shockable CPR -contaminated samples from 41 subjects with coarse ventricular fibrillation (VF) and 31 subjects with rapid ventricular tachycardia (rapid VT). We observed improvements in rhythm analysis using our proposed cascading CNNED structure when compared to using a single CNNED struct

关键词： Defibrillatory shock Cardiopulmonary resuscitation Convolutional Neural Network encoder and decoder Ventricular fibrillation Ventricular tachycardia Electrocardiogram Sudden cardiac arrest

来源：评论

学校读者我要写书评

暂无评论

Multi-unit stacked architecture: An urban scene segmentation network based on UNet and ShuffleNetv2

引用

APPLIED SOFT COMPUTING 2024年 165卷

作者： Liu, Dian Du, Jianchao Li, Chuhan Yu, Chenglong Zhang, Mingjin Xidian Univ Sch Telecommun Engn Xian 710071 Shaanxi Peoples R China Beijing Elect Sci & Technol Inst Beijing 100070 Peoples R China Xidian Univ Sch Artificial Intelligence Xian 710071 Shaanxi Peoples R China

Classic high-accuracy semantic segmentation models typically come with a large number of parameters, making them unsuitable for deployment on driverless platforms with limited computational power. To strike a balance between accuracy and limited computational budget, and enable the use of the classic segmentation model UNet in unmanned driving scenarios, this paper proposes a multi-unit stacked architecture (MSA), namely, MSA-Net, based on UNet and ShuffleNetv2. First, MSA-Net replaces the convolution blocks in the UNet encoder and decoder with stacked basic ShuffleNetv2 units, which greatly reduces computational cost while maintaining high segmentation accuracy. Second, MSA-Net designs enhanced skip connections using pointwise convolution and convolutional block attention (CBAM) to aid the decoder in selecting more relevant and valuable information. Third, MSA-Net proposes multi-scale internal connections to extend the receptive fields of encoder and decoder with little increase in model parameters. The comprehensive experiments show MSANet achieves an optimal balance on the Cityscapes dataset between accuracy and model complexity, with strong generalization on the enhanced PASCAL VOC 2012 dataset. MSA-Net achieves a mean intersection over union (mIoU) of 73.6% and an inference speed of 31.0 frames per second (FPS) on the Cityscapes test dataset. We also propose two other MSA-Net models of different sizes, providing more options for resource-constrained inference.

关键词： Unmanned driving Semantic segmentation encoder and decoder Resource-constrained inference

来源：评论

学校读者我要写书评

暂无评论

Interpretable deep learning model for building energy consumption prediction based on attention mechanism

引用

ENERGY AND BUILDINGS 2021年 252卷

作者： Gao, Yuan Ruan, Yingjun Univ Tokyo Sch Engn Tokyo Japan Tongji Univ Sch Mech & Energy Engn Shanghai Peoples R China

An effective and accurate building energy consumption prediction model is an important means to effectively use building management systems and improve energy efficiency. To cope with the development and changes in digital data, data-driven models, especially deep learning models, have been applied for the prediction of energy consumption and have achieved good accuracy. However, as a deep learning model that can process high-dimensional data, the model often lacks interpretability, which limits the further application and promotion of the model. This paper proposes three interpretable encoder and decoder models based on long short-term memory (LSTM) and self-attention. Attention based on hidden layer states and feature-based attention improves the interpretability of the deep learning models. A case study of one office building is discussed to demonstrate the proposed method and models. Firstly, the addition in future real weather information yields only a 0.54% improvement in the MAPE. The visualization of the model attention weights improves the interpretability of the model at the hidden state level and feature level. For the hidden state of different time steps, the LSTM network will focus on the hidden state of the last time step because it contains more information. The Transformer model gives almost equal attention weight to each day in the coding sequence. For the interpretable results at the feature level, daily max temperature, mean temperature, min temperature, and dew point temperature are the four most important features. The four characteristics of pressure, wind speed-related features, and holidays have the lowest average weights. (c) 2021 Elsevier B.V. All rights reserved.

关键词： Building energy forecasting encoder and decoder Attention Interpretable deep learning model

来源：评论

学校读者我要写书评

暂无评论

Sports video retrieval and classification using focus u-net based squeeze excitation and residual mapping deep learning model

引用

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE 2023年 126卷

作者： Srilakshmi, G. Joe, I. R. Praveen KCG Coll Technol Dept CSE Chennai 600097 Tamil Nadu India Vellore Inst Technol SCOPE Chennai 600127 Tamil Nadu India

Sports videos are widely used by athletes and coaches for training and match analysis purposes outside the mainstream audience. Sports videos should be effectively classified into different genres to easily retrieve and index them from large video datasets. Manual labelling classification methods may cause errors and have low accuracy. Classification based on video content analysis is challenging for computer vision-based techniques. This work introduces an improved focus-net deep learning (DL) model called the Convolutional squeeze U-Net based encoder-decoder for sports video retrieval and classification. First, the keyframes are extracted from the input sports video using a clustering and optical flow analysis method. In the next stage, the frames are preprocessed using a smoothed shock filtering technique to remove the noise. The process of image segmentation is carried out using a Convolutional squeeze U-Net based encoder-decoder model. Finally, the sports video can be classified using the softmax classifier. A CNN (convolutional neural network) is utilized at the encoder section for extracting the features and fed to the decoder for video classification. The experiments are performed in the UCF101 dataset, and the proposed model achieved an overall accuracy of 99.68%. Hence, it is proven that the proposed focus-net model can be efficiently utilized in sports video classification.

关键词： Sports videos Keyframes Deep learning Video retrieval Classification encoder and decoder

来源：评论

学校读者我要写书评

暂无评论

Coronary Artery Segmentation in X-ray Angiography Based on Deep Learning Approach

Coronary Artery Segmentation in X-ray Angiography Based on D...

引用

第43届中国控制会议

作者： Shanbin Li Yiting Fan School of Automation Science and Engineering South China University of Technology School of Energy and Control Engineering Changji University

ISBN: (数字)9789887581581

ISBN: (纸本)9798350366907

Segmenting coronary arteries from X-ray coronary angiography(XCA) images allows observation of coronary artery morphology and stenosis, which is of great significance for computer-aided diagnosis and coronary artery disease ***, XCA images have low contrast and irregular lighting due to the limits of existing imaging techniques, making vascular segmentation challenging. Segmentation of coronary arteries using conventional U-shaped segmentation networks is challenging due to the problem of major differences in vascular morphology, size, and overlap. This paper proposes a network based on the U-shape structure for multiscale context information fusion with lightweight GhostNetV2 as the backbone feature extraction network to improve the model's feature extraction capabilities. Then a multi-scale context fusion module(MCF) is proposed to effectively capture the contextual information of blood vessels. Finally, we propose a feature re-extraction module(FRM)to achieve effective fine feature re-extraction in complex backgrounds. Experimental results show that the model we propose achieves more accurate coronary artery segmentation in complex backgrounds with fewer parameters, improves the segmentation of fine vessels at the end of coronary arteries, and performs well compared with other artery segmentation models. The F1 score,Intersection over Union(IOU), accuracy(ACC), and Sensitivity(Sen) of the model can reach 82.23%, 69.83%, 98.79%, 81.93%,respectively.

关键词： Coronary Arteries Segmentation Multi-scale Context Feature Fusion encoder and decoder Feature Fe-extraction Deep Learning

来源：评论

学校读者我要写书评

暂无评论

A Review on Medical Image Data Compression Techniques

A Review on Medical Image Data Compression Techniques

引用

International Conference on Data, Engineering and Applications (IDEA)

作者： Gopal Patidar Sunil Kumar Dilip Kumar TIEIT Bhopal India NIT Jamshdedpur Jamshedpur India

ISBN: (数字)9781728157184

ISBN: (纸本)9781728157191

In most of the telemedicine applications, the role of image compression techniques is important to deal with the medical images. This will be used for storage and transfer of data over a low bandwidth channel like the Internet by pathologist to a doctor for diagnosis problems of *** a medical image is compressed using methods of lossy compression, the doctor will not be able to perceive any deterioration in quality with respect to the original input image. One of the main disadvantage of the lossy compression algorithms that are commonly used for multimedia applications not for medical image, while the overall quality of the image can be controlled to some extent, in these cases, it is necessary to use lossless compression algorithms, because the compression and decompression of an image is identical to the original image since the information is preserved during the decompression process after image compression the reconstructed image is exact replica of the original image means no information is lost in the coding process. The aim of this paper is to provide a review on various image compression techniques, which is used in medical image compression, performance analysis and compared existing research on medical image compression.

关键词： Image compression Lossy Lossless Huffman Coding encoder and decoder Data compression Image compression Huffman codes image compression method Lossless images Imagery (Psychotherapy) lossy compression Medical original image image data compression Image Otolaryngologists Decompression Medical imaging

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：