检索结果-内蒙古大学图书馆

2024 International Conference on image processing

作者： Sokolova, Anna Vorontsova, Anna Gabdullin, Bulat Limonov, Alexander Samsung Res Seoul South Korea

ISBN: (纸本)9798350349405;9798350349399

Leveraging 3D semantics for direct 3D reconstruction has a great potential yet unleashed. For instance, by assuming that walls are vertical, and a floor is planar and horizontal, we can correct distorted room shapes and eliminate local artifacts such as holes, pits, and hills. In this paper, we propose FAWN, a modification of truncated signed distance function (TSDF) reconstruction methods, which considers scene structure by detecting walls and floor in a scene, and penalizing the corresponding surface normals for deviating from the horizontal and vertical directions. Implemented as a 3D sparse convolutional module, FAWN can be incorporated into any trainable pipeline that predicts TSDF. Since FAWN requires 3D semantics only for training, no additional limitations on further use are imposed. We demonstrate, that FAWN-modified methods use semantics more effectively, than existing semantic-based approaches. Besides, we apply our modification to state-of-the-art TSDF reconstruction methods, and demonstrate a quality gain in SCANNET, ICL-NUIM, TUM RGBD, and 7SCENES benchmarks.

关键词： 3D reconstruction TSDF

来源：评论

学校读者我要写书评

暂无评论

SEMANTIC-AWARE image COMPRESSION ARCHITECTURE FOR SEMANTIC COMMUNICATION 34

SEMANTIC-AWARE IMAGE COMPRESSION ARCHITECTURE FOR SEMANTIC C...

引用

34th International Workshop on Machine Learning for signal processing

作者： Wang, Haotian Cao, Zijian Zhang, Hua Southeast Univ Natl Mobile Commun Res Lab Nanjing 210096 Peoples R China

ISBN: (纸本)9798350372267;9798350372250

Visual data transmission, such as videos and images, is a typical task in wireless communications. The semantic communication paradigm compresses a large amount of visual data at the semantic level to enable its transmission with limited channel capacity. To improve semantic-aware image compression, we develop a Deep neural Network (DNN)-based architecture for semantic communication. An image is segmented into Regions of Interest (ROI) and Regions of Non-Interest (RONI) parts using semantic segmentation to obtain an ROI mask. The ROI and RONI segments are then independently compressed with adjustable compression ratios for wireless transmission. The reconstructed image is a fusion of the decompressed ROI and RONI segments. The proposed architecture improves the perceptual quality of transmitted images by allocating more bandwidth to ROI parts containing more semantic information compared to RONI parts. The compression ratio of the architecture is adjustable to adapt to time-varying channel capacity. Critical semantic information within ROI is compressed and transmitted independently to ensure accurate transmission even if the transmission of RONI fails. Additionally, a ROI mask compressor is included to minimize the extra bandwidth caused by the irregular contours of ROI masks. Simulation results on fading channels show that the proposed system outperforms existing methods in terms of peak signal-to-noise ratio (PSNR) scores.

关键词： Semantic Communication ROI image compression deep learning

来源：评论

学校读者我要写书评

暂无评论

image MIXING AND GRADIENT SMOOTHING TO ENHANCE THE SAR image ATTACK TRANSFERABILITY 49

IMAGE MIXING AND GRADIENT SMOOTHING TO ENHANCE THE SAR IMAGE...

引用

49th IEEE International Conference on Acoustics, Speech, and signal processing (ICASSP)

作者： Xu, Yue Liu, Xin He, Kun Huang, Shao Zhao, Yaodong Gu, Jie Huazhong Univ Sci & Technol Sch Comp Sci & Technol Wuhan Peoples R China Natl Key Lab Elect Space Secur Chengdu Peoples R China

ISBN: (纸本)9798350344868;9798350344851

Deep neural Networks (DNNs) are known to be vulnerable to adversarial examples, which are crafted by adding imperceptible perturbations to clean examples. With the wide applications of DNNs to Synthetic Aperture Radar (SAR) Automatic Target Recognition (ATR), the vulnerability of SAR deep recognition models has attracted increasing attention. Existing works show that input transformation can effectively improve the black-box attack performance of adversarial examples, but there is little work in the field of SAR-ATR. In this paper, we propose a novel input transformation attack called image Mixing and Gradient Smoothing (IMGS), which is dedicated to attacking SAR images. IMGS mixes a small portion of another image into the input samples in amplitude and phase with different rates and uses the Local Mean Square Error (LMSE) filter to smooth the gradient. Extensive experiments conducted on the MSTAR dataset demonstrate that IMGS significantly outperforms other input transformation methods, originally designed for attacking visual images, in both white-box and black-box settings. The code is available at https://***/JHL-HUST/IMGS.

关键词： SAR image classification black-box attack input transformation adversarial transferability

来源：评论

学校读者我要写书评

暂无评论

引用

IEEE TRANSACTIONS ON COMPUTERS 2024年第9期73卷 2192-2205页

作者： Khataei, Alireza Singh, Gaurav Bazargan, Kia Univ Minnesota Dept Elect & Comp Engn Minneapolis MN 55455 USA

Unary computing is a relatively new method for implementing arbitrary nonlinear functions that uses unpacked thermometer number encoding, enabling much lower hardware costs. In its original form, unary computing provides no trade-off between accuracy and hardware cost. In this work, we propose a novel self-similarity-based method to optimize the previous hybrid binary-unary work and provide it with the trade-off between accuracy and hardware cost by introducing controlled levels of approximation. Looking for self-similarity between different parts of a function allows us to implement a very small subset of core unique subfunctions and derive the rest of the subfunctions from this core using simple linear transformations. We compare our method to previous works such as FloPoCo-LUT (lookup table), HBU (hybrid binary-unary) and FloPoCo-PPA (piecewise polynomial approximation) on several 8-12-bit nonlinear functions including Log, Exp, Sigmoid, GELU, Sin, and Sqr, which are frequently used in neural networks and image processing applications. The area x delay hardware cost of our method is on average 32%-60% better than previous methods in both exact and approximate implementations. We also extend our method to multivariate nonlinear functions and show on average 78%-92% improvement over previous work.

关键词： Hardware Costs Logic gates Wires Polynomials Delays image coding Hardware acceleration approximate computing unary computing stochastic computing table-based method piecewise polynomial approximation nonlinear function activation function

来源：评论

学校读者我要写书评

暂无评论

A Design Framework for Hardware-Efficient Logarithmic Floating-Point Multipliers

引用

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING 2024年第4期12卷 991-1001页

作者： Zhang, Tingting Niu, Zijing Han, Jie Univ Alberta Dept Elect & Comp Engn Edmonton AB T6G 1H9 Canada

The symbiotic use of logarithmic approximation in floating-point (FP) multiplication can significantly reduce the hardware complexity of a multiplier. However, it is difficult for a limited number of logarithmic FP multipliers (LFPMs) to fit in a specific error-tolerant application, such as neural networks (NNs) and digital signal processing, due to their unique error characteristics. This article proposes a design framework for generating LFPMs. We consider two FP representation formats with different ranges of mantissas, the IEEE 754 Standard FP Format and the Nearest Power of Two FP Format. For both logarithm and anti-logarithm computation, the applicable regions of inputs are first evenly divided into several intervals, and then approximation methods with negative or positive errors are developed for each sub-region. By using piece-wise functions, different configurations of approximation methods throughout applicable regions are created, leading to LFPMs with various trade-offs between accuracy and hardware cost. The variety of error characteristics of LFPMs is discussed and the generic hardware implementation is illustrated. As case studies, two LFPM designs are presented and evaluated in applications of JPEG compression and NNs. They do not only increase the classification accuracy, but also achieve smaller PDPs compared to the exact FP multiplier, while being more accurate than a recent logarithmic FP design.

关键词： Hardware Standards Transform coding Training Costs Artificial neural networks image coding Floating-point multiplier logarithmic multiplier neural networks JPEG compression error tolerance approximate computing approximate multiplier

来源：评论

学校读者我要写书评

暂无评论

HYBRID CONVOLUTION-TRANSFORMER FOR LIGHTWEIGHT SINGLE image SUPER-RESOLUTION 49

HYBRID CONVOLUTION-TRANSFORMER FOR LIGHTWEIGHT SINGLE IMAGE ...

引用

49th IEEE International Conference on Acoustics, Speech, and signal processing (ICASSP)

作者： Li, Jiuqiang Ke, Yutong Southwest Jiaotong Univ Sch Comp & Artificial Intelligence Chengdu Peoples R China Southwest Jiaotong Univ SWJTU Leeds Joint Sch Chengdu Peoples R China

ISBN: (纸本)9798350344868;9798350344851

The rapid development of deep learning has driven the breakthrough in performance of single image super-resolution (SISR). However, many existing works deepen the network to pursue performance improvement without considering the issue that models with large parameters are not conducive to current production and deployment. Meanwhile, Transformer, relying on its ability to model long-term dependencies, has entered the field of SISR, but the large memory consumption and inference time cannot avoid the abovementioned problem. In this paper, we propose a Hybrid Convolution-Transformer (HCFormer) for lightweight single image super-resolution. HCFormer effectively combines convolution and Transformer, and its core modules are the super-resolution feature extraction module (SRFEM) and the long-term dependency feature representation module (LDFRM), respectively composed of a series of light-weight and efficient convolution blocks (LECB) and light-weight and efficient Transformer blocks (LETB). LECB excavates the potential super-resolution features in the input image through multi-scale residual convolutional operations, while LETB performs long-term dependency feature representation on the excavated features through a streamlined and improved Transformer. Extensive experimental results on five benchmark datasets, compared with the state-of-the-art light-weight SISR methods, demonstrate the effectiveness and competitiveness of our proposed method.

关键词： Single image super-resolution Transformer convolutional neural network deep learning

来源：评论

学校读者我要写书评

暂无评论

Conditional Normalizing Flow with Multiscale Local-global Features Learning for Low-light image Enhancement

Conditional Normalizing Flow with Multiscale Local-global Fe...

引用

9th International Conference on signal and image processing (ICSIP)

作者： Hu, Yin Hu, Changhui Xu, Lintao Nanjing Univ Posts & Telecommun Coll Automat Nanjing Peoples R China Nanjing Univ Posts & Telecommun Coll Artificial Intelligence Nanjing Peoples R China

ISBN: (纸本)9798350350920

Conditional normalizing flow (CNF) performs a series of reversible transformations to learn the distribution of the normal-light image guided by conditional features from the low-light image, providing a novel solution for low-light image enhancement. However, most existing CNF-based methods completely adopt convolutional neural networks (CNN) to extract conditional features, which only concentrate on the representation of local information. Besides, the invertible network in CNF executes some reversible transformations that only act on the part of the features, affecting the expressive power of the CNF. To tackle the issues that exist in CNF, this paper proposes a novel and powerful CNF-based method named multiscale local-global features guided normalizing flow (MLGFlow) for low-light image enhancement. Specifically, MLGFlow consists of a conditional encoder and an invertible network. In the conditional encoder, we design the multiscale local-global learning block (MLGB) that includes a dual-branch extraction module (DEM) and an attention-based fusion module (AFM) for extracting informative conditional features. DEM concentrates on capturing multiscale local and global features and AFM further promotes feature fusion based on the channel-spatial attention mechanism. In the invertible network, we construct the conditional multi-affine coupling (CMAC) layer to perform sufficient reversible transformation for enhancing the expressive power of the model. Extensive experiments demonstrate that our proposed MLGFlow performs better than current state-of-the-art (SOTA) methods in terms of quantitative evaluation and visual quality.

关键词： low-light image enhancement conditional normalizing flow multiscale local-global features

来源：评论

学校读者我要写书评

暂无评论

Modeling bio-inspired visual neural for detecting visual features of small- and wide-field moving targets synchronously from complex dynamic environments

引用

signal image AND VIDEO processing 2024年第12期18卷 8881-8898页

作者： Zhang, Sheng Li, Ke Zhou, Dan Tang, Jingjing Hohai Univ Coll Informat Sci & Engn Changzhou 213200 Peoples R China Nanchang Inst Technol Sch Mech & Elect Engn Nanchang 330044 Peoples R China

The synchronous detection of visual features of small- and wide-field moving targets in complex dynamic environments has been a challenge in the field of moving target detection. Fortunately, the visual system of Drosophila flies can detect visual features of small- and wide-field moving targets synchronously from complex dynamic environments, thus providing a good paradigm for the synchronous detection of visual features of small- and wide-field moving targets in complex dynamic environments, however, there is little literature that comprehensively analyses and verify this. In this paper, we present a bio-inspired computing model for detecting visual features of small- and wide-field moving targets synchronously. The model consists of three stages. First, visual stimuli are perceived and divided into parallel ON and OFF pathways. Then, the feedback mechanism and the full Hassenstein-Reichardt correlator are applied to the Medulla neurons. Finally, the Lobula Columnar 11 is used to detect visual features of small-field moving targets, i.e., the position, meanwhile, the Lobula Plate Tangential Cell is utilized to detect visual features of wide-field moving targets, i.e., the translational directional selectivity. Through extensive experiments, the proposed model can detect visual features of small- and wide-field moving targets synchronously. In addition, the proposed model improves the detection rate in small-field moving target detection by 17.18% compared with the traditional bio-inspired computing model, while the effectiveness of the proposed model is further verified by comparing it with the conventional moving target detection methods. Moreover, the proposed model can also effectively detect visual features of wide-field moving targets. The source code can be found at https://***/szhanghh/A-bio-inspired-visual-neural-computing-model.

关键词： Bio-inspired visual neural model Visual features detection LC11 LPTC Complex dynamic environments

来源：评论

学校读者我要写书评

暂无评论

Estimating the visibility in foggy weather based on meteorological and video data: A Recurrent neural Network approach

引用

IET signal processing 2023年第1期17卷

作者： Chen, Jian Yan, Ming Qureshi, Muhammad Rabea Hanzla Geng, Keke Yangzhou Univ Sch Mech Engn Yangzhou 225127 Jiangsu Peoples R China Southeast Univ Sch Mech Engn Nanjing Peoples R China

The research of visibility detection in foggy days is of great significance to both road traffic and air transport safety. Based on the meteorological and video data collected from an airport, a deep Recurrent neural Network (RNN) model was established in this study to predict the visibility. First, the Fourier Transform was used to extract feature variables from video data. Then, the Principal Component Analysis method was used to reduce the dimension of features. After that, 462 sets of sample data include image features, air pressure, temperature and wind speed, were used as inputs to train the RNN model. By comparing the predicted results with the actual visibility data as well as some other state-of-the-art methods, it can be found that the proposed model makes up for the deficiency of models based only on meteorological or image data, and has higher accuracy in different grades of visibility. With considering the meteorological data, the accuracy of RNN model is improved by 18.78%. Besides, with aids of correlation analysis, the influence of the meteorological factors on the predicted visibility was analysed, for fog at night, temperature is the dominant factor affecting visibility.

关键词： correlation analysis data dimension reduction fourier transform principal component analysis (PCA) Recurrent neural Network (RNN) model

来源：评论

学校读者我要写书评

暂无评论

Efficient CNN Prediction With Smoothness Factor for Reversible Data Hiding

引用

IEEE signal processing LETTERS 2025年 32卷 1341-1345页

作者： Lin, Minchun Xiang, Shijun Jinan Univ Coll Informat Sci & Technol Guangzhou 510632 Peoples R China

In reversible data hiding (RDH) community, researchers often train the CNN-based predictors with the Mean Square Error (MSE) loss function to evaluate the differences between original and predicted images. This will make the prediction network parameters optimized for all pixels without difference. Considering that the prediction errors in smooth areas are prioritized from the prediction error set for reversible data hiding, in this letter we propose to apply a smoothness factor into the MSE loss function. The smoothness factor used to evaluate the pixel smoothness of an image in steganography is adopted as the loss weight in the new loss function, corresponding to large values in the smooth areas and small values in the texture areas. Experimental results have shown that the CNN-based predictors trained with the proposed loss function can predict pixels more accurately in the smooth areas than using the original loss function. As a bonus, better embedding performance can be achieved by comparing with recent typical CNN-based RDH methods.

关键词： Distortion Filter banks Convolutional neural networks Training Linear programming Kernel Data mining Wavelet coefficients Steganography Optimization Reversible data hiding CNN prediction loss function smoothness factor

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：