检索结果-内蒙古大学图书馆

NIR-to-RGB image colorization based on a conditional GAN

APPLIED OPTICS 2025年第11期64卷 2968-2978页

作者： Chen, Qidong Liu, Xiao Du, Lili Song, Bo Sun, Xiaobing Shen, Zhengyu Univ Sci & Technol China 96 Jinzhai Rd Hefei City 230026 Anhui Province Peoples R China Chinese Acad Sci Anhui Inst Opt & Fine Mech Hefei Inst Phys Sci 350 Shushan Lake Rd Hefei City 230031 Anhui Province Peoples R China Chinese Acad Sci Anhui Inst Opt & Fine Mech Key Lab Opt Calibrat & Characterizat Hefei 230031 Peoples R China

Near-infrared (NIR) image colorization is an image enhancement method that improves the readability of a nearinfrared image and enhances its semantic information. For the problems of color distortion, semantic ambiguity, and unclear texture shape in current near-infrared image colorization techniques, we propose a novel, to the best of our knowledge, near-infrared image colorization approach based on the generative adversarial network (GAN). This method carries out an optimization design from the generator and discriminator of the GAN and designs a proper loss function based on the new network architecture. In terms of the generator, we design and integrate a Res-WTConv-U-Net network within the generator. Additionally, we design a deep bottleneck block, which is composed of a residual block and an efficient attention module (ECA), to replace the bottleneck layer in the U-Net. Moreover, we replace the traditional convolution with the wavelet convolution (WTconv) to achieve more effective feature extraction and better performance. In the aspect of the discriminator, we design a dual-scale discriminator by combining two discriminators of different receptive fields, which can take into account the global structure and local details. After training the model using the same dataset, we conduct a comparative experiment against typical image colorization methods. In this experiment, structural similarity (SSIM), peak signal-to-noise ratio (PSNR), and color histogram similarity (CHS) are employed as image evaluation indices. By comparing the colorization results in two different datasets, it is concluded that the PSNR has improved by 12.6% on average, the SSIM has improved by 7.4% on average, and the CHS has improved by 9.5% on average. Experimental results show that the colorization effect of this method is significantly better than other methods. (c) 2025 Optica Publishing Group. All rights, including for text and data mining (TDM), Artificial Intelligence (AI) training, a

关键词： Biomedical imaging image enhancement Imaging techniques Infrared imaging neural networks stochastic gradient descent

来源：评论

学校读者我要写书评

暂无评论

Medical image segmentation with an emphasis on prior convolution and channel multi-branch attention

引用

DIGITAL signal processing 2025年 162卷

作者： Wang, Yuenan Wang, Hua Zhang, Fan Shandong Technol & Business Univ Sch Informat & Elect Engn Yantai 264005 Peoples R China Ludong Univ Sch Informat & Elect Engn Yantai 264025 Peoples R China Shandong Technol & Business Univ Sch Comp Sci & Technol Yantai 264005 Peoples R China Shandong Future Intelligent Financial Engn Lab Yantai 264005 Peoples R China

Transformer model has received extensive attention in recent years. Its powerful ability to handle contextual relationships makes it outstanding in the accurate segmentation of medical structures such as organs and lesions. However, as the Transformer model becomes more complex, its computational overhead has also increased significantly, becoming one of the key factors limiting the performance improvement of the model. In addition, some existing methods use channel dimensionality reduction to model cross-channel relationships. Although this strategy effectively reduces the amount of computation, it may lead to information loss or poor performance in segmentation tasks on medical images with rich details. To address the above problems, we propose an innovative medical image segmentation model, PCMA Former. This model combines convolution with focused weight reparameterization and a channel multi-branch attention mechanism, aiming to effectively improve model performance while maintaining low computational overhead. Through experimental verification on multiple medical image datasets (such as Synapse, ISIC2017, and ISIC2018), PCMA Former has achieved better results than traditional convolutional neural networks and existing Transformer models.

关键词： Transformer Medical image segmentation Attention mechanism

来源：评论

学校读者我要写书评

暂无评论

Analysis and Evaluation of Various Fraud Detection methods for Electronic Payment Cards Transactions in Big Data

引用

JOURNAL OF signal processing SYSTEMS FOR signal image AND VIDEO TECHNOLOGY 2024年第12期96卷 849-870页

作者： Banirostam, Hamid Banirostam, Touraj Pedram, Mir Mohsen Rahmani, Amir Masoud Islamic Azad Univ Dept Comp Engn Cent Tehran Branch Tehran Iran Kharazmi Univ Fac Engn Dept Elect & Comp Engn Tehran Iran Islamic Azad Univ Dept Comp Engn Sci & Res Branch Tehran Iran

In today's digital world, the vast volume of data generated, often referred to as big data, presents both challenges and opportunities. One significant challenge is the risk of fraud in electronic cash transactions. This study examines and compares 20 common online fraud detection methods within the context of big data, evaluating them based on 11 criteria: type of learning, speed, accuracy, cost (time), complexity, interpretability, scalability, robustness, flexibility, and temporal and spatial complexity. The evaluation highlights the performance of each method against various types of online cash fraud, including identity theft, card skimming, phishing, malware, money laundering, account takeover, refund fraud, and friendly fraud. Performance scores, derived from real-world data and simulations, indicate the effectiveness of each method in identifying and countering fraud in a big data environment. Our findings show that deep learning methods and artificial neural networks outperform other methods in most fraud scenarios, while general rule-based and inferential methods are less effective. This research provides valuable insights for financial institutions, e-commerce platforms, and other online services to enhance their fraud detection capabilities and protect sensitive customer data in the era of big data.

关键词： Fraud detection Electronic cash Big data analytics Machine learning Cybersecurity

来源：评论

学校读者我要写书评

暂无评论

SAR Ship Classification based on Second-order Synchrosqueezing Transform and HOG Feature

SAR Ship Classification based on Second-order Synchrosqueezi...

引用

9th International Conference on signal and image processing (ICSIP)

作者： Li, Dongrui Zhao, Zhichun Shenzhen MSU BIT Univ Fac Engn Shenzhen Peoples R China

ISBN: (纸本)9798350350920

Accurate classification and identification of vessels in remote sensing satellite imagery is critical for ocean monitoring and resource management. The ability to extract information from remote-sensing data is of paramount importance. To exploit the non-stationary characteristics of synthetic aperture radar (SAR) target, a comprehensive SAR ship recognition framework is designed by combing the second-order synchrosqueezing transform (SST), an effective non-stationary signal processing tool, with the histogram of oriented gradient (HOG) feature in this paper. Firstly, the second-order SST is performed on SAR images to describe the non-stationary characteristics of ships at different times and frequencies. Secondly, HOG features are utilized to effectively extract the non-stationary information of SAR ships and provide more discriminative input for the deep learning network. Then, the optimal ResNet model is selected as the convolutional neural network (CNN) classifier to automatically fuse the non-stationary features and abstract features of SAR ships. Experiments on two open SAR ship datasets (OpenSARShip and FUSAR-Ship) show that the proposed method achieves accurate classification and outperforms the state-of-the-art (SOTA) CNN-based methods in terms of robustness and generalization ability. The positive effect of non-stationary characteristics on SAR ship classification is verified.

关键词： Synthetic aperture radar (SAR) ship classification non-stationary signal processing time-frequency analysis (TFA) histogram of oriented gradient (HOG) convolutional neural network (CNN)

来源：评论

学校读者我要写书评

暂无评论

stochastic Optimization of Vector Quantization methods in Application to Speech and image processing 48

Stochastic Optimization of Vector Quantization Methods in Ap...

引用

48th IEEE International Conference on Acoustics, Speech and signal processing, ICASSP 2023

作者： Vali, Mohammad Hassan Backstrom, Tom Aalto University Department of Signal Processing and Acoustics Finland

ISBN: (纸本)9781728163277

Vector quantization (VQ) methods have been used in a wide range of applications for speech, image, and video data. While classic VQ methods often use expectation maximization, in this paper, we investigate the use of stochastic optimization employing our recently proposed noise substitution in vector quantization technique. We consider three variants of VQ including additive VQ, residual VQ, and product VQ, and evaluate their quality, complexity and bitrate in speech coding, image compression, approximate nearest neighbor search, and a selection of toy examples. Our experimental results demonstrate the trade-offs in accuracy, complexity, and bitrate such that using our open source implementations and complexity calculator, the best vector quantization method can be chosen for a particular problem. © 2023 IEEE.

关键词： Vector quantization

来源：评论

学校读者我要写书评

暂无评论

DDNet: a hybrid network based on deep adaptive multi-head attention and dynamic graph convolution for EEG emotion recognition

引用

signal image AND VIDEO processing 2025年第4期19卷 1-10页

作者： Xu, Bingyue Zhang, Xin Zhang, Xiu Sun, Baiwei Wang, Yujie Tianjin Normal Univ Tianjin Key Lab Wireless Mobile Commun & Power Tra Tianjin 300387 Peoples R China Tianjin Normal Univ Coll Artificial Intelligence Tianjin 300387 Peoples R China Tianjin Normal Univ Coll Elect & Commun Engn Tianjin 300387 Peoples R China

Emotion recognition plays a crucial role in cognitive science and human-computer interaction. Existing techniques tend to ignore the significant differences between different subjects, resulting in limited accuracy and generalization ability. In addition, existing methods suffer from difficulties in capturing the complex relationships among the channels of electroencephalography signals. A hybrid network is proposed to overcome the limitations. The proposed network is comprised of a deep adaptive multi-head attention (DAM) branch and a dynamic graph convolution (DGC) branch. The DAM branch uses residual convolution and adaptive multi-head attention mechanism. It can focus on multi-dimensional information from different representational subspaces at different locations. The DGC branch uses a dynamic graph convolutional neural network that learns topological features among the channels. The synergistic effect of these two branches enhances the model's adaptability to subject differences. The extraction of local features and the understanding of global patterns are also optimized in the proposed network. Subject independent experiments were conducted on SEED and SEED-IV datasets. The average accuracy of SEED was 92.63% and the average F1-score was 92.43%. The average accuracy of SEED-IV was 85.03%, and the average F1-score was 85.01%. The results show that the proposed network has significant advantages in cross-subject emotion recognition, and can improve the accuracy and generalization ability in emotion recognition tasks.

关键词： Emotion recognition EEG Deep learning Multi-head attention Graph neural network

来源：评论

学校读者我要写书评

暂无评论

Modified DSFD and TCDCN Based Facial Landmark Detection for Gender and Age Classification

引用

OPTOELECTRONICS INSTRUMENTATION AND DATA processing 2024年第3期60卷 398-411页

作者： Meenakshi, J. Thailambal, G. Vels Inst Sci Technol & Adv Studies VISTAS Dept Comp Sci Chennai 600117 India

Facial analysis evaluates the physical appearances of person, which is crucial for several clinical settings. The perspective of real-world faces captured in an uncontrolled environment makes it harder for the gender prediction algorithm to correctly identify gender. The accuracy of the most advanced algorithms currently in use for real-time facial gender prediction is decreased by these factors. Most importantly the facial gender prediction can pave a way for the visually challenged persons to identify the gender and age. A dual shot face detector with task restricted Fine-tuned deep neural network (DTFN) is created to recognise the facial land markings for accurate gender and age prediction in order to overcome the challenges and defects. Bidirectional filtering and sigmoid stretching are the main preprocessing methods used to improve contrast and remove noise from the input image once facial photographs are first gathered. Next, employing the modified dual shot face detector (DSFD) to separate the face from the remaining background image. To solve this problem, DSFD is built around caps net. A task constrained deep convolutional neural network (TCDCN) is then used to extract and identify features from facial landmarks. The collected features are fed into a fine tuned deep neural network (DNN) classifier, which further classifies the data according to age and gender. By adjusting the hidden layer's parameter using the stochastic gradient descent technique, fine tweaking is achieved. According to the results of experimental research the proposed technique achieves 96 % Thus, the proposed approach is the best option for automatic facial land mark detection.

关键词： gender prediction, facial landmark dualshot face detector task constrained finetuned deep neural network (DTFN) bi-directional filtering modified dual shot face detector(DSFD) deep neural network (DNN)

来源：评论

学校读者我要写书评

暂无评论

SET-NAS: SAMPLE-EFFICIENT TRAINING FOR neural ARCHITECTURE SEARCH WITH STRONG PREDICTOR AND STRATIFIED SAMPLING 31

SET-NAS: SAMPLE-EFFICIENT TRAINING FOR NEURAL ARCHITECTURE S...

引用

2024 International Conference on image processing

作者： Zhang, Yu-Ming Hsieh, Jun-Wei Chang, Yu-Hsiu Li, Xin Chang, Ming-Ching Lee, Chun-Chieh Fan, Kuo-Chin Natl Cent Univ Dep Comp Sci & Inf Engn Taoyuan Taiwan Natl Yang Ming Chiao Tung Univ Coll AI & Green Energy Hsinchu Taiwan SUNY Albany Dep Comp Sci & Inf Engn New York NY USA

ISBN: (纸本)9798350349405;9798350349399

Sample-efficient neural architecture search (NAS) techniques have advanced rapidly. Two lines of methods, namely neural predictor and sequential search, have shown promising performance in improving the sample efficiency of NAS. However, as far as we know, little attention has been paid to the middle ground between these two lines. Inspired by the analogy between NAS and evolutionary optimization, we propose a new Sample-Efficient Training for NAS (SET-NAS) based on strategies that improve fitness scores and sampling mechanisms. We develop a strong neural predictor called the Fully Bidirectional Graph Convolutional Network (Fully-BiGCN) that significantly enhances the predictor capability of the features in each layer. The developed predictor is embedded into an iterative stratified sampling process to retain only a subset of best-fit architectures using the same training budget. SET-NAS achieves remarkable results compared to the state-of-the-art in predictor-based NAS. Using NASBench-201 as the benchmark, SET-NAS takes only 27.1% (CIFAR-10), 49.0% (CIFAR-100), and 51.75% (imageNet-16) of training cost of other state-of-the-art predictor-based methods to find the promising network architecture.

关键词： neural Architecture Search predictor-based NAS efficient NAS

来源：评论

学校读者我要写书评

暂无评论

Local feature-based video captioning with multiple classifier and CARU-attention

引用

IET image processing 2024年第9期18卷 2304-2317页

作者： Im, Sio-Kei Chan, Ka-Hou Macao Polytech Univ Fac Appl Sci Macau Peoples R China Macao Polytech Univ Engn Res Ctr Appl Technol Machine Translat & Artif Macau Peoples R China

Video captioning aims to identify multiple objects and their behaviours in a video event and generate captions for the current scene. This task aims to generate a detailed description of the current video in real-time using natural language, which requires deep learning to analyze and determine the relationships between interesting objects in the frame sequence. In practice, existing methods typically involve detecting objects in the frame sequence and then generating captions based on features extracted through object coverage locations. Therefore, the results of caption generation are highly dependent on the performance of object detection and identification. This work proposes an advanced video captioning approach that works in adaptively and effectively addresses the interdependence between event proposals and captions. Additionally, an attention-based multimodel framework is introduced to capture the main context from the frame and sound in the video scene. Also, an intermediate model is presented to collect the hidden states captured from the input sequence, which performs to extract the main features and implicitly produce multiple event proposals. For caption prediction, the proposed method employs the CARU layer with attention consideration as the primary RNN layer for decoding. Experimental results showed that the proposed work achieves improvements compared to the baseline method and also better performance compared to other state-of-the-art models on the ActivityNet dataset, presenting competitive results in the tasks of video captioning. An advanced video captioning approach is proposed that works in adaptively and effectively addresses the interdependence between event proposals and captions. Additionally, an attention-based multimodel framework is introduced to capture the main context from the frame and sound in the video scene. image

关键词： convolutional neural nets feature extraction pattern classification recurrent neural nets video signal processing

来源：评论

学校读者我要写书评

暂无评论

JOINT END-TO-END image COMPRESSION AND DENOISING: LEVERAGING CONTRASTIVE LEARNING AND MULTI-SCALE SELF-ONNS 31

JOINT END-TO-END IMAGE COMPRESSION AND DENOISING: LEVERAGING...

引用

2024 International Conference on image processing

作者： Xie, Yuxin Yu, Li Pakdaman, Farhad Gabbouj, Moncef Nanjing Univ Informat Sci & Technol Sch Software Nanjing Peoples R China Nanjing Univ Informat Sci & Technol Sch Comp Sci Nanjing Peoples R China Tampere Univ Fac Informat Technol & Commun Sci Tampere Finland

ISBN: (纸本)9798350349405;9798350349399

Noisy images are a challenge to image compression algorithms due to the inherent difficulty of compressing noise. As noise cannot easily be discerned from image details, such as high-frequency signals, its presence leads to extra bits needed for compression. Since the emerging learned image compression paradigm enables end-to- end optimization of codecs, recent efforts were made to integrate denoising into the compression model, relying on clean image features to guide denoising. However, these methods exhibit suboptimal performance under high noise levels, lacking the capability to generalize across diverse noise types. In this paper, we propose a novel method integrating a multi-scale denoiser comprising of Self Organizing Operational neural Networks, for joint image compression and denoising. We employ contrastive learning to boost the network ability to differentiate noise from high frequency signal components, by emphasizing the correlation between noisy and clean counterparts. Experimental results demonstrate the effectiveness of the proposed method both in rate-distortion performance, and codec speed, outperforming the current state-of-the-art.

关键词： Self-organized operational neural networks contrastive leaning Joint image compression and denoising Learned compression

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：