检索结果-内蒙古大学图书馆

30th IEEE International Conference on image processing (ICIP)

作者： Liu, Wei Zhang, Huigang Xia, Xiaojie Wang, Liuan Sun, Jun Fujitsu R&D Ctr Co LTD Beijing Peoples R China

ISBN: (纸本)9781728198354

image segmentation is a difficult and challenging task because of the complex object appearance and diverse object categories. Traditional methods directly use visual features for segmentation but ignore the correlation between objects. We introduce a knowledge reasoning module (KRM) for external knowledge aggregation and leverage a graphic neural network to aggregate the knowledge feature, which is concatenated with a visual feature for semantic segmentation. To this end, we use word embedding of category names as semantic feature and establish the relationship between categories. Through iteration, the aggregated features can be enriched. In experiments, three well known semantic segmentation methods are used as baseline. Our experiment results outperform the baseline methods on the food dataset Food-Seg103 and Cityscapes, and demonstrate the effectiveness of our proposed method.

关键词： Knowledge reasoning semantic segmentation

来源：评论

学校读者我要写书评

暂无评论

Research and realization of mural painting disease recognition methods 4

Research and realization of mural painting disease recogniti...

引用

4th International Symposium on Computer Technology and Information Science, ISCTIS 2024

作者： Shi, Nuo Yang, Lei He, Pengju Li, Yixian Wang, Miaoqi Zhao, Haochen Xi'an Eurasian College School of Information Engineering Shaanxi Xi'an China

ISBN: (纸本)9798350354560

Mural paintings are the treasures of Chinese culture and contain high values. Acoustic emission technology, combined with digital signal processing and convolutional neural network methods, can non-destructively and in real time identify the type of mural painting diseases in Tang tombs. The occurrence of two kinds of diseases, fracture and shedding, is simulated by lead breakage and tape sticking;the acoustic emission signal generated by the disease is collected and the signal is preprocessed;after wavelet transform, the signal processing is transformed into an image recognition problem, and the convolutional neural network model is used to complete the classification of the image, and the type of disease is given in the end. The accuracy rate of disease recognition reaches more than 95%, which has significant advantages in mural disease recognition and can provide strong support for the protection of cultural relics. © 2024 IEEE.

关键词： Acoustic emission testing

来源：评论

学校读者我要写书评

暂无评论

Region of interest enabled learned image coding for machines 25

Region of interest enabled learned image coding for machines

引用

25th IEEE International Workshop on Multimedia signal processing (MMSP)

作者： Ahonen, Jukka, I Le, Nam Zhang, Honglei Cricri, Francesco Rahtu, Esa Nokia Technol Tampere Finland Tampere Univ Tampere Finland

ISBN: (纸本)9798350338935

image and video coding for machines has been recently gaining more and more interest from both the industry and the research community. One successful approach is based on end-to-end (E2E) learned compression and has shown significant gains over the state-of-the-art conventional image coding methods. However, one of the remaining challenges for such E2E-learned image codecs for machines is to adaptively allocate the bits over different regions of the image, while retaining the machine vision performance. In this paper, we propose a method that leverages Regions-Of-Interest (ROIs) for bitrate allocation within a Learned image Codec (LIC) for machines. In particular, the proposed method reduces the bits allocated for the background regions of the image by reducing the variance of the elements corresponding to the background regions in the latent representation. This results in more heavily quantized background areas, while keeping the quality of the ROI areas suitable for machine tasks. The proposed method achieves significant gains, -15.80% and -22.43% Pareto BD-rate reduction, over the baseline LIC on object detection and instance segmentation tasks, respectively. To the best of our knowledge, this is the first research paper proposing an ROI-based inference-time technology for Learned image Coding for machines.

关键词： region of interest learned image coding video coding for machines machine vision neural networks

来源：评论

学校读者我要写书评

暂无评论

Semantic-oriented learning-based image compression by Only-Train-Once quantized autoencoders

引用

signal image AND VIDEO processing 2023年第1期17卷 285-293页

作者： Sebai, D. Shah, A. Ulah Univ Manouba Natl Sch Comp Sci Cristal Lab Manouba Tunisia Univ Tun Hussein Onn Fac Comp Sci & Informat Technol Parit Raja Johor Malaysia

Accessibility to big training datasets together with current advances in computing power has emerged interest in the leverage of deep learning to address image compression. This needs to train and deploy separate networks for rate adaptation, which is impractical and extensive in terms of memory cost and power consumption, especially for broad bitrate ranges. To deal with such limitation, the variable-rate compression methods use the Lagrange multiplier to control the Rate/Distortion trade-offs in order not to require retraining of the neural network for each rate. However, they do not make an optimized bit allocation for the eye-catching foreground details, and do not consider the different degree of attention that the human eye has to each area of the image. Thus, other deep learning-based image compression approaches, which could outperform the above ones, are replied on the use of additional information. In this paper, we present a loss-conditional autoencoder tailored to the specific task of semantic image understanding to achieve higher visual quality in lossy variable-rate compression. Our framework is a neural network-based scheme able to automatically optimize coding parameters with multi-term perceptual loss function based on semantic-important structural SIMilarity index. To ensure the rate adaptation, we suggest modulating the compression network on the bitwidth of its activations by quantizing them according to several bitwidth values. Experiments are presented on the JPEG AI dataset in which our method achieves competitive and higher visual quality for the same compressed size, when compared to conventional codecs and related work.

关键词： Learning-based image compression Variable-rate compression Loss-conditional autoencoder Only-Train-Once Quantized autoencoders Multi-term loss function

来源：评论

学校读者我要写书评

暂无评论

A New Non-Convex Framework to Improve Asymptotical Knowledge on Generic stochastic Gradient Descent 33

A New Non-Convex Framework to Improve Asymptotical Knowledge...

引用

33rd IEEE International Workshop on Machine Learning for signal processing, MLSP 2023

作者： Fest, Jean-Baptiste Repetti, Audrey Chouzenoux, Emilie CVN CentraleSupélec Inria Université Paris-Saclay 9 rue Joliot Curie Gif-sur-Yvette France Heriot-Watt University School of Engineering & Physical Sciences School of Mathematical & Computer Schiences EdinburghEH14 4AS United Kingdom

ISBN: (纸本)9798350324112

stochastic gradient optimization methods are broadly used to minimize non-convex smooth objective functions, for instance when training deep neural networks. However, theoretical guarantees on the asymptotic behaviour of these methods remain scarce. Especially, ensuring almost-sure convergence of the iterates to a stationary point is quite challenging. In this work, we introduce a new Kurdyka Lojasiewicz theoretical framework to analyze asymptotic behavior of stochastic gradient descent (SGD) schemes when minimizing non-convex smooth objectives. In particular, our framework provides new almost-sure convergence results, on iterates generated by any SGD method satisfying mild conditional descent conditions. We illustrate the proposed framework by means of several toy simulation examples. We illustrate the role of the considered theoretical assumptions, and investigate how SGD iterates are impacted whether these assumptions are either fully or partially satisfied. © 2023 IEEE.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

Multilevel thresholding satellite image segmentation using chaotic coronavirus optimization algorithm with hybrid fitness function

引用

neural COMPUTING & APPLICATIONS 2023年第1期35卷 855-886页

作者： Hosny, Khalid M. Khalid, Asmaa M. Hamza, Hanaa M. Mirjalili, Seyedali Zagazig Univ Fac Comp & Informat Dept Informat Technol Zagazig 44519 Egypt Torrens Univ Australia Ctr Artificial Intelligence Res & Optimisat Brisbane Qld 4006 Australia

image segmentation is a critical step in digital image processing applications. One of the most preferred methods for image segmentation is multilevel thresholding, in which a set of threshold values is determined to divide an image into different classes. However, the computational complexity increases when the required thresholds are high. Therefore, this paper introduces a modified Coronavirus Optimization algorithm for image segmentation. In the proposed algorithm, the chaotic map concept is added to the initialization step of the naive algorithm to increase the diversity of solutions. A hybrid of the two commonly used methods, Otsu's and Kapur's entropy, is applied to form a new fitness function to determine the optimum threshold values. The proposed algorithm is evaluated using two different datasets, including six benchmarks and six satellite images. Various evaluation metrics are used to measure the quality of the segmented images using the proposed algorithm, such as mean square error, peak signal-to-noise ratio, Structural Similarity Index, Feature Similarity Index, and Normalized Correlation Coefficient. Additionally, the best fitness values are calculated to demonstrate the proposed method's ability to find the optimum solution. The obtained results are compared to eleven powerful and recent metaheuristics and prove the superiority of the proposed algorithm in the image segmentation problem.

关键词： image segmentation Optimization Thresholding Metaheuristic Satellite

来源：评论

学校读者我要写书评

暂无评论

Improving T2I-Adapter via Integration of Visual and Textual Conditions with Attention Mechanism 31

Improving T2I-Adapter via Integration of Visual and Textual ...

引用

31st IEEE International Conference on image processing Challenges and Workshops, ICIPCW 2024

作者： Zhu, Zheng-An Fan, Xin-Yun Chiang, Chen-Kuo National Chung Cheng University Department of Computer Science and Information Engineering Taiwan

ISBN: (纸本)9798331515942

Recently, relying solely on T2I has gradually proven insufficient to meet the demands for image generation. As a result, people have started exploring more controllable image-generation methods based on Diffusion technology. In addition to solely using textual descriptions to generate images, extra control conditions such as Sketch, Segmentation, and Canny are being considered. However, existing T2I control adapters mostly lack a comprehensive integration between textual and image conditions. Additionally, in multi-layer neural networks, certain features may be lost information. Therefore, this paper aims to address the aforementioned issues and propose a new architecture systematically. We utilize Cross Attention to merge textual descriptions and image features and introduce Coordinate Attention at each feature output to enhance the overall feature representation. Experimental results demonstrate that compares to state-of-The-Art methods, this approach achieved superior evaluation metrics and exhibited visual effects more in line with human assessment. © 2024 IEEE.

关键词： Multilayer neural networks

来源：评论

学校读者我要写书评

暂无评论

Automated design of Convolutional neural Network architecture using Gray Wolf Optimization for plant seedlings classification 8

Automated design of Convolutional Neural Network architectur...

引用

8th IEEE International Conference on image and signal processing and their Applications, ISPA 2024

作者： Badis, Lamis Aliouat, Wahiba Bouchiba, Kenza Faculty of Sciences and Applied Sciences Department of Computer Sciences University of Bouira Algeria Faculty of Sciences and Applied Sciences LIM Laboratory Department of Computer Sciences University of Bouira Algeria

ISBN: (纸本)9798350309249

Convolutional neural Networks (CNNs) have gained significant popularity in image classification tasks, yet achieving their optimal design remains a challenge due to the vast array of possible layer configurations and associated hyperparameters. Selecting the best CNN model for a given task often demands considerable time investment in training numerous models. To address this issue, we propose an automated method for CNN architecture design, utilizing pretrained models like the Backbone and employing Gray Wolf Optimization. This approach automatically generates and evaluates candidate CNN architectures for classifying plant seedlings. Our objective is to distinguish between weed and crop seedlings. Additionally, we introduce a gray wolf representation to encode CNN architectures and their hyperparameters. Our method combines the strengths of transfer learning from pre-trained models to extract meaningful image features with the optimization capabilities of the Gray Wolf Optimization (GWO) algorithm. By leveraging these techniques, our method achieves exceptional accuracy, surpassing state-of-the-art methods with a validation accuracy of up to 97.83%. This innovative approach offers a transformative tool for enhancing the accuracy of CNN models, tailored specifically to the dataset at hand. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

MSV-RGNN: MULTISCALE VOXEL GRAPH neural NETWORK FOR 3D OBJECT DETECTION 30

MSV-RGNN: MULTISCALE VOXEL GRAPH NEURAL NETWORK FOR 3D OBJEC...

引用

30th IEEE International Conference on image processing (ICIP)

作者： Lee, Wonjoon Woo, Sungmin Kim, Donghyeong Lee, Sangyoun Yonsei Univ Sch Elect & Elect Engn Seoul South Korea

ISBN: (纸本)9781728198354

This paper proposes a two-stage 3D object detection framework, multiscale voxel graph neural network (MSV-RGNN) which aims to fully exploit multiple scale graph features by establishing global and local relationships between voxel features at different 3D convolutional neural network (CNN) layers. In contrast to conventional graph-based methods, our proposed multiscale-voxel-graph region-of-interest (RoI) pooling module constructs graphs across diverse voxel resolutions to obtain geometric structure information on voxel features. Initially, our multiscale-voxel-graph RoI pooling module sample voxel center points with voxel-wise feature vectors and 3D region proposals from backbone network. Subsequently, graphs are constructed at different scales and graph features are aggregated for second-stage refinement. The experimental results demonstrate the potential of using multiscale graphs across different voxel resolutions for 3D object detection, achieving decent experimental results with state-of-the-art methods.

关键词： multiscale graph 3D object detection voxel

来源：评论

学校读者我要写书评

暂无评论

Detectify : image Tampering Detection using Error Level Analysis (ELA) and Convolutional neural Network (CNN)

Detectify : Image Tampering Detection using Error Level Anal...

引用

2024 International Conference on signal processing, Computation, Electronics, Power and Telecommunication, IConSCEPT 2024

作者： Geethanjali, T.M. Darshan, T.S. Surya, K. Rahul, H.U. Sheety, Ipshika N PES College of Engineering Vishvesvaraya Technological University Dept. Information Science and Engineering Mandya India

ISBN: (纸本)9798331540685

In the evolving digital landscape, the proliferation of manipulated images poses a significant challenge to the authenticity and integrity of visual content. This project investigates cutting-edge image manipulation detection techniques, employing a combination of Error Level Analysis (ELA) and Convolutional neural Networks (CNN) for robust prediction. Focusing on the widely-used CASIA V2.0 dataset, the study provides a comprehensive evaluation of image manipulation methods. Error Level Analysis is utilized as a forensic tool to identify alterations in the compression levels of manipulated images. By scrutinizing variations in error levels, the project aims to enhance the detection accuracy of manipulated regions within visual content. The CNN model is meticulously crafted and trained using preprocessed ELA images to acquire nuanced features essential for discerning tampering- induced alterations. The proposed hybrid approach, integrating ELA and CNN, establishes a robust framework for detecting image manipulation that is adaptable and efficient. Through the meticulous examination of the CASIA V2.0 dataset, this project contributes to ongoing efforts in combating digital image manipulation. This study serves as a valuable resource for forensic analysts, researchers, and practitioners working towards ensuring the veracity of digital images, offering a nuanced understanding of image manipulation techniques in the contemporary digital era. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：