Brand logo image examines how a critical dimension of logo design, namely the naturalness of the logo color, influences brand design induced by the logo. However, high complexity is observed, requiring higher quality ...
详细信息
ISBN:
(数字)9798350318609
ISBN:
(纸本)9798350318616
Brand logo image examines how a critical dimension of logo design, namely the naturalness of the logo color, influences brand design induced by the logo. However, high complexity is observed, requiring higher quality data and susceptibility to overfitting when there is insufficient training data. The implementation of the proposed system involved the utilization of a Deep Convolutional Generative Adversarial Network (DCGAN) reduce overfitting and efficiently handle noise and distortion in the logo image, which is essential in brand logo imageprocessing. The brand logo image dataset forwarded to the preprocessing stage included pixel values, which may lead to the loss of contrast, shadow, sharpness, noise removal, and structure in some relevant images. The classification involved DCGAN, aiming to achieve high-quality logo image prediction, noise reduction, and handling of overfitting by incorporating synthetic logo images. The proposed DCGAN method achieves promising performance, including a mean Average Precision (mAP) of 52.25, Accuracy of 97.09%, Precision of 97.01 %, Recall of 98.10%, and F1-score of 97.11 %. This performance is compared to existing methods such as YOLOv3 and Convolutional Neural Network (CNN) - Long Short-Term Memory (LSTM).
imageprocessing algorithm has important application value in 3D real-time rendering of power grid. By reasonable selection and application of imageprocessing algorithm, the effect and performance of 3D visualization...
imageprocessing algorithm has important application value in 3D real-time rendering of power grid. By reasonable selection and application of imageprocessing algorithm, the effect and performance of 3D visualization of power grid can be improved. Through the research on related issues, this paper aims to develop and evaluate a 3D real-time rendering model of power grid to meet the needs of power system in visualization, interaction and analysis. In order to achieve this goal, this paper adopts a set of comprehensive methods and constructs a 3D real-time rendering model with four modules: data input, rendering, interaction and analysis. Among them, the rendering module combines a variety of imageprocessing algorithms to ensure high quality rendering results; The interactive module provides users with an intuitive and easy-to-operate interface. In order to verify the performance of the model, several groups of simulation experiments are further designed, and the whole process and data of the experiments are recorded in detail. The results show that this model has achieved a stable 60 FPS (Frames Per Second) on a high-performance computer, and the image quality has also been highly praised by users. In addition, the average user experience score reached 9.2 points. The model can not only provide powerful visual support for power system, but also be used as an analysis tool to help decision-making and planning.
Computer vision has significantly impacted information technology over the last few years. The imageprocessing process lays the groundwork for computer vision with important components, including image filtering and ...
Computer vision has significantly impacted information technology over the last few years. The imageprocessing process lays the groundwork for computer vision with important components, including image filtering and image inpainting. Digital imageprocessing is one of the broadest areas for study. It has several applications in military, medicine, space, automobile, security systems and many others. To increase the effectiveness of the system's noise suppression structural similarity index, universal image quality index, texture detection, mean square error, peak signal-to-noise ratio, and many other metrics. The evolutionary examination of several image-filtering methods is included in this paper. Among these methods, the most significant are median filtering, pixel similarity weighted frame averaging, bilateral filtering, anisotropic diffusion, and Gaussian filtering. This study focuses on an evolutionary analysis of image inpainting methods based on parallel pipelines, convolution neural networks and generative adversarial networks. The analysis of numerous works on improving image quality reveals that the hybrid model of image-filtering and image inpainting for image quality improvement is the promising solution for improved quality of image required for computer vision applications.
The construction safety of transformer stations is an important issue in the power industry, and traditional safety inspection methods suffer from high labor costs and low efficiency. The construction safety detection...
详细信息
ISBN:
(数字)9798350378917
ISBN:
(纸本)9798350378924
The construction safety of transformer stations is an important issue in the power industry, and traditional safety inspection methods suffer from high labor costs and low efficiency. The construction safety detection system for transformer stations based on Convolutional Neural Network (CNN) has attracted widespread attention in recent years, but there is still relatively little comprehensive summary of its research methods and results. This study elaborates on the construction of a CNN-based transformer station construction safety detection system and conducts a deep comparison with traditional SVM models. Through the analysis of research literature, it was found that CNN can effectively distinguish between safe and dangerous behaviors by learning image features and contextual relationships, improving the accuracy of detection. The parallel computing and GPU acceleration capabilities of CNN significantly shorten the detection time and improve the real-time performance of detection.
The purpose of the medical image segmentation task is to delineate different organs or lesion regions in the image, which is an important aid for intelligent clinical medical diagnosis. Recent approaches suffer from t...
The purpose of the medical image segmentation task is to delineate different organs or lesion regions in the image, which is an important aid for intelligent clinical medical diagnosis. Recent approaches suffer from the inability to obtain reliable attention, are computationally intensive, and do not exploit the relationships between different samples. We marry convolution and Transformer effectively to establish MCTE for medical image segmentation. The proposed MCTE is an end-to-end network based on U-Net with a parallel learning of three types of attention, namely local attention learning with channel and spatial dimensional convolution, global attention learning with smaller computational effort of swin transformer, and external attention learning with two shared memory storing all medical image information. Extensive experimental results on the ACDC and Synapse dataset, which are widely used for the evaluation of medical image segmentation methods, demonstrate that our proposed method exceeds the compared baseline.
This research addresses the balance of style expression and retention of content in arbitrary image style transfer by proposing an improved CAST framework integrated with a Multi-Scale Convolutional Attention module. ...
详细信息
ISBN:
(数字)9798331510916
ISBN:
(纸本)9798331510923
This research addresses the balance of style expression and retention of content in arbitrary image style transfer by proposing an improved CAST framework integrated with a Multi-Scale Convolutional Attention module. Existing style transfer methods often suffer from detail loss or structural distortion when processing complex style features. To solve these issues, we designed a parallel multi-scale convolutional structure that simultaneously captures local and global style features through different receptive fields, enhancing the system's perception of complex textures. We also introduced a Structural Similarity loss function to improve content structure preservation. Compared to existing methods, our framework achieves significant improvements in both style expressiveness and content preservation, generating more natural and artistic stylized images.
Human motion prediction (HMP) refers to predicting the future body pose from the historical pose sequence. Many existing methods use Graph Convolutional Networks (GCN) to model the human body and convert the human pos...
详细信息
ISBN:
(数字)9798350349399
ISBN:
(纸本)9798350349405
Human motion prediction (HMP) refers to predicting the future body pose from the historical pose sequence. Many existing methods use Graph Convolutional Networks (GCN) to model the human body and convert the human pose from the pose space to the trajectory space or 3D coordinates. Furthermore, GCN treat human poses as a generic graph formed by links between each pair of body joints to encode the dependence of human spatial poses as well as temporal information by working in trajectory space. We design a multi-stage distributedprocessing network that includes Spatial Dense Graph Convolutional Networks (S-DGCN) and Temporal Dense Graph Convolutional Networks (T-DGCN). The multistage strategy enables us to gradually acquire smoother inputs. Additionally, we have incorporated an attention mechanism within the processing framework, which helps T-DGCN better capture temporal dependencies. As a result, the proposed network not only facilitates more effective feature extraction but also achieves state-of-the-art performance on the CMU-Mocap and 3DPW datasets. Our code is available at https://***/ihavenotgoodname/MSGAT.
The anti-forensics (AF) technology has become a new field of cybercrime. The problems of existing forensic technologies should be considered from criminals' perspective, so as to make improvement to existing AF te...
详细信息
ISBN:
(纸本)9783030953881;9783030953874
The anti-forensics (AF) technology has become a new field of cybercrime. The problems of existing forensic technologies should be considered from criminals' perspective, so as to make improvement to existing AF technologies. There are two types of AF methods, namely, data hiding and destruction, where most AF tools are primarily based on data hiding. If the data can be intercepted by investigators during the AF process, the remaining data may be destroyed by the criminal, which would make investigators obtain nothing about data information. To address this issue, this paper proposes an AF scheme with multi-device storage based on Reed-Solomon codes by combining data hiding and data destruction. The data is divided into multiple out-of-order data blocks and parity blocks, where these blocks are stored separately in different devices. This method can reduce the storage cost and protect the privacy of data. Even if the data is destroyed, it allows AF investigators to recover the data. Security analysis showed that this AF method can prevent malicious, erroneous or invalid files while acquired and ensure data security in data stolen. Theoretical analysis indicated that this method was difficult for investigators but easy for AFer in files recovery. Experimental results demonstrated that the proposed method is effective and has practical efficiency.
Deep learning requires training on massive data to get the ability to deal with unfamiliar data in the future, but it is not as easy to get a good model from training on massive data. Because of the requirements of de...
详细信息
Temporal Language Grounding (TLG) aims to localize moments in untrimmed videos that are most relevant to natural language queries. While existing weakly-supervised methods have achieved significant success in explorin...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Temporal Language Grounding (TLG) aims to localize moments in untrimmed videos that are most relevant to natural language queries. While existing weakly-supervised methods have achieved significant success in exploring cross-modal relationships, they still face a critical bottleneck: the interference of task-irrelevant information in query embeddings. To address this issue, we propose TLG Frequency Spiking (TFS), a dimensional mask derived from the frequency domain that models the varying importance specific to different queries. By enhancing the understanding of queries, TFS effectively optimizes the cross-modal alignment of visual and textual modalities. Experimental results show that TFS significantly outperforms state-of-the-art baselines on both the Charades-STA and ActivityNet-Captions datasets.
暂无评论