检索结果-内蒙古大学图书馆

IAENG International Journal of Computer science 2025年第1期52卷 233-243页

作者： Zhang, Wanpeng Zhou, Ziwei School of Computer Science and Software Engineering University of Science and Technology LiaoNing Anshan114051 China School of Computer Science and Software Engineering University of Science and Technology LiaoNing Anshan114051 China

Point cloud completion is crucial in point cloud processing, as it can repair and refine incomplete 3D data, ensuring more accurate models. However, current point cloud completion methods commonly face a challenge: they fail to fully utilize multi-scale information from local features, leading to limitations in accuracy and detail preservation. To address this issue, this paper proposes a multi-scale feature optimization algorithm for point cloud completion that integrates SoftPool. Based on DGCNN, the method combines dilated convolution and bottleneck attention mechanisms to extract features at different scales, enhancing the ability to capture detailed information in point clouds. The bottleneck attention mechanism is used to optimize important detail features. The extracted local features are concatenated with their corresponding positional information to form point proxies, enhancing the effective extraction of local geometric features, resulting in more refined completed point cloud shapes. A Transformer architecture is employed to model these features. Finally, SoftPool is introduced for fine-grained feature downsampling, improving the network's ability to recover point cloud details. FoldingNet is used to reconstruct missing structures and output the completed point cloud. To validate the model's completion performance, training and testing are conducted on the PCN and ShapeNet55 datasets. Experimental results demonstrate that the model has better feature detail retention and more accurate completion results. On the PCN dataset, the average CD value is reduced by 6.5% compared to the best-performing model among the comparison methods. On the ShapeNet55 dataset, the average CD value across three difficulty levels is reduced by 6.9% compared to the best-performing model among the comparison methods. Additionally, the model also achieved a 2.1% improvement in F-score. © (2025), (International Association of Engineers). All rights reserved.

关键词： Ability testing

来源：评论

学校读者我要写书评

暂无评论

Improving UAV Image Target Detection: A Novel Approach Using OptiDETR with Swin Transformer

IAENG International Journal of Computer Science

引用

IAENG International Journal of Computer science 2025年第3期52卷 771-780页

作者： Ma, Wenlong Liu, Weisheng School of Computer Science and Software Engineering University of Science and Technology Liaoning Anshan114051 China School of Computer Science and Software Engineering University of Science and Technology Liaoning Anshan114051 China

In the analysis of drone aerial images, object detection tasks are particularly challenging, especially in the presence of complex terrain structures, extreme differences in target sizes, suboptimal shooting angles, and varying lighting conditions, all of which exacerbate the difficulty of recognition. In recent years, the DETR model based on the Transformer architecture has eliminated traditional post-processing steps such as NMS(Non-Maximum Suppression), thereby simplifying the object detection process and improving detection accuracy, which has garnered widespread attention in the academic community. However, DETR has limitations such as slow training convergence, difficulty in query optimization, and high computational costs, which hinder its application in practical fields. To address these issues, this paper proposes a new object detection model called OptiDETR. This model first employs a more efficient hybrid encoder to replace the traditional Transformer encoder. The new encoder significantly enhances feature processing capabilities through internal and cross-scale feature interaction and fusion logic. Secondly, an IoU (Intersection over Union) aware query selection mechanism is introduced. This mechanism adds IoU constraints during the training phase to provide higher-quality initial object queries for the decoder, significantly improving the decoding performance. Additionally, the OptiDETR model integrates SW-Block into the DETR decoder, leveraging the advantages of Swin Transformer in global context modeling and feature representation to further enhance the performance and efficiency of object detection. To tackle the problem of small object detection, this study innovatively employs the SAHI algorithm for data augmentation. Through a series of experiments, It achieved a significant performance improvement of more than two percentage points in the mAP (mean Average Precision) metric compared to current mainstream object detection models. Furthermore, ther

关键词： Decoding

来源：评论

学校读者我要写书评

暂无评论

A Generative Model-Based Network Framework for Ecological Data Reconstruction

引用

Computers, Materials & Continua 2025年第1期82卷 929-948页

作者： Shuqiao Liu Zhao Zhang Hongyan Zhou Xuebo Chen School of Electronic and Information Engineering University of Science and Technology LiaoningAnshan114051China School of Computer Science and Software Engineering University of Science and Technology LiaoningAnshan114051China

This study examines the effectiveness of artificial intelligence techniques in generating high-quality environmental data for species introductory site selection *** Strengths,Weaknesses,Opportunities,Threats(SWOT)analysis data with Variation Autoencoder(VAE)and Generative AdversarialNetwork(GAN)the network framework model(SAE-GAN),is proposed for environmental data *** model combines two popular generative models,GAN and VAE,to generate features conditional on categorical data embedding after SWOT *** model is capable of generating features that resemble real feature distributions and adding sample factors to more accurately track individual sample *** data is used to retain more semantic information to generate *** model was applied to species in Southern California,USA,citing SWOT analysis data to train the *** show that the model is capable of integrating data from more comprehensive analyses than traditional methods and generating high-quality reconstructed data from them,effectively solving the problem of insufficient data collection in development *** model is further validated by the Technique for Order Preference by Similarity to an Ideal Solution(TOPSIS)classification assessment commonly used in the environmental data *** study provides a reliable and rich source of training data for species introduction site selection systems and makes a significant contribution to ecological and sustainable development.

关键词： Convolutional Neural Network(CNN) VAE GAN TOPSIS data reconstruction

来源：评论

学校读者我要写书评

暂无评论

Steel Surface Defect Detection Algorithm Based on S-YOLOv8

IAENG International Journal of Computer Science

引用

IAENG International Journal of Computer science 2025年第3期52卷 644-652页

作者： Zhang, Xu Cui, Wenhua Tao, Ye Shi, Tianwei School of Computer Science and Software Engineering University of Science and Technology Liaoning Anshan China

Steel, being a widely utilized material in industrial production, holds a pivotal role in ensuring product safety and longevity. Hence, the exploration and implementation of steel surface defect detection technology carry significant importance. This paper introduces a steel surface defect detection algorithm based on S-YOLOv8. The algorithm, rooted in YOLOv8n as a benchmark model, initially incorporates a shift-wise shift operator in the backbone network. This introduction notably enhances accuracy compared to conventional CNN models while markedly reducing computational demands. Furthermore, the utilization of the SF-Neck framework, integrating the scale sequence feature fusion module (SSFF) and triple feature encoder module (TFE) in the head network, enriches the network’s multi-scale information extraction capabilities. Subsequently, the adoption of the WIoU loss function enhances the overall detector performance. Lastly, the integration of the SEAM occlusion attention module refines the detection head segment of the YOLOv8 algorithm, effectively addressing defect occlusion challenges. Experiments conducted on the NEU-DET dataset reveal that the mAP value of the S-YOLOv8 model reaches an impressive 84.2%. Comparative analysis with other mainstream algorithms demonstrates a substantial enhancement in detection accuracy, alongside a reduction in instances of leakage and misdetection. Consequently, this study charts a new technical trajectory for quality control within the steel manufacturing industry. © (2025), (International Association of Engineers). All rights reserved.

关键词： Benchmarking

来源：评论

学校读者我要写书评

暂无评论

Multi-label, Classification-based Prediction of Breast Cancer Metastasis Directions

IAENG International Journal of Computer Science

引用

IAENG International Journal of Computer science 2025年第1期52卷 1-10页

作者： Wang, Tingting Fan, Qi Tan, Liang Zhang, Beier School of Computer and Software Engineering Anhui Institute of Information Technology China School of Computer Science and Technology Huaibei Normal University China School of Computer and Software Engineering Anhui Institute of Information Technology China School of Computer Science and Technology Huaibei Normal University China

Predicting the metastatic direction of primary breast cancer (BC), thus assisting physicians in precise treatment, strict follow-up, and effectively improving the prognosis. The clinical data of 293,946 patients with primary BC diagnosed between 2010 and 2015 were collected from the Surveillance, Epidemiology, and End Results database. Multiple interpolations and Multi-label Synthetic Minority Over-sampling Technique methods were used for data analysis, and machine learning model was established for multi-label classification. Finally, Surgical information, lymph node status, distant metastasis, tumor size, chemotherapy, histological type, and radiotherapy had significant influence as inputs. Compared with the k-nearest neighbor model, average accuracies of the decision tree and random forest (RF) models increased from 88.84% to 93.59% and 94.14%, respectively. Their average precision, recall rate, F1 score, area under the receiver operating characteristic curve and weighted-F1 increased from 87.24% to 95.85% and 94.74%, 87.73% to 90.40% and 91.76%, 87.07% to 92.16% and 93.45%, 97.11% to 99.53% and 99.95%, 82.13% to 89.44% and 90.48%, respectively. In conclusion, the RF model, which showed the best performance, can be used in multi-label prediction of BC metastasis directions, and can assist physicians in diagnosing and treating patients with primary BC. © (2025), (International Association of Engineers). All rights reserved.

关键词： Lung cancer

来源：评论

学校读者我要写书评

暂无评论

Semi-Supervised Skin Lesion Segmentation Based on Pseudo-Labels

IAENG International Journal of Computer Science

引用

IAENG International Journal of Computer science 2025年第2期52卷 325-332页

作者： Mu, Bo Wei, JingXin Zhang, Yujun School of Computer Science and Software Engineering University of Science and Technology Liaoning Anshan114051 China

In recent years, deep learning has significantly advanced skin lesion segmentation. However, annotating medical image data is specialized and costly, while obtaining unlabeled medical data is easier. To address this challenge, we propose a semi-supervised learning strategy to improve segmentation accuracy by combining a small amount of annotated data with a larger volume of unlabeled data. Our approach employs a teacher-student model framework. In this framework, the teacher model generates pseudo-labels for the unlabeled data, and the student model is trained using both these pseudo-labels and the limited true labels. To improve the student model’s learning capacity, we introduce auxiliary segmentation heads that provide joint guidance during training. We use the crossentropy (CE) loss function to quantify the discrepancies between the segmentation outputs of the main head and auxiliary heads. Since pseudo-labels generated by the teacher model may contain noise, we developed a mechanism to identify and exclude uncertain regions in each unlabeled image. This reduces pseudolabel noise and mitigates its negative impact on the student model. Our method demonstrates significant improvements in skin lesion segmentation on the publicly available ISIC2018 dataset, achieving Dice coefficients of 87.84% and 88.73% with only 5% and 10% of the total annotated data, respectively, outperforming existing methods. © (2025), (International Association of Engineers). All rights reserved.

关键词： Self-supervised learning

来源：评论

学校读者我要写书评

暂无评论

Gloss-driven Conditional Diffusion Models for Sign Language Production

引用

ACM Transactions on Multimedia Computing, Communications and Applications 2025年第4期21卷 1-17页

作者： Tang, Shengeng Xue, Feng Wu, Jingjing Wang, Shuo Hong, Richang School of Computer Science and Information Engineering Hefei University of Technology Hefei China School of Software Hefei University of Technology Hefei China School of Data Science School of Information Science and Technology University of Science and Technology of China Hefei China

Sign Language Production (SLP) aims to convert text or audio sentences into sign language videos corresponding to their semantics, which is challenging due to the diversity and complexity of sign languages, and cross-modal semantic mapping issues. In this work, we propose a Gloss-driven Conditional Diffusion Model (GCDM) for SLP. The core of the GCDM is a diffusion model architecture, in which the sign gloss sequence is encoded by a Transformer-based encoder and input into the diffusion model as a semantic prior condition. In the process of sign pose generation, the textual semantic priors carried in the encoded gloss features are integrated into the embedded Gaussian noise via cross-attention. Subsequently, the model converts the fused features into sign language pose sequences through T-round denoising steps. During the training process, the model uses the ground-truth labels of sign poses as the starting point, generates Gaussian noise through T rounds of noise, and then performs T rounds of denoising to approximate the real sign language gestures. The entire process is constrained by the MAE loss function to ensure that the generated sign language gestures are as close as possible to the real labels. In the inference phase, the model directly randomly samples a set of Gaussian noise, generates multiple sign language gesture sequence hypotheses under the guidance of the gloss sequence, and outputs a high-confidence sign language gesture video by averaging multiple hypotheses. Experimental results on the Phoenix2014T dataset show that the proposed GCDM method achieves competitiveness in both quantitative performance and qualitative visualization. © 2025 Copyright held by the owner/author(s). Publication rights licensed to ACM.

关键词： Forming

来源：评论

学校读者我要写书评

暂无评论

The Improved Unet Semantic Segmentation Network for Remote Sensing Images

IAENG International Journal of Computer Science

引用

IAENG International Journal of Computer science 2025年第4期52卷 1187-1195页

作者： Zhu, Hang Zhao, Ji School of Computer Science and Software Engineering University of Science and Technology Liaoning Anshan114051 China

With the development of artificial intelligence, deep learning has been increasingly used to achieve automatic detection of geographic information, replacing manual interpretation and improving efficiency. However, remote sensing images themselves have the issue of slight inter-class variance and significant intra-class variance, making it challenging to extract valuable information. Additionally, the increasing resolution and size of remote sensing images in recent years have introduced more complexity in the types of information, further increasing the difficulty of extracting valuable data. This paper proposes an improved Unet semantic segmentation network (referred to as RAUnet). First, in the encoder, continuous convolutional blocks are enhanced to extract features. At the same time, the EMAM multi-scale attention module is employed for cross-channel learning, capturing information from different feature channels of the target and using the surrounding feature information to assist in distinguishing target information. To capture multi-directional long-range dependencies, the Lo2 module is used for long-range modeling, which captures not only local contextual information but also long-range dependencies. In the decoder, a Dysample upsampling module is used to restore feature details, and in the skip connection layer, features are added for feature fusion. Experimental results show that compared to mainstream models, the proposed method achieves superior segmentation results on the Potsdam and Vihingen datasets. © (2025), (International Association of Engineers). All rights reserved.

关键词： Semantic Segmentation

来源：评论

学校读者我要写书评

暂无评论

A Dangerous Driving Behavior Detection Method Based on Improved YOLOv8s

引用

engineering Letters 2025年第3期33卷 721-731页

作者： Zhou, Tong Zhang, Xiaoxia Chen, Huilong School of Computer Science and Software Engineering University of Science and Technology LiaoNing Anshan114051 China

Detecting dangerous driving behavior is a critical research area focused on identifying and preventing actions that could lead to traffic accidents, such as smoking, drinking, yawning, and drowsiness, through technical methods. Advanced computer vision and machine learning technologies enable efficient detection models to monitor and analyze driver behavior, improving road safety. Due to challenges posed by complex environments, this paper introduces an enhanced detection algorithm, YOLOv8s-CDS, to improve the identification of dangerous driving behaviors. First, the ConvNeXt V2 module is integrated with the C2f module to form C2fNeb2, optimizing feature extraction for behaviors like smoking or phone use. Second, the DASI (Dimension-Aware Selective Integration) module enhances detection accuracy through multi-scale fusion and dimension perception. Additionally, the SCConv module replaces the Conv module in the Bottleneck, forming C2fSCConv, which reduces spatial redundancy and improves detection efficiency.A comprehensive experimental analysis of dangerous driving image datasets demonstrates that the mean average precision (mAP) of the YOLOv8s-CDS algorithm is 91.20%, which is 2.4% higher than that of the YOLOv8s algorithm. Compared to other object detection algorithms, such as Faster R-CNN, YOLOv5s, YOLOv7s, YOLOX and YOLOv8, YOLOv8s-CDS demonstrates greater practicality in detecting dangerous driving behaviors, contributing to a reduction in traffic accidents and enhancing the safety of life and property. © 2025, International Association of Engineers. All rights reserved.

关键词： Highway accidents

来源：评论

学校读者我要写书评

暂无评论

Vision-Text Bidirectional Collaborative Image Captioning Algorithm

IAENG International Journal of Computer Science

引用

IAENG International Journal of Computer science 2025年第2期52卷 515-523页

作者： Li, Mei-Qi Zhou, Zi-Wei School of Computer Science and Software Engineering University of Science and Technology LiaoNing Anshan114051 China

Image captioning is an interdisciplinary research hotspot at the intersection of computer vision and natural language processing, representing a multimodal task that integrates core technologies from both fields. This task requires the use of computer vision techniques to analyze and extract key visual features from images, followed by the application of natural language processing techniques to generate descriptive text that is syntactically and semantically aligned with human cognition. This process poses a significant challenge for computers. Existing models mostly ignore the relative positional information of visual objects and struggle to efficiently capture the complex relationships between visual and textual data. To address these challenges, we propose a vision-to-text bidirectional collaborative image captioning method. This approach extracts both visual features and positional information of objects, allowing the model to better understand the spatial relationships between objects. The CEW word embedding approach encodes textual information more profoundly, enhancing semantic expression and contextual understanding. In the decoding phase, a bidirectional cross-attention mechanism strengthens the interaction between vision and text, leading to improved accuracy in image understanding. The model is trained and tested on the MSCOCO 2014 dataset and compared with several popular models. Experimental results demonstrate that the proposed method achieves significant improvements on the CIDEr and BLEU-1 evaluation metrics with an increase of 1.5 and 1.1, respectively. In addition, we conduct ablation experiments, quantitative analysis, and qualitative analysis to comprehensively validate the effectiveness and stability of the proposed algorithm. © (2025), (International Association of Engineers). All rights reserved.

关键词： Embeddings

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：