检索结果-内蒙古大学图书馆

Hybrid Optimization Algorithm for Handwritten Document Enhancement

computers, Materials & Continua 2024年第3期78卷 3763-3786页

作者： Shu-Chuan Chu Xiaomeng Yang Li Zhang Václav Snášel Jeng-Shyang Pan College of Computer Science and Engineering Shandong University of Science and TechnologyQingdao266590China School of Information and Control Engineering Qingdao University of TechnologyQingdao266520China Faculty of Electrical Engineering and Computer Science VŠB-TU OstravaOstrava70080Czech Republic Department of Information Management Chaoyang University of TechnologyTaichung41349Taiwan

The Gannet Optimization Algorithm (GOA) and the Whale Optimization Algorithm (WOA) demonstrate strong performance;however, there remains room for improvement in convergence and practical applications. This study introduces a hybrid optimization algorithm, named the adaptive inertia weight whale optimization algorithm and gannet optimization algorithm (AIWGOA), which addresses challenges in enhancing handwritten documents. The hybrid strategy integrates the strengths of both algorithms, significantly enhancing their capabilities, whereas the adaptive parameter strategy mitigates the need for manual parameter setting. By amalgamating the hybrid strategy and parameter-adaptive approach, the Gannet Optimization Algorithm was refined to yield the AIWGOA. Through a performance analysis of the CEC2013 benchmark, the AIWGOA demonstrates notable advantages across various metrics. Subsequently, an evaluation index was employed to assess the enhanced handwritten documents and images, affirming the superior practical application of the AIWGOA compared with other algorithms.

关键词： Metaheuristic algorithm gannet optimization algorithm hybrid algorithm handwritten document enhancement

来源：评论

学校读者我要写书评

暂无评论

Domain generalization with semi-supervised learning for people-centric activity recognition

引用

science China(information sciences) 2025年第1期68卷 171-188页

作者： Jing LIU Wei ZHU Di LI Xing HU Liang SONG Academy for Engineering & Technology Fudan University Shanghai East-bund Research Institute on Networking Systems of AI School of Optoelectronic Information and Computer Engineering University of Shanghai for Science & Technology

People-centric activity recognition is one of the most critical technologies in a wide range of real-world applications,including intelligent transportation systems, healthcare services, and brain-computer interfaces. Large-scale data collection and annotation make the application of machine learning algorithms prohibitively expensive when adapting to new tasks. One way of circumventing this limitation is to train the model in a semi-supervised learning manner that utilizes a percentage of unlabeled data to reduce the labeling burden in prediction tasks. Despite their appeal, these models often assume that labeled and unlabeled data come from similar distributions, which leads to the domain shift problem caused by the presence of distribution gaps. To address these limitations, we propose herein a novel method for people-centric activity recognition,called domain generalization with semi-supervised learning(DGSSL), that effectively enhances the representation learning and domain alignment capabilities of a model. We first design a new autoregressive discriminator for adversarial training between unlabeled and labeled source domains, extracting domain-specific features to reduce the distribution gaps. Second, we introduce two reconstruction tasks to capture the task-specific features to avoid losing information related to representation learning while maintaining task-specific consistency. Finally, benefiting from the collaborative optimization of these two tasks, the model can accurately predict both the domain and category labels of the source domains for the classification task. We conduct extensive experiments on three real-world sensing datasets. The experimental results show that DGSSL surpasses the three state-of-the-art methods with better performance and generalization.

关键词： activity recognition deep learning domain generalization semi-supervised learning adversarial training

来源：评论

学校读者我要写书评

暂无评论

CrowdCL: Unsupervised Crowd Counting Network via Contrastive Learning

引用

IEEE Internet of Things Journal 2025年第12期12卷 21704-21719页

作者： Hu, Yingxiang Liu, Yanbo Cao, Guo Wang, Jin Nanjing University of Science and Technology School of Computer Science and Engineering Nanjing210096 China Nantong University School of Information Science and Technology Nantong226000 China

With the continuous growth of the population, crowd counting plays a crucial role in intelligent monitoring systems for the Internet of Things (IoT) and smart city development. Accurate monitoring of crowd density not only helps maintain public safety but also effectively promotes the development of smart cities. Currently, supervised crowd counting techniques have made significant progress in improving accuracy, but these methods rely on expensive manual annotations and have limited generalization performance. To address these challenges, this article proposes an unsupervised crowd counting network based on contrastive learning, named CrowdCL. CrowdCL primarily leverages image-image contrastive learning and text-image contrastive learning to achieve unsupervised crowd counting. Specifically, in image-image contrastive learning, we strengthen the network’s ability to distinguish crowd features by designing progressive occlusion strategies and patch matching strategies, effectively differentiating crowd information from background information. In text-image contrastive learning, we construct ordered textual prompts to match ordered feature maps and use modality matching loss (Lm) to guide the image encoder. Additionally, to reduce the loss of fine details and alleviate the interference of complex backgrounds, we design a coarse-grained filtering strategy during the testing phase, assigning higher weights to crowd patches with greater potential. Experiments on multiple public datasets show that CrowdCL not only achieves outstanding performance but also outperforms some fully supervised methods in cross-dataset testing. © 2014 IEEE.

关键词： Contrastive Learning

来源：评论

学校读者我要写书评

暂无评论

Enhanced Multi-Scale Object Detection Algorithm for Foggy Traffic Scenarios

引用

computers, Materials & Continua 2025年第2期82卷 2451-2474页

作者： Honglin Wang Zitong Shi Cheng Zhu School of Artificial Intelligence Nanjing University of Information Science and TechnologyNanjing210044China School of Computer Science Nanjing University of Information Science and TechnologyNanjing210044China School of Electrical&Computer Engineering University of Illinois at Urbana ChampaignUrbanaIL 61801USA

In foggy traffic scenarios, existing object detection algorithms face challenges such as low detection accuracy, poor robustness, occlusion, missed detections, and false detections. To address this issue, a multi-scale object detection algorithm based on an improved YOLOv8 has been proposed. Firstly, a lightweight attention mechanism, Triplet Attention, is introduced to enhance the algorithm’s ability to extract multi-dimensional and multi-scale features, thereby improving the receptive capability of the feature maps. Secondly, the Diverse Branch Block (DBB) is integrated into the CSP Bottleneck with two Convolutions (C2F) module to strengthen the fusion of semantic information across different layers. Thirdly, a new decoupled detection head is proposed by redesigning the original network head based on the Diverse Branch Block module to improve detection accuracy and reduce missed and false detections. Finally, the Minimum Point Distance based Intersection-over-Union (MPDIoU) is used to replace the original YOLOv8 Complete Intersection-over-Union (CIoU) to accelerate the network’s training convergence. Comparative experiments and dehazing pre-processing tests were conducted on the RTTS and VOC-Fog datasets. Compared to the baseline YOLOv8 model, the improved algorithm achieved mean Average Precision (mAP) improvements of 4.6% and 3.8%, respectively. After defogging pre-processing, the mAP increased by 5.3% and 4.4%, respectively. The experimental results demonstrate that the improved algorithm exhibits high practicality and effectiveness in foggy traffic scenarios.

关键词： Deep learning object detection foggy scenes traffic detection YOLOv8

来源：评论

学校读者我要写书评

暂无评论

A hybrid model for Arabic character recognition using CNN and Kolmogorov Arnold Networks (KANs)

引用

Multimedia Tools and Applications 2025年 1-24页

作者： Alsayed, Alhag Li, Chunlin Abdalsalam, Mohammed Fat’hAlalim, Ahmed School of Computer Science and Artificial Intelligence Wuhan University of Technology Hubei Wuhan China School of Computer Science and Information Technology University of Gadarif Gadarif Sudan Faculty of Electronics Telecommunications and Informatics Gdansk University of Technology Gdansk Poland School of Computer Science and Information Technology Holy Quran University Omdurman Sudan

Optical Character Recognition (OCR) is a significant technological advancement that turns scanned documents and pictures with text into machine-readable formats. While OCR has reached high accuracy rates for Latin-based languages, such as English, there are still significant challenges for right-to-left languages, including Arabic. Improving the capabilities of Arabic Handwriting Recognition (AHR) is critical to enhancing access to digital content in the Arabic language, given that there are well over 400 million speakers. Arabic OCR faces unique challenges regarding the nature of the language, where letter forms change according to their position within a word with or without dots and connections between intricate characters. Moreover, diacritical marks and curved shapes present additional challenges. This paper presents a hybrid model for Arabic character recognition. It is based on Convolutional Neural Networks (CNNs) and various machine learning classifiers such as Support Vector Machines (SVM), k-nearest Neighbors (KNN), Random Forest (RF), Long Short-Term Memory (LSTM), and Kolmogorov Arnold Networks (KANs). The proposed CNN+KANs model exploits the ability of CNNs to extract features and the ability of KANs to approximate complex functions to achieve high recognition accuracy. The model is tested over two datasets, including diverse handwriting styles, such as the Arabic Handwritten Characters Dataset (AHCD) and Arabic letters written by children, which we call Hijja. Experimental results demonstrate that the CNN+KANs model achieves state-of-the-art performance, with testing accuracies of 97.71% on AHCD and 91.32% on Hijja, outperforming traditional models like CNN+SVM and CNN+KNN. The study shows that the CNN+KANs hybrid model is strong and can adjust well, which makes it a good choice for recognizing Arabic handwritten characters. © The Author(s), under exclusive licence to Springer science+Business Media, LLC, part of Springer Nature 2025.

关键词： Optical character recognition

来源：评论

学校读者我要写书评

暂无评论

Research on facial expression recognition based on multimodal data fusion and neural network

引用

International Journal of Wireless and Mobile Computing 2024年第1期27卷 47-55页

作者： Han, Yi Wang, Xubin Lu, Zhengyu School of Mechanical Science and Engineering Huazhong University of Science and Technology Hubei Wuhan China Anyang Institute of Technology Department of Computer Science and Information Engineering Henan Province Anyang China School of Computer Science and Information Engineering Shanghai Institute of Technology FengXian District Shanghai China

Facial expression recognition is a challenging task when neural network is applied to pattern recognition. Most of the current recognition research is based on single source facial data, which generally has the disadvantages of low accuracy and low robustness. In this paper, a neural network algorithm of facial expression recognition based on multimodal data fusion is proposed. The algorithm is based on the multimodal data, and it takes the facial image, the histogram of oriented gradient of the image and the facial landmarks as the input, and establishes Convolutional Neural Network (CNN) designed to extract features from facial image, neural network designed to extract features from facial Landmarks and Neural Network (LNN) designed to extract features from Histogram of gradient (HNN), three sub-neural networks to extract data features, using multimodal data feature fusion mechanism to improve the accuracy of facial expression recognition. Experimental results show that, the algorithm has a great improvement in accuracy, robustness and detection speed. Copyright © 2024 Inderscience Enterprises Ltd.

关键词： Data fusion

来源：评论

学校读者我要写书评

暂无评论

SA-Model:Multi-Feature Fusion Poetic Sentiment Analysis Based on a Hybrid Word Vector Model

引用

computer Modeling in engineering & sciences 2023年第10期137卷 631-645页

作者： Lingli Zhang Yadong Wu Qikai Chu Pan Li Guijuan Wang Weihan Zhang Yu Qiu Yi Li School of Computer Science and Engineering Sichuan University of Science and EngineeringZigong643000China School of Automation and Information Engineering Sichuan University of Science and EngineeringZigong643000China School of Computer Science and Technology Southwest University of Science and TechnologyMianyang621000China School of Information Engineering Southwest University of Science and TechnologyMianyang621000China

Sentiment analysis in Chinese classical poetry has become a prominent topic in historical and cultural tracing,ancient literature research,***,the existing research on sentiment analysis is relatively *** does not effectively solve the problems such as the weak feature extraction ability of poetry text,which leads to the low performance of the model on sentiment analysis for Chinese classical *** this research,we offer the SA-Model,a poetic sentiment analysis ***-Model firstly extracts text vector information and fuses it through Bidirectional encoder representation from transformers-Whole word masking-extension(BERT-wwmext)and Enhanced representation through knowledge integration(ERNIE)to enrich text vector information;Secondly,it incorporates numerous encoders to remove text features at multiple levels,thereby increasing text feature information,improving text semantics accuracy,and enhancing the model’s learning and generalization capabilities;finally,multi-feature fusion poetry sentiment analysis model is *** feasibility and accuracy of the model are validated through the ancient poetry sentiment *** with other baseline models,the experimental findings indicate that SA-Model may increase the accuracy of text semantics and hence improve the capability of poetry sentiment analysis.

关键词： Sentiment analysis Chinese classical poetry natural language processing BERT-wwm-ext ERNIE multi-feature fusion

来源：评论

学校读者我要写书评

暂无评论

Finite-Time Sliding Mode Control of Vehicle Formation with Specified Performance

引用

engineering Letters 2025年第3期33卷 620-629页

作者： Guo, Dongyang Zhang, Zhao Zhou, Hongyan Chen, Xue-Bo School of Electronic and Information Engineering University of Science and Technology Liaoning Anshan114051 China School of Computer Science and Software Engineering University of Science and Technology Liaoning Anshan114051 China

The current urban intelligent transportation is in a rapid development stage, and coherence control of vehicle formations has important implications in urban intelligent transportation research. This article focuses on the problem of urban intelligent transportation planning, and proposes a new control method for vehicle formation regulation tracking performance control with bounded disturbances and model uncertainty, combining the multiple advantages of sliding mode control and finite time control and utilizing the universal approximation of neural network. This method can ensure that the vehicle formation can achieve the specified tracking performance and be stable in finite time instead of asymptotically stable. First, the third-order dynamics model and spacing strategy of the vehicle are given. The tracking performance of the vehicle formation is specified. Then, the control objective of the system is transformed by using the error transformation to obtain a new system. An improved sliding mode surface is designed for the transformed system. Furthermore, the universal approximation property of the neural network is utilised to overcome the parameter uncertainty in the system. The reconstruction error of neural network is handled by a robust term. In addition, the jittery vibration phenomenon of sliding mode control has been overcome. Finally, the finite-time stability of the system is analysed by constructing the Lyapunov function, simulations were performed to validate and compare with conventional control methods, simulation results show that the novel control method proposed in this paper is significantly faster than the conventional control method, validated the effectiveness of the method proposed in this paper. © 2025, International Association of Engineers. All rights reserved.

关键词： Lyapunov functions

来源：评论

学校读者我要写书评

暂无评论

Clustering and Artificial Intelligence-based Prediction of Ecologically Sustainable Species Introductions

IAENG International Journal of Computer Science

引用

IAENG International Journal of computer science 2025年第4期52卷 1159-1168页

作者： Liu, Shuqiao Zhang, Zhao Zhou, Hongyan Chen, Xue-Bo School of Electronic and Information Engineering University of Science and Technology Liaoning Anshan114051 China School of Computer Science and Software Engineering University of Science and Technology Liaoning Anshan114051 China

There is a growing interest in sustainable ecosystem development, which includes methods such as scientific modeling, environmental assessment, and development forecasting and planning. However, due to insufficient survey data in many current development areas, development progress is delayed and stagnant. To address this situation, this paper proposes a SWOT-TOPSIS-K-Means (STK) data analysis and evaluation model to analyze ecological factors, which can realize a comprehensive and complete data analysis with fewer samples. Decision tree (DT), random forest (RF), and multilayer perceptron (MLP) neural network models were constructed from the results of this analysis, and statistical tests such as r-squared, mean absolute error, and cross-validation are used to further confirm the performance efficiency of the computational prediction models to provide real-time prediction research solutions. For this purpose, data from research scholars on species introduction in ecosystem development were selected for testing. The results show that the proposed assessment model and modeling results satisfy all accuracy-related acceptance requirements. Among them, MLP is better than DT and RF. In summary, the STK assessment model and the MLP prediction model can provide a basis for the selection and development of ecological factors. © (2025), (International Association of Engineers). All rights reserved.

关键词： Prediction models

来源：评论

学校读者我要写书评

暂无评论

Event-Driven Attention Network:A Cross-Modal Framework for Efficient Image-Text Retrieval in Mass Gathering Events

引用

computers, Materials & Continua 2025年第5期83卷 3277-3301页

作者： Kamil Yasen Heyan Jin Sijie Yang Li Zhan Xuyang Zhang Ke Qin Ye Li School of Computer Science and Engineering University of Electronic Science and Technology of ChinaChengdu611731China School of Information and Software Engineering University of Electronic Science and Technology ofChinaChengdu611731China Kashi Institute of Electronics and Information Industry Kashi844508China

Research on mass gathering events is critical for ensuring public security and maintaining social ***,most of the existing works focus on crowd behavior analysis areas such as anomaly detection and crowd counting,and there is a relative lack of research on mass gathering *** believe real-time detection and monitoring of mass gathering behaviors are essential formigrating potential security risks and ***,it is imperative to develop a method capable of accurately identifying and localizing mass gatherings before disasters occur,enabling prompt and effective *** address this problem,we propose an innovative Event-Driven Attention Network(EDAN),which achieves image-text matching in the scenario of mass gathering events with good results for the first *** image-text retrieval methods based on global alignment are difficult to capture the local details within complex scenes,limiting retrieval *** local alignment-based methods aremore effective at extracting detailed features,they frequently process raw textual features directly,which often contain ambiguities and redundant information that can diminish retrieval efficiency and degrade model *** overcome these challenges,EDAN introduces an Event-Driven AttentionModule that adaptively focuses attention on image regions or textual words relevant to the event *** calculating the semantic distance between event labels and textual content,this module effectively significantly reduces computational complexity and enhances retrieval *** validate the effectiveness of EDAN,we construct a dedicated multimodal dataset tailored for the analysis of mass gathering events,providing a reliable foundation for subsequent *** conduct comparative experiments with other methods on our dataset,the experimental results demonstrate the effectiveness of *** the image-to-text retrieval task,EDAN achieved the best performance on the R@5 metric,w

关键词： Mass gathering events image-text retrieval attention mechanism

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：