检索结果-内蒙古大学图书馆

A Lightweight Network with Dual Encoder and Cross Feature Fusion for Cement Pavement Crack Detection

computer Modeling in engineering & sciences 2024年第7期140卷 255-273页

作者： Zhong Qu Guoqing Mu Bin Yuan School of Computer Science and Technology Chongqing University of Posts and TelecommunicationsChongqing400065China School of Software Engineering Chongqing University of Posts and TelecommunicationsChongqing400065China

Automatic crack detection of cement pavement chiefly benefits from the rapid development of deep learning,with convolutional neural networks(CNN)playing an important role in this ***,as the performance of crack detection in cement pavement improves,the depth and width of the network structure are significantly increased,which necessitates more computing power and storage *** limitation hampers the practical implementation of crack detection models on various platforms,particularly portable devices like small mobile *** solve these problems,we propose a dual-encoder-based network architecture that focuses on extracting more comprehensive fracture feature information and combines cross-fusion modules and coordinated attention mechanisms formore efficient feature ***,we use small channel convolution to construct shallow feature extractionmodule(SFEM)to extract low-level feature information of cracks in cement pavement images,in order to obtainmore information about cracks in the shallowfeatures of *** addition,we construct large kernel atrous convolution(LKAC)to enhance crack information,which incorporates coordination attention mechanism for non-crack information filtering,and large kernel atrous convolution with different cores,using different receptive fields to extract more detailed edge and context ***,the three-stage feature map outputs from the shallow feature extraction module is cross-fused with the two-stage feature map outputs from the large kernel atrous convolution module,and the shallow feature and detailed edge feature are fully fused to obtain the final crack prediction *** evaluate our method on three public crack datasets:DeepCrack,CFD,and *** results on theDeepCrack dataset demonstrate the effectiveness of our proposed method compared to state-of-the-art crack detection methods,which achieves Precision(P)87.2%,Recall(R)87.7%,and F-score(F1)87.4%.Thanks to our lightweight cr

关键词： Shallow feature extraction module large kernel atrous convolution dual encoder lightweight network crack detection

来源：评论

学校读者我要写书评

暂无评论

Automatic summarization of cooking videos using transfer learning and transformer-based models

引用

Discover Artificial Intelligence 2025年第1期5卷 1-20页

作者： Sadique, P. M. Alen Aswiga, R.V. School of Computer Science and Engineering Vellore Institute of Technology Tamil Nadu Chennai600127 India

The proliferation of cooking videos on the internet these days necessitates the conversion of these lengthy video contents into concise text recipes. Many online platforms now have a large number of cooking videos, in which, there is a challenge for viewers to extract comprehensive recipes from lengthy visual content. Effective summary is necessary in order to translate the abundance of culinary knowledge found in videos into text recipes that are easy to read and follow. This will make the cooking process easier for individuals who are searching for precise step by step cooking instructions. Such a system satisfies the needs of a broad spectrum of learners while also improving accessibility and user simplicity. As there is a growing need for easy-to-follow recipes made from cooking videos, researchers are looking on the process of automated summarization using advanced techniques. One such approach is presented in our work, which combines simple image-based models, audio processing, and GPT-based models to create a system that makes it easier to turn long culinary videos into in-depth recipe texts. A systematic workflow is adopted in order to achieve the objective. Initially, Focus is given for frame summary generation which employs a combination of two convolutional neural networks and a GPT-based model. A pre-trained CNN model called Inception-V3 is fine-tuned with food image dataset for dish recognition and another custom-made CNN is built with ingredient images for ingredient recognition. Then a GPT based model is used to combine the results produced by the two CNN models which will give us the frame summary in the desired format. Subsequently, Audio summary generation is tackled by performing Speech-to-text functionality in python. A GPT-based model is then used to generate a summary of the resulting textual representation of audio in our desired format. Finally, to refine the summaries obtained from visual and auditory content, Another GPT-based model is used

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Brain tumor segmentation and classification using transfer learning based CNN model with model agnostic concept interpretation

引用

Multimedia Tools and Applications 2025年第5期84卷 2509-2538页

作者： Nancy, A. Maria Maheswari, R. School of Computer Science and Engineering Vellore Institute of Technology Tamil Nadu Chennai632014 India

In recent decades, brain tumors have been regarded as a severe illness that causes significant damage to the health of the individual, and finally it results to death. Hence, the Brain Tumor Segmentation and Classification (BTSC) has gained more attention among researcher communities. BTSC is the process of finding brain tumor tissues and classifying the tissues based on the tumor types. Manual tumor segmentation from is prone to error and a time-consuming task. A precise and fast BTSC model is developed in this manuscript based on a transfer learning-based Convolutional Neural Networks (CNN) model. The utilization of a variant of CNN is because of its superiority in distinct tasks. In the initial phase, the Magnetic Resonance Imaging (MRI) brain images are acquired from the Brain Tumor Image Segmentation Challenge (BRATS) 2019, 2020 and 2021 databases. Then the image augmentation is performed on the gathered images by using zoom-in, rotation, zoom-out, flipping, scaling, and shifting methods that effectively reduce overfitting issues in the classification model. The augmented images are segmented using the layers of the Visual-Geometry-Group (VGG-19) model. Then feature extraction using An Attribute Aware Attention (AWA) methodology is carried out on the segmented images following the segmentation block in the VGG-19 model. The crucial features are then selected using the attribute category reciprocal attention phase. These features are inputted to the Model Agnostic Concept Extractor (MACE) to generate the relevance score between the features for assisting in the final classification process. The obtained relevance scores from the MACE are provided to the max-pooling layer of the VGG-19 model. Then, the final classified output is obtained from the modified VGG-19 architecture. The implemented Relevance score with the AWA-based VGG-19 model is used to classify the tumor as the whole tumor, enhanced tumor, and tumor core. In the classification section, the proposed

关键词： Magnetic resonance imaging

来源：评论

学校读者我要写书评

暂无评论

Identifying malicious traffic under concept drift based on intraclass consistency enhanced variational autoencoder

引用

science China(Information sciences) 2024年第8期67卷 238-252页

作者： Xiang LUO Chang LIU Gaopeng GOU Gang XIONG Zhen LI Binxing FANG Institute of Information Engineering Chinese Academy of Sciences School of Cyber Security University of Chinese Academy of Sciences School of Computer Science and Technology Harbin Institute of Technology (Shenzhen)

Accurate identification of malicious traffic is crucial for implementing effective defense countermeasures and has led to extensive research efforts. However, the continuously evolving techniques employed by adversaries have introduced the issues of concept drift, which significantly affects the performance of existing methods. To tackle this challenge, some researchers have focused on improving the separability of malicious traffic representation and designing drift detectors to reduce the number of false ***, these methods often overlook the importance of enhancing the generalization and intraclass consistency in the representation. Additionally, the detectors are not sufficiently sensitive to the variations among different malicious traffic classes, which results in poor performance and limited robustness. In this paper, we propose intraclass consistency enhanced variational autoencoder with Class-Perception detector(ICE-CP) to identify malicious traffic under concept drift. It comprises two key modules during training:intraclass consistency enhanced(ICE) representation learning and Class-Perception(CP) detector construction. In the first module, we employ a variational autoencoder(VAE) in conjunction with Kullback-Leibler(KL)-divergence and cross-entropy loss to model the distribution of each input malicious traffic flow. This approach simultaneously enhances the generalization, interclass consistency, and intraclass differences in the learned representation. Consequently, we obtain a compact representation and a trained classifier for nondrifting malicious traffic. In the second module, we design the CP detector, which generates a centroid and threshold for each malicious traffic class separately based on the learned representation, depicting the boundaries between drifting and non-drifting malicious traffic. During testing, we utilize the trained classifier to predict malicious traffic classes for the testing samples. Then, we use the CP det

关键词： concept drift malicious traffic identification variational autoencoder intrusion detection cyberspace security

来源：评论

学校读者我要写书评

暂无评论

Residual diverse ensemble for long-tailed multi-label text classification

引用

science China(Information sciences) 2024年第11期67卷 92-105页

作者： Jiangxin SHI Tong WEI Yufeng LI National Key Laboratory for Novel Software Technology Nanjing University School of Artificial Intelligence Nanjing University School of Computer Science and Engineering Southeast University Key Laboratory of Computer Network and Information Integration Southeast UniversityMinistry of Education

Long-tailed multi-label text classification aims to identify a subset of relevant labels from a large candidate label set, where the training datasets usually follow long-tailed label distributions. Many of the previous studies have treated head and tail labels equally, resulting in unsatisfactory performance for identifying tail labels. To address this issue, this paper proposes a novel learning method that combines arbitrary models with two steps. The first step is the “diverse ensemble” that encourages diverse predictions among multiple shallow classifiers, particularly on tail labels, and can improve the generalization of tail *** second is the “error correction” that takes advantage of accurate predictions on head labels by the base model and approximates its residual errors for tail labels. Thus, it enables the “diverse ensemble” to focus on optimizing the tail label performance. This overall procedure is called residual diverse ensemble(RDE). RDE is implemented via a single-hidden-layer perceptron and can be used for scaling up to hundreds of thousands of labels. We empirically show that RDE consistently improves many existing models with considerable performance gains on benchmark datasets, especially with respect to the propensity-scored evaluation ***, RDE converges in less than 30 training epochs without increasing the computational overhead.

关键词： multi-label learning extreme multi-label learning long-tailed distribution multi-label text classification ensemble learning

来源：评论

学校读者我要写书评

暂无评论

Build Yourself before Collaboration: Vertical Federated Learning with Limited Aligned Samples

引用

IEEE Transactions on Mobile Computing 2025年第7期24卷 6503-6516页

作者： Shen, Wei Ye, Mang Yu, Wei Yuen, Pong C. Wuhan University National Engineering Research Center for Multimedia Software School of Computer Science Wuhan China Hong Kong Baptist University Department of Computer Science Hong Kong Hong Kong

Vertical Federated Learning (VFL) has emerged as a crucial privacy-preserving learning paradigm that involves training models using distributed features from shared samples. However, the performance of VFL can be hindered when the number of shared or aligned samples is limited, a common issue in mobile environments where user data are diverse and unaligned across multiple devices. Existing approaches use feature generation and pseudo-label estimation for unaligned samples to address this issue, unavoidably introducing noise during the generation process. In this work, we propose Local Enhanced Effective Vertical Federated Learning (LEEF-VFL), which fully utilizes unaligned samples in the local learning before collaboration. Unlike previous methods that overlook private labels owned by each client, we leverage these private labels to learn from all local samples, constructing robust local models to serve as solid foundations for collaborative learning. Additionally, we reveal that the limited number of aligned samples introduces distribution bias from global data distribution. In this case, we propose to minimize the distribution discrepancies between the aligned samples and the global data distribution to enhance collaboration. Extensive experiments demonstrate the effectiveness of LEEF-VFL in addressing the challenges of limited aligned samples, making it suitable for VFL in mobile computing environments. Codes are available at https://***/shentt67/LEEF-VFL. © 2025 IEEE.

关键词： Federated learning

来源：评论

学校读者我要写书评

暂无评论

Research on Network Traffic Classification Based on Graph Neural Network

IAENG International Journal of Computer Science

引用

IAENG International Journal of computer science 2024年第12期51卷 2043-2050页

作者： Han, Yue Dai, Hong University of Science and Technology Liaoning Liaoning Anshan114051 China School of Computer Science and Software Engineering University of Science and Technology Liaoning Liaoning Anshan China

Network traffic classification is a critical concern in network security and management, essential for accurately differentiating among various network applications, optimizing service quality, and improving user experience. The exponential increase in worldwide Internet users and network traffic is continuously augmenting the diversity and complexity of network applications, rendering the Internet environment increasingly intricate and dynamic. Conventional machine learning techniques possess restricted processing abilities for network traffic attributes and struggle to address the progressively intricate traffic classification tasks in contemporary networks. In recent years, the swift advancement of deep learning technologies, particularly Graph Neural Networks (GNN), has yielded significant improvements in network traffic classification. GNN can capture the structured information among network nodes and extract the latent features of network traffic. Nonetheless, current network traffic classification models continue to exhibit deficiencies in the thoroughness of feature extraction. To tackle the problem, this research proposes a method for constructing traffic graphs utilizing numerical similarity and byte distance proximity by exploring the latent correlations among bytes, and it constructs a model, SDA-GNN, based on Graph Isomorphic Networks (GIN) for the categorization of network traffic. In particular, the Dynamic Time Warping (DTW) distance is employed to evaluate the disparity in byte distributions, a channel attention mechanism is utilized to extract additional features, and a Long Short-Term Memory Network (LSTM) enhances the stability of the training process by extracting sequence characteristics. Experimental findings on two actual datasets indicate that the SDA-GNN model surpasses other baseline techniques across multiple assessment parameters in the network traffic classification task, achieving classification accuracy enhancements of 2.19% and 1.49%

关键词： Long short-term memory

来源：评论

学校读者我要写书评

暂无评论

Imbalanced multilabel retinal disease classification using threshold moving and ensemble learning

引用

Multimedia Tools and Applications 2025年 1-19页

作者： Pendharkar, Gaurav Balaji, Sudharshanan Kumar, B. Muhesh Malathi, G. School of Computer Science and Engineering Vellore Institute of Technology Tamil Nadu Chennai600127 India

Billions of people worldwide are affected by vision impairment majorly caused due to age-related degradation and refractive errors. Diabetic Retinopathy(DR) and Macular Hole(MH) are among the most prevalent senescent retinal diseases. Machine Intelligence can assist ophthalmologists and clinicians in fast and accurate disease diagnosis by identifying patterns in disease progression for a better healthcare system. In this paper, the Retinal Fundus Multi-disease Image Dataset (RFMiD) is used to design a machine intelligence system with two chief components namely a disease risk classifier, and a multi-label classifier. The disease risk classifier predicts whether the retinal fundus image is infected or not. Based on the prediction of the disease risk classifier, a multi-label classifier can be applied to obtain probabilities for the susceptibility to DR and MH. Finally, an ensemble is employed with the best of 3 models for each classifier. The disease riskpredictor attained a peak F1-score of 88%, while the multi-label classifier achieved an Area Under the Curve(AUC) score of 86%. However, the individual binary classifiers for DR and MH reached maximum F-scores of 91% and 93%, respectively. © The Author(s), under exclusive licence to Springer science+Business Media, LLC, part of Springer Nature 2025.

关键词： Risk perception

来源：评论

学校读者我要写书评

暂无评论

DDSS: Driver decision support system based on the driver behaviour prediction to avoid accidents in intelligent transport system

引用

International Journal of Cognitive Computing in engineering 2024年第1期5卷 1-13页

作者： S, Balasubramani D, John Aravindhar Renjith, P.N. Ramesh, K. Department of Computer Science and Engineering Koneru Lakshmaiah Education Foundation Andhra Pradesh Vaddeswaram India Department of Computer Science and Engineering Hindustan Institute of Technology and Science Chennai India School of Computer Science and Engineering Vellore Institute of Technology Chennai India Department of Computer Science and Engineering Sri Krishna College of Engineering and Technology Coimbatore India

Accidents caused by drivers who exhibit unusual behavior are putting road safety at ever-greater risk. When one or more vehicle nodes behave in this way, it can put other nodes in danger and result in potentially catastrophic accidents. In order to anticipate and handle unusual driving behavior in Intelligent Transportation Systems (ITS), this research presents a unique Driver Decision Support System (DDSS). A reliable driving behavior prediction system is used by the suggested DDSS to categorize drivers as displaying normal or abnormal behavior. In order to prevent accidents in ITS scenarios, the system reliably detects anomalous driving patterns and advises nearby vehicles to change lanes or alter speed. The driver behavior prediction algorithm efficiently groups drivers into behavior categories using the K-Means clustering method. In order to evaluate the algorithm's efficacy, a comparative analysis is conducted by comparing its outcomes against those of Support Vector Machines (SVMs), Decision Trees, K-Nearest Neighbours (KNN), Logistic Regression, and Naïve Bayes. The integration of the Driver Decision Support System into the Intelligent Transportation System infrastructure serves to augment endeavours in accident prevention. Monitoring and analysis of driver behavior enable timely interventions, promoting safer driving practices and reducing accident risks. This research helps to create a more effective transportation system by reducing the number of accidents brought on by reckless driving. Because of its novel method to anticipating and controlling driver behavior, the proposed DDSS has promise for improving road safety and preventing accidents. The efficacy and the dependability of the driver behavior prediction algorithm are confirmed by the experimental assessment. © 2023

关键词： K-means clustering

来源：评论

学校读者我要写书评

暂无评论

QARF: A Novel Malicious Traffic Detection Approach via Online Active Learning for Evolving Traffic Streams

引用

Chinese Journal of Electronics 2024年第3期33卷 645-656页

作者： Zequn NIU Jingfeng XUE Yong WANG Tianwei LEI Weijie HAN Xianwei GAO School of Computer Science and Technology Beijing Institute of Technology School of Space Information Space Engineering University

In practical abnormal traffic detection scenarios,traffic often appears as drift,imbalanced and rare labeled streams,and how to effectively identify malicious traffic in such complex situations has become a challenge for malicious traffic *** have extensive studies on malicious traffic detection with single challenge,but the detection of complex traffic has not been widely *** adaptive random forests(QARF) is proposed to detect traffic streams with concept drift,imbalance and lack of labeled *** is an online active learning based approach which combines adaptive random forests method and adaptive margin sampling *** achieves querying a small number of instances from unlabeled traffic streams to obtain effective *** conduct experiments using the NSL-KDD dataset to evaluate the performance of *** is compared with other state-of-the-art *** experimental results show that QARF obtains 98.20% accuracy on the NSL-KDD *** performs better than other state-of-the-art methods in comparisons.

关键词： Training Streams Random forests

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：