Automatic crack detection of cement pavement chiefly benefits from the rapid development of deep learning,with convolutional neural networks(CNN)playing an important role in this ***,as the performance of crack detect...
详细信息
Automatic crack detection of cement pavement chiefly benefits from the rapid development of deep learning,with convolutional neural networks(CNN)playing an important role in this ***,as the performance of crack detection in cement pavement improves,the depth and width of the network structure are significantly increased,which necessitates more computing power and storage *** limitation hampers the practical implementation of crack detection models on various platforms,particularly portable devices like small mobile *** solve these problems,we propose a dual-encoder-based network architecture that focuses on extracting more comprehensive fracture feature information and combines cross-fusion modules and coordinated attention mechanisms formore efficient feature ***,we use small channel convolution to construct shallow feature extractionmodule(SFEM)to extract low-level feature information of cracks in cement pavement images,in order to obtainmore information about cracks in the shallowfeatures of *** addition,we construct large kernel atrous convolution(LKAC)to enhance crack information,which incorporates coordination attention mechanism for non-crack information filtering,and large kernel atrous convolution with different cores,using different receptive fields to extract more detailed edge and context ***,the three-stage feature map outputs from the shallow feature extraction module is cross-fused with the two-stage feature map outputs from the large kernel atrous convolution module,and the shallow feature and detailed edge feature are fully fused to obtain the final crack prediction *** evaluate our method on three public crack datasets:DeepCrack,CFD,and *** results on theDeepCrack dataset demonstrate the effectiveness of our proposed method compared to state-of-the-art crack detection methods,which achieves Precision(P)87.2%,Recall(R)87.7%,and F-score(F1)87.4%.Thanks to our lightweight cr
The proliferation of cooking videos on the internet these days necessitates the conversion of these lengthy video contents into concise text recipes. Many online platforms now have a large number of cooking videos, in...
详细信息
The proliferation of cooking videos on the internet these days necessitates the conversion of these lengthy video contents into concise text recipes. Many online platforms now have a large number of cooking videos, in which, there is a challenge for viewers to extract comprehensive recipes from lengthy visual content. Effective summary is necessary in order to translate the abundance of culinary knowledge found in videos into text recipes that are easy to read and follow. This will make the cooking process easier for individuals who are searching for precise step by step cooking instructions. Such a system satisfies the needs of a broad spectrum of learners while also improving accessibility and user simplicity. As there is a growing need for easy-to-follow recipes made from cooking videos, researchers are looking on the process of automated summarization using advanced techniques. One such approach is presented in our work, which combines simple image-based models, audio processing, and GPT-based models to create a system that makes it easier to turn long culinary videos into in-depth recipe texts. A systematic workflow is adopted in order to achieve the objective. Initially, Focus is given for frame summary generation which employs a combination of two convolutional neural networks and a GPT-based model. A pre-trained CNN model called Inception-V3 is fine-tuned with food image dataset for dish recognition and another custom-made CNN is built with ingredient images for ingredient recognition. Then a GPT based model is used to combine the results produced by the two CNN models which will give us the frame summary in the desired format. Subsequently, Audio summary generation is tackled by performing Speech-to-text functionality in python. A GPT-based model is then used to generate a summary of the resulting textual representation of audio in our desired format. Finally, to refine the summaries obtained from visual and auditory content, Another GPT-based model is used
In recent decades, brain tumors have been regarded as a severe illness that causes significant damage to the health of the individual, and finally it results to death. Hence, the Brain Tumor Segmentation and Classific...
详细信息
In recent decades, brain tumors have been regarded as a severe illness that causes significant damage to the health of the individual, and finally it results to death. Hence, the Brain Tumor Segmentation and Classification (BTSC) has gained more attention among researcher communities. BTSC is the process of finding brain tumor tissues and classifying the tissues based on the tumor types. Manual tumor segmentation from is prone to error and a time-consuming task. A precise and fast BTSC model is developed in this manuscript based on a transfer learning-based Convolutional Neural Networks (CNN) model. The utilization of a variant of CNN is because of its superiority in distinct tasks. In the initial phase, the Magnetic Resonance Imaging (MRI) brain images are acquired from the Brain Tumor Image Segmentation Challenge (BRATS) 2019, 2020 and 2021 databases. Then the image augmentation is performed on the gathered images by using zoom-in, rotation, zoom-out, flipping, scaling, and shifting methods that effectively reduce overfitting issues in the classification model. The augmented images are segmented using the layers of the Visual-Geometry-Group (VGG-19) model. Then feature extraction using An Attribute Aware Attention (AWA) methodology is carried out on the segmented images following the segmentation block in the VGG-19 model. The crucial features are then selected using the attribute category reciprocal attention phase. These features are inputted to the Model Agnostic Concept Extractor (MACE) to generate the relevance score between the features for assisting in the final classification process. The obtained relevance scores from the MACE are provided to the max-pooling layer of the VGG-19 model. Then, the final classified output is obtained from the modified VGG-19 architecture. The implemented Relevance score with the AWA-based VGG-19 model is used to classify the tumor as the whole tumor, enhanced tumor, and tumor core. In the classification section, the proposed
Accurate identification of malicious traffic is crucial for implementing effective defense countermeasures and has led to extensive research efforts. However, the continuously evolving techniques employed by adversari...
详细信息
Accurate identification of malicious traffic is crucial for implementing effective defense countermeasures and has led to extensive research efforts. However, the continuously evolving techniques employed by adversaries have introduced the issues of concept drift, which significantly affects the performance of existing methods. To tackle this challenge, some researchers have focused on improving the separability of malicious traffic representation and designing drift detectors to reduce the number of false ***, these methods often overlook the importance of enhancing the generalization and intraclass consistency in the representation. Additionally, the detectors are not sufficiently sensitive to the variations among different malicious traffic classes, which results in poor performance and limited robustness. In this paper, we propose intraclass consistency enhanced variational autoencoder with Class-Perception detector(ICE-CP) to identify malicious traffic under concept drift. It comprises two key modules during training:intraclass consistency enhanced(ICE) representation learning and Class-Perception(CP) detector construction. In the first module, we employ a variational autoencoder(VAE) in conjunction with Kullback-Leibler(KL)-divergence and cross-entropy loss to model the distribution of each input malicious traffic flow. This approach simultaneously enhances the generalization, interclass consistency, and intraclass differences in the learned representation. Consequently, we obtain a compact representation and a trained classifier for nondrifting malicious traffic. In the second module, we design the CP detector, which generates a centroid and threshold for each malicious traffic class separately based on the learned representation, depicting the boundaries between drifting and non-drifting malicious traffic. During testing, we utilize the trained classifier to predict malicious traffic classes for the testing samples. Then, we use the CP det
Long-tailed multi-label text classification aims to identify a subset of relevant labels from a large candidate label set, where the training datasets usually follow long-tailed label distributions. Many of the previo...
详细信息
Long-tailed multi-label text classification aims to identify a subset of relevant labels from a large candidate label set, where the training datasets usually follow long-tailed label distributions. Many of the previous studies have treated head and tail labels equally, resulting in unsatisfactory performance for identifying tail labels. To address this issue, this paper proposes a novel learning method that combines arbitrary models with two steps. The first step is the “diverse ensemble” that encourages diverse predictions among multiple shallow classifiers, particularly on tail labels, and can improve the generalization of tail *** second is the “error correction” that takes advantage of accurate predictions on head labels by the base model and approximates its residual errors for tail labels. Thus, it enables the “diverse ensemble” to focus on optimizing the tail label performance. This overall procedure is called residual diverse ensemble(RDE). RDE is implemented via a single-hidden-layer perceptron and can be used for scaling up to hundreds of thousands of labels. We empirically show that RDE consistently improves many existing models with considerable performance gains on benchmark datasets, especially with respect to the propensity-scored evaluation ***, RDE converges in less than 30 training epochs without increasing the computational overhead.
Vertical Federated Learning (VFL) has emerged as a crucial privacy-preserving learning paradigm that involves training models using distributed features from shared samples. However, the performance of VFL can be hind...
详细信息
Network traffic classification is a critical concern in network security and management, essential for accurately differentiating among various network applications, optimizing service quality, and improving user expe...
详细信息
Network traffic classification is a critical concern in network security and management, essential for accurately differentiating among various network applications, optimizing service quality, and improving user experience. The exponential increase in worldwide Internet users and network traffic is continuously augmenting the diversity and complexity of network applications, rendering the Internet environment increasingly intricate and dynamic. Conventional machine learning techniques possess restricted processing abilities for network traffic attributes and struggle to address the progressively intricate traffic classification tasks in contemporary networks. In recent years, the swift advancement of deep learning technologies, particularly Graph Neural Networks (GNN), has yielded significant improvements in network traffic classification. GNN can capture the structured information among network nodes and extract the latent features of network traffic. Nonetheless, current network traffic classification models continue to exhibit deficiencies in the thoroughness of feature extraction. To tackle the problem, this research proposes a method for constructing traffic graphs utilizing numerical similarity and byte distance proximity by exploring the latent correlations among bytes, and it constructs a model, SDA-GNN, based on Graph Isomorphic Networks (GIN) for the categorization of network traffic. In particular, the Dynamic Time Warping (DTW) distance is employed to evaluate the disparity in byte distributions, a channel attention mechanism is utilized to extract additional features, and a Long Short-Term Memory Network (LSTM) enhances the stability of the training process by extracting sequence characteristics. Experimental findings on two actual datasets indicate that the SDA-GNN model surpasses other baseline techniques across multiple assessment parameters in the network traffic classification task, achieving classification accuracy enhancements of 2.19% and 1.49%
Billions of people worldwide are affected by vision impairment majorly caused due to age-related degradation and refractive errors. Diabetic Retinopathy(DR) and Macular Hole(MH) are among the most prevalent senescent ...
详细信息
Accidents caused by drivers who exhibit unusual behavior are putting road safety at ever-greater risk. When one or more vehicle nodes behave in this way, it can put other nodes in danger and result in potentially cata...
详细信息
In practical abnormal traffic detection scenarios,traffic often appears as drift,imbalanced and rare labeled streams,and how to effectively identify malicious traffic in such complex situations has become a challenge ...
详细信息
In practical abnormal traffic detection scenarios,traffic often appears as drift,imbalanced and rare labeled streams,and how to effectively identify malicious traffic in such complex situations has become a challenge for malicious traffic *** have extensive studies on malicious traffic detection with single challenge,but the detection of complex traffic has not been widely *** adaptive random forests(QARF) is proposed to detect traffic streams with concept drift,imbalance and lack of labeled *** is an online active learning based approach which combines adaptive random forests method and adaptive margin sampling *** achieves querying a small number of instances from unlabeled traffic streams to obtain effective *** conduct experiments using the NSL-KDD dataset to evaluate the performance of *** is compared with other state-of-the-art *** experimental results show that QARF obtains 98.20% accuracy on the NSL-KDD *** performs better than other state-of-the-art methods in comparisons.
暂无评论