WCE (Wireless Capsule Endoscopy) is a new technology that combines computer vision and medicine, allowing doctors to visualize the conditions inside the intestines, achieving good diagnostic results. However, due to t...
详细信息
WCE (Wireless Capsule Endoscopy) is a new technology that combines computer vision and medicine, allowing doctors to visualize the conditions inside the intestines, achieving good diagnostic results. However, due to the complex intestinal environment and limited pixel resolution of WCE videos, lesions are not easily detectable, and it takes an experienced doctor 1–2 h to analyze a complete WCE video. The use of computer-aided diagnostic methods, assisting or even replacing manual WCE diagnosis, has significant application value. In response to the issue of intestinal lesion detection in WCE videos, this paper proposes a multi-scale feature fusion network model TSD-YOLO based on the YOLO (You Only Look Once) architecture: (I) a Tiny Detection Layer to avoid the loss of shallow feature information for tiny-scale targets;(II) integrating a simple, parameter-free attention module (SimAM) at the neck to better extract local lesion features and fuse features;(III) incorporating a new loss function DIoU (Distance Intersection over Union) to better achieve boundary box regression for target detection. This model was validated using the WCE dataset from Kyushu University Hospital. For the dataset containing 18,000 images, the evaluation metrics of our model for 12 types of lesions, outperformed existing reported results from advanced models on this dataset, and the mAP (mean Average Precision) and precision evaluation metrics improved by 3.7% and 0.9% over the benchmark model.
The growing prevalence of knowledge reasoning using knowledge graphs(KGs)has substantially improved the accuracy and efficiency of intelligent medical ***,current models primarily integrate electronic medical records(...
详细信息
The growing prevalence of knowledge reasoning using knowledge graphs(KGs)has substantially improved the accuracy and efficiency of intelligent medical ***,current models primarily integrate electronic medical records(EMRs)and KGs into the knowledge reasoning process,ignoring the differing significance of various types of knowledge in EMRs and the diverse data types present in the *** better integrate EMR text information,we propose a novel intelligent diagnostic model named the Graph ATtention network incorporating Text representation in knowledge reasoning(GATiT),which comprises text representation,subgraph construction,knowledge reasoning,and diagnostic *** the text representation process,GATiT uses a pre-trained model to obtain text representations of the EMRs and additionally enhances embeddings by including chief complaint information and numerical information in the *** the subgraph construction process,GATiT constructs text subgraphs and disease subgraphs from the KG,utilizing EMR text and the disease to be *** differentiate the varying importance of nodes within the subgraphs features such as node categories,relevance scores,and other relevant factors are introduced into the text ***-passing strategy and attention weight calculation of the graph attention network are adjusted to learn these features in the knowledge reasoning ***,in the diagnostic classification process,the interactive attention-based fusion method integrates the results of knowledge reasoning with text representations to produce the final diagnosis *** results on multi-label and single-label EMR datasets demonstrate the model’s superiority over several state-of-theart methods.
Knowledge distillation is often used for model compression and has achieved a great breakthrough in image classification,but there still remains scope for improvement in object detection,especially for knowledge extra...
详细信息
Knowledge distillation is often used for model compression and has achieved a great breakthrough in image classification,but there still remains scope for improvement in object detection,especially for knowledge extraction of small *** main problem is the features of small objects are often polluted by background noise and not prominent due to down-sampling of convolutional neural network(CNN),resulting in the insufficient refinement of small object features during *** this paper,we propose Hierarchical Matching Knowledge Distillation Network(HMKD)that operates on the pyramid level P2 to pyramid level P4 of the feature pyramid network(FPN),aiming to intervene on small object features before *** employ an encoder-decoder network to encapsulate low-resolution,highly semantic information,akin to eliciting insights from profound strata within a teacher network,and then match the encapsulated information with high-resolution feature values of small objects from shallow layers as the *** this period,we use an attention mechanism to measure the relevance of the inquiry to the feature *** in the process of decoding,knowledge is distilled to the *** addition,we introduce a supplementary distillation module to mitigate the effects of background *** show that our method achieves excellent improvements for both one-stage and twostage object ***,applying the proposed method on Faster R-CNN achieves 41.7%mAP on COCO2017(ResNet50 as the backbone),which is 3.8%higher than that of the baseline.
Convolutional neural networks struggle to accurately handle changes in angles and twists in the direction of images,which affects their ability to recognize patterns based on internal feature levels. In contrast, Caps...
详细信息
Convolutional neural networks struggle to accurately handle changes in angles and twists in the direction of images,which affects their ability to recognize patterns based on internal feature levels. In contrast, CapsNet overcomesthese limitations by vectorizing information through increased directionality and magnitude, ensuring that spatialinformation is not overlooked. Therefore, this study proposes a novel expression recognition technique calledCAPSULE-VGG, which combines the strengths of CapsNet and convolutional neural networks. By refining andintegrating features extracted by a convolutional neural network before introducing theminto CapsNet, ourmodelenhances facial recognition capabilities. Compared to traditional neural network models, our approach offersfaster training pace, improved convergence speed, and higher accuracy rates approaching stability. Experimentalresults demonstrate that our method achieves recognition rates of 74.14% for the FER2013 expression dataset and99.85% for the CK+ expression dataset. By contrasting these findings with those obtained using conventionalexpression recognition techniques and incorporating CapsNet’s advantages, we effectively address issues associatedwith convolutional neural networks while increasing expression identification accuracy.
Linear regression model is one of the important learning models for classification tasks. However, the data from practical application inevitably contains some noise or is corrupted, which may lead to the decline of t...
详细信息
Background The annotation of fashion images is a significantly important task in the fashion industry as well as social media and ***,owing to the complexity and diversity of fashion images,this task entails multiple ...
详细信息
Background The annotation of fashion images is a significantly important task in the fashion industry as well as social media and ***,owing to the complexity and diversity of fashion images,this task entails multiple challenges,including the lack of fine-grained captions and confounders caused by dataset ***,confounders often cause models to learn spurious correlations,thereby reducing their generalization *** In this work,we propose the Deconfounded Fashion Image Captioning(DFIC)framework,which first uses multimodal retrieval to enrich the predicted captions of clothing,and then constructs a detailed causal graph using causal inference in the decoder to perform *** retrieval is used to obtain semantic words related to image features,which are input into the decoder as prompt words to enrich sentence *** the decoder,causal inference is applied to disentangle visual and semantic features while concurrently eliminating visual and language *** Overall,our method can not only effectively enrich the captions of target images,but also greatly reduce confounders caused by the *** verify the effectiveness of the proposed framework,the model was experimentally verified using the FACAD dataset.
Efficient text classification is crucial for information processing due to the generation of massive text data. However, the uneven distribution and redundancy of text data often result in poor classification performa...
详细信息
The grey wolf optimization algorithm (GWO) is a new metaheuristic algorithm. The GWO has the advantages of simple structure, few parameters to adjust, and high efficiency, and has been applied in various optimization ...
详细信息
The grey wolf optimization algorithm (GWO) is a new metaheuristic algorithm. The GWO has the advantages of simple structure, few parameters to adjust, and high efficiency, and has been applied in various optimization problems. However, the orginal GWO search process is guided entirely by the best three wolves, resulting in low population diversity, susceptibility to local optima, slow convergence rate, and imbalance in development and exploration. In order to address these shortcomings, this paper proposes an adaptive dynamic self-learning grey wolf optimization algorithm (ASGWO). First, the convergence factor was segmented and nonlinearized to balance the global search and local search of the algorithm and improve the convergence rate. Second, the wolves in the original GWO approach the leader in a straight line, which is too simple and ignores a lot of information on the path. Therefore, a dynamic logarithmic spiral that nonlinearly decreases with the number of iterations was introduced to expand the search range of the algorithm in the early stage and enhance local development in the later stage. Then, the fixed step size in the original GWO can lead to algorithm oscillations and an inability to escape local optima. A dynamic self-learning step size was designed to help the algorithm escape from local optima and prevent oscillations by reasonably learning the current evolution success rate and iteration count. Finally, the original GWO has low population diversity, which makes the algorithm highly susceptible to becoming trapped in local optima. A novel position update strategy was proposed, using the global optimum and randomly generated positions as learning samples, and dynamically controlling the influence of learning samples to increase population diversity and avoid premature convergence of the algorithm. Through comparison with traditional algorithms, such as GWO, PSO, WOA, and the new variant algorithms EOGWO and SOGWO on 23 classical test functions, ASGW
With the rapid development of information technology, how to ensure the secure transmission and storage of data has become an important issue in today's society. The experiment innovatively proposes an encryption ...
详细信息
With the rapid development of information technology, how to ensure the secure transmission and storage of data has become an important issue in today's society. The experiment innovatively proposes an encryption method to improve the security of computer communication systems under various attack modes. This method is based on Chosen Cipher-text Attack (CCA) and improved adversarial neural network. In the process, the adversarial neural network is first used to encrypt the data. A new symmetric encryption system structure, namely Adversarial Neural Cryptography (ANC), is introduced to merge with Generative Adversarial Network (GAN). In addition, a Chosen Cipher-text Attack-Adversarial Neural Cryptography (CCA-ANC)-based encryption method is proposed to build a computer communication data encryption system. GAN is adjusted and optimized based on the CCA test results to jointly realize the encryption of data transmission. The experiment uses two public data sets: CAIDA and UNIBS. 1520 data in the CAIDA data set are finally selected as the validation set and named as data set A by removing redundant data. 380 data in the UNIBS data set are selected as the test set and named as data set B. The experiment selects the iteration, AUC value, classification accuracy, and other performance indicators. The results showed that the research model reached a stable state with a fitness value of 0.612 after 38 iterations. Compared with existing technologies such as Blockchain technology, X-IDEA, and HS-IQRG algorithms, the AUC of the proposed method was 0.978. On dataset A, the research method had a maximum classification accuracy of 98.24% when the system iterated 75 times. The encryption time of the research method on dataset A was only 0.0424s when the system iterated 44 times. The above results all show that the research method can encrypt data. Meanwhile, this method learns a safe password generation method in the automated system, which makes certain contributions to compute
Software-Defined Networking (SDN) updates network flexibility by decoupling the data plane from control planes, employing a logically centralized yet physically distributed multi-controller architecture. The optimal p...
详细信息
暂无评论