In the analysis of drone aerial images, object detection tasks are particularly challenging, especially in the presence of complex terrain structures, extreme differences in target sizes, suboptimal shooting angles, a...
详细信息
In the analysis of drone aerial images, object detection tasks are particularly challenging, especially in the presence of complex terrain structures, extreme differences in target sizes, suboptimal shooting angles, and varying lighting conditions, all of which exacerbate the difficulty of recognition. In recent years, the DETR model based on the Transformer architecture has eliminated traditional post-processing steps such as NMS(Non-Maximum Suppression), thereby simplifying the object detection process and improving detection accuracy, which has garnered widespread attention in the academic community. However, DETR has limitations such as slow training convergence, difficulty in query optimization, and high computational costs, which hinder its application in practical fields. To address these issues, this paper proposes a new object detection model called OptiDETR. This model first employs a more efficient hybrid encoder to replace the traditional Transformer encoder. The new encoder significantly enhances feature processing capabilities through internal and cross-scale feature interaction and fusion logic. Secondly, an IoU (Intersection over Union) aware query selection mechanism is introduced. This mechanism adds IoU constraints during the training phase to provide higher-quality initial object queries for the decoder, significantly improving the decoding performance. Additionally, the OptiDETR model integrates SW-Block into the DETR decoder, leveraging the advantages of Swin Transformer in global context modeling and feature representation to further enhance the performance and efficiency of object detection. To tackle the problem of small object detection, this study innovatively employs the SAHI algorithm for data augmentation. Through a series of experiments, It achieved a significant performance improvement of more than two percentage points in the mAP (mean Average Precision) metric compared to current mainstream object detection models. Furthermore, ther
MXene is a promising energy storage material for miniaturized microbatteries and microsupercapacitors(MSCs).Despite its superior electrochemical performance,only a few studies have reported MXene-based ultrahigh-rate(...
详细信息
MXene is a promising energy storage material for miniaturized microbatteries and microsupercapacitors(MSCs).Despite its superior electrochemical performance,only a few studies have reported MXene-based ultrahigh-rate(>1000 mV s^(−1))on-paper MSCs,mainly due to the reduced electrical conductance of MXene films deposited on ***,ultrahigh-rate metal-free on-paper MSCs based on heterogeneous MXene/poly(3,4-ethylenedioxythiophene)-poly(styrenesulfonate)(PEDOT:PSS)-stack electrodes are fabricated through the combination of direct ink writing and femtosecond laser *** a footprint area of only 20 mm^(2),the on-paper MSCs exhibit excellent high-rate capacitive behavior with an areal capacitance of 5.7 mF cm^(−2)and long cycle life(>95%capacitance retention after 10,000 cycles)at a high scan rate of 1000 mV s^(−1),outperforming most of the present on-paper ***,the heterogeneous MXene/PEDOT:PSS electrodes can interconnect individual MSCs into metal-free on-paper MSC arrays,which can also be simultaneously charged/discharged at 1000 mV s^(−1),showing scalable capacitive *** heterogeneous MXene/PEDOT:PSS stacks are a promising electrode structure for on-paper MSCs to serve as ultrafast miniaturized energy storage components for emerging paper electronics.
This study examines the effectiveness of artificial intelligence techniques in generating high-quality environmental data for species introductory site selection *** Strengths,Weaknesses,Opportunities,Threats(SWOT)ana...
详细信息
This study examines the effectiveness of artificial intelligence techniques in generating high-quality environmental data for species introductory site selection *** Strengths,Weaknesses,Opportunities,Threats(SWOT)analysis data with Variation Autoencoder(VAE)and Generative AdversarialNetwork(GAN)the network framework model(SAE-GAN),is proposed for environmental data *** model combines two popular generative models,GAN and VAE,to generate features conditional on categorical data embedding after SWOT *** model is capable of generating features that resemble real feature distributions and adding sample factors to more accurately track individual sample *** data is used to retain more semantic information to generate *** model was applied to species in Southern California,USA,citing SWOT analysis data to train the *** show that the model is capable of integrating data from more comprehensive analyses than traditional methods and generating high-quality reconstructed data from them,effectively solving the problem of insufficient data collection in development *** model is further validated by the Technique for Order Preference by Similarity to an Ideal Solution(TOPSIS)classification assessment commonly used in the environmental data *** study provides a reliable and rich source of training data for species introduction site selection systems and makes a significant contribution to ecological and sustainable development.
Graph neural networks have proven their effectiveness for user-item interaction graph collaborative filtering. However, most of the existing recommendation models highly depended on abundant and high-quality datasets ...
详细信息
Breast cancer in women’s becoming the serious cause moving to the morbidity and the mortality worldwide. This paper aims to design the hybrid model using various machine learning classification algorithms like k-Near...
详细信息
In today's era of rapid internet expansion, the need for efficient information retrieval tools is paramount. Recommendation systems serve as indispensable instruments, aiding users in navigating vast datasets swif...
详细信息
Emotion recognition by facial expression is a challenging task that has gotten much attention in recent years. Deep neural networks are used to extract pertinent information from facial photographs and categorise them...
详细信息
Most of current semantic communication (SemCom) frameworks focus on the image transmission, which, however, do not address the problem on how to deliver digital signals without any semantic features. This paper propos...
详细信息
Large models have recently played a dominant role in natural language processing and multimodal vision-language learning. However, their effectiveness in text-related visual tasks remains relatively unexplored. In thi...
详细信息
Large models have recently played a dominant role in natural language processing and multimodal vision-language learning. However, their effectiveness in text-related visual tasks remains relatively unexplored. In this paper, we conducted a comprehensive evaluation of large multimodal models, such as GPT4V and Gemini, in various text-related visual tasks including text recognition, scene text-centric visual question answering(VQA), document-oriented VQA, key information extraction(KIE), and handwritten mathematical expression recognition(HMER). To facilitate the assessment of optical character recognition(OCR) capabilities in large multimodal models, we propose OCRBench, a comprehensive evaluation benchmark. OCRBench contains 29 datasets, making it the most comprehensive OCR evaluation benchmark available. Furthermore, our study reveals both the strengths and weaknesses of these models, particularly in handling multilingual text, handwritten text, non-semantic text, and mathematical expression *** importantly, the baseline results presented in this study could provide a foundational framework for the conception and assessment of innovative strategies targeted at enhancing zero-shot multimodal *** evaluation pipeline and benchmark are available at https://***/Yuliang-Liu/Multimodal OCR.
Solar flares are one of the strongest outbursts of solar activity,posing a serious threat to Earth’s critical infrastructure,such as communications,navigation,power,and ***,it is essential to accurately predict solar...
详细信息
Solar flares are one of the strongest outbursts of solar activity,posing a serious threat to Earth’s critical infrastructure,such as communications,navigation,power,and ***,it is essential to accurately predict solar flares in order to ensure the safety of human ***,the research focuses on two directions:first,identifying predictors with more physical information and higher prediction accuracy,and second,building flare prediction models that can effectively handle complex observational *** terms of flare observability and predictability,this paper analyses multiple dimensions of solar flare observability and evaluates the potential of observational parameters in *** flare prediction models,the paper focuses on data-driven models and physical models,with an emphasis on the advantages of deep learning techniques in dealing with complex and high-dimensional *** reviewing existing traditional machine learning,deep learning,and fusion methods,the key roles of these techniques in improving prediction accuracy and efficiency are *** prevailing challenges,this study discusses the main challenges currently faced in solar flare prediction,such as the complexity of flare samples,the multimodality of observational data,and the interpretability of *** conclusion summarizes these findings and proposes future research directions and potential technology advancement.
暂无评论