Scene text detection is an important task in computer *** this paper,we present YOLOv5 Scene Text(YOLOv5ST),an optimized architecture based on YOLOv5 v6.0 tailored for fast scene text *** primary goal is to enhance in...
详细信息
Scene text detection is an important task in computer *** this paper,we present YOLOv5 Scene Text(YOLOv5ST),an optimized architecture based on YOLOv5 v6.0 tailored for fast scene text *** primary goal is to enhance inference speed without sacrificing significant detection accuracy,thereby enabling robust performance on resource-constrained devices like drones,closed-circuit television cameras,and other embedded *** achieve this,we propose key modifications to the network architecture to lighten the original backbone and improve feature aggregation,including replacing standard convolution with depth-wise convolution,adopting the C2 sequence module in place of C3,employing Spatial Pyramid Pooling Global(SPPG)instead of Spatial Pyramid Pooling Fast(SPPF)and integrating Bi-directional Feature Pyramid Network(BiFPN)into the *** results demonstrate a remarkable 26%improvement in inference speed compared to the baseline,with only marginal reductions of 1.6%and 4.2%in mean average precision(mAP)at the intersection over union(IoU)thresholds of 0.5 and 0.5:0.95,*** work represents a significant advancement in scene text detection,striking a balance between speed and accuracy,making it well-suited for performance-constrained environments.
Since different kinds of face forgeries leave similar forgery traces in videos,learning the common features from different kinds of forged faces would achieve promising generalization ability of forgery ***,to accurat...
详细信息
Since different kinds of face forgeries leave similar forgery traces in videos,learning the common features from different kinds of forged faces would achieve promising generalization ability of forgery ***,to accurately detect known forgeries while ensuring high generalization ability of detecting unknown forgeries,we propose an intra-inter network(IIN)for face forgery detection(FFD)in videos with continual *** proposed IIN mainly consists of three modules,i.e.,intra-module,inter-module,and forged trace masking module(FTMM).Specifically,the intra-module is trained for each kind of face forgeries by supervised learning to extract special features,while the inter-module is trained by self-supervised learning to extract the common *** a result,the common and special features of the different forgeries are decoupled by the two feature learning modules,and then the decoupled common features can be utlized to achieve high generalization ability for ***,the FTMM is deployed for contrastive learning to further improve detection *** experimental results on FaceForensic++dataset demonstrate that the proposed IIN outperforms the state-of-the-arts in ***,the generalization ability of the IIN verified on DFDC and Celeb-DF datasets demonstrates that the proposed IIN significantly improves the generalization ability for FFD.
This study examines the effectiveness of artificial intelligence techniques in generating high-quality environmental data for species introductory site selection *** Strengths,Weaknesses,Opportunities,Threats(SWOT)ana...
详细信息
This study examines the effectiveness of artificial intelligence techniques in generating high-quality environmental data for species introductory site selection *** Strengths,Weaknesses,Opportunities,Threats(SWOT)analysis data with Variation Autoencoder(VAE)and Generative AdversarialNetwork(GAN)the network framework model(SAE-GAN),is proposed for environmental data *** model combines two popular generative models,GAN and VAE,to generate features conditional on categorical data embedding after SWOT *** model is capable of generating features that resemble real feature distributions and adding sample factors to more accurately track individual sample *** data is used to retain more semantic information to generate *** model was applied to species in Southern California,USA,citing SWOT analysis data to train the *** show that the model is capable of integrating data from more comprehensive analyses than traditional methods and generating high-quality reconstructed data from them,effectively solving the problem of insufficient data collection in development *** model is further validated by the Technique for Order Preference by Similarity to an Ideal Solution(TOPSIS)classification assessment commonly used in the environmental data *** study provides a reliable and rich source of training data for species introduction site selection systems and makes a significant contribution to ecological and sustainable development.
The distribution of the nuclear ground-state spin in a two-body random ensemble(TBRE)was studied using a general classification neural network(NN)model with two-body interaction matrix elements as input features and t...
详细信息
The distribution of the nuclear ground-state spin in a two-body random ensemble(TBRE)was studied using a general classification neural network(NN)model with two-body interaction matrix elements as input features and the corresponding ground-state spins as labels or output *** quantum many-body system problem exceeds the capability of our optimized NNs in terms of accurately predicting the ground-state spin of each sample within the ***,our NN model effectively captured the statistical properties of the ground-state spin because it learned the empirical regularity of the ground-state spin distribution in TBRE,as discovered by physicists.
Point cloud completion is crucial in point cloud processing, as it can repair and refine incomplete 3D data, ensuring more accurate models. However, current point cloud completion methods commonly face a challenge: th...
详细信息
In the analysis of drone aerial images, object detection tasks are particularly challenging, especially in the presence of complex terrain structures, extreme differences in target sizes, suboptimal shooting angles, a...
详细信息
In the analysis of drone aerial images, object detection tasks are particularly challenging, especially in the presence of complex terrain structures, extreme differences in target sizes, suboptimal shooting angles, and varying lighting conditions, all of which exacerbate the difficulty of recognition. In recent years, the DETR model based on the Transformer architecture has eliminated traditional post-processing steps such as NMS(Non-Maximum Suppression), thereby simplifying the object detection process and improving detection accuracy, which has garnered widespread attention in the academic community. However, DETR has limitations such as slow training convergence, difficulty in query optimization, and high computational costs, which hinder its application in practical fields. To address these issues, this paper proposes a new object detection model called OptiDETR. This model first employs a more efficient hybrid encoder to replace the traditional Transformer encoder. The new encoder significantly enhances feature processing capabilities through internal and cross-scale feature interaction and fusion logic. Secondly, an IoU (Intersection over Union) aware query selection mechanism is introduced. This mechanism adds IoU constraints during the training phase to provide higher-quality initial object queries for the decoder, significantly improving the decoding performance. Additionally, the OptiDETR model integrates SW-Block into the DETR decoder, leveraging the advantages of Swin Transformer in global context modeling and feature representation to further enhance the performance and efficiency of object detection. To tackle the problem of small object detection, this study innovatively employs the SAHI algorithm for data augmentation. Through a series of experiments, It achieved a significant performance improvement of more than two percentage points in the mAP (mean Average Precision) metric compared to current mainstream object detection models. Furthermore, ther
Multiarmed bandit(MAB) models are widely used for sequential decision-making in uncertain environments, such as resource allocation in computer communication systems.A critical challenge in interactive multiagent syst...
Multiarmed bandit(MAB) models are widely used for sequential decision-making in uncertain environments, such as resource allocation in computer communication systems.A critical challenge in interactive multiagent systems with bandit feedback is to explore and understand the equilibrium state to ensure stable and tractable system performance.
As the penetration ratio of wind power in active distribution networks continues to increase,the system exhibits some characteristics such as randomness and *** and accurate short-term wind power prediction is essenti...
详细信息
As the penetration ratio of wind power in active distribution networks continues to increase,the system exhibits some characteristics such as randomness and *** and accurate short-term wind power prediction is essential for algorithms like scheduling and optimization *** on the spatio-temporal features of Numerical Weather Prediction(NWP)data,it proposes the WVMD_DSN(Whale Optimization Algorithm,Variational Mode Decomposition,Dual Stream Network)*** model first applies Pearson correlation coefficient(PCC)to choose some NWP features with strong correlation to wind power to form the feature ***,it decomposes the feature set using Variational Mode Decomposition(VMD)to eliminate the nonstationarity and obtains Intrinsic Mode Functions(IMFs).Here Whale Optimization Algorithm(WOA)is applied to optimise the key parameters of VMD,namely the number of mode components K and penalty factor ***,incorporating attention mechanism(AM),Squeeze-Excitation Network(SENet),and Bidirectional Gated Recurrent Unit(BiGRU),it constructs the dual-stream network(DSN)for short-term wind power *** experiments demonstrate that the WVMD_DSN model outperforms existing baseline algorithms and exhibits good generalization *** relevant code is available at https://***/ruanyuyuan/***(accessed on 20 August 2024).
Let n≥2 be an integer. We give necessary and sufficient conditions for an integral quadratic form over dyadic local fields to be n-universal by using invariants from Beli's theory of bases of norm ***, we provide...
详细信息
Let n≥2 be an integer. We give necessary and sufficient conditions for an integral quadratic form over dyadic local fields to be n-universal by using invariants from Beli's theory of bases of norm ***, we provide a minimal set for testing n-universal quadratic forms over dyadic local fields, as an analogue of Bhargava and Hanke's 290-theorem(or Conway and Schneeberger's 15-theorem) on universal quadratic forms with integer coefficients.
This paper explores a double quantum images representation(DNEQR)model that allows for simultaneous storage of two digital images in a quantum superposition ***,a new type of two-dimensional hyperchaotic system based ...
详细信息
This paper explores a double quantum images representation(DNEQR)model that allows for simultaneous storage of two digital images in a quantum superposition ***,a new type of two-dimensional hyperchaotic system based on sine and logistic maps is investigated,offering a wider parameter space and better chaotic behavior compared to the sine and logistic *** on the DNEQR model and the hyperchaotic system,a double quantum images encryption algorithm is ***,two classical plaintext images are transformed into quantum states using the DNEQR ***,the proposed hyperchaotic system is employed to iteratively generate pseudo-random *** chaotic sequences are utilized to perform pixel value and position operations on the quantum image,resulting in changes to both pixel values and ***,the ciphertext image can be obtained by qubit-level diffusion using two XOR operations between the position-permutated image and the pseudo-random *** corresponding quantum circuits are also *** results demonstrate that the proposed scheme ensures the security of the images during transmission,improves the encryption efficiency,and enhances anti-interference and anti-attack capabilities.
暂无评论