Multispectral pedestrian detection technology leverages infrared images to provide reliable information for visible light images, demonstrating significant advantages in low-light conditions and background occlusion s...
详细信息
Multispectral pedestrian detection technology leverages infrared images to provide reliable information for visible light images, demonstrating significant advantages in low-light conditions and background occlusion scenarios. However, while continuously improving cross-modal feature extraction and fusion, ensuring the model’s detection speed is also a challenging issue. We have devised a deep learning network model for cross-modal pedestrian detection based on Resnet50, aiming to focus on more reliable features and enhance the model’s detection efficiency. This model employs a spatial attention mechanism to reweight the input visible light and infrared image data, enhancing the model’s focus on different spatial positions and sharing the weighted feature data across different modalities, thereby reducing the interference of multi-modal features. Subsequently, lightweight modules with depthwise separable convolution are incorporated to reduce the model’s parameter count and computational load through channel-wise and point-wise convolutions. The network model algorithm proposed in this paper was experimentally validated on the publicly available KAIST dataset and compared with other existing methods. The experimental results demonstrate that our approach achieves favorable performance in various complex environments, affirming the effectiveness of the multispectral pedestrian detection technology proposed in this paper.
More and more federated cloud platforms have been constructed to deal with non-trivial large-scale applications, which typically require certain level of quality-of-service (QoS) guarantee. However, most of existing c...
详细信息
Multimodal sentiment analysis aims to understand people’s emotions and opinions from diverse ***-nating or multiplying various modalities is a traditional multi-modal sentiment analysis fusion *** fusion method does ...
详细信息
Multimodal sentiment analysis aims to understand people’s emotions and opinions from diverse ***-nating or multiplying various modalities is a traditional multi-modal sentiment analysis fusion *** fusion method does not utilize the correlation information between *** solve this problem,this paper proposes amodel based on amulti-head attention ***,after preprocessing the original ***,the feature representation is converted into a sequence of word vectors and positional encoding is introduced to better understand the semantic and sequential information in the input ***,the input coding sequence is fed into the transformer model for further processing and *** the transformer layer,a cross-modal attention consisting of a pair of multi-head attention modules is employed to reflect the correlation between ***,the processed results are input into the feedforward neural network to obtain the emotional output through the classification *** the above processing flow,the model can capture semantic information and contextual relationships and achieve good results in various natural language processing *** model was tested on the CMU Multimodal Opinion Sentiment and Emotion Intensity(CMU-MOSEI)and Multimodal EmotionLines Dataset(MELD),achieving an accuracy of 82.04% and F1 parameters reached 80.59% on the former dataset.
This paper presents a design method to implement an antenna array characterized by ultra-wide beam coverage,low profile,and low Sidelobe Level(SLL)for the application of Unmanned Aerial Vehicle(UAV)air-to-ground *** a...
详细信息
This paper presents a design method to implement an antenna array characterized by ultra-wide beam coverage,low profile,and low Sidelobe Level(SLL)for the application of Unmanned Aerial Vehicle(UAV)air-to-ground *** array consists of ten broadside-radiating,ultrawide-beamwidth elements that are cascaded by a central-symmetry series-fed network with tapered currents following Dolph-Chebyshev distribution to provide low ***,an innovative design of end-fire Huygens source antenna that is compatible with metal ground is presented.A low-profile,half-mode Microstrip Patch Antenna(MPA)is utilized to serve as the magnetic dipole and a monopole is utilized to serves as the electric dipole,constructing the compact,end-fire,grounded Huygens source ***,two opposite-oriented end-fire Huygens source antennas are seamlessly integrated into a single antenna element in the form of monopole-loaded MPA to accomplish the ultrawide,broadside-radiating *** consideration has been applied into the design of series-fed network as well as antenna element to compensate the adverse coupling effects between elements on the radiation *** indicates an ultrawide Half-Power Beamwidth(HPBW)of 161°and a low SLL of-25 dB with a high gain of 12 d Bi under a single-layer *** concurrent ultrawide beamwidth and low SLL make it particularly attractive for applications of UAV air-to-ground communication.
Open-vocabulary object detection (OVD) models are considered to be Large Multi-modal Models (LMM), due to their extensive training data and a large number of parameters. Mainstream OVD models prioritize object coarse-...
详细信息
As an effective method for data dimensionality reduction, feature selection could improve the classification accuracy and reduce the computational cost when dealing with high-dimensional data. Feature selection is ess...
详细信息
Advanced Driver Assistance Systems (ADAS) are designed to prevent collisions, identify the condition of drivers while operating vehicles, and provide additional information to enhance drivers' awareness of potenti...
详细信息
To solve the problem that the massive amount of information and real-time processing in the IoT system puts pressure on the computing resources of the whole system, the industry often adopts the computation offloading...
详细信息
Wireless sensor network(WSN)positioning has a good effect on indoor positioning,so it has received extensive attention in the field of ***-line-of sight(NLOS)is a primary challenge in indoor complex *** this paper,a r...
详细信息
Wireless sensor network(WSN)positioning has a good effect on indoor positioning,so it has received extensive attention in the field of ***-line-of sight(NLOS)is a primary challenge in indoor complex *** this paper,a robust localization algorithm based on Gaussian mixture model and fitting polynomial is proposed to solve the problem of NLOS ***,fitting polynomials are used to predict the measured *** residuals of predicted and measured values are clustered by Gaussian mixture model(GMM).The LOS probability and NLOS probability are calculated according to the clustering *** measured values are filtered by Kalman filter(KF),variable parameter unscented Kalman filter(VPUKF)and variable parameter particle filter(VPPF)in *** distance value processed by KF and VPUKF and the distance value processed by KF,VPUKF and VPPF are combined according to ***,the maximum likelihood method is used to calculate the position coordinate *** simulation comparison,the proposed algorithm has better positioning accuracy than several comparison algorithms in this *** it shows strong robustness in strong NLOS environment.
The rapid development of the Internet has led to the widespread dissemination of manipulated facial images, significantly impacting people's daily lives. With the continuous advancement of Deepfake technology, the...
详细信息
The rapid development of the Internet has led to the widespread dissemination of manipulated facial images, significantly impacting people's daily lives. With the continuous advancement of Deepfake technology, the generated counterfeit facial images have become increasingly challenging to distinguish. There is an urgent need for a more robust and convincing detection method. Current detection methods mainly operate in the spatial domain and transform the spatial domain into other domains for analysis. With the emergence of transformers, some researchers have also combined traditional convolutional networks with transformers for detection. This paper explores the artifacts left by Deepfakes in various domains and, based on this exploration, proposes a detection method that utilizes the steganalysis rich model to extract high-frequency noise to complement spatial features. We have designed two main modules to fully leverage the interaction between these two aspects based on traditional convolutional neural networks. The first is the multi-scale mixed feature attention module, which introduces artifacts from high-frequency noise into spatial textures, thereby enhancing the model's learning of spatial texture features. The second is the multi-scale channel attention module, which reduces the impact of background noise by weighting the features. Our proposed method was experimentally evaluated on mainstream datasets, and a significant amount of experimental results demonstrate the effectiveness of our approach in detecting Deepfake forged faces, outperforming the majority of existing methods.
暂无评论