Agriculture is an important component of every country's economy, supplying the necessary resources to farmers and their families. The livelihoods of farmers are greatly threatened by crop diseases, which highligh...
详细信息
Driver behavior recognition has attracted extensive attention recently. Numerous methods have been developed on the basis of various deep neural networks. However, the existing models still suffer from various challen...
详细信息
Accurate photovoltaic(PV)power forecasting ensures the stability and reliability of power *** address the complex characteristics of nonlinearity,volatility,and periodicity,a novel two-stage PV forecasting method base...
详细信息
Accurate photovoltaic(PV)power forecasting ensures the stability and reliability of power *** address the complex characteristics of nonlinearity,volatility,and periodicity,a novel two-stage PV forecasting method based on an optimized transformer architecture is *** the first stage,an inverted transformer backbone was utilized to consider the multivariate correlation of the PV power series and capture its non-linearity and *** attention was introduced to reduce high-memory occupation and solve computational overload *** the second stage,a weighted series decomposition module was proposed to extract the periodicity of the PV power series,and the final forecasting results were obtained through additive *** on two public datasets showed that the proposed forecasting method has high accuracy,robustness,and computational *** RMSE improved by 31.23%compared with that of a traditional transformer,and its MSE improved by 12.57%compared with that of a baseline model.
The video grounding(VG) task aims to locate the queried action or event in an untrimmed video based on rich linguistic descriptions. Existing proposal-free methods are trapped in the complex interaction between video ...
详细信息
The video grounding(VG) task aims to locate the queried action or event in an untrimmed video based on rich linguistic descriptions. Existing proposal-free methods are trapped in the complex interaction between video and query, overemphasizing cross-modal feature fusion and feature correlation for VG. In this paper, we propose a novel boundary regression paradigm that performs regression token learning in a transformer. Particularly, we present a simple but effective proposal-free framework, namely video grounding transformer(ViGT), which predicts the temporal boundary using a learnable regression token rather than multi-modal or cross-modal features. In ViGT, the benefits of a learnable token are manifested as follows.(1) The token is unrelated to the video or the query and avoids data bias toward the original video and query.(2) The token simultaneously performs global context aggregation from video and query ***, we employed a sharing feature encoder to project both video and query into a joint feature space before performing cross-modal co-attention(i.e., video-to-query attention and query-to-video attention) to highlight discriminative features in each modality. Furthermore, we concatenated a learnable regression token [REG] with the video and query features as the input of a vision-language transformer. Finally, we utilized the token [REG] to predict the target moment and visual features to constrain the foreground and background probabilities at each timestamp. The proposed ViGT performed well on three public datasets:ANet-Captions, TACoS, and YouCookⅡ. Extensive ablation studies and qualitative analysis further validated the interpretability of ViGT.
We demonstrated self-aligned top-gate (SATG) amorphous indium-gallium-zinc oxide (a-IGZO) thin-film transistor (TFT) where the source/drain (S/D) regions were induced into a low resistance state by first coating a thi...
详细信息
This study, detailed below, introduces a new method of handling crimes in particular states, using ensemble machine learning and data visualization techniques in particular, as a way to enhance public safety. Crime de...
详细信息
Network intrusion detection methods aim to distinguish network traffic from benign and malicious classes. With the continuous escalation in both the volume and diversity of malicious traffic, class incremental learnin...
详细信息
Currently, early detection and intervention are still the key to prevent the aggravation of Alzheimer's disease (AD). computer assisted diagnosis can provide great help for the early detection of AD. However, comp...
详细信息
In this paper we demonstrate that residual error estimators can be used to detect internal resonance problems arising from the electric and magnetic field integral equations for perfectly conducting targets. In additi...
详细信息
This paper describes the design and implementation of a steerable robot control device making use of machine imaginative and prescient. The machine consists of a robot car ready with a 3-d digital camera, a show and a...
详细信息
暂无评论