Digital twinning enables manufacturers to create digital representations of physical entities,thus implementing virtual simulations for product *** efforts of digital twinning neglect the decisive consumer feedback in...
详细信息
Digital twinning enables manufacturers to create digital representations of physical entities,thus implementing virtual simulations for product *** efforts of digital twinning neglect the decisive consumer feedback in product development stages,failing to cover the gap between physical and digital *** work mines real-world consumer feedbacks through social media topics,which is significant to product *** specifically analyze the prevalent time of a product topic,giving an insight into both consumer attention and the widely-discussed time of a *** primary body of current studies regards the prevalent time prediction as an accompanying task or assumes the existence of a preset ***,these proposed solutions are either biased in focused objectives and underlying patterns or weak in the capability of generalization towards diverse *** this end,this work combines deep learning and survival analysis to predict the prevalent time of *** propose a specialized deep survival model which consists of two *** first module enriches input covariates by incorporating latent features of the time-varying text,and the second module fully captures the temporal pattern of a rumor by a recurrent network ***,a specific loss function different from regular survival models is proposed to achieve a more reasonable *** experiments on real-world datasets demonstrate that our model significantly outperforms the state-of-the-art methods.
Image-text retrieval aims to capture the semantic correspondence between images and texts,which serves as a foundation and crucial component in multi-modal recommendations,search systems,and online *** mainstream meth...
详细信息
Image-text retrieval aims to capture the semantic correspondence between images and texts,which serves as a foundation and crucial component in multi-modal recommendations,search systems,and online *** mainstream methods primarily focus on modeling the association of image-text pairs while neglecting the advantageous impact of multi-task learning on image-text *** this end,a multi-task visual semantic embedding network(MVSEN)is proposed for image-text ***,we design two auxiliary tasks,including text-text matching and multi-label classification,for semantic constraints to improve the generalization and robustness of visual semantic embedding from a training ***,we present an intra-and inter-modality interaction scheme to learn discriminative visual and textual feature representations by facilitating information flow within and between ***,we utilize multi-layer graph convolutional networks in a cascading manner to infer the correlation of image-text *** results show that MVSEN outperforms state-of-the-art methods on two publicly available datasets,Flickr30K and MSCOCO,with rSum improvements of 8.2%and 3.0%,respectively.
Video question answering(VideoQA) is a challenging yet important task that requires a joint understanding of low-level video content and high-level textual semantics. Despite the promising progress of existing efforts...
详细信息
Video question answering(VideoQA) is a challenging yet important task that requires a joint understanding of low-level video content and high-level textual semantics. Despite the promising progress of existing efforts, recent studies revealed that current VideoQA models mostly tend to over-rely on the superficial correlations rooted in the dataset bias while overlooking the key video content, thus leading to unreliable results. Effectively understanding and modeling the temporal and semantic characteristics of a given video for robust VideoQA is crucial but, to our knowledge, has not been well investigated. To fill the research gap, we propose a robust VideoQA framework that can effectively model the cross-modality fusion and enforce the model to focus on the temporal and global content of videos when making a QA decision instead of exploiting the shortcuts in datasets. Specifically, we design a self-supervised contrastive learning objective to contrast the positive and negative pairs of multimodal input, where the fused representation of the original multimodal input is enforced to be closer to that of the intervened input based on video perturbation. We expect the fused representation to focus more on the global context of videos rather than some static keyframes. Moreover, we introduce an effective temporal order regularization to enforce the inherent sequential structure of videos for video representation. We also design a Kullback-Leibler divergence-based perturbation invariance regularization of the predicted answer distribution to improve the robustness of the model against temporal content perturbation of videos. Our method is model-agnostic and can be easily compatible with various VideoQA backbones. Extensive experimental results and analyses on several public datasets show the advantage of our method over the state-of-the-art methods in terms of both accuracy and robustness.
Various mainstream target tracking algorithms based on siamese networks are gradually becoming a trend in the field of deep learning tracking due to their concurrent advantages of accuracy and speed. Most siamese netw...
详细信息
The smoothness of high-speed railway tracks is an important indicator for judging the quality of railway tracks. Accurately predicting the uneven trend of the track and dealing with it in advance is of great significa...
详细信息
The existing researches on the flexibility evaluation and optimal scheduling of flexible loads in residential buildings do not fully consider the association characteristics of different loads,resulting in a large dev...
详细信息
The existing researches on the flexibility evaluation and optimal scheduling of flexible loads in residential buildings do not fully consider the association characteristics of different loads,resulting in a large deviation between the calculated results and experimental results of optimization scheduling.A flexibility evaluation methodology and an optimization model considering load associations characteristics are proposed for flexible loads in residential *** flexibility ratio,which is the ratio of temporal flexibility considering association characteristics to that without considering association characteristics,is defined in this *** optimization model is solved using the CPLEX solver under three different scenarios,namely,a scenario only considering the temporal overlapping load associations,a scenario only considering the temporal non-overlapping load associations,and a scenario considering both types of load *** was shown that in the residential building case in this study,the cooking loads with association characteristics exhibit less temporal flexibility but higher temporal flexibility ratio of up to 71.21%,while laundry loads exhibit higher temporal flexibility,but their temporal flexibility ratio is only around 36.84%.Additionally,when the users adopted the time of use(TOU)price,their electricity costs under the three considered scenarios increased by 0.00%,7.57%,and 7.57%relative to the scenario without considering load associations,*** installing a 3-kW household photovoltaic system,the electricity costs under the three scenarios increased by 0.00%,1.28%,and 1.28%,*** highlighted in the results,temporal non-overlapping association characteristics greatly affect the optimal scheduling of flexible energy loads,especially under TOU,while temporal overlapping association characteristics have little effect on that.
Deep learning-based visual SLAM (Simultaneous Localization and Mapping) has become one of the highly researched areas in recent years. Deep learning can be integrated with various modules of visual SLAM systems, inclu...
详细信息
Deep learning technology has driven continuous advancements in the visual tracking field. In order to overcome various challenges, Siamese-based trackers and Attention-based trackers improve tracking performance by ad...
详细信息
In order to solve the shortcomings of the existing safety helmet detection algorithms in construction sites, tunnels, coal mines and other construction scenarios, it is difficult to detect occluded targets and small t...
详细信息
Flocculation flotation is the most efficient method for recovering fine-grained minerals,and its essence lies in flotation and recovery of *** physical characteristics of flocs are mainly determined by their apparent ...
详细信息
Flocculation flotation is the most efficient method for recovering fine-grained minerals,and its essence lies in flotation and recovery of *** physical characteristics of flocs are mainly determined by their apparent particle size and structure(density and morphology).Substantial researches have been conducted regarding the effect of floc characteristics on particle settling and water ***,the influence of floc characteristics on flotation has not been widely *** on the floc formation and flocculation flotation,this study reviews the fundamental physical characteristics of flocs from the perspectives of floc particle size and structure,summarizing the interaction between floc particle size and ***,it thoroughly discusses the effect of floc particle size and structure on floc floatability,further revealing the influence of floc characteristics on bubble collision and adhesion and elucidating the mechanisms of interaction between flocs and ***,it is observed that floc particle size is not the only factor influencing flocculation *** the appropriate apparent particle size range,flocs with a compact structure exhibit higher efficiency in bubble collision and adhesion during flotation,thereby resulting in enhanced flotation *** study aims to provide a reference for flocculation flotation,targeting the development of more efficient and refined flocculation flotation processes in the future.
暂无评论