Multimodal Aspect-Based Sentiment Analysis (MABSA) aims to extract aspect terms from text-image pairs and identify their sentiments. Previous methods are based on the premise that the image contains the objects referr...
详细信息
Video-based person re-identification (ReID) has become increasingly important due to its applications in video surveillance applications. By employing events in video-based person ReID, more motion information can be ...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Video-based person re-identification (ReID) has become increasingly important due to its applications in video surveillance applications. By employing events in video-based person ReID, more motion information can be provided between continuous frames to improve recognition accuracy. Previous approaches have assisted by introducing event data into the video person ReID task, but they still cannot avoid the privacy leakage problem caused by RGB images. In order to avoid privacy attacks and to take advantage of the benefits of event data, we consider using only event data. To make full use of the information in the event stream, we propose a Cross-Modality and Temporal Collaboration (CMTC) network for event-based video person ReID. First, we design an event transform network to obtain corresponding auxiliary information from the input of raw events. Additionally, we propose a differential modality collaboration module to balance the roles of events and auxiliaries to achieve complementary effects. Furthermore, we introduce a temporal collaboration module to exploit motion information and appearance cues. Experimental results demonstrate that our method outperforms others in the task of event-based video person ReID.
作者:
Liu, YifanXu, JianzhongZhu, YiyangTian, ZhaoxuanZhao, ChengyongLi, Gen
State Key Laboratory of Alternate Electrical Power System with Renewable Energy Sources China
Energy Technology and Computer Science Section Department of Engineering Technology and Didactics Ballerup2750 Denmark
With the increasing integration of renewable energy into power systems, electromagnetic transient (EMT) simulation has become indispensable for accurate system analysis. However, the complexity of wind turbine (WT) mo...
详细信息
With the rapid development of the aviation industry, flight delays have become a global issue, resulting in significant economic losses and passenger dissatisfaction. Accurately determining the causes of flight delays...
详细信息
Current optimization methods for microgrid scheduling face issues such as insufficient precision in energy distribution, high operational costs, and inefficiency. In response to these challenges, an optimization sched...
详细信息
Event data can asynchronously capture variations in light intensity, thereby implicitly providing valuable complementary cues for RGB-Event tracking. Existing methods typically employ a direct interaction mechanism to...
详细信息
作者:
Chen JiaFan ShiXu ChengSchool of Computer Science and Engineering
The Engineering Research Center of Learning-Based Intelligent System (Ministry of Education) The Key Laboratory of Computer Vision and System (Ministry of Education) Tianjin University of Technology Tianjin China
4D light field imaging captures rich spatial-angular information, providing essential geometric cues for semantic segmentation tasks. In this paper, we introduce a novel backbone network called the Light Field Extract...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
4D light field imaging captures rich spatial-angular information, providing essential geometric cues for semantic segmentation tasks. In this paper, we introduce a novel backbone network called the Light Field Extraction Interaction Network (LFEI-Net). LFEI-Net excels in extracting global structures and multi-scale spatial-angular features, capturing feature dependencies through channel modeling and diverse feature interactions. Unlike traditional methods that depend on pyramid and dilated feature extraction, LFEI-Net pioneers an efficient method by integrating large-scale horizontal depth-wise convolution (HDWC) and vertical depth-wise convolution (VDWC) with interactive operations for comprehensive spatial multi-scale feature extraction. Furthermore, we present the Multi-Angular Modeling (MAM) module, which effectively captures scene angle variations from multiple perspectives and precisely delineates object boundaries, thereby improving model adaptability. Our experimental evaluations on two datasets demonstrate that LFEI-Net significantly outperforms state-ofthe-art (SOTA) 2D and 4D light field semantic segmentation methods, achieving mean Intersection over Union (mIoU) of 83.72% and 86.88%, respectively.
Cross-view object geo-localization (CVOGL) aims to locate an object of interest in a captured ground- or drone-view image within the satellite image. However, existing works treat ground-view and drone-view query imag...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Cross-view object geo-localization (CVOGL) aims to locate an object of interest in a captured ground- or drone-view image within the satellite image. However, existing works treat ground-view and drone-view query images equivalently, overlooking their inherent viewpoint discrepancies and the spatial correlation between the query image and the satellite-view reference image. To this end, this paper proposes a novel View-specific Attention Geo-localization method (VAGeo) for accurate CVOGL. Specifically, VAGeo contains two key modules: view-specific positional encoding (VSPE) module and channel-spatial hybrid attention (CSHA) module. In object-level, according to the characteristics of different viewpoints of ground and drone query images, viewpoint-specific positional codings are designed to more accurately identify the click-point object of the query image in the VSPE module. In feature-level, a hybrid attention in the CSHA module is introduced by combining channel attention and spatial attention mechanisms simultaneously for learning discriminative features. Extensive experimental results demonstrate that the proposed VAGeo gains a significant performance improvement, i.e., improving acc@0.25/acc@0.5 on the CVOGL dataset from 45.43%/42.24% to 48.21%/45.22% for ground-view, and from 61.97%/57.66% to 66.19%/61.87% for drone-view.
This paper models a platooning system consisting of trucks and a third-party service provider (TPSP), which performs platoon coordination, distributes the platooning profit in platoons, and charges trucks in exchange ...
详细信息
sEMG (surface electromyography) signal control of bionic prostheses has been widely studied over the past few years. In particular, sparse sEMG signals are rapidly developing in the field of gesture recognition for th...
详细信息
暂无评论