Neural-based multi-task learning (MTL) has gained significant improvement, and it has been successfully applied to recommendation system (RS). Recent deep MTL methods for RS (e.g. MMoE, PLE) focus on designing soft ga...
详细信息
ISBN:
(纸本)9798400701245
Neural-based multi-task learning (MTL) has gained significant improvement, and it has been successfully applied to recommendation system (RS). Recent deep MTL methods for RS (e.g. MMoE, PLE) focus on designing soft gating-based parameter-sharing networks that implicitly learn a generalized representation for each task. However, MTL methods may suffer from performance degeneration when dealing with conflicting tasks, as negative transfer effects can occur on the task-shared bottom representation. This can result in a reduced capacity for MTL methods to capture task-specific characteristics, ultimately impeding their effectiveness and hindering the ability to generalize well on all tasks. In this paper, we focus on the bottom representation learning of MTL in RS and propose the Deep task-specific Bottom representation Network (DTRN) to alleviate the negative transfer problem. DTRN obtains task-specific bottom representation explicitly by making each task have its own representation learning network in the bottom representation modeling stage. specifically, it extracts the user's interests from multiple types of behavior sequences for each task through the parameter-efficient hypernetwork. To further obtain the dedicated representation for each task, DTRN refines the representation of each feature by employing a SENet-like network for each task. The two proposed modules can achieve the purpose of getting task-specific bottom representation to relieve tasks' mutual interference. Moreover, the proposed DTRN is flexible to combine with existing MTL methods. Experiments on one public dataset and one industrial dataset demonstrate the effectiveness of the proposed DTRN.
The morphology of pyramidal cells (PCs) varies significantly among species and brain layers. Therefore, it is particularly challenging to analyze which species or layers they belong to based on morphological features....
详细信息
ISBN:
(纸本)9783031460043;9783031460050
The morphology of pyramidal cells (PCs) varies significantly among species and brain layers. Therefore, it is particularly challenging to analyze which species or layers they belong to based on morphological features. Existing deep learning-based methods analyze speciesrelated or layer-related morphological characteristics of PCs. However, these methods are realized in a task-agnostic manner without considering task-specific features. This paper proposes a task-specific morphological representation learning framework for morphology analysis of PCs to enforce task-specific feature extraction through dual-task learning, enabling performance gains for each task. specifically, we first utilize species-wise and layer-wise feature extraction branches to obtain species-related and layer-related features. Applying the principle of mutual information minimization, we then explicitly force each branch to learn taskspecific features, which are further enhanced via an adaptive representation enhancement module. In this way, the performance of both tasks can be greatly improved simultaneously. Experimental results demonstrate that the proposed method can effectively extract the species-specific and layer-specificrepresentations when identifying rat and mouse PCs in multiple brain layers. Our method reaches the accuracies of 87.44% and 72.46% on species and layer analysis tasks, significantly outperforming a single task by 2.22% and 3.86%, respectively.
One-shot multiple object tracking (MOT), which learns object detection and identity embedding in a unified network, has attracted increasing attention due to its low complexity and high tracking speed. However, most o...
详细信息
One-shot multiple object tracking (MOT), which learns object detection and identity embedding in a unified network, has attracted increasing attention due to its low complexity and high tracking speed. However, most one-shot trackers ignore that detection and re-identification (ReID) require different representations of features. The inherent difference between these two subtasks leads to optimization contradictions in the training procedure. This issue would result in suboptimal tracking performance. To alleviate this contradiction, we propose a novel dual-path transformation network (DTN) that decouples the shared features into detection-specific and ReID-specificrepresentations. By learning task-specific features, this module satisfies the different requirements of both subtasks. Moreover, we observe that previous trackers generally utilize local information to distinguish targets and ignore global semantic relations, which are crucial for tracking. Therefore, we design a pyramid non-local network (PNN) that allows our network to explore pixel-to-pixel relations with a global receptive field. Meanwhile, PNN considers the scale information to enhance the robustness to scale variations. Extensive experiments conducted on three benchmarks, i.e., MOT16, MOT17, and MOT20, demonstrate the superiority of our tracker, namely DPTrack. The experimental results reveal that DPTrack achieves state-of-the-art performance, e.g., MOTA of 77.1% on MOT17. Moreover, DPTrack runs at 14.9FPS, and our lightweight version runs at 26.6FPS with only a slight performance decay.
Multiple object tracking (MOT) in unmanned aerial vehicle (UAV) videos is a fundamental task and can be applied in many fields. MOT consists of two critical procedures, i.e., object detection and re-identification (Re...
详细信息
Multiple object tracking (MOT) in unmanned aerial vehicle (UAV) videos is a fundamental task and can be applied in many fields. MOT consists of two critical procedures, i.e., object detection and re-identification (ReID). One-shot MOT, which incorporates detection and ReID in a unified network, has gained attention due to its fast inference speed. It significantly reduces the computational overhead by making two subtasks share features. However, most existing one-shot trackers struggle to achieve robust tracking in UAV videos. We observe that the essential difference between detection and ReID leads to an optimization contradiction within one-shot networks. To alleviate this contradiction, we propose a novel feature decoupling network (FDN) to convert shared features into detection-specific and ReID-specificrepresentations. The FDN searches for characteristics and commonalities between the two tasks to synergize detection and ReID. In addition, existing one-shot trackers struggle to locate small targets in UAV videos. Therefore, we design a pyramid transformer encoder (PTE) to enrich the semantic information of the resulting detection-specificrepresentations. By learning scale-aware fine-grained features, the PTE empowers our tracker to locate targets in UAV videos accurately. Extensive experiments on VisDrone2021 and UAVDT benchmarks demonstrate that our tracker achieves state-of-the-art tracking performance.
暂无评论