Partial-label learning(PLL) is a typical problem of weakly supervised learning, where each training instance is annotated with a set of candidate labels. Self-training PLL models achieve state-of-the-art performance b...
详细信息
Partial-label learning(PLL) is a typical problem of weakly supervised learning, where each training instance is annotated with a set of candidate labels. Self-training PLL models achieve state-of-the-art performance but suffer from error accumulation problems caused by mistakenly disambiguated instances. Although co-training can alleviate this issue by training two networks simultaneously and allowing them to interact with each other, most existing co-training methods train two structurally identical networks with the same task, i.e., are symmetric, rendering it insufficient for them to correct each other due to their similar limitations. Therefore, in this paper, we propose an asymmetric dual-task co-training PLL model called AsyCo,which forces its two networks, i.e., a disambiguation network and an auxiliary network, to learn from different views explicitly by optimizing distinct tasks. Specifically, the disambiguation network is trained with a self-training PLL task to learn label confidence, while the auxiliary network is trained in a supervised learning paradigm to learn from the noisy pairwise similarity labels that are constructed according to the learned label confidence. Finally, the error accumulation problem is mitigated via information distillation and confidence refinement. Extensive experiments on both uniform and instance-dependent partially labeled datasets demonstrate the effectiveness of AsyCo.
Researchers have recently achieved significant advances in deep learning techniques, which in turn has substantially advanced other research disciplines, such as natural language processing, image processing, speech r...
详细信息
Researchers have recently achieved significant advances in deep learning techniques, which in turn has substantially advanced other research disciplines, such as natural language processing, image processing, speech recognition, and softwareengineering. Various deep learning techniques have been successfully employed to facilitate softwareengineering tasks, including code generation, software refactoring, and fault localization. Many studies have also been presented in top conferences and journals, demonstrating the applications of deep learning techniques in resolving various softwareengineering tasks. However,although several surveys have provided overall pictures of the application of deep learning techniques in softwareengineering,they focus more on learning techniques, that is, what kind of deep learning techniques are employed and how deep models are trained or fine-tuned for softwareengineering tasks. We still lack surveys explaining the advances of subareas in softwareengineering driven by deep learning techniques, as well as challenges and opportunities in each subarea. To this end, in this study, we present the first task-oriented survey on deep learning-based softwareengineering. It covers twelve major softwareengineering subareas significantly impacted by deep learning techniques. Such subareas spread out through the whole lifecycle of software development and maintenance, including requirements engineering, software development, testing, maintenance, and developer collaboration. As we believe that deep learning may provide an opportunity to revolutionize the whole discipline of softwareengineering, providing one survey covering as many subareas as possible in softwareengineering can help future research push forward the frontier of deep learning-based softwareengineering more systematically. For each of the selected subareas,we highlight the major advances achieved by applying deep learning techniques with pointers to the available datasets i
Existing 3D object detection suffers from expensive annotation costs and poor transferability to unknown data due to the domain gap, Unsupervised Domain Adaptation (UDA) aims to generalize detection models trained in ...
详细信息
Existing 3D object detection suffers from expensive annotation costs and poor transferability to unknown data due to the domain gap, Unsupervised Domain Adaptation (UDA) aims to generalize detection models trained in labeled source domains to perform robustly on unexplored target domains, providing a promising solution for cross-domain 3D object detection. Although Self-Training (ST) based cross-domain 3D detection methods with the assistance of pseudo-labeling techniques have achieved remarkable progress, they still face the issue of low-quality pseudo-labels when there are significant domain disparities due to the absence of a process for feature distribution alignment. While Adversarial Learning (AL) based methods can effectively align the feature distributions of the source and target domains, the inability to obtain labels in the target domain forces the adoption of asymmetric optimization losses, resulting in a challenging issue of source domain bias. To overcome these limitations, we propose a novel unsupervised domain adaptation framework for 3D object detection via collaborating ST and AL, dubbed as STAL3D, unleashing the complementary advantages of pseudo labels and feature distribution alignment. Additionally, a Background Suppression Adversarial Learning (BS-AL) module and a Scale Filtering Module (SFM) are designed tailored for 3D cross-domain scenes, effectively alleviating the issues of the large proportion of background interference and source domain size bias. Our STAL3D achieves state-of-the-art performance on multiple cross-domain tasks and even surpasses the Oracle results on Waymo $\rightarrow$ KITTI and Waymo $\rightarrow$ KITTI-rain. IEEE
In this work, we introduce a class of black-box(BB) reductions called committed-programming reduction(CPRed) in the random oracle model(ROM) and obtain the following interesting results:(1) we demonstrate that some we...
详细信息
In this work, we introduce a class of black-box(BB) reductions called committed-programming reduction(CPRed) in the random oracle model(ROM) and obtain the following interesting results:(1) we demonstrate that some well-known schemes, including the full-domain hash(FDH) signature(Eurocrypt1996) and the Boneh-Franklin identity-based encryption(IBE) scheme(Crypto 2001), are provably secure under CPReds;(2) we prove that a CPRed associated with an instance-extraction algorithm implies a reduction in the quantum ROM(QROM). This unifies several recent results, including the security of the Gentry-Peikert-Vaikuntanathan IBE scheme by Zhandry(Crypto 2012) and the key encapsulation mechanism(KEM) variants using the Fujisaki-Okamoto transform by Jiang et al.(Crypto 2018) in the ***, we show that CPReds are incomparable to non-programming reductions(NPReds) and randomly-programming reductions(RPReds) formalized by Fischlin et al.(Asiacrypt 2010).
As a pivotal enabler of intelligent transportation system(ITS), Internet of vehicles(Io V) has aroused extensive attention from academia and industry. The exponential growth of computation-intensive, latency-sensitive...
详细信息
As a pivotal enabler of intelligent transportation system(ITS), Internet of vehicles(Io V) has aroused extensive attention from academia and industry. The exponential growth of computation-intensive, latency-sensitive,and privacy-aware vehicular applications in Io V result in the transformation from cloud computing to edge computing,which enables tasks to be offloaded to edge nodes(ENs) closer to vehicles for efficient execution. In ITS environment,however, due to dynamic and stochastic computation offloading requests, it is challenging to efficiently orchestrate offloading decisions for application requirements. How to accomplish complex computation offloading of vehicles while ensuring data privacy remains challenging. In this paper, we propose an intelligent computation offloading with privacy protection scheme, named COPP. In particular, an Advanced Encryption Standard-based encryption method is utilized to implement privacy protection. Furthermore, an online offloading scheme is proposed to find optimal offloading policies. Finally, experimental results demonstrate that COPP significantly outperforms benchmark schemes in the performance of both delay and energy consumption.
Understanding and quantifying the capabilities of foundation models, particularly in text-to-image(T2I) generation, is crucial for verifying their alignment with human expectations and practical requirements. However,...
详细信息
Understanding and quantifying the capabilities of foundation models, particularly in text-to-image(T2I) generation, is crucial for verifying their alignment with human expectations and practical requirements. However, evaluating T2I foundation models presents significant challenges due to the complex, multi-dimensional psychological factors that influence human preferences for generated images. In this work, we propose MindScore, a multi-view framework for assessing the generation capacity of T2I models through the lens of human preference. Specifically, MindScore decomposes the evaluation into four complementary modules that align with human cognitive processing of images: matching, faithfulness, quality,and realness. The matching module quantifies the semantic alignment between generated images and prompt text, while the faithfulness module measures how accurately the images reflect specific prompt details. Furthermore, we incorporate quality and realness modules to capture deeper psychological preferences, recognizing that unpleasant or distorted images often trigger adverse human responses. Extensive experiments on three T2I datasets with human preference annotations clearly validate the superiority of our proposed MindScore over various state-of-the-art baselines. Our case studies further reveal that MindScore offers valuable insights into T2I generation from a human-centric perspective.
Cloud storage is now widely used, but its reliability has always been a major concern. Cloud block storage(CBS) is a famous type of cloud storage. It has the closest architecture to the underlying storage and can prov...
详细信息
Cloud storage is now widely used, but its reliability has always been a major concern. Cloud block storage(CBS) is a famous type of cloud storage. It has the closest architecture to the underlying storage and can provide interfaces for other types. Data modifications in CBS have potential risks such as null reference or data *** verification of these operations can improve the reliability of CBS to some extent. Although separation logic is a mainstream approach to verifying program correctness, the complex architecture of CBS creates some challenges for verifications. This paper develops a proof system based on separation logic for verifying the CBS data modifications. The proof system can represent the CBS architecture, describe the properties of the CBS system state, and specify the behavior of CBS data modifications. Using the interactive verification approach from Coq, the proof system is implemented as a verification tool. With this tool, the paper builds machine-checked proofs for the functional correctness of CBS data modifications. This work can thus analyze the reliability of cloud storage from a formal perspective.
Although matrix multiplication plays an essential role in a wide range of applications,previous works only focus on optimizing dense or sparse matrix *** Sparse Approximate Matrix Multiply(SpAMM)is an algorithm to acc...
详细信息
Although matrix multiplication plays an essential role in a wide range of applications,previous works only focus on optimizing dense or sparse matrix *** Sparse Approximate Matrix Multiply(SpAMM)is an algorithm to accelerate the multiplication of decay matrices,the sparsity of which is between dense and sparse *** addition,large-scale decay matrix multiplication is performed in scientific applications to solve cutting-edge *** optimize large-scale decay matrix multiplication using SpAMM on supercomputers such as Sunway Taihulight,we present swSpAMM,an optimized SpAMM algorithm by adapting the computation characteristics to the architecture features of Sunway ***,we propose both intra-node and inter-node optimizations to accelerate swSpAMM for large-scale *** intra-node optimizations,we explore algorithm parallelization and block-major data layout that are tailored to better utilize the architecture advantage of Sunway *** inter-node optimizations,we propose a matrix organization strategy for better distributing sub-matrices across nodes and a dynamic scheduling strategy for improving load balance across *** compare swSpAMM with the existing GEMM library on a single node as well as large-scale matrix multiplication methods on multiple *** experiment results show that swSpAMM achieves a speedup up to 14.5×and 2.2×when compared to xMath library on a single node and 2D GEMM method on multiple nodes,respectively.
Foundation models(FMs) [1] have revolutionized software development and become the core components of large software systems. This paradigm shift, however, demands fundamental re-imagining of softwareengineering theo...
Foundation models(FMs) [1] have revolutionized software development and become the core components of large software systems. This paradigm shift, however, demands fundamental re-imagining of softwareengineering theories and methodologies [2]. Instead of replacing existing software modules implemented by symbolic logic, incorporating FMs' capabilities to build software systems requires entirely new modules that leverage the unique capabilities of ***, while FMs excel at handling uncertainty, recognizing patterns, and processing unstructured data, we need new engineering theories that support the paradigm shift from explicitly programming and maintaining user-defined symbolic logic to creating rich, expressive requirements that FMs can accurately perceive and implement.
Aiming at the low accuracy of existing binocular stereo matching and depth estimation methods, this paper proposes a multi-scale binocular stereo matching network based on semantic association. A semantic association ...
详细信息
Aiming at the low accuracy of existing binocular stereo matching and depth estimation methods, this paper proposes a multi-scale binocular stereo matching network based on semantic association. A semantic association module is designed to construct the contextual semantic association relationship among the pixels through semantic category and attention mechanism. The disparity of those regions where the disparity is easily estimated can be used to assist the disparity estimation of relatively difficult regions, so as to improve the accuracy of disparity estimation of the whole image. Simultaneously, a multi-scale cost volume computation module is proposed. Unlike the existing methods, which use a single cost volume, the proposed multi-scale cost volume computation module designs multiple cost volumes for features of different scales. The semantic association feature and multi-scale cost volume are aggregated, which fuses the high-level semantic information and the low-level local detailed information to enhance the feature representation for accurate stereo matching. We demonstrate the effectiveness of the proposed solutions on the KITTI2015 binocular stereo matching dataset, and our model achieves comparable or higher matching performance, compared to other seven classic binocular stereo matching algorithms.
暂无评论