Multi-label classification is a challenging problem that has attracted significant attention from researchers, particularly in the domain of image and text attribute annotation. However, multi-label datasets are prone...
详细信息
Multi-label classification is a challenging problem that has attracted significant attention from researchers, particularly in the domain of image and text attribute annotation. However, multi-label datasets are prone to serious intra-class and inter-class imbalance problems, which can significantly degrade the classification performance. To address the above issues, we propose the multi-label weighted broad learning system(MLW-BLS) from the perspective of label imbalance weighting and label correlation mining. Further, we propose the multi-label adaptive weighted broad learning system(MLAW-BLS) to adaptively adjust the specific weights and values of labels of MLW-BLS and construct an efficient imbalanced classifier set. Extensive experiments are conducted on various datasets to evaluate the effectiveness of the proposed model, and the results demonstrate its superiority over other advanced approaches.
In the fields of intelligent transportation and multi-task cooperation, many practical problems can be modeled by colored traveling salesman problem(CTSP). When solving large-scale CTSP with a scale of more than 1000d...
详细信息
In the fields of intelligent transportation and multi-task cooperation, many practical problems can be modeled by colored traveling salesman problem(CTSP). When solving large-scale CTSP with a scale of more than 1000dimensions, their convergence speed and the quality of their solutions are limited. This paper proposes a new hybrid IT?(HIT?) algorithm, which integrates two new strategies, crossover operator and mutation strategy, into the standard IT?. In the iteration process of HIT?, the feasible solution of CTSP is represented by the double chromosome coding, and the random drift and wave operators are used to explore and develop new unknown regions. In this process, the drift operator is executed by the improved crossover operator, and the wave operator is performed by the optimized mutation strategy. Experiments show that HIT? is superior to the known comparison algorithms in terms of the quality solution.
Partial multi-label learning(PML) allows learning from rich-semantic objects with inaccurate annotations, where a set of candidate labels are assigned to each training example but only some of them are valid. Existi...
详细信息
Partial multi-label learning(PML) allows learning from rich-semantic objects with inaccurate annotations, where a set of candidate labels are assigned to each training example but only some of them are valid. Existing approaches rely on disambiguation to tackle the PML problem, which aims to correct noisy candidate labels by recovering the ground-truth labeling information ahead of prediction model induction. However, this dominant strategy might be suboptimal as it usually needs extra assumptions that cannot be fully satisfied in real-world scenarios. Instead of label correction, we investigate another strategy to tackle the PML problem, where the potential ambiguity in PML data is eliminated by correcting instance features in a label-specific manner. Accordingly, a simple yet effective approach named PASE, i.e., partial multi-label learning via label-specific feature corrections, is proposed. Under a meta-learning framework, PASElearns to exert label-specific feature corrections so that potential ambiguity specific to each class label can be eliminated and the desired prediction model can be induced on these corrected instance features with the provided candidate labels. Comprehensive experiments on a wide range of synthetic and real-world data sets validate the effectiveness of the proposed approach.
Recommender systems are effective in mitigating information overload, yet the centralized storage of user data raises significant privacy concerns. Cross-user federated recommendation(CUFR) provides a promising distri...
详细信息
Recommender systems are effective in mitigating information overload, yet the centralized storage of user data raises significant privacy concerns. Cross-user federated recommendation(CUFR) provides a promising distributed paradigm to address these concerns by enabling privacy-preserving recommendations directly on user devices. In this survey, we review and categorize current progress in CUFR, focusing on four key aspects: privacy, security, accuracy, and efficiency. Firstly,we conduct an in-depth privacy analysis, discuss various cases of privacy leakage, and then review recent methods for privacy protection. Secondly, we analyze security concerns and review recent methods for untargeted and targeted *** untargeted attack methods, we categorize them into data poisoning attack methods and parameter poisoning attack methods. For targeted attack methods, we categorize them into user-based methods and item-based methods. Thirdly,we provide an overview of the federated variants of some representative methods, and then review the recent methods for improving accuracy from two categories: data heterogeneity and high-order information. Fourthly, we review recent methods for improving training efficiency from two categories: client sampling and model compression. Finally, we conclude this survey and explore some potential future research topics in CUFR.
The naive Bayesian classifier(NBC) is a supervised machine learning algorithm having a simple model structure and good theoretical interpretability. However, the generalization performance of NBC is limited to a large...
详细信息
The naive Bayesian classifier(NBC) is a supervised machine learning algorithm having a simple model structure and good theoretical interpretability. However, the generalization performance of NBC is limited to a large extent by the assumption of attribute independence. To address this issue, this paper proposes a novel attribute grouping-based NBC(AG-NBC), which is a variant of the classical NBC trained with different attribute groups. AG-NBC first applies a novel effective objective function to automatically identify optimal dependent attribute groups(DAGs). Condition attributes in the same DAG are strongly dependent on the class attribute, whereas attributes in different DAGs are independent of one another. Then,for each DAG, a random vector functional link network with a SoftMax layer is trained to output posterior probabilities in the form of joint probability density estimation. The NBC is trained using the grouping attributes that correspond to the original condition attributes. Extensive experiments were conducted to validate the rationality, feasibility, and effectiveness of AG-NBC. Our findings showed that the attribute groups chosen for NBC can accurately represent attribute dependencies and reduce overlaps between different posterior probability densities. In addition, the comparative results with NBC, flexible NBC(FNBC), tree augmented Bayes network(TAN), gain ratio-based attribute weighted naive Bayes(GRAWNB), averaged one-dependence estimators(AODE), weighted AODE(WAODE), independent component analysis-based NBC(ICA-NBC), hidden naive Bayesian(HNB) classifier, and correlation-based feature weighting filter for naive Bayes(CFW) show that AG-NBC obtains statistically better testing accuracies, higher area under the receiver operating characteristic curves(AUCs), and fewer probability mean square errors(PMSEs) than other Bayesian classifiers. The experimental results demonstrate that AG-NBC is a valid and efficient approach for alleviating the attribute i
Foundation models(FMs) [1] have revolutionized software development and become the core components of large software systems. This paradigm shift, however, demands fundamental re-imagining of software engineering theo...
Foundation models(FMs) [1] have revolutionized software development and become the core components of large software systems. This paradigm shift, however, demands fundamental re-imagining of software engineering theories and methodologies [2]. Instead of replacing existing software modules implemented by symbolic logic, incorporating FMs' capabilities to build software systems requires entirely new modules that leverage the unique capabilities of ***, while FMs excel at handling uncertainty, recognizing patterns, and processing unstructured data, we need new engineering theories that support the paradigm shift from explicitly programming and maintaining user-defined symbolic logic to creating rich, expressive requirements that FMs can accurately perceive and implement.
Data hierarchy,as a hidden property of data structure,exists in a wide range of machine learning applications.A common practice to classify such hierarchical data is first to encode the data in the Euclidean space,and...
详细信息
Data hierarchy,as a hidden property of data structure,exists in a wide range of machine learning applications.A common practice to classify such hierarchical data is first to encode the data in the Euclidean space,and then train a Euclidean ***,such a paradigm leads to a performance drop due to distortion of data embedding in the Euclidean *** relieve this issue,hyperbolic geometry is investigated as an alternative space to encode the hierarchical data for its higher ability to capture the hierarchical *** methods cannot explore the full potential of the hyperbolic geometry,in the sense that such methods define the hyperbolic operations in the tangent plane,causing the distortion of data *** this paper,we develop two novel kernel formulations in the hyperbolic space,with one being positive definite(PD)and another one being indefinite,to solve the classification tasks in hyperbolic *** PD one is defined via mapping the hyperbolic data to the Drury-Arveson(DA)space,which is a special reproducing kernel Hilbert space(RKHS).To further increase the discrimination of the classifier,an indefinite kernel is further defined in the Krein ***,we design a 2-layer nested indefinite kernel which first maps hyperbolic data into the DA spaces,followed by a mapping from the DA spaces to the Krein *** experiments on real-world datasets demonstrate the superiority ofthe proposed kernels.
Cloud storage is now widely used, but its reliability has always been a major concern. Cloud block storage(CBS) is a famous type of cloud storage. It has the closest architecture to the underlying storage and can prov...
详细信息
Cloud storage is now widely used, but its reliability has always been a major concern. Cloud block storage(CBS) is a famous type of cloud storage. It has the closest architecture to the underlying storage and can provide interfaces for other types. Data modifications in CBS have potential risks such as null reference or data *** verification of these operations can improve the reliability of CBS to some extent. Although separation logic is a mainstream approach to verifying program correctness, the complex architecture of CBS creates some challenges for verifications. This paper develops a proof system based on separation logic for verifying the CBS data modifications. The proof system can represent the CBS architecture, describe the properties of the CBS system state, and specify the behavior of CBS data modifications. Using the interactive verification approach from Coq, the proof system is implemented as a verification tool. With this tool, the paper builds machine-checked proofs for the functional correctness of CBS data modifications. This work can thus analyze the reliability of cloud storage from a formal perspective.
The emergence of the Internet-of-Things is anticipated to create a vast market for what are known as smart edge devices,opening numerous opportunities across countless domains,including personalized healthcare and adv...
详细信息
The emergence of the Internet-of-Things is anticipated to create a vast market for what are known as smart edge devices,opening numerous opportunities across countless domains,including personalized healthcare and advanced *** 3D integration,edge devices can achieve unprecedented miniaturization while simultaneously boosting processing power and minimizing energy ***,we demonstrate a back-end-of-line compatible optoelectronic synapse with a transfer learning method on health care applications,including electroencephalogram(EEG)-based seizure prediction,electromyography(EMG)-based gesture recognition,and electrocardiogram(ECG)-based arrhythmia *** experiments on three biomedical datasets,we observe the classification accuracy improvement for the pretrained model with 2.93%on EEG,4.90%on ECG,and 7.92%on EMG,*** optical programming property of the device enables an ultralow power(2.8×10^(-13) J)fine-tuning process and offers solutions for patient-specific issues in edge computing ***,the device exhibits impressive light-sensitive characteristics that enable a range of light-triggered synaptic functions,making it promising for neuromorphic vision *** display the benefits of these intricate synaptic properties,a 5×5 optoelectronic synapse array is developed,effectively simulating human visual perception and memory *** proposed flexible optoelectronic synapse holds immense potential for advancing the fields of neuromorphic physiological signal processing and artificial visual systems in wearable applications.
Partial-label learning(PLL) is a typical problem of weakly supervised learning, where each training instance is annotated with a set of candidate labels. Self-training PLL models achieve state-of-the-art performance b...
详细信息
Partial-label learning(PLL) is a typical problem of weakly supervised learning, where each training instance is annotated with a set of candidate labels. Self-training PLL models achieve state-of-the-art performance but suffer from error accumulation problems caused by mistakenly disambiguated instances. Although co-training can alleviate this issue by training two networks simultaneously and allowing them to interact with each other, most existing co-training methods train two structurally identical networks with the same task, i.e., are symmetric, rendering it insufficient for them to correct each other due to their similar limitations. Therefore, in this paper, we propose an asymmetric dual-task co-training PLL model called AsyCo,which forces its two networks, i.e., a disambiguation network and an auxiliary network, to learn from different views explicitly by optimizing distinct tasks. Specifically, the disambiguation network is trained with a self-training PLL task to learn label confidence, while the auxiliary network is trained in a supervised learning paradigm to learn from the noisy pairwise similarity labels that are constructed according to the learned label confidence. Finally, the error accumulation problem is mitigated via information distillation and confidence refinement. Extensive experiments on both uniform and instance-dependent partially labeled datasets demonstrate the effectiveness of AsyCo.
暂无评论