Unbiased cross-validation (UCV) is a commonly-used method to calculate the optimal bandwidth for the kernel density estimator (KDE), which estimates the underlying probability density function (PDF) for a given data s...
详细信息
Unbiased cross-validation (UCV) is a commonly-used method to calculate the optimal bandwidth for the kernel density estimator (KDE), which estimates the underlying probability density function (PDF) for a given data set. Since the UCV method was proposed, there have been few studies that have pointed out its instability when determining the KDE bandwidth. Following the principle of stability improvement, this paper presents a novel ensemble UCV based KDE (EUCV-KDE), which determines the expectation of an estimated PDF using an ensemble of data-block based UCVs rather than a single data-point based UCV. To derive the optimal bandwidth, a novel objective function is designed for EUCV-KDE by considering the empirical and structural risk of KDE together. We validate the rationality and effectiveness of EUCV-KDE on 10 probability distributions. The experimental results show that EUCV-KDE is convergent as the number of data-block based UCVs increases and can obtain a more stable and better prediction performance than the classical UCV-KDE and the revisited cross-validation (RCV) based KDE (RCV-KDE). In addition, a real-world application based on UK climate data is provided to further validate the effectiveness of EUCV-KDE by determining the optimal bandwidth for Nadaraya-Watson kernel regression estimator. (c) 2021 Elsevier Inc. All rights reserved.
Cross-modal anomaly detection is a relatively new and challenging research topic in machinelearning field, which aims at identifying the anomalies whose patterns are disparate across different modalities. As far as w...
详细信息
Cross-modal anomaly detection is a relatively new and challenging research topic in machinelearning field, which aims at identifying the anomalies whose patterns are disparate across different modalities. As far as we know, this topic has yet to be well studied, and existing works often suffer from the incomplete anomalous data detection and low data utilization problems. To alleviate these limitations, this paper proposes an efficient deep cross-modal anomaly detection approach via Triple-adaptive Network and Bi-quintuple Contrastive learning (TN-BCL), which lies among the earliest attempt to detect various cross-modal anomalies within the heterogeneous multi-modal data. To be specific, a triple-adaptive network is explicitly designed to identify various anomalies, whose patterns are disparate in both single-modal scenario and cross-modal scenario. On the one hand, the top branch network is utilized to adaptively detect the attribute anomalies and part of mixed anomalies in multi-modal data samples. On the other hand, the bottom two-branch network, with shared residual blocks, is leveraged to learn the discriminative cross-modal embeddings. At the same time, an efficient bi-quintuple contrastive learning method is designed to enhance the feature correlation between the same attribute data, while maximally enlarging the feature difference between different attribute data. Besides that, the bidirectional learning scheme is employed to significantly improve the data utilization. Through the joint exploitation of the above, different kinds of anomalous samples can be well detected across different modalities. Extensive experiments show that the proposed framework outperforms the state-of-the-art competing methods, with a large improvement margin.
Accurate segmentation of gastric tumors is critical yet presents a formidable challenge in medical imaging, where conventional UNet-based frameworks, despite their prevalence, falter on intricate tumor samples due to ...
详细信息
ISBN:
(纸本)9798350313345;9798350313338
Accurate segmentation of gastric tumors is critical yet presents a formidable challenge in medical imaging, where conventional UNet-based frameworks, despite their prevalence, falter on intricate tumor samples due to their limited interactive capacities. The SAM-based segmentation methods address this shortcoming yet with insufficient accuracy. By ingeniously blending images with mask inputs, our MSI-UNet leverages a U-shaped design to deliver pixel-perfect segmentation accuracy, while a novel multi-scale attention module adeptly harnesses interaction points for refined information extraction. When benchmarked on gastric tumor segmentation tasks, MSI-UNet surpasses existing state-of-the-art methods, elevating the Dice Similarity Coefficient (DSC) from 74.82% to 79.3% and minimizing Average Surface Distance (ASD) from 6.46 to 1.98, achieving a comparable accuracy with inter-radiologist consistency of 79.7% DSC. Furthermore, our framework demonstrates superior predictive prowess in survival analysis, enhancing the C-index from 61.7% to 68.68%. Ample experimental comparisons have substantiated that MSI-UNet holds the potential to offer considerable assistance to healthcare professionals in managing and decoding subsequent medical procedures.
Federated learning (FL) emerges as an attractive collaborative machinelearning framework that enables training of models across decentralized devices by merely exposing model parameters. However, malicious attackers ...
详细信息
Federated learning (FL) emerges as an attractive collaborative machinelearning framework that enables training of models across decentralized devices by merely exposing model parameters. However, malicious attackers can still hijack communicated parameters to expose clients' raw samples resulting in privacy leakage. To defend against such attacks, differentially private FL (DPFL) is devised, which incurs negligible computation overhead in protecting privacy by adding noises. Nevertheless, the low model utility and communication efficiency makes DPFL hard to be deployed in the real environment. To overcome these deficiencies, we propose a novel DPFL algorithm called FedDP-SA (namely, federated learning with differential privacy by splitting Local data sets and averaging parameters). Specifically, FedDP-SA splits a local data set into multiple subsets for parameter updating. Then, parameters averaged over all subsets plus differential privacy (DP) noises are returned to the parameter server. FedDP-SA offers dual benefits: 1) enhancing model accuracy by efficiently lowering sensitivity, thereby reducing noise to ensure DP and 2) improving communication efficiency by communicating model parameters with a lower frequency. These advantages are validated through sensitivity analysis and convergence rate analysis. Finally, we conduct comprehensive experiments to verify the performance of FedDP-SA compared with other state-of-the-art baseline algorithms.
The recently developed matrix-based Renyi's alpha-order entropy enables measurement of information in data simply using the eigenspectrum of symmetric positive semi-definite (PSD) matrices in reproducing kernel Hi...
详细信息
The recently developed matrix-based Renyi's alpha-order entropy enables measurement of information in data simply using the eigenspectrum of symmetric positive semi-definite (PSD) matrices in reproducing kernel Hilbert space, without estimation of the underlying data distribution. This intriguing property makes this new information measurement widely adopted in multiple statistical inference and learning tasks. However, the computation of such quantity involves the trace operator on a PSD matrix G to power alpha (i.e., tr(G(alpha))), with a normal complexity of nearly O (n(3)), which severely hampers its practical usage when the number of samples (i.e., n) is large. In this work, we present computationally efficient approximations to this new entropy functional that can reduce its complexity to even significantly less than O (n(2)). To this end, we leverage the recent progress on Randomized Numerical Linear Algebra, developing Taylor, Chebyshev and Lanczos approximations to tr(G(alpha) ) for arbitrary values of alpha by converting it into a matrix-vector multiplication problem. We also establish the connection between the matrix-based Renyi's entropy and PSD matrix approximation, which enables exploiting both clustering and block low-rank structure of G to further reduce the computational cost. We theoretically provide approximation accuracy guarantees and illustrate the properties for different approximations. Large-scale experimental evaluations on both synthetic and real-world data corroborate our theoretical findings, showing promising speedup with negligible loss in accuracy.
Cervical cancer is the most commonly diagnosed cancer among women globally, with high mortality rate. For early diagnosis, automated and accurate cervical cancer classification ap-proaches can be developed through eff...
详细信息
Cervical cancer is the most commonly diagnosed cancer among women globally, with high mortality rate. For early diagnosis, automated and accurate cervical cancer classification ap-proaches can be developed through effective classification of Pap smear cell images. The current study introduces a novel Modified Firefly Optimization Algorithm with Deep learning-enabled cervical cancer classification (MFFOA-DL3) model for the classification of Pap Smear Images (PSI). The proposed MFFOA-DL3 model examines the PSI for the existence of cervical cancer cells. To accomplish this, the proposed MFFOA-DL3 model primarily applies Bilateral Filtering (BF) -based noise removal approach to get rid of the noise. Then, Kapur's entropy-based image seg-mentation technique is applied to determine the affected regions. Moreover, EfficientNet tech-nique is also applied to generate the feature vectors. Finally, MFFOA with Stacked Sparse Denoising Autoencoder (SSDA) model is exploited to classify the PSI. In current study, MFFOA is utilized to appropriately modify the parameters related to SSDA model. The proposed MFFOA-DL3 model was experimentally validated using benchmark dataset. The results attained from extensive comparative analysis highlighted the better performance of MFFOA-DL3 model over other recent approaches.
Spatial transcriptomics and messenger RNA splicing encode extensive spatiotemporal information for cell states and transitions. The current lineage-inference methods either lack spatial dynamics for state transition o...
详细信息
Spatial transcriptomics and messenger RNA splicing encode extensive spatiotemporal information for cell states and transitions. The current lineage-inference methods either lack spatial dynamics for state transition or cannot capture different dynamics associated with multiple cell states and transition paths. Here we present spatial transition tensor (STT), a method that uses messenger RNA splicing and spatial transcriptomes through a multiscale dynamical model to characterize multistability in space. By learning a four-dimensional transition tensor and spatial-constrained random walk, STT reconstructs cell-state-specific dynamics and spatial state transitions via both short-time local tensor streamlines between cells and long-time transition paths among attractors. Benchmarking and applications of STT on several transcriptome datasets via multiple technologies on epithelial-mesenchymal transitions, blood development, spatially resolved mouse brain and chicken heart development, indicate STT's capability in recovering cell-state-specific dynamics and their associated genes not seen using existing methods. Overall, STT provides a consistent multiscale description of single-cell transcriptome data across multiple spatiotemporal scales. STT is a method that connects mRNA splicing and cell-state transitions across spatiotemporal scales and at single-cell resolution.
Hierarchical federated learning (HFL) improves the scalability and efficiency of traditional federated learning (FL) by incorporating a hierarchical topology into the FL framework. In a typical HFL system, clients are...
详细信息
Hierarchical federated learning (HFL) improves the scalability and efficiency of traditional federated learning (FL) by incorporating a hierarchical topology into the FL framework. In a typical HFL system, clients are divided into multiple tiers, and the training process involves both local and global model aggregation. However, existing HFL approaches have several significant drawbacks. Firstly, the root parameter server (PS) is vulnerable to single-point failure and also acts as a bottleneck for global aggregation. Additionally, frequent global aggregation over the wide area network (WAN) incurs substantial communication costs, which negatively affect training efficiency. In this paper, we propose a novel HFL algorithm called CPFedAvg to address the aforementioned challenges. CPFedAvg introduces a root-free hierarchical topology, where the top tier consists of multiple PSes, effectively resolving the issues associated with the root PS. Additionally, we substitute the expensive global aggregation with parameter mixing operations between the PSes in the top tier. We analyze the convergence rate of CPFedAvg under non-convex loss. Based on this analysis, we formulate a convex optimization problem to optimize the frequency of executing local aggregations between consecutive parameter mixing operations. To simulate real-world communication networks, we develop FedNetSimulator to simulate a diverse range of FL communication processes. Finally, we conduct extensive experiments using real datasets (i.e., CIFAR-10 and CIFAR-100). The experimental results demonstrate that CPFedAvg can improve model accuracy by up to 18% and the speedup can be as high as 6 compared with the state-of-the-art baselines.
Due to the outbreak of the new crown epidemic, more companies prefer to use telecommuting for work, which also provides more attack surfaces for APT attacks. After initially gaining access to the intranet, attackers w...
详细信息
Due to the outbreak of the new crown epidemic, more companies prefer to use telecommuting for work, which also provides more attack surfaces for APT attacks. After initially gaining access to the intranet, attackers will use server message block (SMB), RDP, and other remote sharing or connection protocols to move horizontally to achieve the purpose of privilege escalation. In this work, we design a multidimensional detection framework to detect lateral movement behavior based on the SMB protocol in the intranet environment. This framework combines active trapping and passive scanning, and uses neural networks to determine the attack samples used by the adversary when moving laterally. We test the effectiveness of the active trapping technology in a simulation environment, and verify through real malware samples that the accuracy of neural network detection can reach about 90%. The experimental results show that our work can effectively detect the lateral movement behavior using the SMB protocol in the intranet environment.
Recently, Artificial Intelligence (AI) has received more attention for being used in many applications. It is expected to play a key role in Vehicular Ad Hoc Networks (VANET). On the other hand, AI-enabled VANET (AI-V...
详细信息
Recently, Artificial Intelligence (AI) has received more attention for being used in many applications. It is expected to play a key role in Vehicular Ad Hoc Networks (VANET). On the other hand, AI-enabled VANET (AI-VANET) has become an emerging field. However, its cyber security is facing enormous challenges. Although many trust schemes are proposed for addressing these issues, the reliance on only trust updates could increase the risk of long-term attacks before being detected. In this article, we design a human cognition-based trust update scheme (HC-TUS) for AI-VANET. Significantly, the novel trust update scheme is designed by strategically incorporating the Ebbinghaus forgetting theory. The simulation results indicate that 1) HC-TUS could better meet the principle of & Ograve;Hard to get, easy to lose & Oacute;for trust than BRSN and BTDS;2) HC-TUS could detect and resist the collusion attack more quickly than BRSN and BTDS. The open issues in terms of trust for AI-VANET are also investigated and highlighted.
暂无评论