Recent advances in single-cell RNA sequencing (scRNA-seq) technology provides unprecedented opportunities for reconstruction gene regulation networks (GRNs). At present, many different models have been proposed to inf...
Recent advances in single-cell RNA sequencing (scRNA-seq) technology provides unprecedented opportunities for reconstruction gene regulation networks (GRNs). At present, many different models have been proposed to infer GRN from a large number of RNA-seq data, but most deep learning models use a priori gene regulatory network to infer potential GRNs. It is a challenge to reconstruct GRNs from scRNA-seq data due to the noise and sparsity introduced by the dropout effect. Here, we propose GAALink, a novel unsupervised deep learning method. It first constructs the gene similarity matrix and then refines it by threshold value. It then learns feature representations of genes through a graphical attention autoencoder that propagates information across genes with different weights. Finally, we use gene feature expression for matrix completion such that the GRNs are reconstructed. Compared with seven existing GRNs reconstruction methods, GAALink achieves more accurate performance on seven scRNA-seq dataset with four ground truth networks. GAALink can provide a useful tool for inferring GRNs for scRNA-seq expression data.
Tracking the flow of external inputs in a program with taint-analysis techniques can help developers better identify potential security vulnerabilities in the software. However, directly using the static taint analysi...
详细信息
software plays a fundamental role in research as a tool, an output, or even as an object of study. This special issue on software citation, indexing, and discoverability brings together five papers examining different...
详细信息
software plays a fundamental role in research as a tool, an output, or even as an object of study. This special issue on software citation, indexing, and discoverability brings together five papers examining different aspects of how the use of software is recorded and made available to others. It describes new work on datasets that enable large-scale analysis of the evolution of software usage and citation, that presents evidence of increased citation rates when software artifacts are released, that provides guidance for registries and repositories to support software citation and findability, and that shows there are still barriers to improving and formalising software citation and publication practice. As the use of software increases further, driven by modern research methods, addressing the barriers to software citation and discoverability will encourage greater sharing and reuse of software, in turn enabling research progress.
In the contemporary landscape, autonomous vehicles (AVs) have emerged as a prominent technological advancement globally. Despite their widespread adoption, significant hurdles remain, with security standing out as a c...
详细信息
Converting source code from one programming language to another is a problem that occurs regularly in real life, but has attracted limited attention and has not been investigated systematically. This paper presents th...
详细信息
The current COVID-19 epidemic is responsible for causing a catastrophe on a global scale due to its risky spread. The community’s insecurity is growing as a result of a lack of appropriate remedial measures and immun...
详细信息
Real-world objects exhibit intricate semantic properties that can be characterized from a multitude of perspectives, which necessitates the development of a model capable of discerning multiple patterns within data, w...
详细信息
Real-world objects exhibit intricate semantic properties that can be characterized from a multitude of perspectives, which necessitates the development of a model capable of discerning multiple patterns within data, while concurrently predicting several Labeling Dimensions (LDs) — a task known as Multi-dimensional Classification (MDC). While the class imbalance issue has been extensively investigated within the multi-class paradigm, its study in the MDC context has been limited due to the imbalance shift phenomenon. A sample’s classification as a minor or major class instance becomes ambiguous when it belongs to a minor class in one LD and a major class in another. Previous MDC methodologies predominantly emphasized instance-wise criteria, neglecting prediction capabilities from a dimension aspect, i.e., the average classification performance across LDs. We assert the significance of dimension-wise metrics in real-world MDC applications and introduce two such metrics. Furthermore, we observe imbalanced class distributions within each LD and propose a novel Imbalance-Aware fusion Model (IMAM) for addressing the MDC problem. Specifically, we first decompose the task into multiple multi-class classification problems, creating imbalance-aware deep models for each LD separately. This straightforward method performs well across LDs without sacrificing performance in instance-wise criteria. Subsequently, we employ LD-wise models as multiple teachers and transfer their knowledge across all LDs to a unified student model. Experimental results on several real-world datasets demonstrate that our IMAM approach excels in both instance-wise evaluations and the proposed dimension-wise metrics.
Scene-based recommendation has proven its usefulness in E-commerce,by recommending commodities based on a given ***,scenes are typically unknown in advance,which necessitates scene discovery for *** this article,we st...
详细信息
Scene-based recommendation has proven its usefulness in E-commerce,by recommending commodities based on a given ***,scenes are typically unknown in advance,which necessitates scene discovery for *** this article,we study scene discovery for E-commerce *** first formalize a scene as a set of commodity cate-gories that occur simultaneously and frequently in real-world situations,and model an E-commerce platform as a heteroge-neous information network(HIN),whose nodes and links represent different types of objects and different types of rela-tionships between objects,*** then formulate the scene mining problem for E-commerce as an unsupervised learning problem that finds the overlapping clusters of commodity categories in the *** solve the problem,we pro-pose a non-negative matrix factorization based method SMEC(Scene Mining for E-Commerce),and theoretically prove its *** six real-world E-commerce datasets,we finally conduct an extensive experimental study to evaluate SMEC against 13 other methods,and show that SMEC consistently outperforms its competitors with regard to various evaluation measures.
Few-shot NER aims to identify entities of target types with only limited number of illustrative instances. Unfortunately, few-shot NER is severely challenged by the intrinsic precise generalization problem, i.e., it i...
详细信息
As various types of data grow explosively, large-scale data storage, backup, and transmission become challenging, which motivates many researchers to propose efficient universal compression algorithms for multi-source...
详细信息
As various types of data grow explosively, large-scale data storage, backup, and transmission become challenging, which motivates many researchers to propose efficient universal compression algorithms for multi-source data. In recent years, due to the emergence of hardware acceleration devices such as GPUs, TPUs, DPUs, and FPGAs, the performance bottleneck of neural networks (NN) has been overcome, making NN-based compression algorithms increasingly practical and popular. However, the research survey for the NN-based universal lossless compressors has not been conducted yet, and there is also a lack of unified evaluation metrics. To address the above problems, in this paper, we present a holistic survey as well as benchmark evaluations. Specifically, i) we thoroughly investigate NN-based lossless universal compression algorithms toward multi-source data and classify them into 3 types: static pre-training, adaptive, and semi-adaptive. ii) We unify 19 evaluation metrics to comprehensively assess the compression effect, resource consumption, and model performance of compressors. iii) We conduct experiments more than 4600 CPU/GPU hours to evaluate 17 state-of-the-art compressors on 28 real-world datasets across data types of text, images, videos, audio, etc. iv) We also summarize the strengths and drawbacks of NN-based lossless data compressors and discuss promising research directions. We summarize the results as the NN-based Lossless Compressors Benchmark (NNLCB, See ***/NNLCB website), which will be updated and maintained continuously in the future.
暂无评论