Capturing the root cause and propagation path of the fault is critical to ensuring the safety and efficiency of industrial processes, especially those that inadequately utilize process knowledge and data. To address t...
详细信息
Learning with Noisy Labels (LNL) aims to improve the model generalization when facing data with noisy labels, and existing methods generally assume that noisy labels come from known classes, called closed-set noise. H...
详细信息
Biselection (feature and sample selection) enhances the efficiency and accuracy of machine learning models when handling large-scale data. Fuzzy rough sets, an uncertainty mathematical model known for its excellent in...
详细信息
This paper addresses the challenge of Granularity Competition in fine-grained classification tasks, which arises due to the semantic gap between multi-granularity labels. Existing approaches typically develop independ...
作者:
Wang, FeiyuZhou, Jian-TaoGuo, XuInner Mongolia University
College of Computer Science Inner Mongolia Hohhot China Inner Mongolia Key Laboratory of Social Computing and Data Processing
Inner Mongolia Engineering Laboratory for Big Data Analysis Technology Engineering Research Center of Ecological Big Data Ministry of Education Natl. Loc. Jt. Eng. Research Center of Intelligent Information Processing Technology for Mongolian Inner Mongolia Engineering Laboratory for Cloud Computing and Service Software China
In a multi-cloud storage system, provenance data records all operations and ownership during its lifecycle, which is critical for data security and audibility. However, recording provenance data also poses some challe...
详细信息
作者:
Wang, FeiyuZhou, Jian-TaoCollege of Computer Science
Inner Mongolia University Inner Mongolia Hohhot China Engineering Research Center of Ecological Big Data
Ministry of Education Natl. Loc. Jt. Eng. Research Center of Intelligent Information Processing Technology for Mongolian Inner Mongolia Engineering Laboratory for Cloud Computing and Service Software Inner Mongolia Key Laboratory of Social Computing and Data Processing Inner Mongolia Engineering Laboratory for Big Data Analysis Technology China
Cloud storage services have been used by most businesses and individual users. However, data loss, service interruptions and cyber attacks often lead to cloud storage services not being provided properly, and these in...
Due to its open-source nature, Android operating system has been the main target of attackers to exploit. Malware creators always perform different code obfuscations on their apps to hide malicious activities. Feature...
详细信息
Due to its open-source nature, Android operating system has been the main target of attackers to exploit. Malware creators always perform different code obfuscations on their apps to hide malicious activities. Features extracted from these obfuscated samples through program analysis contain many useless and disguised features, which leads to many false negatives. To address the issue, in this paper, we demonstrate that obfuscation-resilient malware family analysis can be achieved through contrastive learning. The key insight behind our analysis is that contrastive learning can be used to reduce the difference introduced by obfuscation while amplifying the difference between malware and other types of malware. Based on the proposed analysis, we design a system that can achieve robust and interpretable classification of Android malware. To achieve robust classification, we perform contrastive learning on malware samples to learn an encoder that can automatically extract robust features from malware samples. To achieve interpretable classification, we transform the function call graph of a sample into an image by centrality analysis. Then the corresponding heatmaps can be obtained by visualization techniques. These heatmaps can help users understand why the malware is classified as this family. We implement IFDroid and perform extensive evaluations on two datasets. Experimental results show that IFDroid is superior to state-of-the-art Android malware familial classification systems. Moreover, IFDroid is capable of maintaining a 98.4% F1 on classifying 69,421 obfuscated malware samples. IEEE
In order to solve the problem that the similarity method used in software module clustering can produce arbitrary decision, and the description matrix of dendrogram generated by base clustering in hierarchical cluster...
详细信息
A common but critical task in biological ontologies data analysis is to compare the difference between ontologies. There have been numerous ontologybased semantic-similarity measures proposed in specific ontology doma...
详细信息
A common but critical task in biological ontologies data analysis is to compare the difference between ontologies. There have been numerous ontologybased semantic-similarity measures proposed in specific ontology domain, but it still remains a challenge for crossdomain ontologies comparison. An ontology contains the scientific natural language description for the corresponding biological aspect. Therefore, we develop a new method based on natural language processing(NLP) representation model bidirectional encoder representations from transformers(BERT) for cross-domain semantic representation of biological ontologies. This article uses the BERT model to represent the word-level of the ontologies as a set of vectors, facilitating the semantic analysis or comparing the biomedical entities named in an ontology or associated with ontology terms. We evaluated the ability of our method in two experiments: calculating similarities of pair-wise disease ontology and human phenotype ontology terms and predicting the pair-wise of proteins interaction. The experimental results demonstrated the comparative performance. This gives promise to the development of NLP methods in biological data analysis.
Anomaly detection refers to the identification of data objects that deviate from the general data distribution. One of the important challenges in anomaly detection is handling high-dimensional data, especially when i...
详细信息
暂无评论