检索结果-内蒙古大学图书馆

SSRN 2022年

作者： Lv, Kexin He, Fan Huang, Xiaolin Yang, Jie Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University MOE Key Laboratory of System Control and Information Processing Shanghai200240 China

Generalized eigenvalue problem (GEP) plays a significant role in signal processing and machine learning. This paper proposes a consensus-based distributed algorithm for GEP in multi-agent systems, where data samples are distributively stored across agents. The distributed GEP is reformulated as a consensus optimization, but the presence of its quadratic inseparable constraint makes the considered problem more challenging. To deal with it, a sequential method is proposed combined with the alternating direction method of multipliers, which requires communication between couples of nodes. Theoretical analysis shows the proposed algorithm will converge to the set of stationary solutions. And the numerical experiments on synthetic and real-world datasets validate that the approximated solution is competitive to the ground truth. © 2022, The Authors. All rights reserved.

关键词： Multi agent systems

来源：评论

学校读者我要写书评

暂无评论

A Self-Attention Based Method for Facial Expression recognition 2021

A Self-Attention Based Method for Facial Expression Recognit...

引用

7th International Conference on Computing and Artificial Intelligence, ICCAI 2021

作者： Ling, Xufeng Liang, Jingxin Yang, Jie Shanghai Normal University Tianhua College Ai School No. 1661 North Sheng Xin Road China Institute of Image Processing and Pattern Recognition Shanghai Jiaotong University China

ISBN: (纸本)9781450389501

We present a self-attention-based method termed as Vision Transformer (ViT) to efficiently classify the human facial expressions. Our work can be divided into two contributions. First, the facial expression image is divided to N∗N patches, each of which corresponds to word, and the whole image data is used as a paragraph that composed of n words. Second, we design a learnable module, the ViT, with sequence length of L, latent dimension, and 12 attention layers which are integrated together as a unified framework. We also train the proposed model on the normalized and augmented version of FER2013plusdataset. We show empirically that ViT has superior performance compared to alternative approaches. © 2021 ACM.

关键词： Face recognition

来源：评论

学校读者我要写书评

暂无评论

Variational Feature Disentanglement for Few-Shot Domain Adaptation

Variational Feature Disentanglement for Few-Shot Domain Adap...

引用

IEEE International Conference on image processing

作者： Weiduo Wang Yun Gu Jie Yang Department of Automation Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University China Institute of Medical Robotics Shanghai Jiao Tong University China Shanghai Center for Brain Science and Brain-Inspired Technology

In this paper, we focus on the few-shot domain adaptation problem. With limited training data in target domain, a new approach is emerging to acquire the transferable knowledge from the source domain. Previous methods aligned the embedding space between domains by reducing the pair-wise distance. However, these methods are reporting the misalignment and poor generalization. To solve this problem, we propose a variational feature disentanglement framework. The embedding features are explicitly disentangled into domaininvariant and domain-specific components. The distributions of domain-invariant variance are estimated and aligned by the variational inference. For further disentanglement, the domain-invariant and domain-specific components are separated by the orthogonal constraints of subspaces. The experiments on Digits dataset and VisDA-C dataset demonstrate that the proposed method can outperform the state-of-the-art methods.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Detection of Venous Thromboembolism Using Recurrent Neural Networks with Time-Series Data 24

Detection of Venous Thromboembolism Using Recurrent Neural N...

引用

3rd Asia Conference on Algorithms, Computing and Machine Learning, CACML 2024

作者： Xu, Can Huang, Yaqin Xiang, Xinni Lei, Haike Yang, Jie Shanghai Jiao Tong University Shanghai China Chongqing University Cancer Hospital Chongqing China West China Hospital Sichuan University Chengdu China Shanghai Jiao Tong University Institute of Image Processing and Pattern Recognition China Chongqing Key Laboratory of Translational Research for Cancer Metastasis and Individualized Treatment China West China School of Medicine China Chongqing Cancer Multi-omics Big Data Application Engineering Research Center China

ISBN: (纸本)9798400716416

Machine Learning (ML) has been widely applied to medical science for decades. As common knowledge, the progress of many diseases is often chronic and dynamic. Longitudinal data, or time-series data, has better descriptions of certain diseases, as it naturally captures the dynamic over time. Venous thromboembolism (VTE), as a regular complication of cancer tumour treatments, has been emphasized in many studies. Recent research has shown that ML algorithms incredibly contribute to VTE risk estimation and prediction. However, existing methods rarely consider the dynamic changes during the development of VTE, and often make decisions based on time-point data, such as the latest blood test results of patients. In this paper, we propose a VTE detection model that takes advantage of capturing dynamic information using time-series data. The model consists of a Recurrent Neural Network (RNN) based framework with time embedding and Attention mechanism to encode longitudinal features of patients. We test our proposed model on a real-world dataset that contains multiple times of blood test results of patients together with other clinical features. The network is compared with several commonly used ML models in previous VTE studies. The results show that our model outperforms all baseline models, and achieved a state-of-art performance with 80.0% accuracy and 0.881 AUC. © 2024 ACM.

关键词： Risk assessment

来源：评论

学校读者我要写书评

暂无评论

Towards Robust Neural Networks Via Orthogonal Diversity

SSRN

引用

SSRN 2023年

作者： Fang, Kun Tao, Qinghua Wu, Yingwen Li, Tao Cai, Jia Cai, Feipeng Huang, Xiaolin Yang, Jie Institute of Image Processing and Pattern Recognition Department of Automation Shanghai Jiao Tong University Shanghai China ESAT-STADIUS KU Leuven Belgium Central Media Technology Institute Huawei Technologies Ltd. China

Deep Neural Networks (DNNs) are vulnerable to invisible perturbations on the images generated by adversarial attacks, which raises researches on the adversarial robustness of DNNs. A series of methods represented by the adversarial training and its variants have proven as one of the most effective techniques in enhancing the DNN robustness. Generally, adversarial training focuses on enriching the training data by involving perturbed data. Such data augmentation effect of the involved perturbed data in adversarial training does not contribute to the robustness of DNN itself and usually suffers from clean accuracy drop. Towards the robustness of DNN itself, we in this paper propose a novel defense that aims at augmenting the model in order to learn features that are adaptive to diverse inputs, including adversarial examples. More specifically, to augment the model, multiple paths are embedded into the network, and an orthogonality constraint is imposed on these paths to guarantee the diversity among them. A margin-maximization loss is then designed to further boost such DIversity via Orthogonality (DIO). In this way, the proposed DIO augments the model and enhances the robustness of DNN itself as the learned features can be corrected by these mutually-orthogonal paths. Extensive empirical results on various data sets, architectures, and attacks verify the adversarial robustness of the proposed DIO utilizing model augmentation. Besides, DIO can also be flexibly combined with different data augmentation techniques (e.g., TRADES and DDPM), further promoting robustness gains. © 2023, The Authors. All rights reserved.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

A Facial Expression recognition System for Smart Learning Based on YOLO and Vision Transformer 2021

A Facial Expression Recognition System for Smart Learning Ba...

引用

7th International Conference on Computing and Artificial Intelligence, ICCAI 2021

作者： Ling, Xufeng Liang, Jingxin Wang, Dong Yang, Jie Shanghai Normal University Tianhua College Ai School No. 1661 North Sheng Xin Road China Institute of Image Processing and Pattern Recognition Shanghai Jiaotong University China

ISBN: (纸本)9781450389501

This paper proposes a facial expression recognition system for smart learning on classroom. Firstly, YOLO is used to extract face images of multiple students from high-resolution video;secondly, face images are preprocessed, then a self-attention based model named Vision Transformer (ViT) is used to recognize facial expressions;finally, the classified facial expression is used to assist teacher to analyze students' learning status, so as to provide suggestions for improving teaching effect. © 2021 ACM.

关键词： Students

来源：评论

学校读者我要写书评

暂无评论

Isoform Function Prediction Based on Heterogeneous Graph Attention Networks

Isoform Function Prediction Based on Heterogeneous Graph Att...

引用

2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023

作者： Guo, Kuo Li, Yifan Chen, Hao Shen, Hong-Bin Yang, Yang Shanghai Jiao Tong University Key Lab. of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering Department of Computer Science and Engineering Shanghai200240 China Shanghai Jiao Tong University Key Laboratory of System Control and Information Processing Ministry of Education of China Institute of Image Processing and Pattern Recognition Shanghai200240 China Carnegie Mellon University School of Computer Science Computational Biology Department PittsburghPA15213 United States

ISBN: (纸本)9798350337488

Isoforms refer to different mRNA molecules transcribed from the same gene, which can be translated into proteins with varying structures and functions. Predicting the functions of isoforms is an essential topic in bioinformatics as it can provide valuable insights into the intricate mechanisms of gene regulation and biological processes. Conventionally, gene function labels are standardized in Gene Ontology (GO) terms. However, traditional methods for predicting isoform function are largely limited by the absence of isoform-specific labels, sparse annotations, and the vast number of GO terms. To address these issues, we propose HANIso, a deep learning-based method for isoform function prediction. HANIso leverages a pretrained protein language model to extract features from protein sequences. It also integrates heterogeneous information, such as isoform sequence features, GO annotations, and isoform interaction data, using a Heterogeneous Graph Attention Network (HAN). This allows the model to learn the importance of different sources of information and their semantic relationships through the attention mechanism. Our method can predict function labels at both the gene level and isoform level. We conduct experiments on two species datasets, and the results demonstrate that our method outperforms existing methods on both AUROC and AUPRC. HANIso has the potential to overcome the limitations of traditional methods and provide a more accurate and comprehensive understanding of isoform function. © 2023 IEEE.

关键词： alternative splicing gene ontology heterogeneous graph attention network isoform function prediction protein language model

来源：评论

学校读者我要写书评

暂无评论

SliceProp: A Slice-Wise Bidirectional Propagation Model for Interactive 3D Medical image Segmentation 1

SliceProp: A Slice-Wise Bidirectional Propagation Model for ...

引用

1st IEEE International Conference on Medical Artificial Intelligence, MedAI 2023

作者： Xu, Xin Lu, Wenjing Lei, Jiahao Qiu, Peng Shen, Hong-Bin Yang, Yang Shanghai Jiao Tong University Key Lab. of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering Department of Computer Science and Engineering Shanghai200240 China Shanghai Ninth People's Hospital Shanghai Jiao Tong University School of Medicine Department of Vascular Surgery China Shanghai Jiao Tong University Institute of Image Processing and Pattern Recognition Key Laboratory of System Control and Information Processing Ministry of Education of China Shanghai200240 China

ISBN: (纸本)9798350358780

Interactive medical image segmentation methods have become increasingly popular in recent years. These methods combine manual labeling and automatic segmentation, reducing the workload of annotation while maintaining high accuracy. However, most current interactive segmentation frameworks are limited to 2D image data, and are not suitable for 3D image data due to the large size and high complexity of 3D data, as well as the challenges posed by information asymmetry and sparse annotation. In this paper, we propose SliceProp, an interactive segmentation framework that implements slice-wise Label Bidirectional Propagation (LBP) for 3D medical image segmentation. SliceProp extends the interactive 2D image segmentation algorithm to 3D image segmentation, and can handle 3D data with large size and high complexity. Moreover, equipped with a Backtracking Feedback Check (BFC) module, SliceProp effectively addresses the issues of information asymmetry and spatial sparse annotation in 3D medical image segmentation. Additionally, we adopt an uncertainty-based criterion to pri-oritize the slices to be refined interactively, which enhances the efficiency of the interaction process by enabling the model to focus on the regions with the most unreliable predictions. SliceProp is evaluated on two datasets and achieves promising results compared to state-of-the-art methods. © 2023 IEEE.

关键词： Medical imaging

来源：评论

学校读者我要写书评

暂无评论

Anchor Graph Structure Fusion Hashing for Cross-Modal Similarity Search

arXiv

引用

arXiv 2022年

作者： Wang, Lu Yang, Jie Zareapoor, Masoumeh Zheng, Zhonglong Institute of Image Processing and Pattern Recognition Department of Automation Shanghai Jiao Tong University Shanghai 201100 China Zhejiang Normal University China

Cross-modal hashing has been widely applied to retrieve items across modalities due to its superiority in fast computation and low storage. However, some challenges are still needed to address: (1) most existing CMH methods take graphs, which are always predefined separately in each modality, as input to model data distribution. These methods omit to consider the correlation of graph structure among multiple modalities. Besides, cross-modal retrieval results highly rely on the quality of predefined affinity graphs;(2) most existing CMH methods deal with the preservation of intra- and inter-modal affinity independently to learn the binary codes, which ignores considering the fusion affinity among multi-modalities data;(3) most existing CMH methods relax the discrete constraints to solve the optimization objective, which could significantly degrade the retrieval performance. To solve the above limitations, in this paper, we propose a novel Anchor Graph Structure Fusion Hashing (AGSFH). AGSFH constructs the anchor graph structure fusion matrix from different anchor graphs of multiple modalities with the Hadamard product, which can fully exploit the geometric property of underlying data structure across multiple modalities. Specifically, based on the anchor graph structure fusion matrix, AGSFH makes an attempt to directly learn an intrinsic anchor graph, where the structure of the intrinsic anchor graph is adaptively tuned so that the number of components of the intrinsic graph is exactly equal to the number of clusters. Based on this process, training instances can be clustered into semantic space. Besides, AGSFH preserves the anchor fusion affinity into the common binary Hamming space, capturing intrinsic similarity and structure across modalities by hash codes. Furthermore, a discrete optimization framework is designed to learn the unified binary codes across modalities. Extensive experimental results on three public social datasets demonstrate the superiority of AGS

关键词： Graphic methods

来源：评论

学校读者我要写书评

暂无评论

Cross-Modal De-Deviation for Enhancing Few-Shot Classification

SSRN

引用

SSRN 2023年

作者： Pan, Mei-Hong Shen, Hong-Bin School of Electronic Information and Electrical Engineering Shanghai Jiaotong University China Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University Key Laboratory of System Control and Information Processing Ministry of Education of China Shanghai200240 China

Few-shot learning poses a critical challenge due to the deviation problem caused by the scarcity of available samples. In this work, we aim to address deviations in both feature representations and prototypes. To achieve this, we propose a cross-modal de-deviation framework that leverages class semantic information to provide robust prior knowledge for the samples. This framework begins with a visual-to-semantic autoencoder trained on the labeled samples to predict semantic features for the unlabeled samples. Then, we devise a binary linear programming model to incorporate the initial prototypes with the cluster centers of the unlabeled samples. To avoid mismatch between the cluster centers and the initial prototypes, we conduct the label assignment process in the semantic space using the class ground truth semantic features as the reference points, with the cluster centers transformed into semantic representations. Moreover, we model a linear classifier with the concatenation of the refined prototypes and the class ground truth semantic features serving as the initial weights. Then we propose a novel optimization strategy based on the alternating least squares (ALS) model. From the ALS model, we can derive two closed-form solutions regarding to the features and weights, facilitating alternative optimization of them. Extensive experiments conducted on three standard benchmarks demonstrate the competitive advantages of our CMDD method over the state-of-the-art few-shot classification methods, confirming its effectiveness in reducing deviation. © 2023, The Authors. All rights reserved.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：