检索结果-内蒙古大学图书馆

LTSP: Long-term slice propagation for accurate airway segmentation

学校读者我要写书评

暂无评论

arXiv 2022年

作者： Wu, Yangqian Zhang, Minghui Yu, Weihao Zheng, Hao Xu, Jiasheng Gu, Yun Institute Of Medical Robotics Shanghai Jiao Tong University Shanghai China Institute Of Image Processing And Pattern Recognition Shanghai Jiao Tong University Shanghai China

Purpose: Bronchoscopic intervention is a widely-used clinical technique for pulmonary diseases, which requires an accurate and topological complete airway map for its localization and guidance. The airway map could be extracted from chest computed tomography (CT) scans automatically by airway segmentation methods. Due to the complex treelike structure of the airway, preserving its topology completeness while maintaining the segmentation accuracy is a challenging task. Methods: In this paper, a long-term slice propagation (LTSP) method is proposed for accurate airway segmentation from pathological CT scans. We also design a two-stage end-to-end segmentation framework utilizing the LTSP method in the decoding process. Stage 1 is used to generate a coarse feature map by an encoder-decoder architecture. Stage 2 is to adopt the proposed LTSP method for exploiting the continuity information and enhancing the weak airway features in the coarse feature map. The final segmentation result is predicted from the refined feature map. Results: Extensive experiments were conducted to evaluate the performance of the proposed method on 70 clinical CT scans. The results demonstrate the considerable improvements of the proposed method compared to some state-of-the-art methods as most breakages are eliminated and more tiny bronchi are detected. The ablation studies further confirm the effectiveness of the constituents of the proposed method. Conclusion: Slice continuity information is beneficial to accurate airway segmentation. Furthermore, by propagating the long-term slice feature, the airway topology connectivity is preserved with overall segmentation accuracy maintained. Copyright © 2022, The Authors. All rights reserved.

关键词： Computerized tomography

Unsupervised motion representation enhanced network for action recognition

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Yang, Xiaohang Kong, Lingtong Yang, Jie Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University China

Learning reliable motion representation between consecutive frames, such as optical flow, has proven to have great promotion to video understanding. However, the TV-L1 method, an effective optical flow solver, is time-consuming and expensive in storage for caching the extracted optical flow. To fill the gap, we propose UF-TSN, a novel end-to-end action recognition approach enhanced with an embedded lightweight unsupervised optical flow estimator. UF-TSN estimates motion cues from adjacent frames in a coarse-to-fine manner and focuses on small displacement for each level by extracting pyramid of feature and warping one to the other according to the estimated flow of the last level. Due to the lack of labeled motion for action datasets, we constrain the flow prediction with multi-scale photometric consistency and edge-aware smoothness. Compared with state-of-the-art unsupervised motion representation learning methods, our model achieves better accuracy while maintaining efficiency, which is competitive with some supervised or more complicated approaches. Copyright © 2021, The Authors. All rights reserved.

关键词： Optical flows

Unsupervised Motion Representation Enhanced Network for Action recognition

学校读者我要写书评

暂无评论

Unsupervised Motion Representation Enhanced Network for Acti...

IEEE International Conference on Acoustics, Speech and Signal processing

作者： Xiaohang Yang Lingtong Kong Jie Yang Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University China

ISBN: (纸本)9781728176055;9781728176062

关键词： Learning systems Visualization image edge detection Speech recognition Signal processing Feature extraction Reliability

UniGNN: A unified framework for graph and hypergraph neural networks

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Huang, Jing Yang, Jie Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University China

Hypergraph, an expressive structure with flexibility to model the higher-order correlations among entities, has recently attracted increasing attention from various research domains. Despite the success of Graph Neural Networks (GNNs) for graph representation learning, how to adapt the powerful GNN-variants directly into hypergraphs remains a challenging problem. In this paper, we propose UniGNN, a unified framework for interpreting the message passing process in graph and hypergraph neural networks, which can generalize general GNN models into hypergraphs. In this framework, meticulously-designed architectures aiming to deepen GNNs can also be incorporated into hypergraphs with the least effort. Extensive experiments have been conducted to demonstrate the effectiveness of UniGNN on multiple real-world datasets, which outperform the state-of-the-art approaches with a large margin. Especially for the DBLP dataset, we increase the accuracy from 77.4% to 88.8% in the semi-supervised hypernode classification task. We further prove that the proposed message-passing based UniGNN models are at most as powerful as the 1-dimensional Generalized Weisfeiler-Leman (1-GWL) algorithm in terms of distinguishing non-isomorphic hypergraphs. Our code is available at https://***/OneForward/UniGNN. © 2021, CC BY-NC-SA.

关键词： Graph neural networks

OAS-Net: Occlusion aware sampling network for accurate optical flow

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Kong, Lingtong Yang, Xiaohang Yang, Jie Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University China

Optical flow estimation is an essential step for many real-world computer vision tasks. Existing deep networks have achieved satisfactory results by mostly employing a pyramidal coarse-to-fine paradigm, where a key process is to adopt warped target feature based on previous flow prediction to correlate with source feature for building 3D matching cost volume. However, the warping operation can lead to troublesome ghosting problem that results in ambiguity. Moreover, occluded areas are treated equally with non occluded regions in most existing works, which may cause performance degradation. To deal with these challenges, we propose a lightweight yet efficient optical flow network, named OAS-Net (occlusion aware sampling network) for accurate optical flow. First, a new sampling based correlation layer is employed without noisy warping operation. Second, a novel occlusion aware module is presented to make raw cost volume conscious of occluded regions. Third, a shared flow and occlusion awareness decoder is adopted for structure compactness. Experiments on Sintel and KITTI datasets demonstrate the effectiveness of proposed approaches. Copyright © 2021, The Authors. All rights reserved.

关键词： Optical flows

Learn What You Need in Personalized Federated Learning

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Lv, Kexin Ye, Rui Huang, Xiaolin Yang, Jie Chen, Siheng Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University Shanghai200240 China Shanghai Jiao Tong University Shanghai200240 China Shanghai Jiao Tong University Shanghai200240 China Shanghai AI Laboratory Shanghai200232 China

Personalized federated learning aims to address data heterogeneity across local clients in federated learning. However, current methods blindly incorporate either full model parameters or predefined partial parameters in personalized federated learning. They fail to customize the collaboration manner according to each local client's data characteristics, causing unpleasant aggregation results. To address this essential issue, we propose Learn2pFed, a novel algorithm-unrolling-based personalized federated learning framework, enabling each client to adaptively select which part of its local model parameters should participate in collaborative training. The key novelty of the proposed Learn2pFed is to optimize each local model parameter's degree of participant in collaboration as learnable parameters via algorithm unrolling methods. This approach brings two benefits: 1) mathmatically determining the participation degree of local model parameters in the federated collaboration, and 2) obtaining more stable and improved solutions. Extensive experiments on various tasks, including regression, forecasting, and image classification, demonstrate that Learn2pFed significantly outperforms previous personalized federated learning methods. Copyright © 2024, The Authors. All rights reserved.

关键词： Parameter estimation

Generating Cartoon images from Face Photos with Cycle-Consistent Adversarial Networks

学校读者我要写书评

暂无评论

Computers, Materials & Continua 2021年第11期69卷 2733-2747页

作者： Tao Zhang Zhanjie Zhang Wenjing Jia Xiangjian He Jie Yang School of Artificial Intelligence and Computer Science Jiangnan UniversityWuxi214000China Key Laboratory of Artificial Intelligence Jiangsu214000China The Global Big Data Technologies Centre University of Technology SydneyUltimoNSW2007Australia The Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong UniversityShanghai201100China

The generative adversarial network(GAN)is first proposed in 2014,and this kind of network model is machine learning systems that can learn to measure a given distribution of data,one of the most important applications is style *** transfer is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output ***-GAN is a classic GAN model,which has a wide range of scenarios in style *** its unsupervised learning characteristics,the mapping is easy to be learned between an input image and an output ***,it is difficult for CYCLE-GAN to converge and generate high-quality *** order to solve this problem,spectral normalization is introduced into each convolutional kernel of the *** convolutional kernel reaches Lipschitz stability constraint with adding spectral normalization and the value of the convolutional kernel is limited to[0,1],which promotes the training process of the proposed ***,we use pretrained model(VGG16)to control the loss of image content in the position of l1 *** avoid overfitting,l1 regularization term and l2 regularization term are both used in the object loss *** terms of Frechet Inception Distance(FID)score evaluation,our proposed model achieves outstanding performance and preserves more discriminative *** results show that the proposed model converges faster and achieves better FID scores than the state of the art.

关键词： Generative adversarial network spectral normalization Lipschitz stability constraint VGG16 l1 regularization term l2 regularization term Frechet inception distance

Online LiDAR-Camera Extrinsic Parameters Self-checking

学校读者我要写书评

暂无评论

arXiv 2022年

作者： Wei, Pengjin Yan, Guohang Li, Yikang Fang, Kun Yang, Jie Liu, Wei The Institute of Image Processing and Pattern Recognition Department of Automation Shanghai Jiao Tong University China The Autonomous Driving Group Shanghai AI Laboratory China

With the development of neural networks and the increasing popularity of automatic driving, the calibration of the LiDAR and the camera has attracted more and more attention. This calibration task is multi-modal, where the rich color and texture information captured by the camera and the accurate three-dimensional spatial information from the LiDAR is incredibly significant for downstream tasks. Current research interests mainly focus on obtaining accurate calibration results through information fusion. However, they seldom analyze whether the calibrated results are correct or not, which could be of significant importance in real-world applications. For example, in large-scale production, the LiDARs and the cameras of each smart car have to get well-calibrated as the car leaves the production line, while in the rest of the car life period, the poses of the LiDARs and cameras should also get continually supervised to ensure the security. To this end, this paper proposes a self-checking algorithm to judge whether the extrinsic parameters are well-calibrated by introducing a binary classification network based on the fused information from the camera and the LiDAR. Moreover, since there is no such dataset for the task in this work, we further generate a new dataset branch from the KITTI dataset tailored for the task. Our experiments on the proposed dataset branch demonstrate the performance of our method. To the best of our knowledge, this is the first work to address the significance of continually checking the calibrated extrinsic parameters for autonomous driving. The code is open-sourced on the Github website at https://***/OpenCalib/LiDAR2camera_self-check. Copyright © 2022, The Authors. All rights reserved.

关键词： Cameras

A New Head Pose Estimation Method Using Vision Transformer Model 2021

学校读者我要写书评

暂无评论

A New Head Pose Estimation Method Using Vision Transformer M...

7th International Conference on Computing and Artificial Intelligence, ICCAI 2021

作者： Ling, Xufeng Wang, Dong Yang, Jie Shanghai Normal University Tianhua College Ai School No. 1661 North Sheng Xin Road China Institute of Image Processing and Pattern Recognition Shanghai Normal University Tianhua College China

ISBN: (纸本)9781450389501

In this paper, a self-attention-based Vision Transformer (VIT) method is introduced into estimate human head pose parameters. Firstly, the head pose image is divided into 32X32 patches, each image patch is regarded as a word, and the whole image is treated as a paragraph composed of n words by the VIT. image recognition can be regarded as the semantic recognition of this paragraph. Next, we redesign the regression VIT to estimate the parameters. Then we select Head Pose Database as the training and validation dataset. The VIT is trained on the enhanced and normalized dataset. Finally, the trained VIT is used to regress the head pose parameters on testing samples. Experimental results show that VIT has high accuracy and good generalization ability for head pose estimation. © 2021 ACM.

关键词： image recognition