检索结果-内蒙古大学图书馆

Watermarking for out-of-distribution detection 22

学校读者我要写书评

暂无评论

Watermarking for out-of-distribution detection

Proceedings of the 36th International Conference on Neural Information Processing Systems

作者： Qizhou Wang Feng Liu Yonggang Zhang Jing Zhang Chen Gong Tongliang Liu Bo Han Department of Computer Science Hong Kong Baptist University School of Mathematics and Statistics The University of Melbourne School of Computer Science The University of Sydney PCA Lab Key Lab of Intelligent Perception and Systems for High-Dimensional Information of MoE and Jiangsu Key Lab of Image and Video Understanding for Social Security School of Computer Science and Engineering Nanjing University of Science and Technology TML Lab The University of Sydney

ISBN: (纸本)9781713871088

Out-of-distribution (OOD) detection aims to identify OOD data based on representations extracted from well-trained deep models. However, existing methods largely ignore the reprogramming property of deep models and thus may not fully unleash their intrinsic strength: without modifying parameters of a well-trained deep model, we can reprogram this model for a new purpose via data-level manipulation (e.g., adding a specific feature perturbation to the data). This property motivates us to reprogram a classification model to excel at OOD detection (a new task), and thus we propose a general methodology named watermarking in this paper. Specifically, we learn a unified pattern that is superimposed onto features of original data, and the model's detection capability is largely boosted after watermarking. Extensive experiments verify the effectiveness of watermarking, demonstrating the significance of the reprogramming property of deep models in OOD detection.

关键词：

Divide and Conquer: Hybrid Pre-training for Person Search

学校读者我要写书评

暂无评论

arXiv 2023年

作者： Tian, Yanling Chen, Di Liu, Yunan Yang, Jian Zhang, Shanshan PCA Lab Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education Jiangsu Key Lab of Image and Video Understanding for Social Security School of Computer Science and Engineering Nanjing University of Science and Technology Nanjing China School of Artificial Intelligence Dalian Maritime University China

Large-scale pre-training has proven to be an effective method for improving performance across different tasks. Current person search methods use imageNet pre-trained models for feature extraction, yet it is not an optimal solution due to the gap between the pre-training task and person search task (as a downstream task). Therefore, in this paper, we focus on pretraining for person search, which involves detecting and reidentifying individuals simultaneously. Although labeled data for person search is scarce, datasets for two sub-tasks person detection and re-identification are relatively abundant. To this end, we propose a hybrid pre-training framework specifically designed for person search using sub-task data only. It consists of a hybrid learning paradigm that handles data with different kinds of supervisions, and an intra-task alignment module that alleviates domain discrepancy under limited resources. To the best of our knowledge, this is the first work that investigates how to support full-task pre-training using sub-task data. Extensive experiments demonstrate that our pre-trained model can achieve significant improvements across diverse protocols, such as person search method, finetuning data, pre-training data and model backbone. For example, our model improves ResNet50 based NAE by 10.3% relative improvement w.r.t. mAP. Our code and pre-trained models are released for plug-and-play usage to the person search community (https://***/personsearch/PretrainPS). Copyright © 2023, The Authors. All rights reserved.

关键词： Machine learning

Center-Based Decoupled Point Cloud Registration for 6D Object Pose Estimation

学校读者我要写书评

暂无评论

Center-Based Decoupled Point Cloud Registration for 6D Objec...

International Conference on computer vision (ICCV)

作者： Haobo Jiang Zheng Dang Shuo Gu Jin Xie Mathieu Salzmann Jian Yang PCA Lab Nanjing University of Science and Technology China PCA Lab Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education and Jiangsu Key Lab of Image and Video Understanding for Social Security School of Computer Science and Engineering Nanjing University of Science and Technology China CVLab EPFL Switzerland

In this paper, we propose a novel center-based decoupled point cloud registration framework for robust 6D object pose estimation in real-world scenarios. Our method decouples the translation from the entire transformation by predicting the object center and estimating the rotation in a center-aware manner. This center offset-based translation estimation is correspondence-free, freeing us from the difficulty of constructing correspondences in challenging scenarios, thus improving robustness. To obtain reliable center predictions, we use a multi-view (bird’s eye view and front view) object shape description of the source-point features, with both views jointly voting for the object center. Additionally, we propose an effective shape embedding module to augment the source features, largely completing the missing shape information due to partial scanning, thus facilitating the center prediction. With the center-aligned source and model point clouds, the rotation predictor utilizes feature similarity to establish putative correspondences for SVD-based rotation estimation. In particular, we introduce a center-aware hybrid feature descriptor with a normal correction technique to extract discriminative, part-aware features for high-quality correspondence construction. Our experiments show that our method outperforms the state-of-the-art methods by a large margin on real-world datasets such as TUD-L, LINEMOD, and Occluded-LINEMOD. Code is available at https://***/JiangHB/CenterReg.

关键词：

UnDeepLIO: Unsupervised deep lidar-inertial odometry

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Tu, Yiming Xie, Jin PCA Lab. Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education Nanjing University of Science and Technology Nanjing China Jiangsu Key Lab of Image Video Understanding for Social Security School of Computer Science and Engineering Nanjing University of Science and Technology Nanjing China

Extensive research efforts have been dedicated to deep learning based odometry. Nonetheless, few efforts are made on the unsupervised deep lidar odometry. In this paper, we design a novel framework for unsupervised lidar odometry with the IMU, which is never used in other deep methods. First, a pair of siamese LSTMs are used to obtain the initial pose from the linear acceleration and angular velocity of IMU. With the initial pose, we perform the rigid transform on the current frame and align it to the last frame. Then we extract vertex and normal features from the transformed point clouds and its normals. Next a two-branch attention module is proposed to estimate residual rotation and translation from the extracted vertex and normal features, respectively. Finally, our model outputs the sum of initial and residual poses as the final pose. For unsupervised training, we introduce an unsupervised loss function which is employed on the voxelized point clouds. The proposed approach is evaluated on the KITTI odometry estimation benchmark and achieves comparable performances against other state-of-the-art methods. Copyright © 2021, The Authors. All rights reserved.

关键词： Optical radar

Ultra-High-Definition image HDR Reconstruction via Collaborative Bilateral Learning

学校读者我要写书评

暂无评论

Ultra-High-Definition Image HDR Reconstruction via Collabora...

International Conference on computer vision (ICCV)

作者： Zhuoran Zheng Wenqi Ren Xiaochun Cao Tao Wang Xiuyi Jia School of Computer Science and Engineering Nanjing University of Science and Technology Jiangsu Key Laboratory of Image and Video Understanding for Social Safety Nanjing University of Science and Technology SKLOIS IIE CAS Huawei Noah’s Ark Lab

ISBN: (纸本)9781665428132

Existing single image high dynamic range (HDR) reconstruction methods attempt to expand the range of illuminance. They are not effective in generating plausible textures and colors in the reconstructed results, especially for high-density pixels in ultra-high-definition (UHD) images. To address these problems, we propose a new HDR reconstruction network for UHD images by collaboratively learning color and texture details. First, we propose a dual-path network to extract the content and chromatic features at a reduced resolution of the low dynamic range (LDR) input. These two types of features are used to fit bilateral-space affine models for real-time HDR reconstruction. To extract the main data structure of the LDR input, we propose to use 3D Tucker decomposition and reconstruction to prevent pseudo edges and noise amplification in the learned bilateral grid. As a result, the high-quality content and chromatic features can be reconstructed capitalized on guided bilateral upsampling. Finally, we fuse these two full-resolution feature maps into the HDR reconstructed results. Our proposed method can achieve real-time processing for UHD images (about 160 fps). Experimental results demonstrate that the proposed algorithm performs favorably against the state-of-the-art HDR reconstruction approaches on public benchmarks and real-world UHD images.

关键词： Visualization Three-dimensional displays image resolution image color analysis image edge detection Reconstruction algorithms Dynamic range

Cascading Enhancement Representation for Face Anti-Spoofing

学校读者我要写书评

暂无评论

SSRN

SSRN 2023年

作者： Ma, Yimei Dong, Yangwei Qian, Jianjun Wong, Wai Keung Yang, Jian PCA Lab Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education Jiangsu Key Lab of Image and Video Understanding for Social Security School of Computer Science and Engineering Nanjing University of Science and Technology China Institute of Textiles and Clothing The Hong Kong Polytechnic University Hong Kong

Face anti-spoofing (FAS) is the first security line of defense in face recognition system. The majority of current methods focus on distinguishing the live faces from spoof faces by designing adaptive network with auxiliary information to enhance the feature discrimination, which shows that how to achieve the discriminative representation is also vital to solve FAS task. In this paper, motivated by the idea of cascading enhancement, we propose a novel cascading enhancement representation network (CERN) for effective FAS. Specifically, the CERN utilizes two branches to achieve multi-level feature in cascading enhancement feature extraction stage. The first branch employs the backbone network to concatenate the multi-scale feature in conjunction with attention modules. The second branch utilizes the shared attention modules to enhance the input space for learning the multi-level refinement features. In cascading enhancement feature fusion stage, we transmit the high-level feature to the middle level (mid-level) for enhancing the mid-level representation. The novel mid-level feature is then used to enhance the low-level feature. Moreover, the weight map learning scheme is proposed to further enhance discrimination of the predicted binary map. Additionally, we use meta learning to extend our CERN for solving cross-database testing. Experiments on five benchmark databases demonstrate the effectiveness of our methods against the state-of-art methods. © 2023, The Authors. All rights reserved.

关键词： Face recognition

SE(3) Diffusion Model-based Point Cloud Registration for Robust 6D Object Pose Estimation

学校读者我要写书评

暂无评论

arXiv 2023年

作者： Jiang, Haobo Salzmann, Mathieu Dang, Zheng Xie, Jin Yang, Jian PCA Lab Nanjing University of Science and Technology China CVLab EPFL Switzerland ClearSpace Switzerland PCA Lab Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education Jiangsu Key Lab of Image and Video Understanding for Social Security School of Computer Science and Engineering Nanjing University of Science and Technology China

In this paper, we introduce an SE(3) diffusion model-based point cloud registration framework for 6D object pose estimation in real-world scenarios. Our approach formulates the 3D registration task as a denoising diffusion process, which progressively refines the pose of the source point cloud to obtain a precise alignment with the model point cloud. Training our framework involves two operations: An SE(3) diffusion process and an SE(3) reverse process. The SE(3) diffusion process gradually perturbs the optimal rigid transformation of a pair of point clouds by continuously injecting noise (perturbation transformation). By contrast, the SE(3) reverse process focuses on learning a denoising network that refines the noisy transformation step-by-step, bringing it closer to the optimal transformation for accurate pose estimation. Unlike standard diffusion models used in linear Euclidean spaces, our diffusion model operates on the SE(3) manifold. This requires exploiting the linear Lie algebra se(3) associated with SE(3) to constrain the transformation transitions during the diffusion and reverse processes. Additionally, to effectively train our denoising network, we derive a registration-specific variational lower bound as the optimization objective for model learning. Furthermore, we show that our denoising network can be constructed with a surrogate registration model, making our approach applicable to different deep registration networks. Extensive experiments demonstrate that our diffusion registration framework presents outstanding pose estimation performance on the real-world TUD-L, LINEMOD, and Occluded-LINEMOD datasets. Code is available at https://***/JiangHB/DiffusionReg. Copyright © 2023, The Authors. All rights reserved.

关键词： Diffusion

Graph Matching Optimization Network for Point Cloud Registration

学校读者我要写书评

暂无评论

Graph Matching Optimization Network for Point Cloud Registra...

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

作者： Qianliang Wu Yaqi Shen Haobo Jiang Guofeng Mei Yaqing Ding Lei Luo Jin Xie Jian Yang PCA Lab Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education and Jiangsu Key Lab of Image and Video Understanding for Social Security School of Computer Science and Engineering Nanjing University of Science and Technology Faculty of Engineering and Information Technology University of Technology Sydney Sydney NSW Australia

Point Cloud Registration is a fundamental and challenging problem in 3D computer vision. Recent works often utilize geometric structure features in downsampled points (patches) to seek correspondences, then propagate these sparse patch correspondences to the dense level in the corresponding patches' neighborhood. However, they neglect the explicit global scale rigid constraint at the dense level point matching. We claim that the explicit isometry-preserving constraint in the dense level on a global scale is also important for improving feature representation in the training stage. To this end, we propose a Graph Matching Optimization based Network (GMONet for short), which utilizes the graph-matching optimizer to explicitly exert the isometry preserving constraints in the point feature training to improve the point feature representation. Specifically, we exploit a partial graph-matching optimizer to enhance the super point (i.e., down-sampled key points) features and a full graph-matching optimizer to improve the dense level point features in the overlap region. Meanwhile, we leverage the inexact proximal point method and the mini-batch sampling technique to accelerate these two graph-matching optimizers. Given high discriminative point features in the evaluation stage, we utilize the RANSAC approach to estimate the transformation between the scanned pairs. The proposed method has been evaluated on the 3DMatch/3DLoMatch and the KITTI datasets. The experimental results show that our method performs competitively compared to state-of-the-art baselines.

关键词：

Towards Harnessing Feature Embedding for Robust Learning with Noisy labels

学校读者我要写书评

暂无评论

arXiv 2022年

作者： Zhang, Chuang Shen, Li Yang, Jian Gong, Chen Pca Lab The Key Lab. of Intelligent Percept. and Syst. for High-Dimensional Info. of Ministry of Education School of Computer Science and Engineering Nanjing University of Science and Technology China Jiangsu Key Lab of Image and Video Understanding for Social Security China Jd Explore Academy China College of Computer Science Nankai University China

The memorization effect of deep neural networks (DNNs) plays a pivotal role in recent label noise learning methods. To exploit this effect, the model prediction-based methods have been widely adopted, which aim to exploit the outputs of DNNs in the early stage of learning to correct noisy labels. However, we observe that the model will make mistakes during label prediction, resulting in unsatisfactory performance. By contrast, the produced features in the early stage of learning show better robustness. Inspired by this observation, in this paper, we propose a novel feature embedding-based method for deep learning with label noise, termed labEl NoiseDilution (LEND). To be specific, we first compute a similarity matrix based on current embedded features to capture the local structure of training data. Then, the noisy supervision signals carried by mislabeled data are overwhelmed by nearby correctly labeled ones (i.e., label noise dilution), of which the effectiveness is guaranteed by the inherent robustness of feature embedding. Finally, the training data with diluted labels are further used to train a robust classifier. Empirically, we conduct extensive experiments on both synthetic and real-world noisy datasets by comparing our LEND with several representative robust learning approaches. The results verify the effectiveness of our LEND. © 2022, CC BY.

关键词： Deep neural networks