检索结果-内蒙古大学图书馆

International Conference on Computer Vision (ICCV)

作者： Aiming Hao Yuecong Min Xilin Chen Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS) Institute of Computing Technology CAS Beijing China University of Chinese Academy of Sciences Beijing China

ISBN: (纸本)9781665428132

In recent years, deep learning moves video-based Continuous Sign Language Recognition (CSLR) significantly forward. Currently, a typical network combination for CSLR includes a visual module, which focuses on spatial and short-temporal information, followed by a contextual module, which focuses on long-temporal information, and the Connectionist Temporal Classification (CTC) loss is adopted to train the network. However, due to the limitation of chain rules in back-propagation, the visual module is hard to adjust for seeking optimized visual features. As a result, it enforces that the contextual module focuses on contextual information optimization only rather than balancing efficient visual and contextual information. In this paper, we propose a Self-Mutual Knowledge Distillation (SMKD) method, which enforces the visual and contextual modules to focus on short-term and long-term information and enhances the discriminative power of both modules simultaneously. Specifically, the visual and contextual modules share the weights of their corresponding classifiers, and train with CTC loss simultaneously. Moreover, the spike phenomenon widely exists with CTC loss. Although it can help us choose a few of the key frames of a gloss, it does drop other frames in a gloss and makes the visual feature saturation in the early stage. A gloss segmentation is developed to relieve the spike phenomenon and decrease saturation in the visual module. We conduct experiments on two CSLR bench-marks: PHOENIX14 and PHOENIX14-T. Experimental results demonstrate the effectiveness of the SMKD.

关键词： Training Deep learning Visualization Computer vision Gesture recognition Assistive technologies Benchmark testing

来源：评论

学校读者我要写书评

暂无评论

A Prompting-based Approach for Adversarial Example Generation and Robustness Enhancement

arXiv

引用

arXiv 2022年

作者： Yang, Yuting Huang, Pei Cao, Juan Li, Jintao Lin, Yun Dong, Jin Song Ma, Feifei Zhang, Jian Key Lab of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences Beijing China University of Chinese Academy of Sciences Beijing China Beijing China National University of Singapore Singapore Laboratory of Parallel Software and Computational Science ISCAS Beijing China

Recent years have seen the wide application of NLP models in crucial areas such as finance, medical treatment, and news media, raising concerns of the model robustness and vulnerabilities. In this paper, we propose a novel prompt-based adversarial attack to compromise NLP models and robustness enhancement technique. We first construct malicious prompts for each instance and generate adversarial examples via mask-and-filling under the effect of a malicious purpose. Our attack technique targets the inherent vulnerabilities of NLP models, allowing us to generate samples even without interacting with the victim NLP model, as long as it is based on pre-trained language models (PLMs). Furthermore, we design a prompt-based adversarial training method to improve the robustness of PLMs. As our training method does not actually generate adversarial samples, it can be applied to large-scale training sets efficiently. The experimental results show that our attack method can achieve a high attack success rate with more diverse, fluent and natural adversarial examples. In addition, our robustness enhancement method can significantly improve the robustness of models to resist adversarial attacks. Our work indicates that prompting paradigm has great potential in probing some fundamental flaws of PLMs and fine-tuning them for downstream tasks. Copyright © 2022, The Authors. All rights reserved.

关键词： Natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

Deep Learning for Logo Detection: A Survey

arXiv

引用

arXiv 2022年

作者： Hou, Sujuan Li, Jiacheng Min, Weiqing Hou, Qiang Zhao, Yanna Zheng, Yuanjie Jiang, Shuqiang School of Information Science and Engineering Shandong Normal University Shandong250358 China The Key Laboratory of Intelligent Information Processing Institute of Computing Technology China Chinese Academy of Sciences Beijing100190 China University of Chinese Academy of Sciences Beijing100049 China

When logos are increasingly created, logo detection has gradually become a research hotspot across many domains and tasks. Recent advances in this area are dominated by deep learning-based solutions, where many datasets, learning strategies, network architectures, etc. have been employed. This paper reviews the advance in applying deep learning techniques to logo detection. Firstly, we discuss a comprehensive account of public datasets designed to facilitate performance evaluation of logo detection algorithms, which tend to be more diverse, more challenging, and more reflective of real life. Next, we perform an in-depth analysis of the existing logo detection strategies and the strengths and weaknesses of each learning strategy. Subsequently, we summarize the applications of logo detection in various fields, from intelligent transportation and brand monitoring to copyright and trademark compliance. Finally, we analyze the potential challenges and present the future directions for the development of logo detection to complete this survey. Copyright © 2022, The Authors. All rights reserved.

关键词： Surveys

来源：评论

学校读者我要写书评

暂无评论

Gaussian-Hermite Moment Invariants of General Multi-Channel Functions

arXiv

引用

arXiv 2022年

作者： Mo, Hanlin Li, Hua Zhao, Guoying The Center for Machine Vision and Signal Analysis University of Oulu OuluFI-90014 Finland The Key lab of Intelligent Information Processing The Institute of Computing Technology Chinese Academy of Sciences Beijing100190 China University of Chinese Academy of Sciences Beijing100049 China The School of Information and Technology Northwest University Xi'An710069 China

With the development of data acquisition technology, large amounts of multi-channel data are collected and widely used in many fields. Most of them, such as RGB images and vector fields, can be expressed as different types of multi-channel functions. Feature extraction of multi-channel data for identifying interest patterns is a critical but challenging task. This paper focuses on constructing moment-based features of general multi-channel functions. Specifically, we define two transform models, rotation-affine transform and total rotation transform, to describe real deformations of multi-channel data. Then, we design a structural framework to generate Gaussian-Hermite moment invariants for these two transform models systematically. It is the first time that a unified framework has been proposed in the literature to construct orthogonal moment invariants of general multi-channel functions. Given a specific type of multi-channel data, we demonstrate how to utilize the new method to derive all possible invariants and eliminate dependences among them. We obtain independent sets of invariants with low orders and low degrees for RGB images, 2D vector fields and color volume data. Based on synthetic and real multi-channel data, we conduct extensive experiments to evaluate the stability and discriminability of these invariants and their robustness to noise. The results show that new moment invariants significantly outperform previous moment invariants of multi-channel data in RGB image classification and vortex detection in 2D vector fields. Copyright © 2022, The Authors. All rights reserved.

关键词： Image classification

来源：评论

学校读者我要写书评

暂无评论

Modeling human travel and social contact with multi-layer networks for epidemic prediction 9

Modeling human travel and social contact with multi-layer ne...

引用

9th IEEE International Conference on Bioinformatics and Computational Biology, ICBCB 2021

作者： Duan, Wei Wang, Tao Wang, Peng Ju, Rusheng Wang, Xiao Yang, Tian College of Systems Engineering National University of Defense Technology Changsha City China State Key Laboratry of Complex Systems Management and Control Institute of Automation Chinese Academy of Sciences Beijing City China Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing Hunan Normal University Changsha City China

ISBN: (纸本)9780738132020

It is a key issue to reasonably represent human travel and social contact in epidemic models. Various measures were applied to develop the models of human mobility and contact in a long range or a short range, such as Brown movement, random walks, spatial networks, gravity models, contact networks. We proposed a method of representing human daily movement and social contact by using multi-layer networks with temporal edge weights. We combined bipartite networks with social networks to describe human daily trip and social contact, respectively. Temporal edge weights of multi-layer networks were employed to denote the propensity of individual movement and contact. We also verified our models and parameters by incorporating human daily travel and contact regularities, as well as comparing experimental results with human behavior statistical laws. At last, we applied a Chinese university campus as a case study to investigate students' daily travel and social contact, and studied the transmission and control strategies of COVID-19 virus. We found stricter control strategies are needed to mitigate the transmission of COVID-19 virus in a university. Once a patient case emerges in a university, it is better to close the campus and quarantine all students. Partial control strategies such as quarantining a part of students and buildings cannot achieve a great effect of mitigating the transmission of COVID-19 virus. Our works are beneficial for the practitioners in the field of computational epidemiology. © 2021 IEEE.

关键词： Viruses

来源：评论

学校读者我要写书评

暂无评论

Learning to Distill Global Representation for Sparse-View CT

Learning to Distill Global Representation for Sparse-View CT

引用

International Conference on Computer Vision (ICCV)

作者： Zilong Li Chenglong Ma Jie Chen Junping Zhang Hongming Shan Shanghai Key Lab of Intelligent Information Processing School of Computer Science Fudan University Shanghai China Institute of Science and Technology for Brain-Inspired Intelligence and MOE Frontiers Center for Brain Science Fudan University Shanghai China Shanghai Center for Brain Science and Brain-Inspired Technology Shanghai China

Sparse-view computed tomography (CT)—using a small number of projections for tomographic reconstruction—enables much lower radiation dose to patients and accelerated data acquisition. The reconstructed images, however, suffer from strong artifacts, greatly limiting their diagnostic value. Current trends for sparse-view CT turn to the raw data for better information recovery. The resultant dual-domain methods, nonetheless, suffer from secondary artifacts, especially in ultra-sparse view scenarios, and their generalization to other scanners/protocols is greatly limited. A crucial question arises: have the image post-processing methods reached the limit? Our answer is not yet. In this paper, we stick to image post-processing methods due to great flexibility and propose global representation(GloRe) distillation framework for sparse-view CT, termed GloReDi. First, we propose to learn GloRe with Fourier convolution, so each element in GloRe has an image-wide receptive field. Second, unlike methods that only use the full-view images for supervision, we propose to distill GloRe from intermediate-view reconstructed images that are readily available but not explored in previous literature. The success of GloRe distillation is attributed to two key components: representation directional distillation to align the GloRe directions, and band-pass-specific contrastive distillation to gain clinically important details. Extensive experiments demonstrate the superiority of the proposed GloReDi over the state-of-the-art methods, including dual-domain ones. The source code is available at https://***/longzilicart/GloReDi.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Twin Contrastive Learning with Noisy labels

Twin Contrastive Learning with Noisy Labels

引用

Conference on Computer Vision and Pattern Recognition (CVPR)

作者： Zhizhong Huang Junping Zhang Hongming Shan Shanghai Key Lab of Intelligent Information Processing School of Computer Science Fudan University Shanghai China Institute of Science and Technology for Brain-inspired Intelligence and MOE Frontiers Center for Brain Science Fudan University Shanghai China Shanghai Center for Brain Science and Brain-inspired Technology Shanghai China

Learning from noisy data is a challenging task that sig-nificantly degenerates the model performance. In this paper, we present TCL, a novel twin contrastive learning model to learn robust representations and handle noisy labels for classification. Specifically, we construct a Gaussian mixture model (GMM) over the representations by injecting the supervised model predictions into GMM to link label- free latent variables in GMM with label-noisy annotations. Then, TCL detects the examples with wrong labels as the out- of-distribution examples by another two-component GMM, taking into account the data distribution. We further propose a cross-supervision with an entropy regularization loss that bootstraps the true targets from model predictions to handle the noisy labels. As a result, TCL can learn discriminative representations aligned with estimated labels through mixup and contrastive learning. Extensive experimental results on several standard benchmarks and real-world datasets demonstrate the superior performance of TCL. In particular, TCL achieves 7.5% improvements on CIFAR-10 with 90% noisy label-an extremely noisy scenario. The source code is available at https://***/Hzzone/TCL.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Dist-PU: Positive-Unlabeled Learning from a label Distribution Perspective

arXiv

引用

arXiv 2022年

作者： Zhao, Yunrui Xu, Qianqian Jiang, Yangbangyan Wen, Peisong Huang, Qingming School of Computer Science and Technology University of Chinese Academy of Sciences China Key Laboratory of Intelligent Information Processing Institute of Computing Technology CAS China State Key Laboratory of Information Security Institute of Information Engineering CAS China School of Cyber Security University of Chinese Academy of Sciences China Key Laboratory of Big Data Mining and Knowledge Management University of Chinese Academy of Sciences China

Positive-Unlabeled (PU) learning tries to learn binary classifiers from a few labeled positive examples with many unlabeled ones. Compared with ordinary semi-supervised learning, this task is much more challenging due to the absence of any known negative labels. While existing cost-sensitive-based methods have achieved state-of-the-art performances, they explicitly minimize the risk of classifying unlabeled data as negative samples, which might result in a negative-prediction preference of the classifier. To alleviate this issue, we resort to a label distribution perspective for PU learning in this paper. Noticing that the label distribution of unlabeled data is fixed when the class prior is known, it can be naturally used as learning supervision for the model. Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions, which is formulated by aligning their expectations. Moreover, we further adopt the entropy minimization and Mixup regularization to avoid the trivial solution of the label distribution consistency on unlabeled data and mitigate the consequent confirmation bias. Experiments on three benchmark datasets validate the effectiveness of the proposed method. Code available at: https://***/Ray-rui/Dist-PU-Positive-UnlabeledLearning-from-a-label-Distribution-Perspective. Copyright © 2022, The Authors. All rights reserved.

关键词： Entropy

来源：评论

学校读者我要写书评

暂无评论

Hard-instance learning for quantum adiabatic prime factorization

引用

Physical Review A 2022年第6期105卷 062455-062455页

作者： Jian Lin Zhengfeng Zhang Junping Zhang Xiaopeng Li State Key Laboratory of Surface Physics Institute of Nanoelectronics and Quantum Computing and Department of Physics Fudan University Shanghai 200433 China Shanghai Key Lab of Intelligent Information Processing and School of Computer Science Fudan University Shanghai 200433 China Shanghai Qi Zhi Institute Xuhui District Shanghai 200032 China Shanghai Research Center for Quantum Sciences Shanghai 201315 China

Prime factorization is a difficult problem with classical computing, whose exponential hardness is the foundation of Rivest-Shamir-Adleman cryptography. With programable quantum devices, adiabatic quantum computing has been proposed as a plausible approach to solve prime factorization, having promising advantage over classical computing. Here, we find there are certain hard instances that are consistently intractable for both classical simulated annealing and unconfigured adiabatic quantum computing (AQC). Aiming at an automated architecture for optimal configuration of quantum adiabatic factorization, we apply a deep reinforcement learning (RL) method to configure the AQC algorithm. By setting the success probability of the worst-case problem instances as the reward to RL, we show the AQC performance on the hard instances is dramatically improved by RL configuration. The success probability also becomes more evenly distributed over different problem instances, meaning the configured AQC is more stable as compared to the unconfigured case. Through a technique of transfer learning, we find prominent evidence that the framework of AQC configuration is scalable—the configured AQC as trained on five qubits remains working efficiently on nine qubits with a minimal amount of additional training cost.

关键词： Machine learning Quantum algorithms Quantum computation Quantum simulation

来源：评论

学校读者我要写书评

暂无评论

Adaptive Nonlinear Latent Transformation for Conditional Face Editing

Adaptive Nonlinear Latent Transformation for Conditional Fac...

引用

International Conference on Computer Vision (ICCV)

作者： Zhizhong Huang Siteng Ma Junping Zhang Hongming Shan Shanghai Key Lab of Intelligent Information Processing School of Computer Science Fudan University Shanghai China Institute of Science and Technology for Brain-Inspired Intelligence and MOE Frontiers Center for Brain Science Fudan University Shanghai China Shanghai Center for Brain Science and Brain-Inspired Technology Shanghai China

Recent works for face editing usually manipulate the latent space of StyleGAN via the linear semantic directions. However, they usually suffer from the entanglement of facial attributes, need to tune the optimal editing strength, and are limited to binary attributes with strong supervision signals. This paper proposes a novel adaptive nonlinear latent transformation for disentangled and conditional face editing, termed AdaTrans. Specifically, our AdaTrans divides the manipulation process into several finer steps; i.e., the direction and size at each step are conditioned on both the facial attributes and the latent codes. In this way, AdaTrans describes an adaptive nonlinear transformation trajectory to manipulate the faces into target attributes while keeping other attributes unchanged. Then, AdaTrans leverages a predefined density model to constrain the learned trajectory in the distribution of latent codes by maximizing the likelihood of transformed latent code. Moreover, we also propose a disentangled learning strategy under a mutual information framework to eliminate the entanglement among attributes, which can further relax the need for labeled data. Consequently, AdaTrans enables a controllable face editing with the advantages of disentanglement, flexibility with non-binary attributes, and high fidelity. Extensive experimental results on various facial attributes demonstrate the qualitative and quantitative effectiveness of the proposed AdaTrans over existing state-of-the-art methods, especially in the most challenging scenarios with a large age gap and few labeled examples. The source code is available at https://***/Hzzone/AdaTrans.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：