检索结果-内蒙古大学图书馆

Proceedings of the 37th International Conference on Neural Information Processing Systems

作者： Ge Yuan Xiaodong Cun Yong Zhang Maomao Li Chenyang Qi Xintao Wang Ying Shan Huicheng Zheng School of Computer Science and Engineering Sun Yat-sen University and Key Laboratory of Machine Intelligence and Advanced Computing Ministry of Education China and Guangdong Key Laboratory of Information Security Technology and Tencent AI Lab Tencent AI Lab The Hong Kong University of Science and Technology School of Computer Science and Engineering Sun Yat-sen University and Key Laboratory of Machine Intelligence and Advanced Computing Ministry of Education China and Guangdong Key Laboratory of Information Security Technology

Exquisite demand exists for customizing the pretrained large text-to-image model, e.g. Stable Diffusion, to generate innovative concepts, such as the users themselves. However, the newly-added concept from previous customization methods often shows weaker combination abilities than the original ones even given several images during training. We thus propose a new personalization method that allows for the seamless integration of a unique individual into the pre-trained diffusion model using just one facial photograph and only 1024 learnable parameters under 3 minutes. So we can effortlessly generate stunning images of this person in any pose or position, interacting with anyone and doing anything imaginable from text prompts. To achieve this, we first analyze and build a well-defined celeb basis from the embedding space of the pre-trained large text encoder. Then, given one facial photo as the target identity, we generate its own embedding by optimizing the weight of this basis and locking all other parameters. Empowered by the proposed celeb basis, the new identity in our customized model showcases a better concept combination ability than previous personalization methods. Besides, our model can also learn several new identities at once and interact with each other where the previous customization model fails to. Project page is at: http://***. Code is at: https://***/ygtxr1997/CelebBasis.

关键词：

来源：评论

学校读者我要写书评

暂无评论

A light rule-based approach to english subject-verb agreement errors on the third person singular forms 29

A light rule-based approach to english subject-verb agreemen...

引用

29th Pacific Asia Conference on Language, Information and Computation, PACLIC 2015

作者： Wang, Yuzhu Zhao, Hai Shi, Dan Center for Brain-Like Computing and Machine Intelligence Department of Computer Science and Engineering Shanghai Jiao Tong University Shanghai200240 China Key Lab. of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering Shanghai Jiao Tong University Shanghai200240 China LangYing NLP Research Institute Shanghai Lang Ying Education Technology Co. Ltd China

Verb errors are one of the most common grammar errors made by non-native writers of English. This work especially focus on an important type of verb usage errors, subject-verb agreement for the third person singular forms, which has a high proportion in errors made by non-native English learners. Existing work has not given a satisfied solution for this task, in which those using supervised learning method usually fail to output good enough performance, and rule-based methods depend on advanced linguistic resources such as syntactic parsers. In this paper, we propose a rule-based method to detect and correct the concerned errors. The proposed method relies on a series of rules to automatically locate subject and predicate in four types of sentences. The evaluation shows that the proposed method gives state-of-The-Art performance with quite limited linguistic resources.

关键词： Errors

来源：评论

学校读者我要写书评

暂无评论

On comparing different metric learning schemes for deep feature based person re-identification with camera adaption

On comparing different metric learning schemes for deep feat...

引用

2019 IEEE International Conference on Real-Time computing and Robotics, RCAR 2019

作者： Wu, Wanyin Yang, Zhao Tao, Dapeng Zhang, Qieshi Cheng, Jun FIST LAB School of Information Science and Engineering Yunnan University Kunming Yunnan650091 China CAS Key Lab. of Hum.-Machine Intelligence-Synergy Systems Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences Shenzhen518055 China School of Mechanical and Electric Engineering Guangzhou University Guangzhou Guangdong510006 China Chinese University of Hong Kong Hong Kong999077 Hong Kong

ISBN: (纸本)9781728137261

Person re-identification, as a branch of image retrieval, has an extremely important application in public safety. In the past few decades, researchers have improved its accuracy through a variety of methods, including increasing data volumes and developing feature extraction or metric learning schemes. As the data volumes increased, researchers began to turn handcrafted feature algorithms to deep learning. Now, with the emergence of deep learning and large-scale data, we need to discuss which kind of metric learning performances on the re-identification problem is better. In this paper, to address the data shortage problem and learn invariant property from different cameras, we choose a camera adaptation method to increase the data volumes. We conduct comprehensive experiments on two datasets with camera adaptation and combine recent advances on feature extraction and metric learning. To ensure fairness, all methods use a unified code library that includes 4 deep feature extraction networks and 8 metric learning. The experimental results show that, under the premise of continuous improvement in deep learning, the traditional metric method based on Euclidean distance also achieved commendable results. © 2019 IEEE.

关键词： Cameras

来源：评论

学校读者我要写书评

暂无评论

Online training through time for spiking neural networks 22

Online training through time for spiking neural networks

引用

Proceedings of the 36th International Conference on Neural Information Processing Systems

作者： Mingqing Xiao Qingyan Meng Zongpeng Zhang Di He Zhouchen Lin Key Lab. of Machine Perception (MoE) School of Intelligence Science and Technology Peking University The Chinese University of Hong Kong Shenzhen and Shenzhen Research Institute of Big Data Center for Data Science Academy for Advanced Interdisciplinary Studies Peking University Key Lab. of Machine Perception (MoE) School of Intelligence Science and Technology Peking University and Institute for Artificial Intelligence Peking University and Peng Cheng Laboratory China

ISBN: (纸本)9781713871088

Spiking neural networks (SNNs) are promising brain-inspired energy-efficient models. Recent progress in training methods has enabled successful deep SNNs on large-scale tasks with low latency. Particularly, backpropagation through time (BPTT) with surrogate gradients (SG) is popularly used to enable models to achieve high performance in a very small number of time steps. However, it is at the cost of large memory consumption for training, lack of theoretical clarity for optimization, and inconsistency with the online property of biological learning rules and rules on neuromorphic hardware. Other works connect the spike representations of SNNs with equivalent artificial neural network formulation and train SNNs by gradients from equivalent mappings to ensure descent directions. But they fail to achieve low latency and are also not online. In this work, we propose online training through time (OTTT) for SNNs, which is derived from BPTT to enable forward-in-time learning by tracking presynaptic activities and leveraging instantaneous loss and gradients. Meanwhile, we theoretically analyze and prove that the gradients of OTTT can provide a similar descent direction for optimization as gradients from equivalent mapping between spike representations under both feedforward and recurrent conditions. OTTT only requires constant training memory costs agnostic to time steps, avoiding the significant memory costs of BPTT for GPU training. Furthermore, the update rule of OTTT is in the form of three-factor Hebbian learning, which could pave a path for online on-chip learning. With OTTT, it is the first time that the two mainstream supervised SNN training methods, BPTT with SG and spike representation-based training, are connected, and meanwhile it is in a biologically plausible form. Experiments on CIFAR-10, CIFAR-100, ImageNet, and CIFAR10-DVS demonstrate the superior performance of our method on large-scale static and neuromorphic datasets in a small number of time steps.

关键词：

来源：评论

学校读者我要写书评

暂无评论

ASAG: Building Strong One-Decoder-Layer Sparse Detectors via Adaptive Sparse Anchor Generation

arXiv

引用

arXiv 2023年

作者： Fu, Shenghao Yan, Junkai Gao, Yipeng Xie, Xiaohua Zheng, Wei-Shi School of Computer Science and Engineering Sun Yat-sen University China Pengcheng Lab China Guangdong Province Key Laboratory of Information Security Technology China Key Laboratory of Machine Intelligence and Advanced Computing Ministry of Education China

Recent sparse detectors with multiple, e.g. six, decoder layers achieve promising performance but much inference time due to complex heads. Previous works have explored using dense priors as initialization and built one-decoder-layer detectors. Although they gain remarkable acceleration, their performance still lags behind their six-decoder-layer counterparts by a large margin. In this work, we aim to bridge this performance gap while retaining fast speed. We find that the architecture discrepancy between dense and sparse detectors leads to feature conflict, hampering the performance of one-decoder-layer detectors. Thus we propose Adaptive Sparse Anchor Generator (ASAG) which predicts dynamic anchors on patches rather than grids in a sparse way so that it alleviates the feature conflict problem. For each image, ASAG dynamically selects which feature maps and which locations to predict, forming a fully adaptive way to generate image-specific anchors. Further, a simple and effective Query Weighting method eases the training instability from adaptiveness. Extensive experiments show that our method outperforms dense-initialized ones and achieves a better speed-accuracy tradeoff. The code is availab.e at https://***/iSEE-lab.ratory/ASAG. Copyright © 2023, The Authors. All rights reserved.

关键词： Decoding

来源：评论

学校读者我要写书评

暂无评论

Estimator Meets Equilibrium Perspective: A Rectified Straight Through Estimator for Binary Neural Networks Training

arXiv

引用

arXiv 2023年

作者： Wu, Xiao-Ming Zheng, Dian Liu, Zuhao Zheng, Wei-Shi School of Computer Science and Engineering Sun Yat-sen University China Pengcheng Lab China Guangdong Province Key Laboratory of Information Security Technology China Key Laboratory of Machine Intelligence and Advanced Computing Ministry of Education China

Binarization of neural networks is a dominant paradigm in neural networks compression. The pioneering work BinaryConnect uses Straight Through Estimator (STE) to mimic the gradients of the sign function, but it also causes the crucial inconsistency problem. Most of the previous methods design different estimators instead of STE to mitigate it. However, they ignore the fact that when reducing the estimating error, the gradient stability will decrease concomitantly. These highly divergent gradients will harm the model training and increase the risk of gradient vanishing and gradient exploding. To fully take the gradient stability into consideration, we present a new perspective to the BNNs training, regarding it as the equilibrium between the estimating error and the gradient stability. In this view, we firstly design two indicators to quantitatively demonstrate the equilibrium phenomenon. In addition, in order to balance the estimating error and the gradient stability well, we revise the original straight through estimator and propose a power function based estimator, Rectified Straight Through Estimator (ReSTE for short). Comparing to other estimators, ReSTE is rational and capable of flexibly balancing the estimating error with the gradient stability. Extensive experiments on CIFAR-10 and ImageNet datasets show that ReSTE has excellent performance and surpasses the state-of-the-art methods without any auxiliary modules or losses. Copyright © 2023, The Authors. All rights reserved.

关键词： Stability

来源：评论

学校读者我要写书评

暂无评论

Estimator Meets Equilibrium Perspective: A Rectified Straight Through Estimator for Binary Neural Networks Training

Estimator Meets Equilibrium Perspective: A Rectified Straigh...

引用

International Conference on Computer Vision (ICCV)

作者： Xiao-Ming Wu Dian Zheng Zuhao Liu Wei-Shi Zheng School of Computer Science and Engineering Sun Yat-sen University China Pengcheng Lab China Guangdong Province Key Laboratory of Information Security Technology China Key Laboratory of Machine Intelligence and Advanced Computing Ministry of Education China

关键词：

来源：评论

学校读者我要写书评

暂无评论

MIST: Multiple instance self-training framework for video anomaly detection

arXiv

引用

arXiv 2021年

作者： Feng, Jia-Chang Hong, Fa-Ting Zheng, Wei-Shi School of Computer Science and Engineering Sun Yat-Sen University Peng Cheng Laboratory Shenzhen China Key Laboratory of Machine Intelligence and Advanced Computing Ministry of Education China Pazhou Lab Guangzhou China

Weakly supervised video anomaly detection (WS-VAD) is to distinguish anomalies from normal events based on discriminative representations. Most existing works are limited in insufficient video representations. In this work, we develop a multiple instance self-training framework (MIST) to efficiently refine task-specific discriminative representations with only video-level annotations. In particular, MIST is composed of 1) a multiple instance pseudo lab.l generator, which adapts a sparse continuous sampling strategy to produce more reliable clip-level pseudo lab.ls, and 2) a self-guided attention boosted feature encoder that aims to automatically focus on anomalous regions in frames while extracting task-specific representations. Moreover, we adopt a self-training scheme to optimize both components and finally obtain a task-specific feature encoder. Extensive experiments on two public datasets demonstrate the efficacy of our method, and our method performs comparably to or even better than existing supervised and weakly supervised methods, specifically obtaining a frame-level AUC 94.83% on ShanghaiTech. © 2021, CC BY.

关键词： Anomaly detection

来源：评论

学校读者我要写书评

暂无评论

ASAG: Building Strong One-Decoder-Layer Sparse Detectors via Adaptive Sparse Anchor Generation

ASAG: Building Strong One-Decoder-Layer Sparse Detectors via...

引用

International Conference on Computer Vision (ICCV)

作者： Shenghao Fu Junkai Yan Yipeng Gao Xiaohua Xie Wei-Shi Zheng School of Computer Science and Engineering Sun Yat-sen University China Guangdong Province Key Laboratory of Information Security Technology China Key Laboratory of Machine Intelligence and Advanced Computing Ministry of Education China Pengcheng Lab China

Recent sparse detectors with multiple, e.g. six, decoder layers achieve promising performance but much inference time due to complex heads. Previous works have explored using dense priors as initialization and built one-decoder-layer detectors. Although they gain remarkable acceleration, their performance still lags behind their six-decoder-layer counterparts by a large margin. In this work, we aim to bridge this performance gap while retaining fast speed. We find that the architecture discrepancy between dense and sparse detectors leads to feature conflict, hampering the performance of one-decoder-layer detectors. Thus we propose Adaptive Sparse Anchor Generator (ASAG) which predicts dynamic anchors on patches rather than grids in a sparse way so that it alleviates the feature conflict problem. For each image, ASAG dynamically selects which feature maps and which locations to predict, forming a fully adaptive way to generate image-specific anchors. Further, a simple and effective Query Weighting method eases the training instability from adaptiveness. Extensive experiments show that our method outperforms dense-initialized ones and achieves a better speed-accuracy trade-off. The code is availab.e at https://***/iSEE-lab.ratory/ASAG.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Rotation Exploration Transformer for Aerial Person Re-identification

Rotation Exploration Transformer for Aerial Person Re-identi...

引用

IEEE International Conference on Multimedia and Expo (ICME)

作者： Lei Wang Quan Zhang Junyang Qiu Jianhuang Lai School of Computer Science and Engineering Sun Yat-sen University China Guangdong Province Key Laboratory of Information Security Technology China Key Laboratory of Machine Intelligence and Advanced Computing Ministry of Education China Pazhou Lab (HuangPu) Guangzhou China

ISBN: (数字)9798350390155

ISBN: (纸本)9798350390162

Aerial person re-identification (AReID) focuses on accurately matching target person images within a UAV camera network. Challenges arise due to the broad field of view and arbitrary movement of UAVs, leading to foreground target rotation and background style variation. Existing AReID methods have provided limited solutions for the former, while the latter remains largely unexplored. This paper propose a Rotation Exploration Vision Transformer (RoExViT) to tackle the aforementioned dual challenges. Specifically, we design Multiple Rotation Tokens (MRT) to explore diverse rotational representations at the feature level, addressing foreground target rotation. To handle background style variation, we propose Cross-Camera Similarity (CCS) loss to effectively minimize the view gap among different cameras. Furthermore, we propose Iteratively Adaptive Batch Construction (IABC) strategy to mitigate overfitting on small datasets. Extensive experiments show that our method outperforms the state-of-the-art methods on PRAI-1581 and UAV-Human while also exhibting outstanding performance on Market1501.

关键词： Computer vision Transformers Cameras Autonomous aerial vehicles Identification of persons

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：