检索结果-内蒙古大学图书馆

Feature calibration and feature separation for long-tailed visual recognition

NEUROCOMPUTING 2025年 637卷

作者： Wang, Qianqian Zhou, Fangyu Zhao, Xiangge Lin, Yangtao Ye, Haibo Nanjing Univ Aeronaut & Astronaut Coll Comp Sci & Technol Nanjing 210016 Jiangsu Peoples R China Chinese Aeronaut Estab Grad Sch Yangzhou 225111 Jiangsu Peoples R China

Long-tailed data classification is prevalent in real-world scenarios, but training on such datasets can lead to biased classifications and poor performance. We address this challenge by focusing on improving feature representation for tail classes, which is often lower in quality due to their closer proximity to other distinct classes. Inspired by the similarity between head and tail classes, we propose Class-wise Knowledge Distillation (CKD) to help tail classes learn prediction distributions from head classes, thus calibrating their features. Additionally, we introduce Hard Negative Samples Sampling (HNSS) to enhance feature separation by selecting challenging negative examples for contrastive learning. Our Feature Calibration and Feature Separation (FCFS) method achieves competitive results on CIFAR10-LT, CIFAR100-LT, and ImageNet-LT benchmarks, demonstrating effective feature learning for long-tailed classification problems. This approach leverages both knowledge distillation and hard negative sampling to improve model performance.

关键词： Long-tail learning Contrastive learning Feature learning visual recognition

来源：评论

学校读者我要写书评

暂无评论

Learning Local Spatial and Global Context Activation for visual recognition 7th

Learning Local Spatial and Global Context Activation for Vis...

引用

7th Chinese Conference on Pattern recognition and Computer Vision

作者： Liu, Yunfei Zhou, Lijun Zhang, Junran Sichuan Univ Coll Elect Engn Chengdu Peoples R China

ISBN: (纸本)9789819784899;9789819784905

The selection of activation functions in visual recognition significantly influences training dynamics and task performance. This study introduces an activation function called local spatial and global context activation (SCeLU), which is a conceptually effective activation function. SCeLU extends the Rectified Linear Unit (ReLU) and FReLU to a 3D activation by incorporating a negligible overhead of spatial context conditions. The forms of ReLU and FReLU are f (x) = max (x, 0) and f (x) = max (x, T(x)), respectively, where T(.) represents the 2D spatial condition. However, SCeLU takes the form of f (x) = max (x, Pi(x) . Gamma(x)), where Pi(.) represents the 3D global context condition and Gamma(.) represents the 2D local spatial condition. Intuitively, the context condition facilitates the modeling of global information, while the spatial condition enhances the capacity for local pixel-wise modeling. By appropriately combining spatial and context conditions, SCeLU demonstrates adaptability to complex visual layouts in various image recognition tasks. By simply changing the activation function, experiments conducted on ImageNet demonstrate a significant enhancement and robustness of SCeLU, particularly for small models, and some enhancements under partially highly optimized large models. Furthermore, our novel SCeLU seamlessly extends to object detection and semantic segmentation tasks, underscoring its effectiveness as an effective alternative in various visual recognition tasks. Our model is open-sourced at https://***/ YunDuanFei/SCeLU.

关键词： visual recognition Activation function Global context Spatial condition

来源：评论

学校读者我要写书评

暂无评论

Spike-VisNet: A novel framework for visual recognition with FocusLayer-STDP learning

引用

NEURAL NETWORKS 2025年 182卷 106918页

作者： Liu, Ying Luo, Xiaoling Zhang, Ya Zhang, Yun Zhang, Wei Qu, Hong Univ Elect Sci & Technol China Sch Comp Sci & Engn Chengdu 610054 Peoples R China Sichuan Univ Sci & Engn Key Lab Higher Educ Sichuan Prov Enterprise Inform Yibin 644000 Peoples R China Southwest Petr Univ Sch Comp Sci & Software Engn Chengdu 610500 Peoples R China Hebei Univ Technol Sch Elect & Informat Engn Tianjin 300401 Peoples R China

Current vision-inspired spiking neural networks (SNNs) face key challenges due to their model structures typically focusing on single mechanisms and neglecting the integration of multiple biological features. These limitations, coupled with limited synaptic plasticity, hinder their ability to implement biologically realistic visual processing. To address these issues, we propose Spike-VisNet, a novel retina-inspired framework designed to enhance visual recognition capabilities. This framework simulates both the functional and layered structure of the retina. To further enhance this architecture, we integrate the FocusLayer-STDP learning rule, allowing Spike-VisNet to dynamically adjust synaptic weights in response to varying visual stimuli. This rule combines channel attention, inhibition mechanisms, and competitive mechanisms with spike-timing- dependent plasticity (STDP), significantly improving synaptic adaptability and visual recognition performance. Comprehensive evaluations on benchmark datasets demonstrate that Spike-VisNet outperforms other STDPbased SNNs, achieving precision scores of 98.6% on MNIST, 93.29% on ETH-80, and 86.27% on CIFAR-10. These results highlight its effectiveness and robustness, showcasing Spike-VisNet's potential to simulate human visual processing and its applicability to complex real-world visual challenges.

关键词： Spiking neural networks Biomimetic vision Unsupervised learning visual recognition

来源：评论

学校读者我要写书评

暂无评论

The Design of a Maze Solving Robot Based on visual recognition and A∗ Algorithm 24

The Design of a Maze Solving Robot Based on Visual Recogniti...

引用

3rd International Conference on Artificial Intelligence and Education, ICAIE 2024

作者： Yang, Yanli Huang, Yixin Hong, Gaiyan Yang, Guangsong Chengyi College Jimei University Fujian Xiamen China School of Ocean Information Engineering Jimei University Fujian Xiamen China

ISBN: (纸本)9798400712692

To solve the problem of finding the target object in the complicated maze, an intelligent robot with visual recognition and automatic driving is designed, which uses an OpenMV visual recognition module to identify the locations of the target objects in the maze, calculate their coordinates, identify the maze map, and judge the truth and falsity of the targets. According to the digitized maze map and the target object coordinates, the A∗heuristic seasrch algorithm is used to calculate the path and find the optimal path. At the same time, the high-precision laser ranging sensor is used to detect whether the obstacle is encountered and avoid obstacles in time. The experimental results show that the maze solving robot can quickly and stably travel along the optimal path, and accurately identify the truth and falsity of the targets, and deal with various unexpected situations. © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.

关键词： A∗ algorithm Maze solving robot visual recognition

来源：评论

学校读者我要写书评

暂无评论

visual recognition of continuous hand postures

引用

IEEE TRANSACTIONS ON NEURAL NETWORKS 2002年第4期13卷 983-994页

作者： Nölker, C Ritter, H Univ Bielefeld Fac Technol Neuroinformat Dept D-4800 Bielefeld Germany

his paper describes GREFIT (Gesture recognition based on Finger Tips), a neural-network-based system which recognizes continuous hand postures from gray-level video images (posture capturing). Our approach yields a full identification of all finger joint angles (making, however, some assumptions about joint couplings to simplify computations). This allows a full reconstruction of the three-dimensional (3-D) hand shape, using an articulated hand model with 16 segments and 20 joint angles. GREFIT uses a two-stage approach to solve this task. In the first stage, a hierarchical system of artificial neural networks (ANNs) combined with a priori knowledge locates the two-dimensional (2-D) positions of the finger tips in the image. In the second stage, the 2-D position information is transformed by an ANN into an estimate of the 3-D configuration of an articulated hand model, which is also used for visualization. This model is designed according to the dimensions and movement possibilities of a natural human hand. The virtual hand imitates the user's hand to an remarkable accuracy and can follow postures from gray scale images at a frame rate of 10 Hz.

关键词： hand model hand posture human-computer intraction (HCI) inverse kinematics local linear mapping (LLM) network neural network self-organizing map (SOM) visual learning visual recognition

来源：评论

学校读者我要写书评

暂无评论

visual recognition for humanoid robots

引用

ROBOTICS AND AUTONOMOUS SYSTEMS 2017年 91卷 151-168页

作者： Fanello, Sean Ryan Ciliberto, Carlo Noceti, Nicoletta Metta, Giorgio Odone, Francesca Ist Italiano Tecnol ICub Facil Genoa Italy Univ Genoa DIBRIS Genoa Italy Microsoft Res Labs Redmond WA 98051 USA

visual perception is a fundamental component for most robotics systems operating in human environments. Specifically, visual recognition is a prerequisite to a large variety of tasks such as tracking, manipulation, human-robot interaction. As a consequence, the lack of successful recognition often becomes a bottleneck for the application of robotics system to real-world situations. In this paper we aim at improving the robot visual perception capabilities in a natural, human-like fashion, with a very limited amount of constraints to the acquisition scenario. In particular our goal is to build and analyze a learning system that can rapidly be re-trained in order to incorporate new evidence if available. To this purpose, we review the state-of-the-art coding-pooling pipelines for visual recognition and propose two modifications which allow us to improve the quality of the representation, while maintaining real-time performances: a coding scheme, Best Code Entries (BCE), and a new pooling operator, Mid-Level Classification Weights (MLCW). The former focuses entirely on sparsity and improves the stability and computational efficiency of the coding phase, the latter increases the discriminability of the visual representation, and therefore the overall recognition accuracy of the system, by exploiting data supervision. The proposed pipeline is assessed from a qualitative perspective on a Human-Robot Interaction (HRI) application on the iCub platform. Quantitative evaluation of the proposed system is performed both on in-house robotics data sets (iCubWorld) and on established computer vision benchmarks (Caltech-256, PASCAL VOC 2007). As a byproduct of this work, we provide for the robotics community an implementation of the proposed visual recognition pipeline which can be used as perceptual layer for more complex robotics applications. (C) 2016 Published by Elsevier B.V.

关键词： Human-Robot Interaction Learning and interaction visual recognition Sparse representations iCub

来源：评论

学校读者我要写书评

暂无评论

visual recognition Method Based on Hybrid KPCA Network

引用

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS 2020年第9期E103D卷 2015-2018页

作者： Yang, Feng Ma, Zheng Xie, Mei Univ Elect Sci & Technol China Sch Informat & Commun Engn Chengdu Peoples R China Wenzhou Med Univ Sch Informat & Engn Wenzhou Peoples R China

In this paper, we propose a deep model of visual recognition based on hybrid KPCA Network(H-KPCANet), which is based on the combination of one-stage KPCANet and two-stage KPCANet. The proposed model consists of four types of basic components: the input layer, one-stage KPCANet, two-stage KPCANet and the fusion layer. The role of one-stage KPCANet is to calculate the KPCA filters for convolution layer, and two-stage KPCANet is to learn PCA filters in the first stage and KPCA filters in the second stage. After binary quantization mapping and block-wise histogram, the features from two different types of KPCANets are fused in the fusion layer. The final feature of the input image can be achieved by weighted serial combination of the two types of features. The performance of our proposed algorithm is tested on digit recognition and object classification, and the experimental results on visual recognition benchmarks of MNIST and CIFAR-10 validated the performance of the proposed H-KPCANet.

关键词： visual recognition KPCA feature fusion H-KPCANet

来源：评论

学校读者我要写书评

暂无评论

visual recognition of melamine in milk via selective metallo-hydrogel formation

引用

Chinese Chemical Letters 2019年第12期30卷 2266-2270页

作者： Xiaoling Bao Jianhong Liu Qingshu Zheng Wei Pei Yimei Yang Yanyun Dai Tao Tu Institute of Quality Inspection of Food and Cosmetics Shanghai Institute of Quality Inspection and Technical ResearchShanghai 200233China Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials Department of ChemistryFudan UniversityShanghai 200438China

A series of novel six-coordinated terpyridine zinc complexes,containing ammonium salts and thymine fragment at the two terminals,have been designed and synthesized,which can function as highly sensitive visualized sensors for melamine detection via selective metallo-hydrogel *** fully characterization by various techniques,the complementary triple-hydrogen-bonding between the thymine fragment and melamine,as well as π-π stacking interactions may be responsible for the selective metallo-hydrogel *** light of the possible interference aroused by milk ingredients(proteins,peptides and amino acids) and legal/illegal additives(urine,sugars and vitamins),a series of control experiments are therefore *** our delight,this visual recognition is highly selective,no gelation was observed with the selected milk ingredients or ***,this new developed protocol enables convenient and highly selective visual recognition of melamine at a concentration as low as 10 ppm in raw milk without any tedious pretreatment.

关键词： Hydrogen-bonding interaction Pincer zinc complex Melamine Metallo-hydrogel visual recognition

来源：评论

学校读者我要写书评

暂无评论

visual recognition IN INFANT PIGTAILED MACAQUES AFTER A 24-HOUR DELAY

引用

AMERICAN JOURNAL OF PRIMATOLOGY 1985年第3期8卷 259-264页

作者： GUNDERSON, VM SWARTZ, KB CUNY HERBERT H LEHMAN COLL DEPT PSYCHOL BRONX NY 10468 USA UNIV WASHINGTON CTR CHILD DEV & MENTAL RETARDAT SEATTLE WA 98195 USA

Comparative studies of memory in monkey and human subjects suggest similarities in visual recognition memory across human and nonhuman primates. In order to investigate developmental aspects of visual recognition memory in monkey infants, the familiarization‐novelty procedure, developed for use with human infants, was employed with pigtailed monkey infants to study long‐delay recognition memory. Subjects were familiarized with a black‐and‐white abstract pattern. Twenty‐four hours later they were tested with the familiar pattern paired with a novel one. Results indicated a significant visual preference for the novel stimulus, providing evidence for recognition memory. These results parallel those obtained with human infants, suggesting further similarities in the development of visual recognition memory.

关键词： visual recognition memory development pigtailed macaques Macaca nemestrina

来源：评论

学校读者我要写书评

暂无评论

visual recognition IMPAIRMENT FOLLOWS VENTROMEDIAL BUT NOT DORSOLATEAL PREFRONTAL LESIONS IN MONKEYS

引用

BEHAVIOURAL BRAIN RESEARCH 1986年第3期20卷 249-261页

作者： BACHEVALIER, J MISHKIN, M Laboratory of Neuropsychology National Institute of Mental Health Bethesda MD 20892 U.S.A.

visual recognition in monkeys appears to involve the participation of two limbothalamic pathways, one including the amygdala and the magnocellular portion of the medial dorsal nucleus (MDmc) and the other, the hippocampus and the anterior nuclei of the thalamus (Ant N). Both MDmc and Ant N project, in turn, to the prefrontal cortex, mainly to its ventral and medial portions. To test whether the prefrontal projection targets of the two limbothalamic pathways also participate in memory functions, performance on a variety of learning and memory tasks was assessed in monkeys with lesions of the ventromedial prefrontal cortex (Group VM). Normal monkeys and monkeys with lesions of dorsolateral prefrontal cortex (Group DL) served as controls. Group VM was severely impaired on a test of object recognition, whereas Group DL did not differ appreciably from normal animals. Conversely, the animals in Group VM were able to learn a spatial delayed response task, whereas 2 of the 3 animals in Group DL could not. Neither group was impaired in the acquisition of visual discrimination habits, even though the successive trials on a given discrimination were separated by 24-h intervals. The patterns of deficient suggest that ventromedial prefrontal cortex constitutes another station in the limbothalamic system underlying cognitive memory processes, whereas the dorsolateral prefrontal cortex lies outside this system. The results support the view that the classical delayed-response deficit observed after dorsolateral prefrontal lesions represents a perceptuo-mnemonic impairment in spatial functions selectively rather than a memory loss of a more general nature.

关键词： dorsolateral prefrontal cortex monkey ventromedial prefrontal cortex visual recognition

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：