Deep neural networks have surpassed human performance in key visual challenges such as object recognition, but require a large amount of energy, computation, and memory. In contrast, spiking neural networks (SNNs) hav...
详细信息
Deep neural networks have surpassed human performance in key visual challenges such as object recognition, but require a large amount of energy, computation, and memory. In contrast, spiking neural networks (SNNs) have the potential to improve both the efficiency and biological plausibility of object recognition systems. Here we present a SNN model that uses spike-latency coding and winner-take-all inhibition (WTA-I) to efficiently represent visual stimuli using multi-scale parallel processing. Mimicking neuronal response properties in early visual cortex, images were preprocessed with three different spatial frequency (SF) channels, before they were fed to a layer of spiking neurons whose synaptic weights were updated using spike-timing-dependent-plasticity. We investigate how the quality of the represented objects changes under different SF bands and WTA-I schemes. We demonstrate that a network of 200 spiking neurons tuned to three SFs can efficiently represent objects with as little as 15 spikes per neuron. Studying how core object recognition may be implemented using biologically plausible learning rules in SNNs may not only further our understanding of the brain, but also lead to novel and efficient artificial vision systems.
Experimental data suggests that a first hypothesis about the content of a complex visual scene is available as early as 150 ms after stimulus presentation. Other evidence suggests that recognition in the visual cortex...
详细信息
Experimental data suggests that a first hypothesis about the content of a complex visual scene is available as early as 150 ms after stimulus presentation. Other evidence suggests that recognition in the visual cortex of mammals is a bidirectional, often top-down driven process. Here, we present a spiking neural network model that demonstrates how the cortex can use both strategies: Faced with a new stimulus, the cortex first tries to catch the gist of the scene. The gist is then fed back as global hypothesis to influence and redirect further bottom-up processing. We propose that these two modes of processing are carried out in different layers of the cortex. A cortical column may, thus, be primarily defined by the specific connectivity that links neurons in different layers into a functional circuit. Given an input, our model generates an initial hypothesis after only a few milliseconds. The first wave of action potentials traveling up the hierarchy activates representations of features and feature combinations. In most cases, the correct feature representation is activated strongest and precedes all other candidates with millisecond precision. Thus, our model codes the reliability of a response in the relative latency of spikes. In the subsequent refinement stage where high-level activity modulates lower stages, this activation dominance is propagated back, influencing its own afferent activity to establish a unique decision. Thus, top-down influence de-activates representations that have contributed to the initial hypothesis about the current stimulus, comparable to predictive coding. Features that do not match the top-down prediction trigger an error signal that can be the basis for learning new representations. (C) 2009 Elsevier Ltd. All rights reserved.
暂无评论