Highly discriminative feature expression for non-rigid shape recognition is an important and challenging task, which requires both abstract and robust shape descriptors. However, the majority of existing low-level des...
详细信息
Highly discriminative feature expression for non-rigid shape recognition is an important and challenging task, which requires both abstract and robust shape descriptors. However, the majority of existing low-level descriptors are designed via hand-crafted, which are sensitive to local changes and larger deformation. To address this issue, this paper proposes a bag of shape descriptor based on unsupervised deep learning and Bag of Words (BoW) for shape recognition. Different from existing pipelines, our method is specially designed to learn high-level and hierarchical shape features from multi-scale context structures. It effectively overcomes obstacles, such as irregular topology, orientation ambiguity, and rigid or non-rigid transformation in the hierarchical learning of contour fragments. Specifically, by adopting an improved decomposing strategy, the shape can be decomposed to a series of valuable contour fragments, results in local to global feature learning. An unsupervised learning framework is also applied to the contour fragment for its feature expression based on the context structure and SSAE (Stack Sparse Auto Encode). In the process of shape representation, a high-level shape dictionary is learned by K-clustering to achieve discriminative feature coding. In addition, to achieve a compact and simplified shape representation, SPM (Spatial Pyramid Matching) is adopted by max-pooling, which effectively incorporates spatial layout information of the given shape. The experiments demonstrate that the proposed method achieves state-of-the-art performance on several public shape datasets comparing with the latest approaches. Our method also obtains high performance under the noisy and occlusion condition.
作者:
Jang, ESHanyang Univ
Coll Informat & Commun Software Div Seoul 133791 South Korea
Although frame-based MPEG-4 video services have been successfully deployed since 2000, MPEG-4 video coding is now facing great competition in becoming a dominant player in the market. Object-based coding is one of the...
详细信息
Although frame-based MPEG-4 video services have been successfully deployed since 2000, MPEG-4 video coding is now facing great competition in becoming a dominant player in the market. Object-based coding is one of the key functionalities of MPEG-4 video coding. Real-time object-based video encoding is also important for multimedia broadcasting for the near future. Object-based video services using MPEG-4 have not yet made a successful debut due to several reasons. One of the critical problems is the coding complexity of object-based video coding over frame-based video coding. Since a video object is described with an arbitrary shape, the bitstream contains not only motion and texture data but also shape data. This has introduced additional complexity to the decoder side as well as to the encoder side. In this paper, we have analyzed the current MPEG-4 video encoding tools and proposed efficient coding technologies that reduce the complexity of the encoder. Using the proposed coding schemes, we have obtained a 56 percent reduction in shape-coding complexity over the MPEG-4 video reference software (Microsoft version, 2000 edition).
In a rectangular-block division-based coding method, there is a problem that, in the blocks where different regions coexist, the coding efficiency decreases. In the segmentation-based coding method, which is promising...
详细信息
Visual shape perception is central to many everyday tasks, from object recognition to grasping and handling tools.1-10 Yet how shape is encoded in the visual system remains poorly understood. Here, we probed shape rep...
详细信息
Visual shape perception is central to many everyday tasks, from object recognition to grasping and handling tools.1-10 Yet how shape is encoded in the visual system remains poorly understood. Here, we probed shape representations using visual aftereffects-perceptual distortions that occur following extended exposure to a stimulus.11-17 Such effects are thought to be caused by adaptation in neural populations that encode both simple, low-level stimulus characteristics17-20 and more abstract, high-level object features.21-23 To tease these two contributions apart, we used machine -learning methods to synthesize novel shapes in a multidimensional shape space, derived from a large database of natural shapes.24 Stimuli were carefully selected such that low-level and high-level adaptation models made distinct predictions about the shapes that observers would perceive following adaptation. We found that adaptation along vector trajectories in the high-level shape space predicted shape aftereffects better than simple low-level processes. Our findings reveal the central role of high-level statistical features in the visual representation of shape. The findings also hint that human vision is attuned to the distribution of shapes experienced in the natural environment.
暂无评论