3d model classification based on view has become a hot research topic. If all projection views of 3dmodel are treated equally, the importance anddifference of different views, and complementary and correlation infor...
详细信息
3d model classification based on view has become a hot research topic. If all projection views of 3dmodel are treated equally, the importance anddifference of different views, and complementary and correlation information between views will be ignored. In order to solve these issues, this paper proposes a method of 3d model classification based on deep Residual Shrinkage Network (dRSN) and multi-view feature fusion. Firstly, 3dmodel is projected into six 2d views. Secondly, dRSN is used to extract view features from 2d views. Thirdly, shape distribution features d1, d2, d3 of 2d view are integrated with view features to get the fusion feature. Fourthly, the fusion feature is input into softmax function to get discriminative features and Shannon entropy is used to compute the uncertainty of view classification to measure view saliency. Fifthly, the fusion features of 2d views in descending order of view saliency are input into Long Short-Term Memory (LSTM) in sequence for fusing multi-view features. Finally, softmax function is adopted to classify 3dmodel based on multi-view fusion feature. Experimental results show that accuracy of the proposed method achieves 93.28% on modelNet10 dataset and it demonstrates higher accuracy.
3dmodels are widely used in industrial manufacturing, virtual reality, medical diagnosis and so on. At present, view-based3d model classification has become an important research topic. However, single view feature ...
详细信息
3dmodels are widely used in industrial manufacturing, virtual reality, medical diagnosis and so on. At present, view-based3d model classification has become an important research topic. However, single view feature can not describe the overall shape of 3dmodel. When multiple views are fused to describe 3dmodel, useful information is confused. It causes certain interference to determine 3dmodel's category. To solve these problems, a novel method of 3d model classification based on RegNet design space and voting algorithm is proposed. Firstly, 2d views of 3dmodel are input into RegNet design space with attention mechanism to extract high-level semantic feature(HSF). Secondly, HSF and the corresponding low-level shape features (LSF) of view are fused, including d1, d2, d3, Fourier descriptor, and Zernike moment. Thirdly, LSTM is combined with softmax function to extract more representative features from the fused feature. Finally, based on discriminative features, improved voting algorithm based on shannon entropy is constructed to determine 3dmodel's category. Experimental results show that average accuracy of the proposed method on modelNet10 reaches 94.93%, and the classification performance is outstanding.
Nowadays, driven by the increasing concern on 3d techniques, resulting in the large-scale 3ddata, 3d model classification has attracted enormous attention from both research and industry communities. Most of the curr...
详细信息
Nowadays, driven by the increasing concern on 3d techniques, resulting in the large-scale 3ddata, 3d model classification has attracted enormous attention from both research and industry communities. Most of the current methods highly depend on sufficient labeled3dmodels, which substantially restricts their scalability to novel classes with few annotated training data since it can increase the chance of overfitting. Besides, they only leverage single-modal information (either point cloud or multi-view information), and few works integrate these complementary information for 3dmodel representation. To overcome these problems, we propose a multi-modal meta-transfer fusion network (M3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$<^>3d model classification$$\end{document}TF), the key of which is to perform few-shot multi-modal representation for 3d model classification. Specifically, we first convert the original 3ddata into both multi-view and point cloud modalities, and pre-train individual encoding networks on a large-scale dataset to obtain the optimal initial parameters, which is beneficial to few-shot learning tasks. Then, to enable the network to adjust to few-shot learning tasks, we update the parameters in Scaling and Shifting operation (SS), multi-modal representation fusion (MMRF) and the 3dmodel classifier to obtain optimal initialization parameters. Since the large-scale training parameters in feature extractors will increase the chance of overfitting, we freeze the feature extractor and introduce a SS operation to adjust its weights. Specifically, SS can reduce the number of training parameters up to 20%, which can effectively avoid overfitting. MMRF can adaptively integrate the multi-modal information based on their significance to the 3dmodel for a more robust 3d representation. Since t
The most existing methods for 3d model classification and retrieval rely on the fully supervised training scheme, which are prohibitive and time-consuming to collect and label 3dmodels of wide different categories. H...
详细信息
The most existing methods for 3d model classification and retrieval rely on the fully supervised training scheme, which are prohibitive and time-consuming to collect and label 3dmodels of wide different categories. How to make full use of the existing known data to represent the unknown data is a crucial topic. Inspired by the zero-shot learning in 2d image domain, we propose the semantically guided projection method to classify and retrieve unseen 3dmodels by exploring the semantic relationship between seen and unseen 3dmodels. First, we explore the multi-view information of 3dmodels to construct the semantic attributes as the prior knowledge to represent 3dmodels. Then, we learn bidirectional projections from visual features to semantics and from semantics to visual features, which can eliminate the gap between seen and unseen domains. Extensive experiments for zero-shot 3d model classification and retrieval on two popular datasets, modelNet40 and ShapeNetCore55, have demonstrated the effectiveness and superiority of the proposed method.
With the development of multimedia technology, 3dmodel has been applied in many fields such as mechanical design, construction industry, entertainment industry, medical treatment and so on. The number of 3dmodel is ...
详细信息
With the development of multimedia technology, 3dmodel has been applied in many fields such as mechanical design, construction industry, entertainment industry, medical treatment and so on. The number of 3dmodel is becoming more and more in our lives. Therefore, effective automatic management andclassification of 3dmodels become more and more important. In this paper, we propose a dual-meta-learner model based on LSTM to learn the exact optimization algorithm used to train another two learner neural network classifier in the few-shot regime. The parametrization of our model allows it to learn appropriate parameter updates specifically for the scenario where a set amount of updates will be made, while it can also achieve a general initialization of the learner (classifier) network that allows for quick convergence of training. Our method attains state-of-the-art performance by significant margins. (C) 2019 Published by Elsevier B.V.
IntroductionExisting multi-view-based3d model classification methods have the problems of insufficient view refinement feature extraction and poor generalization ability of the network model, which makes it difficult...
详细信息
IntroductionExisting multi-view-based3d model classification methods have the problems of insufficient view refinement feature extraction and poor generalization ability of the network model, which makes it difficult to further improve the classification accuracy. To this end, this paper proposes a multi-view SoftPool attention convolutional network for 3d model classification tasks. MethodsThis method extracts multi-view features through ResNest and adaptive pooling modules, and the extracted features can better represent 3dmodels. Then, the results of the multi-view feature extraction processed using SoftPool are used as the Query for the self-attentive calculation, which enables the subsequent refinement extraction. We then input the attention scores calculated by Query and Key in the self-attention calculation into the mobile inverted bottleneck convolution, which effectively improves the generalization of the network model. Based on our proposed method, a compact 3d global descriptor is finally generated, achieving a high-accuracy 3d model classification performance. ResultsExperimental results showed that our method achieves 96.96% OA and 95.68% AA on modelNet40 and 98.57% OA and 98.42% AA on modelNet10. discussionCompared with a multitude of popular methods, our algorithm model achieves the state-of-the-art classification accuracy.
The rapiddevelopment of information technology has also brought new vitality to art design. The 3d animation model making is a new multimedia technology based on computer technology. In order to efficiently organise ...
详细信息
The rapiddevelopment of information technology has also brought new vitality to art design. The 3d animation model making is a new multimedia technology based on computer technology. In order to efficiently organise and utilise the 3dmodel resources, researchers focus on how to achieve effective retrieval andclassification. In order to realise the recognition andclassification of 3dmodels, a novel network model called3dSmallPCapsNet is proposed in this paper based on the feature that Capsule Network (CapsNet) exploits vector neurons to store feature space information. The proposed method can extract more representative features while reducing the model complexity. To evaluate our method, three different methods which are MeshNet, Shape-dNA and GPS-embedding, are compared. The experimental results on data sets SHREC10 and SHREC15 show that the proposed method has better performance.
Unsupervised3dmodel analysis has attracted tremendous attentions with the increasing growth of 3dmodeldata and the extensive human annotations. Many effective methods have been designed to address the 3dmodel ana...
详细信息
Unsupervised3dmodel analysis has attracted tremendous attentions with the increasing growth of 3dmodeldata and the extensive human annotations. Many effective methods have been designed to address the 3dmodel analysis with labeled information, while rare methods devote to unsuperviseddeep learning due to the difficulty of mining reliable information. In this paper, we propose a novel unsuperviseddeep learning method named joint local correlation and global contextual information (LCGC) for 3dmodel retrieval andclassification, which mines the reliable triplet set and uses triplet loss to optimize the deep neural network. Our method proposes two schemes: 1) Local self-correlation information learning, which adopts the intra and inter information to construct the view-level triplet set. 2) Global neighbor contextual information learning, which employs the neighbor contextual information to explore the reliable relations among 3dmodels and construct the model-level triplet set. The above schemes encourage that the selected triple set can been used to improve the discrimination of learned features. Extensive evaluations on two large-scale datasets, modelNet40 and ShapeNet55, have demonstrated the effectiveness of our proposed method.
We design an ingenious view-pooling method named learning-based multiple pooling fusion (LMPF), and apply it to multi-view convolutional neural network (MVCNN) for 3d model classification or retrieval. By this means, ...
详细信息
We design an ingenious view-pooling method named learning-based multiple pooling fusion (LMPF), and apply it to multi-view convolutional neural network (MVCNN) for 3d model classification or retrieval. By this means, multi-view feature maps projected from a 3dmodel can be compiled as a simple and effective feature descriptor. The LMPF method fuses the max pooling method and the mean pooling method by learning a set of optimal weights. Compared with the hand-crafted approaches such as max pooling and mean pooling, the LMPF method can decrease the information loss effectively because of its "learning" ability. Experiments on modelNet40 dataset and McGill dataset are presented and the results verify that LMPF can outperform those previous methods to a great extent.
In this paper, we address a problem of correcting upright orientation of a reconstructed object to search. We first reconstruct an input object appearing in an image sequence, and generate a query shape using multi-vi...
详细信息
In this paper, we address a problem of correcting upright orientation of a reconstructed object to search. We first reconstruct an input object appearing in an image sequence, and generate a query shape using multi-view object co-segmentation. In the next phase, we utilize the Convolutional Neural Network (CNN) architecture to determine category-specific upright orientation of the queried shape for 3d model classification and retrieval. As a practical application of our system, a shape style and a pose from an inferred category and up-vector are obtained by comparing 3d shape similarity with candidate 3dmodels and aligning its projections with a set of 2d co-segmentation masks. We quantitatively and qualitatively evaluate the presented system with more than 720 upfront-aligned3dmodels and five sets of multi-view image sequences. (C) 2020 Published by Elsevier B.V.
暂无评论