Photomosaic images are composite images composed of many small images called *** its overall visual effect,a photomosaic image is similar to the target image,and photomosaics are also called“montage art”.Noisy block...
详细信息
Photomosaic images are composite images composed of many small images called *** its overall visual effect,a photomosaic image is similar to the target image,and photomosaics are also called“montage art”.Noisy blocks and the loss of local information are the major obstacles in most methods or programs that create photomosaic *** solve these problems and generate a photomosaic image in this study,we propose a tile selection method based on error minimization.A photomosaic image can be generated by partitioning the target image in a rectangular pattern,selecting appropriate tile images,and then adding them with a weight *** on the principles of montage art,the quality of the generated photomosaic image can be evaluated by both global and local *** the proposed framework,via an error function analysis,the results show that selecting a tile image using a global minimum distance minimizes both the global error and the local error ***,the weight coefficient of the image superposition can be used to adjust the ratio of the global and local ***,to verify the proposed method,we built a new photomosaic creation dataset during this *** experimental results show that the proposed method achieves a low mean absolute error and that the generated photomosaic images have a more artistic effect than do the existing approaches.
Fine-grained 3D shape classification poses challenges in effectively capturing and integrating discriminative features residing in subtle local regions. Previous methods typically extract features independently from i...
Fine-grained 3D shape classification poses challenges in effectively capturing and integrating discriminative features residing in subtle local regions. Previous methods typically extract features independently from individual views of 3D shapes, with a focus on various strategies for fusing these extracted view features. However, this approach neglects interview correlations and potential redundancies among different views. In this study, we introduce $$\hbox {C}^2$$ DFL, which consists of two primary modules: cross-view discriminative feature extraction (CV-DFE) and cross-layer discriminative feature fusion (CL-DFF). CV-DFE integrates discriminative features by merging inputs from multiple views, mitigating limitations associated with isolated feature extraction. CL-DFF dynamically selects key tokens using a transformer model to interactively fuse discriminative features from various levels. Extensive experiments conducted on three categories of the FG3D dataset demonstrate the exceptional efficacy of $$\hbox {C}^2$$ DFL in capturing and integrating discriminative features of 3D shapes. The proposed method achieves state-of-the-art accuracy in fine-grained 3D shape classification (FGSC).
In deep learning, supervised learning techniques usually require a large amount of expensive labeled data to train the network, and the feature representations extracted by the model usually mix multiple attributes, r...
详细信息
The window-based attention is used to alleviate the problem of abrupt increase in computation as the input image resolution grows and shows excellent performance. However, the problem that aggregating global features ...
详细信息
The attention mechanism plays a pivotal role in designing advanced super-resolution (SR) networks. In this work, we design an efficient SR network by improving the attention mechanism. We start from a simple pixel att...
详细信息
Automatically understanding funny moments (i.e., the moments that make people laugh) when watching comedy is challenging, as they relate to various features, such as body language, dialogues and culture. In this paper...
详细信息
This paper focuses on the image composition of transparent objects, where existing image matting methods suffer from composition errors due to the lack of accurate foreground during the composition process. We propose...
详细信息
Fine-grained 3D shape classification (FGSC) remains challenging due to the difficulty of adaptively capturing global structure differences and subtle inter-class distinctions. This paper directly extends vision Transf...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Fine-grained 3D shape classification (FGSC) remains challenging due to the difficulty of adaptively capturing global structure differences and subtle inter-class distinctions. This paper directly extends vision Transformer (ViT) to FGSC, proposing a pure Transformer network FG3DFormer that fully leverages ViT’s global correlation and local attention abilities. FG3Dformer comprises the Hierarchical Feature Extraction (HFE) and the Hierarchical Feature Refinement (HFR), interconnected through the Adaptive View Region Selection (AVRS). Firstly, the HFE comprehensively evaluates the significance of intra-view patches and views driven by inter-view and intraview attention. Then, the AVRS adaptively selects crucial patch Tokens from different views to serve as sources of subtle local features. Finally, the HFR refines the 3D shape descriptor, capturing more discriminative global and subtle local features by leveraging both the view and selected crucial patch Tokens. Extensive experiments on FG3D and ModelNet40 demonstrate the superiority of FG3Dformer in FGSC and meta-category 3D shape classification tasks.
Facial Action Unit (AU) detection is a crucial task in affective computing and social robotics as it helps to identify emotions expressed through facial expressions. Anatomically, there are innumerable correlations be...
详细信息
Under-Display Camera (UDC) has been widely exploited to help smartphones realize full-screen displays. However, as the screen could inevitably affect the light propagation process, the images captured by the UDC syste...
详细信息
暂无评论