Existing Multimodal Large Language Models (MLLMs) encounter significant challenges in modeling the temporal context within long videos. Currently, mainstream Agent-based methods use external tools (e.g., search engine...
详细信息
Generalization remains a significant challenge for low-level vision models, which often struggle with unseen degradations in real-world scenarios despite their success in controlled benchmarks. In this paper, we revis...
详细信息
Photomosaic images are composite images composed of many small images called *** its overall visual effect,a photomosaic image is similar to the target image,and photomosaics are also called“montage art”.Noisy block...
详细信息
Photomosaic images are composite images composed of many small images called *** its overall visual effect,a photomosaic image is similar to the target image,and photomosaics are also called“montage art”.Noisy blocks and the loss of local information are the major obstacles in most methods or programs that create photomosaic *** solve these problems and generate a photomosaic image in this study,we propose a tile selection method based on error minimization.A photomosaic image can be generated by partitioning the target image in a rectangular pattern,selecting appropriate tile images,and then adding them with a weight *** on the principles of montage art,the quality of the generated photomosaic image can be evaluated by both global and local *** the proposed framework,via an error function analysis,the results show that selecting a tile image using a global minimum distance minimizes both the global error and the local error ***,the weight coefficient of the image superposition can be used to adjust the ratio of the global and local ***,to verify the proposed method,we built a new photomosaic creation dataset during this *** experimental results show that the proposed method achieves a low mean absolute error and that the generated photomosaic images have a more artistic effect than do the existing approaches.
Video stylization transfers a source video into an artistic version while maintaining temporal coherence between adjacent frames. In this paper, we formulate the unsupervised example-based video stylization with Marko...
详细信息
ISBN:
(纸本)9781450306164
Video stylization transfers a source video into an artistic version while maintaining temporal coherence between adjacent frames. In this paper, we formulate the unsupervised example-based video stylization with Markov random field model. In our algorithm, we implement an improved optical flow algorithm to maintain temporal coherence while improve the accuracy of estimation along motion boundaries. We also extend our algorithm to the application of video personalization, in which human faces keep clear and distinguishable. A series of techniques are fused in video personalization, including face detection and alignment, motion flow, skin detection, and illumination blending. Given a source video and a style template image, our algorithm produces the stylized and/or personalized video(s) automatically. Experimental results demonstrate that our algorithm performs excellently in both video stylization and personalization. Copyright 2011 ACM.
This paper proposes a novel approach to single image super-resolution. First, an image up-sampling scheme is proposed which takes the advantages of both bilateral filtering and mean shift image segmentation. Then we u...
详细信息
ISBN:
(纸本)9781450306164
This paper proposes a novel approach to single image super-resolution. First, an image up-sampling scheme is proposed which takes the advantages of both bilateral filtering and mean shift image segmentation. Then we use a shock filter to enhance strong edges in the initial up-sampling result and obtain an intermediate high-resolution image. Finally, we enforce a reconstruction constraint on the high-resolution image so that fine details can be inferred by back projection. Since strong edges in the intermediate result are enhanced, ringing artifacts can be suppressed in the back projection step. We compare our algorithm with several state-of-the-art image super-resolution algorithms. Qualitative and quantitative experimental results demonstrate that our approach performs the best. Copyright 2011 ACM.
Long-range contextual information is essential for achieving high-performance semantic segmentation. Previous feature re-weighting methods demonstrate that using global context for re-weighting feature channels can ef...
详细信息
The attention mechanism plays a pivotal role in designing advanced super-resolution (SR) networks. In this work, we design an efficient SR network by improving the attention mechanism. We start from a simple pixel att...
详细信息
Both performance and efficiency are important to semantic segmentation. State-of-the-art semantic segmentation algorithms are mostly based on dilated Fully Convolutional Networks (dilatedFCN), which adopt dilated conv...
详细信息
Real-time 3D sensing plays a critical role in robotic navigation, video surveillance and human-computer interaction, etc. When computing 3D structures of dynamic scenes from stereo sequences, spatiotemporal stereo and...
详细信息
This paper presents a system that can automatically segment objects in large scale 3D point clouds obtained from urban ranging images. The system consists of three steps: The first one involves a ground detection proc...
详细信息
ISBN:
(纸本)9781450306164
This paper presents a system that can automatically segment objects in large scale 3D point clouds obtained from urban ranging images. The system consists of three steps: The first one involves a ground detection process that can detect relatively complex terrain and separate it from other objects. The second step superpixelizes the remaining objects to speed up the segmentation process. In the final step, a manifold embedded mode seeking method is adopted to segment the point clouds. Even though the segmentation of urban objects is a challenging problem in terms of accuracy and problem scale, our system can efficiently generate very good segmentation results. The proposed manifold learning effectively improves the segmentation performance due to the fact that continuous artificial objects often have manifold-like structures. Copyright 2011 ACM.
暂无评论