The digital camera captured document images may often be warped and distorted due to different camera angles or document surfaces. A robust technique is needed to solve this kind of distortion. The research on dewarpi...
详细信息
We present a task from the critical infrastructure field in materials engineering. We created a surrogate model for the bridge construction object to determine the material parameters' values. The work aims to use...
详细信息
Deep learning has rapidly advanced, enabling new applications such as object detection, text recognition, and occlusion handling. However, challenges remain in the detection of objects in complex environments such as ...
详细信息
Deep learning has rapidly advanced, enabling new applications such as object detection, text recognition, and occlusion handling. However, challenges remain in the detection of objects in complex environments such as aerial images where things like motion blur, low light, and significant occlusion occur. This paper addresses a similar challenge by introducing a novel supervised framework, the Graphically Residual Attentive Network (GRESIDAN). In the same model, GRESIDAN integrates three synergistic pipelines for object detection, occlusion detection, and occlusion removal. GRESIDAN uses a residually attentive block combining ResNet-18 and a multi-headed attention mechanism to improve feature extraction and detection accuracy in low-quality, occluded aerial images. A graphically attentive occlusion detection pipeline is implemented to handle occlusion, segment better, and mask out the occluder in the aerial image. The GRESIDAN model is validated on the COCO-2017 dataset and a custom private aerial object detection dataset, outperforming the state-of-the-art methods in handling occlusion and detecting objects. Our contributions provide a robust solution to the problem of detecting and handling occluded objects in aerial imagery, pushing the boundaries of automated visual recognition in challenging real-world scenarios. The code for public use in training and testing is available on GitHub .
This paper presents a new method for automatic palmprint recognition based on kernel PCA method by integrating the Gabor wavelet representation of palm images. Gabor wavelets are first applied to derive desirable palm...
详细信息
This paper presents a new method for automatic palmprint recognition based on kernel PCA method by integrating the Gabor wavelet representation of palm images. Gabor wavelets are first applied to derive desirable palmprint features. The Gabor transformed palm images exhibit strong characteristics of spatial locality, scale, and orientation selectivity. These images can produce salient features that are most suitable for palmprint recognition. The kernel PCA method then nonlinearly maps the Gabor-wavelet image into a high-dimensional feature space. The proposed algorithm has been successfully tested on two different public data sets from the PolyU palmprint databases for which the samples were collected in two different sessions.
This special issue of the Journal of Intelligent & Fuzzy Systems is a selected collection of papers submitted to the IEEE International Conference on Algorithms, methodology, models and applications in emerging te...
详细信息
This special issue of the Journal of Intelligent & Fuzzy Systems is a selected collection of papers submitted to the IEEE International Conference on Algorithms, methodology, models and applications in emerging technologies and International Conference on Telecommunication, Power analysis and Computing Techniques and held from February 16-18, 2017 and April 6-8, 2017, Chennai, India. These papers have been reviewed and accepted for presentation at the conference and for publication in the Journal of Intelligent & Fuzzy Systems (JIFS). In this special issue there are 50 papers covering a wide range of tools, techniques and applications of artificial intelligent techniques and applications.
We present a modification of the Mumford-Shah functional and its cartoon limit which allows the incorporation of statistical shape knowledge in a single energy functional. We show segmentation results on artificial an...
详细信息
ISBN:
(纸本)076951278X
We present a modification of the Mumford-Shah functional and its cartoon limit which allows the incorporation of statistical shape knowledge in a single energy functional. We show segmentation results on artificial and real-world images with and without prior shape information. In the case of occlusion and strongly cluttered background the shape prior significantly improves segmentation. Finally we compare our results to those obtained by a level-set implementation of geodesic active contours.
With the explosive growth of Web and the recent development in digital media technology, the number of images on the Web has grown tremendously. Consequently, Web image clustering has emerged as an important applicati...
详细信息
ISBN:
(纸本)9781605580852
With the explosive growth of Web and the recent development in digital media technology, the number of images on the Web has grown tremendously. Consequently, Web image clustering has emerged as an important application. Some of the initial efforts along this direction revolved around clustering Web images based on the visual features of images or textual features by making use of the text surrounding the images. However, not much work has been done in using multimodal information for clustering Web images. In this paper, we propose a graph theoretical framework for simultaneously integrating visual and textual features for efficient Web image clustering. Specifically, we model visual features, images and words from surrounding text using a tripartite graph. Partitioning this graph leads to clustering of the Web images. Although, graph partitioning approach has been adopted before, the main contribution of this work lies in a new algorithm that we propose - Consistent Isoperimetric High-order Co-clustering (CIHC), for partitioning the tripartite graph. Computationally, CIHC is very quick as it requires a simple solution to a sparse system of linear equations. Our theoretical analysis and extensive experiments performed on real Web images demonstrate the performance of CIHC in terms of the quality, efficiency and scalability in partitioning the visual feature-image-word tripartite graph.
Head detection is an important, but difficult task, if no restrictions such as static illumination, frontal face appearance or uniform background can be assumed. We present a system that is able to perform head detect...
详细信息
In Content-based Image Retrieval (CBIR) research, advanced technology that fuses the heterogeneous information into image clustering has drawn extensive attention recently. However, using multiple features for co-clus...
详细信息
ISBN:
(纸本)9781605586083
In Content-based Image Retrieval (CBIR) research, advanced technology that fuses the heterogeneous information into image clustering has drawn extensive attention recently. However, using multiple features for co-clustering images without any user feedbacks is a challenging problem. In this paper, we propose a Semi-Supervised Non-negative Matrix Factorization (SS-NMF) framework for image co-clustering. Our method computes new relational matrices by incorporating user provided feedbacks into images through simultaneous distance metric learning and feature selection for different low-level visual features. Using an iterative algorithm, we perform tri-factorizations of the new matrices to infer image clusters. Theoretically, we show the convergence and correctness of SS-NMF co-clustering and the advantages of SS-NMF co-clustering over existing approaches. Through extensive experiments conducted on image data sets, we demonstrate that SS-NMF provides an effective and efficient solution for image co-clustering. Copyright 2009 ACM.
暂无评论