Semantic Segmentation of satellite images is one of the most challenging problems in computervision as it requires a model capable of capturing both local and global information at each pixel. Current state of the ar...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
Semantic Segmentation of satellite images is one of the most challenging problems in computervision as it requires a model capable of capturing both local and global information at each pixel. Current state of the art methods are based on Fully Convolutional Neural Networks (FCNN) with mostly two main components: an encoder which is a pretrained classification model that gradually reduces the input spatial size and a decoder that transforms the encoder's feature map into a predicted mask with the original size. We change this conventional architecture to a model that makes use of full resolution information. NU-Net is a deep FCNN that is able to capture wide field of view global information around each pixel while maintaining localized full resolution information throughout the model. We evaluate our model on the Land Cover Classification and Road Extraction tracks in the DeepGlobe competition.
In this paper, we present a distributed embedded vision system that enables surround scene analysis and vehicle threat estimation. The proposed system analyzes the surroundings of the ego-vehicle using four cameras, e...
详细信息
ISBN:
(纸本)9781509014378
In this paper, we present a distributed embedded vision system that enables surround scene analysis and vehicle threat estimation. The proposed system analyzes the surroundings of the ego-vehicle using four cameras, each connected to a separate embedded processor. Each processor runs a set of optimized vision-based techniques to detect surrounding vehicles, so that the entire system operates at real-time speeds. This setup has been demonstrated on multiple vehicle testbeds with high levels of robustness under real-world driving conditions and is scalable to additional cameras. Finally, we present a detailed evaluation which shows over 95% accuracy and operation at nearly 15 frames per second.
Recently released depth cameras provide effective estimation of 3D positions of skeletal joints in temporal sequences of depth maps. In this work, we propose an efficient yet effective method to recognize human action...
详细信息
ISBN:
(纸本)9780769549903
Recently released depth cameras provide effective estimation of 3D positions of skeletal joints in temporal sequences of depth maps. In this work, we propose an efficient yet effective method to recognize human actions based on the positions of joints. First, the body skeleton is decomposed in a set of kinematic chains, and the position of each joint is expressed in a locally defined reference system which makes the coordinates invariant to body translations and rotations. A multi-part bag-of-poses approach is then defined, which permits the separate alignment of body parts through a nearest-neighbor classification. Experiments conducted on the Florence 3D Action dataset and the MSR Daily Activity dataset show promising results.
"Big Data" analysis is an emerging topic in computervision and patternrecognition. As one example problem of big data, we study semantic age labels and facial aging pattern analysis on a large database. In...
详细信息
ISBN:
(纸本)9780769549903
"Big Data" analysis is an emerging topic in computervision and patternrecognition. As one example problem of big data, we study semantic age labels and facial aging pattern analysis on a large database. In aging analysis, one of the great challenges is the lack of a large number of face images with ground truth age labels. Unlike many other example-based recognition problems where human annotations can be used as the ground truth labels for both training and testing, it is quite difficult to label the exact ages in face images by human annotators. An alternative is to exploit the unlabeled ages to enhance the age estimation performance. However, it is unclear whether the face images with unlabeled ages can be used or not for age estimation, and how to use the unlabeled data. In this paper, we study the two problems comprehensively under two paradigms: the semi-supervised learning and unsupervised learning for aging pattern analysis. We emphasize the importance of using ground truth age labels and a large database in order to derive a meaningful measure in the context of big data. Our study can make an impact on collecting aging patterns that is very expensive and time consuming in practice.
With one in four individuals afflicted with malnutrition, computervision may provide a way of introducing a new level of automation in the nutrition field to reliably monitor food and nutrient intake. In this study, ...
详细信息
ISBN:
(纸本)9781728125060
With one in four individuals afflicted with malnutrition, computervision may provide a way of introducing a new level of automation in the nutrition field to reliably monitor food and nutrient intake. In this study, we present a novel approach to modeling the link between color and vitamin A content using transmittance imaging of a pureed foods dilution series in a computervision powered nutrient sensing system via a fine-tuned deep autoencoder network, which in this case was trained to predict the relative concentration of sweet potato purees. Experimental results show the deep autoencoder network can achieve an accuracy of 80% across beginner (6 month) and intermediate (8 month) commercially prepared pureed sweet potato samples. Prediction errors may be explained by fundamental differences in optical properties which are further discussed.
Low-rank adaptation (LoRA) and its variants are widely employed in fine-tuning large models, including large language models for natural language processing and diffusion models for computervision. This paper propose...
详细信息
ISBN:
(纸本)9798350365474
Low-rank adaptation (LoRA) and its variants are widely employed in fine-tuning large models, including large language models for natural language processing and diffusion models for computervision. This paper proposes a generalized framework called SuperLoRA that unifies and extends different LoRA variants, which can be realized under different hyper-parameter settings. Introducing new options with grouping, folding, shuffling, projection, and tensor decomposition, SuperLoRA offers high flexibility and demonstrates superior performance, with up to 10-fold gain in parameter efficiency for transfer learning tasks.
While makeup virtual-try-on is now widespread, parametrizing a computer graphics rendering engine for synthesizing images of a given cosmetics product remains a challenging task. In this paper, we introduce an inverse...
详细信息
ISBN:
(纸本)9781665448994
While makeup virtual-try-on is now widespread, parametrizing a computer graphics rendering engine for synthesizing images of a given cosmetics product remains a challenging task. In this paper, we introduce an inverse computer graphics method for automatic makeup synthesis from a reference image, by learning a model that maps an example portrait image with makeup to the space of rendering parameters. This method can be used by artists to automatically create realistic virtual cosmetics image samples, or by consumers, to virtually try-on a makeup extracted from their favorite reference image.
In this paper, efforts have been made to analyze the impact of training strategies, transfer learning and domain knowledge on two biometric-based problems namely: three class oculus classification and fingerprint sens...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
In this paper, efforts have been made to analyze the impact of training strategies, transfer learning and domain knowledge on two biometric-based problems namely: three class oculus classification and fingerprint sensor classification. For analyzing these problems we have considered deep-learning based architecture and evaluated our results on benchmark contact-lens datasets like IIIT-D, ND, IIT-K ( our model is publicly available) and on fingerprint datasets like FVC-2002, FVC-2004, FVC-2006, IIITD-MOLF, IIT-K. In-depth feature analysis of various proposed deep-learning models has been done in order to infer that indeed training in different ways along with transfer learning and domain knowledge plays a vital role in deciding the learning ability of any network.
We present a semantic segmentation algorithm for RGB remote sensing images. Our method is based on the Dilated Stacked U-Nets architecture. This state-of-the-art method has been shown to have good performance in other...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
We present a semantic segmentation algorithm for RGB remote sensing images. Our method is based on the Dilated Stacked U-Nets architecture. This state-of-the-art method has been shown to have good performance in other applications. We perform additional post-processing by blending image tiles and degridding the result. Our method gives competitive results on the DeepGlobe dataset.
Symmetry is a pervasive phenomenon presenting itself in all forms and scales in natural and manmade environments. Its detection plays an essential role at all levels of human as well as machine perception. The recent ...
详细信息
ISBN:
(纸本)9780769549903
Symmetry is a pervasive phenomenon presenting itself in all forms and scales in natural and manmade environments. Its detection plays an essential role at all levels of human as well as machine perception. The recent resurging interest in computational symmetry for computervision and computer graphics applications has motivated us to conduct a US NSF funded symmetry detection algorithm competition as a workshop affiliated with the computervision and patternrecognition (CVPR) conference, 2013. This competition sets a more complete benchmark for computervision symmetry detection algorithms. In this report we explain the evaluation metric and the automatic execution of the evaluation workflow. We also present and analyze the algorithms submitted, and show their results on three test sets of real world images depicting reflection, rotation and translation symmetries respectively. This competition establishes a performance baseline for future work on symmetry detection.
暂无评论