this paper describes the inclusion of a visual servoing project into an undergraduate robotic course. the robotic platform is a vision-enhanced VEX robot, constructed by integrating the commercially-available VEX robo...
详细信息
ISBN:
(纸本)9781728134604
this paper describes the inclusion of a visual servoing project into an undergraduate robotic course. the robotic platform is a vision-enhanced VEX robot, constructed by integrating the commercially-available VEX robot withthe Raspberry PI single-board computer (plus its camera module). this project allows students to go through several key steps in a vision-based control task, including camera calibration, depth recovery, imageprocessing, communication, and basic robot control. vision-based control, which was once a graduate-level subject, is now exposed to undergraduate students. this project will strengthen students' understanding of relevant topics, expose them to more complex robotic systems, provide them opportunities to program and design in multi-disciplinary areas, and prepare them with a more complete set of knowledge, skills, and hands-on experience to benefit other activities they do at school and beyond. Students' feedback confirms that this project is effective in enhancing their learning. the usage of commercially-available robotic kits & components allows this project to be easily produced at other institutions.
Bayesian Sparse Signal Recovery (SSR) for Multiple Measurement Vectors, when elements of each row of solution matrix are correlated, is addressed in the paper. We propose a standard linear Gaussian observation model a...
ISBN:
(纸本)9781450366151
Bayesian Sparse Signal Recovery (SSR) for Multiple Measurement Vectors, when elements of each row of solution matrix are correlated, is addressed in the paper. We propose a standard linear Gaussian observation model and a three-level hierarchical estimation framework, based on Gaussian Scale Mixture (GSM) model with some random and deterministic parameters, to model each row of the unknown solution matrix. this hierarchical model induces heavy-tailed marginal distribution over each row which encompasses several choices of distributions viz. Laplace distribution, Student's t distribution and Jeffery prior. Automatic Relevance Determination (ARD) phenomenon introduces sparsity in the model. It is interesting to see that Block Sparse Bayesian Learning framework is a special case of the proposed framework when induced marginal is Jeffrey prior. Experimental results for synthetic signals are provided to demonstrate its effectiveness. We also explore the possibility of using Multiple Measurement Vectors to model Dynamic Hand Posture Database which consists of sequence of temporally correlated hand posture sequence. It can be seen that by exploiting temporal correlation information present in the successive image samples, the proposed framework can reconstruct the data with less linear random measurements with high fidelity.
Monocular SLAM refers to using a single camera to estimate robot ego motion while building a map of the environment. While Monocular SLAM is a well studied problem, automating Monocular SLAM by integrating it with tra...
详细信息
ISBN:
(纸本)9781450366151
Monocular SLAM refers to using a single camera to estimate robot ego motion while building a map of the environment. While Monocular SLAM is a well studied problem, automating Monocular SLAM by integrating it with trajectory planning frameworks is particularly challenging. this paper presents a novel formulation based on Reinforcement Learning (RL) that generates fail safe trajectories wherein the SLAM generated outputs do not deviate largely from their true values. Quintessentially, the RL framework successfully learns the otherwise complex relation between perceptual inputs and motor actions and uses this knowledge to generate trajectories that do not cause failure of SLAM. We show systematically in simulations how the quality of the SLAM dramatically improves when trajectories are computed using RL. Our method scales effectively across Monocular SLAM frameworks in both simulation and in real world experiments with a mobile robot.
this article presents an algorithm for salient object detection by leveraging the Bayesian surprise of the Restricted Boltzmann Machine (RBM). Here an RBM is trained on patches sampled randomly from the input image. D...
详细信息
ISBN:
(纸本)9781450366151
this article presents an algorithm for salient object detection by leveraging the Bayesian surprise of the Restricted Boltzmann Machine (RBM). Here an RBM is trained on patches sampled randomly from the input image. Due to this random sampling, the RBM is likely to get more exposed to background patches than that of the object. thus, the trained RBM will minimize the free energy of its hidden states with respect to the background patches as opposed to the object. this, according to the free energy principle, implies minimizing Bayesian surprise which is a measure for saliency based on Kullback Leibler divergence between the input and reconstructed patch distribution. Hence, when the trained RBM is exposed to patches from the object region, it would have high divergence and in turn a high Bayesian surprise. thus such pixels with high Bayesian surprise could be considered as salient pixels. For each pixel, a neighborhood (withthe same size of training patch) is considered and is fed to the trained RBM to obtain the reconstructed patch. thereafter, the Kullback Leibler divergence between the input and reconstructed neighborhood of each pixel is computed to measure the Bayesian surprise and is stored in the corresponding position in a matrix to form the saliency map. Experiments are carried out on three datasets namely MSRA-10K, ECSSD and DUTS. the results obtained depict promising performance by the proposed approach.
the article takes a critical aim at one of the most popular uses of Augmented Reality technology - that is enriching literary works. the authors look at popular publications by analyzing the methods of communication u...
详细信息
Zero-shot learning (ZSL) for visual recognition aims at identifying the previously unseen class samples given a trained model on the labeled visual samples of seen classes and additional class-level semantic side info...
详细信息
ISBN:
(纸本)9781450366151
Zero-shot learning (ZSL) for visual recognition aims at identifying the previously unseen class samples given a trained model on the labeled visual samples of seen classes and additional class-level semantic side information for all classes. Often ZSL is tackled by learning an embedding function from the visual to semantic space or vice-versa. However, learning this mapping often results in loss of discriminative property of learned embedding space, thus severely compromising the recognition performance on the test samples. In order to ensure improved discrimination in the embedding space, we introduce a ZSL framework by leveraging the intuitive idea of cross-domain triplets based metric learning for learning such a space. Additionally, we introduce a novel graph Laplacian based regularizer which aligns the graph structures of the visual and semantic spaces in the learned embedding space. Simultaneously optimizing boththe criteria results in a compact, discriminative, and meaningful embedding space, which is experimentally found to be superior to most of its existing counterparts on boththe standard ZSL (AwA and CUB) and the challenging generalized ZSL (AwA1, AwA2, CUB) settings.
In this paper, we propose a geometric feature and frame segmentation based approach for video summarization. Video summarization aims to generate a summarized video with all the salient activities of the input video. ...
详细信息
this paper proposes an efficient method of character segmentation for handwritten text. the main challenge in character segmentation of hand-written text is the varied size of each letter in different documents, conne...
详细信息
Content of the document images are often shows hierarchical multi-layered tree structure. Further, the algorithms for document image applications like line detection, paragraph detection, word recognition, layout anal...
详细信息
image segmentation has always been a key research issue in the field of computervision. image segmentation networks that use deep learning methods require a large number of finely labeled samples, which is difficult ...
详细信息
ISBN:
(数字)9781510630765
ISBN:
(纸本)9781510630765
image segmentation has always been a key research issue in the field of computervision. image segmentation networks that use deep learning methods require a large number of finely labeled samples, which is difficult to obtain. In this paper, we combine the focal loss function withthe fully convolutional networks to improve network performance. And we collected and built a dataset contents 1500 samples with complex background. We trained the improved network withthe dataset to achieve 81.55% in mean average precision and 76.13% in mean intersection over union.
暂无评论