In this paper, we propose a new lossy image compression technique suitable for computationally challenged platforms. Extensive development in moving platforms create need for encoding images in real time with less com...
详细信息
Content-based image retrieval (CBIR) is an accurate characterization of visual information. In this paper, we have proposed a new technique entitled as Transformed Directional Tri Concomitant Triplet Patterns (TdtCTp)...
详细信息
Iris and periocular (collectively termed as ocular) biometric modalities have been the most sought for modalities, due to their increased discrimination ability. Moreover, both these modalities can be captured through...
详细信息
In this paper we propose a novel 3D CNN network with localized residual connections for hyperspectral image classification. Our work chalks a comparative study with the existing methods employed for abstracting deeper...
详细信息
Stroke is a major life-threatening disease mostly occurs to a person of age 65 years and above but nowadays also happen in younger age due to unhealthy diet. If we can predict a stroke in its early stage, then it can ...
详细信息
Suggesting clothing items using query images is very convenient in online shopping, however feature extraction is not very easy for the intended query due to other objects and background variations present in these im...
详细信息
Cyber attackers develop new malicious software to attack their targets every year. Recent sophisticated malware targets financial data and steals the credentials of users. Security analysts design novel methods to def...
详细信息
This paper introduces the application of a visual relationship classifier as a standalone system that is meant to be used with external detectors. Through these lens, we propose a training scheme that uses unannotated...
详细信息
ISBN:
(纸本)9781728163956
This paper introduces the application of a visual relationship classifier as a standalone system that is meant to be used with external detectors. Through these lens, we propose a training scheme that uses unannotated pairs of objects as negative samples in order to improve precision. The proposed network architecture incorporates common techniques presented in related state-of-the-art solutions with a novel positional encoding scheme. We evaluate the proposed training method and architecture on the Open images dataset and improve mAP from 34.6% to 78.2% when considering all possible object pairings in each image. For a case where only ground-truth pairs are considered, our method presents a small decrease, from 91.0% to 88.8% mAP.
Steganography or "Covered Writing" is a security tool for hiding secret information inside any media or object of interest. It provides secure communication between the sender and the receiver. The spatial d...
详细信息
Visual localization is a fundamental problem in computervision and robotics. Recently, deep learning has shown to be effective for robust monocular localization. Most deep learning-based methods utilize convolution n...
详细信息
ISBN:
(纸本)9781728163956
Visual localization is a fundamental problem in computervision and robotics. Recently, deep learning has shown to be effective for robust monocular localization. Most deep learning-based methods utilize convolution neural network (CNN) to regress global 6 degree-of-freedom (Dof) pose. However, these methods suffer from pose sparsity, leading to over-fitting during training and poor localization performance on unseen data. In this paper, we try to alleviate this issue by implementing randomly geometric augmentation (RGA) during training. Specifically, we firstly estimate the depth map using a depth estimation network for the initial training image. Combing the estimated depth, RGB image and its corresponding pose, we can randomly synthesize new images of different views. The synthesized and initial images are used to train the pose regression network. Experiment results show our geometric augmentation strategy can significantly improve the localization accuracy.
暂无评论