image scoring focuses on visual classification or regression which assigns each image a category or precise score. As deep learning is becoming more and more popular, large manually labeled data sets are required. Thi...
详细信息
ISBN:
(纸本)9781509063529
image scoring focuses on visual classification or regression which assigns each image a category or precise score. As deep learning is becoming more and more popular, large manually labeled data sets are required. This makes the task time-consuming, and is difficult for small or medium dataset. In this paper, we give a simple yet effective oversampling technique that considers both entire image and local patches. Oversampling is a well-known trick in deep learning, while in this paper it mainly focuses on the small-size patches instead of some large-size patches that cover the entire object of the image. We first crop an image into many small-size patches that can augment the initial dataset and also partially decrease the over-fitting in the training. The initial dataset and expanded dataset can separately be seen as global and local information for each image. Based on the expanded dataset, we can train standard Convolutional Neural Network (CNN) and patch based CNN (patCNN). In order to integrate the global and local information, we further combine two models to get better performance, called comCNN. The experimental results show the effectiveness of patCNN and comCNN compared with the state-of-the-art methods.
In this paper, we propose a novel visual similarity-based phishing detection scheme using hue information with auto updating database. Since a PWS (Phishing Website) is created based on targeted legitimate website or ...
详细信息
ISBN:
(数字)9781728136790
ISBN:
(纸本)9781728136790
In this paper, we propose a novel visual similarity-based phishing detection scheme using hue information with auto updating database. Since a PWS (Phishing Website) is created based on targeted legitimate website or other subspecies whose hue information is similar each other, many PWSs can be exhaustively detected by tracing similar colored subspecies. Based on this notion, the proposed scheme detects a new PWS which has similar hue information to already detected PWSs. By repeating this procedure, the detection scope can be effectively expanded. In order to avoid the misdetection of legitimate websites which have similar hue information to database 's ones, the proposed scheme utilizes the fact that the combination of used colors is hard to be similar among legitimate websites and PWSs. By the computer simulation with real dataset, we demonstrate that the proposed scheme improves the detection performance as the number of detected PWSs increases.
We tackle the problem of globally localizing a camera-equipped micro aerial vehicle flying within urban environments for which a Google Street View image database exists. To avoid the caveats of current image-search a...
详细信息
ISBN:
(纸本)9781467363587
We tackle the problem of globally localizing a camera-equipped micro aerial vehicle flying within urban environments for which a Google Street View image database exists. To avoid the caveats of current image-search algorithms in case of severe viewpoint changes between the query and the database images, we propose to generate virtual views of the scene, which exploit the air-ground geometry of the system. To limit the computational complexity of the algorithm, we rely on a histogram-voting scheme to select the best putative image correspondences. The proposed approach is tested on a 2km image dataset captured with a small quadroctopter flying in the streets of Zurich. The success of our approach shows that our new air-ground matching algorithm can robustly handle extreme changes in viewpoint, illumination, perceptual aliasing, and over-season variations, thus, outperforming conventional visual place-recognition approaches.
Retinal prosthesis represent the best near-term hope for individuals with chronic blinding disease of the outer retina. However the small number of stimulating electrodes produces a poor, low resolution image. We prop...
详细信息
ISBN:
(纸本)9781424400324
Retinal prosthesis represent the best near-term hope for individuals with chronic blinding disease of the outer retina. However the small number of stimulating electrodes produces a poor, low resolution image. We propose a new preprocessing method for epi-retinal implants and validate it through a novel simulation of the implanted blind perception. Twenty-one normally sighted, untrained subjects performed a face recognition test. Three different electrodes grids were simulated: rectangular, hexagonal and log-polar. The results show that the proposed pre-processing, method has a good and statistically significant performance improvement.
Sparse coding is an active research topic in machine learning and signal processing community. In this paper, we propose a novel local sparse model for multi-label image annotation. Existing feature descriptors and ex...
详细信息
ISBN:
(纸本)9781479903566
Sparse coding is an active research topic in machine learning and signal processing community. In this paper, we propose a novel local sparse model for multi-label image annotation. Existing feature descriptors and extraction algorithms pay less attention to semantic information and extracted feature dimension usually is high, which leads to heavy computation. Noise and redundant information often reduce the performance of sparse model. To address these issues, we combine label and visual information for feature selection while most previous work only utilizes labels and ignores visual information itself. First of all, we make use of label sets to seek images neighbor relations and generate the Gaussian kernel matrix over these neighbor images, then use LLP(Local Learning Projection) algorithm to get minimal local estimation error. After that, for each query image, we find its K nearest neighbors in the transformed space and use these neighbors to reconstruct it via sparse coding. Moreover, during coding, we penalize the corresponding reconstruction coefficients to implicitly reflect the neighbor relations. Finally, propagating tags from training data to test data. image annotation experiments on the Corel5k dataset show the performance of our approach is comparable to several state-of-the-art algorithms.
In this paper, we propose a method to improve image retrieve for visual localization by structuring the database. We are studying cloud-based positioning infrastructure system that we call Universal Map. It can reduce...
详细信息
This paper is concerned with experimental analysis of visual inverted pendulum servoing system. Firstly, visual inverted pendulum servoing system is introduced, and three typical imageprocessing algorithms are descri...
详细信息
ISBN:
(纸本)9789811026690;9789811026683
This paper is concerned with experimental analysis of visual inverted pendulum servoing system. Firstly, visual inverted pendulum servoing system is introduced, and three typical imageprocessing algorithms are described. These three algorithms are then employed to process the image of inverted pendulum captured by camera. Comparative experiments are operated, and the detection precision and real time performance are analyzed. This lays a solid foundation for future control research of visual inverted pendulum servoing system.
Performing Content-Based image Retrieval (CBIR) on Internet connected databases through Peer-to-Peer (P2P) network (P2P-CBIR) effectively explores the large-scale image database distributed over connected peers. In ad...
详细信息
Practical sparse approximation algorithms (particularly greedy algorithms) suffer two significant drawbacks: they are difficult to implement in hardware, and they are inefficient for time-varying stimuli (e.g., video)...
详细信息
ISBN:
(纸本)9781424414369
Practical sparse approximation algorithms (particularly greedy algorithms) suffer two significant drawbacks: they are difficult to implement in hardware, and they are inefficient for time-varying stimuli (e.g., video) because they produce erratic temporal coefficient sequences. We present a class of locally competitive algorithms (LCAs) that correspond to a collection of sparse approximation principles minimizing a weighted combination of reconstruction MSE and a coefficient cost function. These systems use thresholding functions to induce local nonlinear competitions in a dynamical system. Simple analog hardware can implement the required nonlinearities and competitions. We show that our LCAs are stable under normal operating conditions and can produce sparsity levels comparable to existing methods. Additionally, these LCAs can produce coefficients for video sequences that are more regular (i.e., smoother and more predictable) than the coefficients produced by greedy algorithms.
This study explores the utilization of the Pyramid Scene Parsing Network (PSPNet) architecture to achieve accurate segmentation of brain tumors in magnetic resonance (MR) images. Experimental evaluations were conducte...
详细信息
ISBN:
(数字)9798350388961
ISBN:
(纸本)9798350388978;9798350388961
This study explores the utilization of the Pyramid Scene Parsing Network (PSPNet) architecture to achieve accurate segmentation of brain tumors in magnetic resonance (MR) images. Experimental evaluations were conducted on different pre-trained backbone network models, including Vgg16, Inceptionv3, Mobilenetv2, Efficientnetb0, Resnet18, Resnet34, Resnet50, Resnet101, Resnext50, and Resnext101, assessing the performance of each model in brain tumor segmentation. The results highlight the VGG16-PSPNet model as the most successful, showcasing high F1-score, mIoU, precision, recall, and accuracy values.
暂无评论