The artificial vision is a part of the artificial intelligence that pretends to simulate the human vision, is to say, from the acquisition, processing, analysis and interpretation of images through an intelligent syst...
详细信息
ISBN:
(纸本)9781509050475
The artificial vision is a part of the artificial intelligence that pretends to simulate the human vision, is to say, from the acquisition, processing, analysis and interpretation of images through an intelligent system. This work presents the creation of prototypes under the game jam model as a software product. In this context, the objective of the present work was to apply basic artificial vision algorithms such as linear discriminant analysis (LDA), principal component analysis (PCA), Fisherface, Otsu, CamShift and color spaces such as RGB and HSV in order to be able to motion detection of objects, face recognition and pedestrian detection. As a result of applying this model in rapid prototyping, we found significant factors (such as: participatory design, light construction, product value approach, aesthetics and technology) in the implementation of innovative strategies in creating of prototypes focused on Software development.
Lossy image compression algorithms are pervasively used to reduce the size of images transmitted over the web and recorded on data storage media. However, we pay for their high compression rate with visual artifacts d...
详细信息
ISBN:
(纸本)9781509061839
Lossy image compression algorithms are pervasively used to reduce the size of images transmitted over the web and recorded on data storage media. However, we pay for their high compression rate with visual artifacts degrading the user experience. Deep convolutional neural networks have become a widespread tool to address high-level computer vision tasks very successfully. Recently, they have found their way into the areas of low-level computer vision and imageprocessing to solve regression problems mostly with relatively shallow networks. We present a novel 12-layer deep convolutional network for image compression artifact suppression with hierarchical skip connections and a multi-scale loss function. We achieve a boost of up to 1.79 dB in PSNR over ordinary JPEG and an improvement of up to 0.36 dB over the best previous ConvNet result. We show that a network trained for a specific quality factor (QF) is resilient to the QF used to compress the input image - a single network trained for QF 60 provides a PSNR gain of more than 1.5 dB over the wide QF range from 40 to 76.
Increasing spatial resolution is often required in many applications such as entertainment systems or video surveillance. Apart from using higher resolution sensors, it is also possible to apply super resolution algor...
详细信息
ISBN:
(纸本)9781467399616
Increasing spatial resolution is often required in many applications such as entertainment systems or video surveillance. Apart from using higher resolution sensors, it is also possible to apply super resolution algorithms to realize an increased resolution. Those methods can be divided into approaches that rely on only a single low resolution image or on multiple low resolution video frames. While incorporating more frames into the super-resolution is beneficial for the resolution enhancement in principle, it is also likely to introduce more artifacts from inaccurate motion estimation. To alleviate this problem, various weightings have been proposed in the literature. In this paper, we propose an extended dual weighting scheme for an interpolation-based super-resolution method based on Voronoi tessellation that relies on both a motion confidence weight and a distance weight. Compared to non-weighted super-resolution, the proposed method yields an average gain in luminance PSNR of up to 1.29 dB and 0.61 dB for upscaling factors of 2 and 4, respectively. Visual comparisons substantiate the objective results.
OCR is the most active, interesting evaluation invention of text cum character processing recognition and pattern based image recognition. In present life OCR has been successfully using in finance, legal, banking, he...
详细信息
ISBN:
(纸本)9781467385879
OCR is the most active, interesting evaluation invention of text cum character processing recognition and pattern based image recognition. In present life OCR has been successfully using in finance, legal, banking, health care and home need appliances. The OCR consists the different levels of processing methods like as image Pre Acquisition, Classification, Post-Acquisition, Pre-Level processing, Segmented processing, Post-Level processing, Feature Extraction. The many researchers are proposed various levels of different methodologies and approaches in different versions of languages with help of modern and traditional technologies. This paper expressed the detail study and analysis of various character recognition methods and approaches: in details like as flow and type of approached methodology was used, type of algorithm has built with support of technology has implemented background of the proposed methodology and invention best outcomes flow for the each methodology. This paper and also expressed the main objectives and ideology of various OCR algorithms, like as neural networks algorithm, structural algorithm, support vector algorithm, statistical algorithm, template matching algorithm along with how they classified, identified, rule formed, inferred for recognition of characters and symbols.
In this paper we present a new, publicly available database of color, high resolution images useful in evaluation of various algorithms in the field of video surveillance. The additional data provided with the images ...
详细信息
ISBN:
(纸本)9783319238142;9783319238135
In this paper we present a new, publicly available database of color, high resolution images useful in evaluation of various algorithms in the field of video surveillance. The additional data provided with the images facilitates the evaluation of tracking, recognition and reidentification across sequences of images.
We address the problem of position control of micro-chips (chiplets) immersed in dielectric fluid. An electric field, shaped by controlling the voltages of spiral shaped electrodes, is used to reliably and accurately ...
详细信息
ISBN:
(纸本)9781509045839
We address the problem of position control of micro-chips (chiplets) immersed in dielectric fluid. An electric field, shaped by controlling the voltages of spiral shaped electrodes, is used to reliably and accurately transport and position chiplets using dielectrophoretic forces. A lumped, capacitive based (nonlinear) motion model is used to generate an open loop control policy. The open loop policy is generated using a one step model predictive control approach. By exploiting the spatial symmetry and periodicity of the open loop control solution, a real-time control scheme is designed by applying simple algebraic operations to a base function defined on a finite domain. The chiplet position is tracked using imageprocessingalgorithms. We demonstrate the validity of our approach by describing an experimental result, where real-time control is used to move a chiplet for 1000μm in a controlled manner.
An unprecedented growth in data generation is taking place. Data about larger dynamic systems is being accumulated, capturing finer granularity events, and thus processing requirements are increasingly approaching rea...
详细信息
ISBN:
(纸本)9781467388153
An unprecedented growth in data generation is taking place. Data about larger dynamic systems is being accumulated, capturing finer granularity events, and thus processing requirements are increasingly approaching real-time. To keep up, data-analytics pipelines need to be viable at massive scale, and switch away from static, offline scenarios to support fully online analysis of dynamic systems. This paper uses a challenge problem, graph colouring, to explore massive-scale analytics for dynamic graph processing. We present an event-based infrastructure, and a novel, online, distributed graph colouring algorithm. Our implementation for colouring static graphs, used as a performance baseline, is up to an order of magnitude faster than previous results and handles massive graphs with over 257 billion edges. Our framework supports dynamic graph colouring with performance at large scale better than GraphLab's static analysis. Our experience indicates that online solutions are feasible, and can be more efficient than those based on snapshotting.
Emotion recognition systems have an important role to play in the human-computer interactive applications (HCI). These systems are using facial features of face images and they are verifying or identifying the emotion...
详细信息
ISBN:
(纸本)9781509016792
Emotion recognition systems have an important role to play in the human-computer interactive applications (HCI). These systems are using facial features of face images and they are verifying or identifying the emotions. In this study, emotion identification algorithms are improved by using just mouth region features of a face. Region of interest (mouth region) is detected by Viola-Jones algorithms from video frames which are including different emotional face expressions. Outer boundaries of lip shapes are extracted by manually and calculated the scalar Fourier Descriptors (FDs) of the boundaries. Classification and recognition of the emotions is presented according to scalar FDs of lip contours. Test results are obtained as 93.9 % accuracy rate for scalar FDs.
We are interested in building scalable computer vision systems for distributed processing of big visual data. We apply data streaming concepts, namely stream algebra operators, which have been proven effective in the ...
详细信息
ISBN:
(纸本)9781450347860
We are interested in building scalable computer vision systems for distributed processing of big visual data. We apply data streaming concepts, namely stream algebra operators, which have been proven effective in the database literature. The operators collectively form an algebra over data streams. The algebra has well defined semantics. It naturally describes online computer vision algorithms and their feedback control and tuning algorithms. In this work, we present the first implementation of such algebra at large scale. Our implementation provides a high level programming interface for constructing and executing vision workflow graphs while hiding the data transfer and concurrency details. It also allows feedback control and dynamic reconfiguration of vision algorithms. A case study is discussed showing a streaming workflow for online lane and road boundary detection and describing the flexibility and effectiveness of the algebra for building complex distributed applications.
Classification of structural brain magnetic resonance (MR) images is a crucial task for many neurological phenotypes that machine learning tools are increasingly developed and applied to solve this problem in recent y...
详细信息
Classification of structural brain magnetic resonance (MR) images is a crucial task for many neurological phenotypes that machine learning tools are increasingly developed and applied to solve this problem in recent years. In this study binary classification of T1-weighted structural brain MR images are performed using state-of-the-art machine learning algorithms when there is no information about the clinical context or specifics of neuroimaging. image derived features and clinical labels that are provided by the International conference on Medical image Computing and Computer-Assisted Intervention 2014 machine learning challenge are used. These morphological summary features are obtained from four different datasets (each N > 70) with clinically relevant phenotypes and automatically extracted from the MR imaging scans using FreeSurfer, a freely distributed brain MR imageprocessing software package. Widely used machine learning tools, namely;back-propagation neural network, self-organizing maps, support vector machines and k-nearest neighbors are used as classifiers. Clinical prediction accuracy is obtained via cross-validation on the training data (N = 150) and predictions are made on the test data (N = 100). Classification accuracy, the fraction of cases where prediction is accurate and area under the ROC curve are used as the performance metrics. Accuracy and area under curve metrics are used for tuning the training hyperparameters and the evaluation of the performance of the classifiers. Performed experiments revealed that support vector machines show a better success compared to the other methods on clinical predictions using summary morphological features in the absence of any information about the phenotype. Prediction accuracy would increase greatly if contextual information is integrated into the system. (C) 2017 Wiley Periodicals, Inc.
暂无评论