检索结果-内蒙古大学图书馆

Shape Recognition and Pose Estimation for Mobile Augmented Reality

IEEE TRANSACTIONS ON VISUALIZATION AND computer GRAPHICS 2011年第10期17卷 1369-1379页

作者： Hagbi, Nate Bergig, Oriel El-Sana, Jihad Billinghurst, Mark Ben Gurion Univ Negev Visual Media Lab IL-84105 Beer Sheva Israel Univ Canterbury HIT Lab NZ Canterbury New Zealand

Nestor is a real-time recognition and camera pose estimation system for planar shapes. The system allows shapes that carry contextual meanings for humans to be used as Augmented Reality (AR) tracking targets. The user can teach the system new shapes in real time. New shapes can be shown to the system frontally, or they can be automatically rectified according to previously learned shapes. Shapes can be automatically assigned virtual content by classification according to a shape class library. Nestor performs shape recognition by analyzing contour structures and generating projective-invariant signatures from their concavities. The concavities are further used to extract features for pose estimation and tracking. Pose refinement is carried out by minimizing the reprojection error between sample points on each image contour and its library counterpart. Sample points are matched by evolving an active contour in real time. Our experiments show that the system provides stable and accurate registration, and runs at interactive frame rates on a Nokia N95 mobile phone.

关键词： Multimedia information systems artificial augmented and virtual realities image processing and computer vision scene analysis tracking

来源：评论

学校读者我要写书评

暂无评论

PADS: A Probabilistic Activity Detection Framework for Video Data

引用

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2010年第12期32卷 2246-2261页

作者： Albanese, Massimiliano Chellappa, Rama Cuntoor, Naresh Moscato, Vincenzo Picariello, Antonio Subrahmanian, V. S. Udrea, Octavian Univ Maryland Inst Adv Comp Studies Dept Comp Sci College Pk MD 20742 USA Kitware Inc Clifton Pk NY 12065 USA Univ Naples Federico II Dipartimento Informat & Sistemist I-80125 Naples Italy IBM TJ Watson Res Ctr Hawthorne NY 10532 USA

There is now a growing need to identify various kinds of activities that occur in videos. In this paper, we first present a logical language called Probabilistic Activity Description Language (PADL) in which users can specify activities of interest. We then develop a probabilistic framework which assigns to any subvideo of a given video sequence a probability that the subvideo contains the given activity, and we finally develop two fast algorithms to detect activities within this framework. OffPad finds all minimal segments of a video that contain a given activity with a probability exceeding a given threshold. In contrast, the OnPad algorithm examines a video during playout (rather than afterwards as OffPad does) and computes the probability that a given activity is occurring (even if the activity is only partially complete). Our prototype Probabilistic Activity Detection System (PADS) implements the framework and the two algorithms, building on top of existing image processing algorithms. We have conducted detailed experiments and compared our approach to four different approaches presented in the literature. We show that-for complex activity definitions-our approach outperforms all the other approaches.

关键词： Applications and expert knowledge-intensive systems computer vision vision and scene understanding video analysis image processing and computer vision applications

来源：评论

学校读者我要写书评

暂无评论

Truthful Color Reproduction in Spatial Augmented Reality Applications

引用

IEEE TRANSACTIONS ON VISUALIZATION AND computer GRAPHICS 2013年第2期19卷 236-248页

作者： Menk, Christoffer Koch, Reinhard Volkswagen AG D-38446 Wolfsburg Germany Univ Kiel Inst Informat D-24098 Kiel Germany

Spatial augmented reality is especially interesting for the design process of a car, because a lot of virtual content and corresponding real objects are used. One important issue in such a process is that the designer can trust the visualized colors on the real object, because design decisions are made on basis of the projection. In this paper, we present an interactive visualization technique which is able to exactly compute the RGB values for the projected image, so that the resulting colors on the real object are equally perceived as the real desired colors. Our approach computes the influences of the ambient light, the material, the pose and the color model of the projector to the resulting colors of the projected RGB values by using a physically based computation. This information allows us to compute the adjustment for the RGB values for varying projector positions at interactive rates. Since the amount of projectable colors does not only depend on the material and the ambient light, but also on the pose of the projector, our method can be used to interactively adjust the range of projectable colors by moving the projector to arbitrary positions around the real object. We further extend the mentioned method so that it is applicable to multiple projectors. All methods are evaluated in a number of experiments.

关键词： computer graphics picture/image generation display algorithms raytracing augmented reality image processing and computer vision radiometry color

来源：评论

学校读者我要写书评

暂无评论

Automatic labeling of facial zones for digital clinical application: An ensemble of semantic segmentation models

引用

SKIN RESEARCH AND TECHNOLOGY 2024年第2期30卷 e13625-e13625页

作者： Tuazon, Rafael Mortezavi, Siavash AbbVie Irvine CA USA AbbVie 2525 Dupont Dr Irvine CA 92612 USA

IntroductionThe application of artificial intelligence to facial aesthetics has been limited by the inability to discern facial zones of interest, as defined by complex facial musculature and underlying structures. Although semantic segmentation models (SSMs) could potentially overcome this limitation, existing facial SSMs distinguish only three to nine facial zones of *** developed a new supervised SSM, trained on 669 high-resolution clinical-grade facial images;a subset of these images was used in an iterative process between facial aesthetics experts and manual annotators that defined and labeled 33 facial zones of *** some zones overlap, some pixels are included in multiple zones, violating the one-to-one relationship between a given pixel and a specific class (zone) required for SSMs. The full facial zone model was therefore used to create three sub-models, each with completely non-overlapping zones, generating three outputs for each input image that can be treated as standalone models. For each facial zone, the output demonstrating the best Intersection Over Union (IOU) value was selected as the winning *** new SSM demonstrates mean IOU values superior to manual annotation and landmark analyses, and it is more robust than landmark methods in handling variances in facial shape and structure.

关键词： computer vision face and gesture recognition image processing and computer vision pixel classification segmentation

来源：评论

学校读者我要写书评

暂无评论

Deep learning-based object recognition in multispectral satellite imagery for real-time applications

引用

MACHINE vision AND APPLICATIONS 2021年第4期32卷 98-98页

作者： Gudzius, Povilas Kurasova, Olga Darulis, Vytenis Filatovas, Ernestas Vilnius Univ Inst Data Sci & Digital Technol Akad St 4 LT-08412 Vilnius Lithuania

Satellite imagery is changing the way we understand and predict economic activity in the world. Advancements in satellite hardware and low-cost rocket launches have enabled near-real-time, high-resolution images covering the entire Earth. It is too labour-intensive, time-consuming and expensive for human annotators to analyse petabytes of satellite imagery manually. Current computer vision research exploring this problem still lack accuracy and prediction speed, both significantly important metrics for latency-sensitive automatized industrial applications. Here we address both of these challenges by proposing a set of improvements to the object recognition model design, training and complexity regularisation, applicable to a range of neural networks. Furthermore, we propose a fully convolutional neural network (FCN) architecture optimised for accurate and accelerated object recognition in multispectral satellite imagery. We show that our FCN exceeds human-level performance with state-of-the-art 97.67% accuracy over multiple sensors, it is able to generalize across dispersed scenery and outperforms other proposed methods to date. Its computationally light architecture delivers a fivefold improvement in training time and a rapid prediction, essential to real-time applications. To illustrate practical model effectiveness, we analyse it in algorithmic trading environment. Additionally, we publish a proprietary annotated satellite imagery dataset for further development in this research field. Our findings can be readily implemented for other real-time applications too.

关键词： Pattern Recognition image processing and computer vision Communications Engineering Networks

来源：评论

学校读者我要写书评

暂无评论

Data Extraction of Circular-Shaped and Grid-like Chart images

引用

JOURNAL OF IMAGING 2022年第5期8卷 136-136页

作者： Bajic, Filip Job, Josip Univ Zagreb Univ Comp Ctr Zagreb 10000 Croatia Fac Elect Engn Comp Sci & Informat Technol Osijek Osijek 31000 Croatia

Chart data extraction is a crucial research field in recovering information from chart images. With the recent rise in image processing and computer vision algorithms, researchers presented various approaches to tackle this problem. Nevertheless, most of them use different datasets, often not publicly available to the research community. Therefore, the main focus of this research was to create a chart data extraction algorithm for circular-shaped and grid-like chart types, which will accelerate research in this field and allow uniform result comparison. A large-scale dataset is provided containing 120,000 chart images organized into 20 categories, with corresponding ground truth for each image. Through the undertaken extensive research and to the best of our knowledge, no other author reports the chart data extraction of the sunburst diagrams, heatmaps, and waffle charts. In this research, a new, fully automatic low-level algorithm is also presented that uses a raster image as input and generates an object-oriented structure of the chart of that image. The main novelty of the proposed approach is in chart processing on binary images instead of commonly used pixel counting techniques. The experiments were performed with a synthetic dataset and with real-world chart images. The obtained results demonstrate two things: First, a low-level bottom-up approach can be shared among different chart types. Second, the proposed algorithm achieves superior results on a synthetic dataset. The achieved average data extraction accuracy on the synthetic dataset can be considered state-of-the-art within multiple error rate groups.

关键词： chart data extraction chart image processing data visualization image processing and computer vision

来源：评论

学校读者我要写书评

暂无评论

Unsupervised Document Binarization of Engineering Drawings via Multi Noise CycleGAN

引用

INTERNATIONAL JOURNAL OF ADVANCED computer SCIENCE AND APPLICATIONS 2023年第7期14卷 838-844页

作者： Rosli, Luqman Hakim Hooi, Yew Kwang Bin, Ong Kai Univ Teknol PETRONAS Dept Comp & Informat Sci Seri Iskandar Malaysia

The task of document binarization of degraded complex documents is tremendously challenging due to the various forms of noise often present in these documents. While the current state-of-the-art deep learning approaches are capable for the removal of various noise types in documents with high accuracy, they employ a supervised learning scheme which requires matching clean and noisy document image pairs which are difficult and costly to obtain for complex documents such as engineering drawings. In this paper, we propose our method for document binarization of engineering drawings using 'Multi Noise CycleGAN'. The method utilizing unsupervised learning using adversarial and cycle-consistency loss is trained on unpaired noisy document images of various noise and image conditions. Experimental results for the removal of various noise types demonstrated that the method is able to reliably produce a clean image for any given noisy image and in certain noisy images achieve significant improvements over existing methods.

关键词： image processing and computer vision generative adversarial networks document binarization deep learning

来源：评论

学校读者我要写书评

暂无评论

Chart Classification Using Siamese CNN

引用

JOURNAL OF IMAGING 2021年第11期7卷 220-220页

作者： Bajic, Filip Job, Josip Univ Zagreb Univ Comp Ctr Zagreb 10000 Croatia Comp Sci & Informat Technol Osijek Fac Elect Engn Osijek 31000 Croatia

In recovering information from the chart image, the first step should be chart type classification. Throughout history, many approaches have been used, and some of them achieve results better than others. The latest articles are using a Support Vector Machine (SVM) in combination with a Convolutional Neural Network (CNN), which achieve almost perfect results with the datasets of few thousand images per class. The datasets containing chart images are primarily synthetic and lack real-world examples. To overcome the problem of small datasets, to our knowledge, this is the first report of using Siamese CNN architecture for chart type classification. Multiple network architectures are tested, and the results of different dataset sizes are compared. The network verification is conducted using Few-shot learning (FSL). Many of described advantages of Siamese CNNs are shown in examples. In the end, we show that the Siamese CNN can work with one image per class, and a 100% average classification accuracy is achieved with 50 images per class, where the CNN achieves only average classification accuracy of 43% for the same dataset.

关键词： chart classification chart image processing data visualization Siamese neural network image processing and computer vision

来源：评论

学校读者我要写书评

暂无评论

Advances in Discrete Tomography and Its Applications 1

引用

丛书名： Applied and Numerical Harmonic Analysis

2007年

作者： Gabor T. Herman Attila Kuba

来源：评论

学校读者我要写书评

暂无评论

Synthesis of Screentone Patterns of Manga Characters

Synthesis of Screentone Patterns of Manga Characters

引用

IEEE International Symposium on Multimedia

作者： Koki Tsubota Daiki Ikami Kiyoharu Aizawa The University of Tokyo Tokyo Japan

ISBN: (数字)9781728156064

ISBN: (纸本)9781728156071

Manga or Japanese comics are a popular medium and their images comprise line drawings and screentones. This study investigates the screentone synthesis task that involves translation from line drawings to manga images. Screentones have regular patterns that are difficult to synthesize. To address this problem, we propose a method to translate line drawings into manga images by generating pixel-wise screentone class labels instead of generating manga images directly. To train a screentone label generator, we create paired data of line drawings and pixel-wise screentone class labels that we obtain by applying to manga images a screentone removal and a screentone classifier, respectively. We train the screentone classifier using paired data of simulated manga images and pixel-wise screentone class labels. In tests, we conduct post-processing to reduce noise in the generated pixel-wise screentone labels. Experiments show that our proposed method produces reasonable screentone patterns. In comparison with results obtained using a baseline method of image-to-image translations, our results are comparable or more visually appealing.

关键词： Manga Screentone image processing and computer vision image generation Line drawings image generation Paired Data images baseline method computer-Assisted image processing synthesis

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：