Being able to recover the shape of 3D deformable surfaces from a single video stream would make it possible to field reconstruction systems that run on widely available hardware without requiring specialized devices. ...
详细信息
ISBN:
(数字)9783031018107
ISBN:
(纸本)9783031006821
Being able to recover the shape of 3D deformable surfaces from a single video stream would make it possible to field reconstruction systems that run on widely available hardware without requiring specialized devices. However, because many different 3D shapes can have virtually the same projection, such monocular shape recovery is inherently ambiguous. In this survey, we will review the two main classes of techniques that have proved most effective so far: The template-based methods that rely on establishing correspondences with a reference image in which the shape is already known, and non-rigid structure-from-motion techniques that exploit points tracked across the sequences to reconstruct a completely unknown shape. In both cases, we will formalize the approach, discuss its inherent ambiguities, and present the practical solutions that have been proposed to resolve them. To conclude, we will suggest directions for future research. Table of Contents: Introduction / Early Approaches toNon-Rigid Reconstruction / Formalizing Template-Based Reconstruction / Performing Template-Based Reconstruction / Formalizing Non-Rigid Structure from Motion / Performing Non-Rigid Structure from Motion / Future Directions
Because circular objects are projected to ellipses in images, ellipse fitting is a first step for 3-D analysis of circular objects in computervision applications. For this reason, the study of ellipse fitting began a...
详细信息
ISBN:
(数字)9783031018152
ISBN:
(纸本)9783031006876
Because circular objects are projected to ellipses in images, ellipse fitting is a first step for 3-D analysis of circular objects in computervision applications. For this reason, the study of ellipse fitting began as soon as computers came into use for image analysis in the 1970s, but it is only recently that optimal computation techniques based on the statistical properties of noise were established. These include renormalization (1993), which was then improved as FNS (2000) and HEIV (2000). Later, further improvements, called hyperaccurate correction (2006), HyperLS (2009), and hyper-renormalization (2012), were presented. Today, these are regarded as the most accurate fitting methods among all known techniques. This book describes these algorithms as well implementation details and applications to 3-D scene analysis. We also present general mathematical theories of statistical optimization underlying all ellipse fitting algorithms, including rigorous covariance and bias analyses and the theoretical accuracy limit. The results can be directly applied to other computervision tasks including computing fundamental matrices and homographies between images. This book can serve not simply as a reference of ellipse fitting algorithms for researchers, but also as learning material for beginners who want to start computervision research. The sample program codes are downloadable from the website: https://***/a/***/ellipse-fitting-for-computer-vision-implementation-and-applications.
In its early years, the field of computervision was largely motivated by researchers seeking computational models of biological vision and solutions to practical problems in manufacturing, defense, and medicine. For ...
详细信息
ISBN:
(数字)9783031018121
ISBN:
(纸本)9783031006845
In its early years, the field of computervision was largely motivated by researchers seeking computational models of biological vision and solutions to practical problems in manufacturing, defense, and medicine. For the past two decades or so, there has been an increasing interest in computervision as an input modality in the context of human-computer interaction. Such vision-based interaction can endow interactive systems with visual capabilities similar to those important to human-human interaction, in order to perceive non-verbal cues and incorporate this information in applications such as interactive gaming, visualization, art installations, intelligent agent interaction, and various kinds of command and control tasks. Enabling this kind of rich, visual and multimodal interaction requires interactive-time solutions to problems such as detecting and recognizing faces and facial expressions, determining a person's direction of gaze and focus of attention, tracking movement of thebody, and recognizing various kinds of gestures. In building technologies for vision-based interaction, there are choices to be made as to the range of possible sensors employed (e.g., single camera, stereo rig, depth camera), the precision and granularity of the desired outputs, the mobility of the solution, usability issues, etc. Practical considerations dictate that there is not a one-size-fits-all solution to the variety of interaction scenarios; however, there are principles and methodological approaches common to a wide range of problems in the domain. While new sensors such as the Microsoft Kinect are having a major influence on the research and practice of vision-based interaction in various settings, they are just a starting point for continued progress in the area. In this book, we discuss the landscape of history, opportunities, and challenges in this area of vision-based interaction; we review the state-of-the-art and seminal works in detecting and recognizing the human body
Background subtraction is a widely used concept for detection of moving objects in videos. In the last two decades there has been a lot of development in designing algorithms for background subtraction, as well as wid...
详细信息
ISBN:
(数字)9783031018138
ISBN:
(纸本)9783031006852
Background subtraction is a widely used concept for detection of moving objects in videos. In the last two decades there has been a lot of development in designing algorithms for background subtraction, as well as wide use of these algorithms in various important applications, such as visual surveillance, sports video analysis, motion capture, etc. Various statistical approaches have been proposed to model scene backgrounds. The concept of background subtraction also has been extended to detect objects from videos captured from moving cameras. This book reviews the concept and practice of background subtraction. We discuss several traditional statistical background subtraction models, including the widely used parametric Gaussian mixture models and non-parametric models. We also discuss the issue of shadow suppression, which is essential for human motion analysis applications. This book discusses approaches and tradeoffs for background maintenance. This book also reviews many of the recent developments in background subtraction paradigm. Recent advances in developing algorithms for background subtraction from moving cameras are described, including motion-compensation-based approaches and motion-segmentation-based approaches. For links to the videos to accompany this book, please see ***/a/***/backgroundsubtraction/ Table of Contents: Preface / Acknowledgments / Figure Credits / Object Detection and Segmentation in Videos / Background Subtraction from a Stationary Camera / Background Subtraction from a Moving Camera / Bibliography / Author's Biography
Modeling data from visual and linguistic modalities together creates opportunities for better understanding of both, and supports many useful applications. Examples of dual visual-linguistic data includes images with ...
详细信息
ISBN:
(数字)9783031018145
ISBN:
(纸本)9783031006869
Modeling data from visual and linguistic modalities together creates opportunities for better understanding of both, and supports many useful applications. Examples of dual visual-linguistic data includes images with keywords, video with narrative, and figures in documents. We consider two key task-driven themes: translating from one modality to another (e.g., inferring annotations for images) and understanding the data using all modalities, where one modality can help disambiguate information in another. The multiple modalities can either be essentially semantically redundant (e.g., keywords provided by a person looking at the image), or largely complementary (e.g., meta data such as the camera used). Redundancy and complementarity are two endpoints of a scale, and we observe that good performance on translation requires some redundancy, and that joint inference is most useful where some information is complementary. Computational methods discussed are broadly organized into ones forsimple keywords, ones going beyond keywords toward natural language, and ones considering sequential aspects of natural language. Methods for keywords are further organized based on localization of semantics, going from words about the scene taken as whole, to words that apply to specific parts of the scene, to relationships between parts. Methods going beyond keywords are organized by the linguistic roles that are learned, exploited, or generated. These include proper nouns, adjectives, spatial and comparative prepositions, and verbs. More recent developments in dealing with sequential structure include automated captioning of scenes and video, alignment of video and text, and automated answering of questions about scenes depicted in images.
This book provides a thorough overview of recent progress in video object segmentation, providing researchers and industrial practitioners with thorough information on the most important problems and developed technol...
详细信息
ISBN:
(数字)9783031446566
ISBN:
(纸本)9783031446559;9783031446580
This book provides a thorough overview of recent progress in video object segmentation, providing researchers and industrial practitioners with thorough information on the most important problems and developed technologies in the area. Video segmentation is a fundamental topic for video understanding in computervision. Segmenting unique objects in a given video is useful for a variety of applications, including video conference, video editing, surveillance, and autonomous driving. Given the revolution of deep learning in computervision problems, numerous new tasks, datasets, and methods have been recently proposed in the domain of segmentation. The book includes these recent results and findings in large-scale video object segmentation as well as benchmarks in large-scale human-centric video analysis in complex events. The authors provide readers with a comprehensive understanding of the challenges involved in video object segmentation, as well as the most effective methods for resolving them.
暂无评论