As much as good representation and theory are needed to explain human actions, so are the action videos used for learning good segmentation techniques. To accurately model complex actions such as diving, figure skatin...
详细信息
ISBN:
(纸本)9781450366151
As much as good representation and theory are needed to explain human actions, so are the action videos used for learning good segmentation techniques. To accurately model complex actions such as diving, figure skating, and yoga practices, videos depicting action by human experts are required. Lack of experts in any domain leads to reduced number of videos and hence an improper learning. In this work we attempt to utilize imperfect amateur performances to get more confident representations of human action sequences. We introduce a novel Community Detection based unsupervised framework that provides mechanisms to interpret video data and address its limitations to produce better action representation. Human actions are composed of distinguishable key poses which form dense communities in graph structures. Anomalous poses performed for a longer duration can also form such dense communities but can be identified based on their rare occurrence across action videos and be rejected. Further, we propose a technique to learn the temporal order of these key poses from these imperfect videos, where the inter community links help reduce the search space of many possible pose sequences. Our framework is seen to improve the segmentation performance of complex human actions withthe help of some imperfect performances. the efficacy of our approach has been illustrated over two complex action datasets - Sun Salutation and Warm-up exercise, that have been developed using random executions from amateur performers.
Recognition of human actions is one of the important tasks in various computervision applications including video surveillance, human computer interaction etc. Traditionally RGB or depth cameras are utilized for this...
详细信息
this book constitutes thoroughly revised and selected papers from the 15th International Joint conference on computervision, Imaging and computergraphicstheory and Applications, VISIGRAPP 2020, held in Valletta, Ma...
详细信息
ISBN:
(数字)9783030948931
ISBN:
(纸本)9783030948924
this book constitutes thoroughly revised and selected papers from the 15th International Joint conference on computervision, Imaging and computergraphicstheory and Applications, VISIGRAPP 2020, held in Valletta, Malta, in February 2020.;the 25 thoroughly revised and extended papers presented in this volume were carefully reviewed and selected from 455 submissions. the papers contribute to the understanding of relevant trends of current research on computergraphics; human computer interaction; information visualization; computervision.
3D human pose estimation from a single image is a challenging problem, especially for in-the-wild settings due to the lack of 3D annotated data. We propose two anatomically inspired loss functions and use them with a ...
详细信息
ISBN:
(纸本)9783030012403;9783030012397
3D human pose estimation from a single image is a challenging problem, especially for in-the-wild settings due to the lack of 3D annotated data. We propose two anatomically inspired loss functions and use them with a weakly-supervised learning framework to jointly learn from large-scale in-the-wild 2D and indoor/synthetic 3D data. We also present a simple temporal network that exploits temporal and structural cues present in predicted pose sequences to temporally harmonize the pose estimations. We carefully analyze the proposed contributions through loss surface visualizations and sensitivity analysis to facilitate deeper understanding of their working mechanism. Jointly, the two networks capture the anatomical constraints in static and kinetic states of the human body. Our complete pipeline improves the state-of-the-art by 11.8% and 12% on Human3.6M and MPI-INF-3DHP, respectively, and runs at 30 FPS on a commodity graphics card.
Patch-based techniques are proven to generate promising results and outperform many of the existing state-of-art techniques for most of the applications in digital imageprocessing. In this work we develop a patch bas...
详细信息
Recently, the interest in Micro Aerial Vehicles (MAVs) and their autonomous flights has increased tremendously and significant advances have been made. the monocular camera has turned out to be most popular sensing mo...
详细信息
ISBN:
(纸本)9781450347532
Recently, the interest in Micro Aerial Vehicles (MAVs) and their autonomous flights has increased tremendously and significant advances have been made. the monocular camera has turned out to be most popular sensing modality for MAVs as it is light-weight, does not consume more power, and encodes rich information about the environment around. In this paper, we present DeepFly, our framework for autonomous navigation of a quadcopter equipped with monocular camera. the navigable space detection and waypoint selection are fundamental components of autonomous navigation system. they have broader meaning than just detecting and avoiding immediate obstacles. Finding the navigable space emphasizes equally on avoiding obstacles and detecting ideal regions to move next to. the ideal region can be defined by two properties: 1) All the points in the region have approximately same high depth value and 2) the area covered by the points of the region in the disparity map is considerably large. the waypoints selected from these navigable spaces assure collision-free path which is safer than path obtained from other waypoint selection methods which do not consider neighboring information. In our approach, we obtain a dense disparity map by performing a translation maneuver. this disparity map is input to a deep neural network which predicts bounding boxes for multiple navigable regions. Our deep convolutional neural network with shortcut connections regresses variable number of outputs without any complex architectural add on. Our autonomous navigation approach has been successfully tested in both indoors and outdoors environment and in range of lighting conditions.
Region features in colour images are of interest in applications such as mapping, climatology, change detection, medicine, etc. this research work is an attempt to automate the process of extracting feature boundaries...
详细信息
ISBN:
(纸本)9788086943022
Region features in colour images are of interest in applications such as mapping, climatology, change detection, medicine, etc. this research work is an attempt to automate the process of extracting feature boundaries from colour images. this process is an attempt to eventually replace manual digitization process by computer assisted boundary detection and conversion to a vector layer in a spatial database. In colour images, various features can be distinguished based on their colour. the features thus extracted as object border can be stored as vector maps in a spatial database after labelling and editing. Here, we present a complete methodology of the boundary extraction and skeletonization process from colour imagery using a colour image segmentation algorithm, a crust extraction algorithm and our new skeleton extraction algorithm. We also present a prototype application for completely automated or semi-automated processing of (satellite) imagery and scanned maps with an application to coastline extraction. Other applications include extraction of fields, clear cuts, clouds, as well as heating or pollution monitoring and dense forest mapping among others.
this paper discusses how to combine particle filter (PF) with particle swarm optimization (PSO) to achieve better object tracking. Owing to multi-swarm based mode seeking the algorithm is capable of maintaining multim...
详细信息
ISBN:
(纸本)9783319028958;9783319028941
this paper discusses how to combine particle filter (PF) with particle swarm optimization (PSO) to achieve better object tracking. Owing to multi-swarm based mode seeking the algorithm is capable of maintaining multimodal probability distributions and the tracking accuracy is far better than accuracy of PF or PSO. We propose parallel resampling scheme for particle filtering running on GPU. We show the efficiency of the parallel PF-PSO algorithm on 3D model based human motion tracking. the 3D model is rasterized in parallel and single thread processes one column of the image. Such level of parallelism allows us to efficiently utilize the GPU resources and to perform tracking of the full human body at rates of 15 frames per second. the GPU achieves an average speedup of 7.5 over the CPU. For marker-less motion capture system consisting of four calibrated cameras, the computations were conducted on four CPU cores and four GTX GPUs on two cards.
the problem of retargeting a larger image in a small display is to maintain recognizability of the objects. the retargeting scheme proposed in this paper provides a suitable solution to this. the input image is partit...
详细信息
In this paper, we demonstrate a computervision application on mobile phones. One can take a picture at a heritage site/monument and obtain associated annotations on a mid-end mobile phone instantly. this does not req...
详细信息
暂无评论