Urban heterogeneity is influenced by population density, cultural factors, and historical background. Three-dimensional mapping using LidAR is essential for capturing structural and spatial changes in complex urban ar...
详细信息
Urban heterogeneity is influenced by population density, cultural factors, and historical background. Three-dimensional mapping using LidAR is essential for capturing structural and spatial changes in complex urban areas. Machine learning-based algorithms for processing point clouds play vital role in transforming unprocessed LidAR data into relevant information suitable for urban applications such as object recognition, and3d mapping. This underscores the importance of having range of benchmark LidAR datasets to enhance development, testing, and validation of algorithms tailored to various urban contexts. However, existing Airborne LidAR Scanned (ALS) datasets represent a limited range of global land cover diversity. To address this gap, we introduce Thiruvananthapuram Aerial LidAR dataset (TALd), a benchmark dataset covering 9 square kilometers from Thiruvananthapuram, Kerala, India. This South Indian region exhibits high-density mixed urban development integrating both built and vegetative elements. TALd, derived from ALS point clouds, has an average point density of 12 points/m(2) and includes colored LidAR points classified into buildings, trees, shrubs, and ground. The dataset is created through systematic pre-processing, classification using automated algorithms and manual corrections, and instance segmentation for noise removal. It includes X, Y, Z coordinates (UTM 43N), RGB values, return number, number of returns, scan angle rank, and class designation. The Land-cover diversity Index (LdI) is 1.49 for TALd, significantly higher than dALES (0.23) and ISPRS Vaihingen 3d (0.37), highlighting its focus on tropical urban environments with dense vegetation and complex infrastructure. TALd serves as a valuable benchmark for advancing point cloud processing, supporting urban mapping in challenging landscapes.
A wide variety of methods have been developed to predict the posture of the human body at a given point in time based on data on previous movements. More recently, prediction models based on deep learning have become ...
详细信息
A wide variety of methods have been developed to predict the posture of the human body at a given point in time based on data on previous movements. More recently, prediction models based on deep learning have become a topic of active research anddevelopment. In this study, we adopt the strategy of separating spatial and temporal information based on an existing STGCN model to extract features effectively in both space and time, and we analyzed the effects of signed or unsigned anddirected or undirected forecasts of the positions of human joints with this approach. We propose a method using an encoder based on a modified graph adjacency matrix in a graph convolutional network model and focus especially on the terms of the signs anddirections of data on the locations of the joints in space and time. We also introduce a global residual block. The results of an experimental evaluation of our proposed method showed that we obtained better performance by applying the signed anddirected features independently to the spatial and temporal adjacency matrices. The proposed model exhibited noticeable improvements in several aspects. In future research, we expect these features of the modified adjacency matrix to help learning models understand the correlation between symbols anddirections for various actions and poses.
With the recent growth of urban mapping and autonomous driving efforts, there has been an explosion of raw 3ddata collected from terrestrial platforms with lidar scanners and color cameras. However, due to high label...
详细信息
ISBN:
(纸本)9781665426886
With the recent growth of urban mapping and autonomous driving efforts, there has been an explosion of raw 3ddata collected from terrestrial platforms with lidar scanners and color cameras. However, due to high labeling costs, ground-truth 3d semantic segmentation annotations are limited in both quantity and geographic diversity, while also being difficult to transfer across sensors. In contrast, large image collections with ground-truth semantic segmentations are readily available for diverse sets of scenes. In this paper, we investigate how to use only those labeled 2d image collections to supervise training 3d semantic segmentation models. Our approach is to train a 3d model from pseudo-labels derived from 2d semantic image segmentations using multiview fusion. We address several novel issues with this approach, including how to select trusted pseudo-labels, how to sample 3d scenes with rare object categories, and how to decouple input features from 2d images from pseudo-labels during training. The proposed network architecture, 2d3dNet, achieves significantly better performance (+6.2-11.4 mIoU) than baselines during experiments on a new urban dataset with lidar and images captured in 20 cities across 5 continents.
This paper presents some ideas which extend the functionality and the application fields of a spatially selective coding within a JPEG2000 framework. At first, the image quality drop between the Regions of Interest (R...
详细信息
ISBN:
(纸本)0819450235
This paper presents some ideas which extend the functionality and the application fields of a spatially selective coding within a JPEG2000 framework. At first, the image quality drop between the Regions of Interest (ROI) and the background (BG) is considered. In a conventional approach, the reconstructed image quality steeply drops along the ROI boundary;however, this effect could be considered or perceived objectionable in some use cases. A simple quality decay management is proposed here, which makes use of concentric ROI with different scaling factors. This allows the technique to be perfectly consistent with the JPEG2000 part 2 ROI definition anddescription. Another considered issue is the extension of the selective ROI coding to a 3d Volume of Interest coding. This extension is currently under consideration for the part 10 of JPEG2000, JP3d. An easy and effective 2d to 3d extension for the VOI definition anddescription is proposed here: a VOI is defined by a set composition of ROI generated solids, where ROI are defined along one or more volume cutting direction, and is described by the relative set of ROI parameters. Moreover, the quality decay management can be applied to this extension. The proposed techniques could have a significant impact on the selective coding of medical images and volumes. Image quality issues are very important but very critical factors in that field, which also constitutes the dominant market for 3d applications. Therefore;some experiments are presented on medical images and volumes in order evaluate the benefits of the proposed approaches in terms of diagnostic quality improvement with respect to a conventional ROI coding usage.
暂无评论