Dimensionality reduction (DR) techniques for multidimensional data serve as powerful tools for visualization and understanding of the structure of the data. Various DR methods have been developed to extract specific f...
详细信息
Dimensionality reduction (DR) techniques for multidimensional data serve as powerful tools for visualization and understanding of the structure of the data. Various DR methods have been developed to extract specific features of the data over the years. However, selection of the optimal DR method and fine-tuning parameters are still challenging, as these choices vary based on the characteristics of the dataset. Consequently, data scientists often rely on their experience or undertake extensive experimentation to identify the most suitable approach. This paper proposes a semi-automatic method for selecting appropriate DR techniques through scatterplot evaluation. Initially, our approach applies a range of DR methods to the given multidimensional data to compute two-dimensional values. Next, we generate scatterplots from the two-dimensional data and calculate scores reflecting the distribution and spatial relationships among the points. Scatterplots that provide insights achieve higher scores, enabling an efficient selection of DR methods based on their visualization. We demonstrate the effectiveness of the presented method through two case studies: The first one is an e-commerce review dataset, and the second focuses on a dataset derived from music feature extraction.
暂无评论