There is a great challenge when building an efficient Big Data Warehouse (DW) from the traditional data warehouse which used to handle the large datasets. Several presented solutions concentrate on the conversion of a...
详细信息
There is a great challenge when building an efficient Big Data Warehouse (DW) from the traditional data warehouse which used to handle the large datasets. Several presented solutions concentrate on the conversion of a standard DW to an columnar model, especially for direct and traditional data sources. Though there have been many successful algorithms that apply data clustering methods, these approaches also come with their fair share of limitations. This paper provides a comprehensive review of the existing methods, both tuned and out-of-the box, exposing their strengths and weaknesses. Further, a comparative study of the different options is always conducted to compare and assess them.
The emergence of large volumes of data imposed by the major players of the web requires new management models and new data storage architectures and treatment able to find information quickly in a large volume of data...
详细信息
ISBN:
(纸本)9781479938407
The emergence of large volumes of data imposed by the major players of the web requires new management models and new data storage architectures and treatment able to find information quickly in a large volume of data. The column-orientednosql (Not Only SQL) database provide for big data the most suitable model to the data warehouse and the structure of multidimensional data in OLAP cube form. However, in the absence of OLAP cube computation operators, we propose in this paper, a new aggregation operator called CN-CUBE (columnar nosql CUBE), which allows data cubes to be computed from data warehouses stored in column-oriented nosql database management system. We implemented the CN-CUBE operator using the SQL Phoenix interface of HBase DBMS and conducted experiments on a public data warehouse in a distributed environment produced using the Hadoop platform. Thus we have shown that our CN-CUBE operator has OLAP cubes computation times very suitable for nosql warehouses.
暂无评论