The availability of bigdata sets in research, industry and society in general has opened up many possibilities of how to use this data. In many applications, however, it is not the data itself that is of interest but...
详细信息
The availability of bigdata sets in research, industry and society in general has opened up many possibilities of how to use this data. In many applications, however, it is not the data itself that is of interest but rather we want to answer some question about it. These answers may sometimes be phrased as solutions to an optimization problem. We survey some algorithmic methods that optimize over large-scale data sets, beyond the realm of machine learning.
The volume and availability of data in the Intelligent Transportation System (ITS) result in the need for data driven approaches. big data algorithms are applied to further enhance the intelligence of the applications...
详细信息
The volume and availability of data in the Intelligent Transportation System (ITS) result in the need for data driven approaches. big data algorithms are applied to further enhance the intelligence of the applications in the transportation field. Applying big data algorithms has increasingly received attention in both the academic and industrial fields of ITS. big data algorithms in ITS have a wide range of applications including but not limited to signal recognition, object detection, traffic flow prediction, travel time planning, travel route planning and safety of vehicle and road. This survey aims to provide a bibliography, a comprehensive review of the application of ITS and a review of most recognized models with bigdata used in the context of ITS. 586 papers are reviewed over the period 1997-2019. This study provides a deep insight into applications of big data algorithms in ITS, revealing different areas of those applications and integrates models and applications. The result of the study identifies research gaps and direction for the future.
This survey paper provides a comprehensive analysis of big data algorithms in recommendation systems,addressing the lack of depth and precision in existing *** proposes a two-pronged approach:a thorough analysis of cu...
详细信息
This survey paper provides a comprehensive analysis of big data algorithms in recommendation systems,addressing the lack of depth and precision in existing *** proposes a two-pronged approach:a thorough analysis of current algorithms and a novel,hierarchical taxonomy for precise *** taxonomy is based on a tri-level hierarchy,starting with the methodology category and narrowing down to specific *** a framework allows for a structured and comprehensive classification of algorithms,assisting researchers in understanding the interrelationships among diverse algorithms and *** a wide range of algorithms,this taxonomy first categorizes algorithms into four main analysis types:user and item similarity based methods,hybrid and combined approaches,deep learning and algorithmic methods,and mathematical modeling methods,with further subdivisions into sub-categories and *** paper incorporates both empirical and experimental evaluations to differentiate between the *** empirical evaluation ranks the techniques based on four *** experimental assessments rank the algorithms that belong to the same category,sub-category,technique,and ***,the paper illuminates the future prospects of bigdata techniques in recommendation systems,underscoring potential advancements and opportunities for further research in this fields.
Software testing is an important process to evaluate whether the developed software applications meet the required specifications. There is an emerging need for testing frameworks for bigdata software projects to ens...
详细信息
ISBN:
(纸本)9781665439022
Software testing is an important process to evaluate whether the developed software applications meet the required specifications. There is an emerging need for testing frameworks for bigdata software projects to ensure the quality of the bigdata applications and satisfy the user requirements. In this study, we propose a software testing framework that can be utilized in bigdata projects both in e-science and e-commerce. In particular, we design the proposed framework to test bigdata-based recommendation applications. To show the usability of the proposed framework, we provide a reference prototype implementation and use the prototype to test a bigdata recommendation application. We apply the prototype implementation to test both functional and non-functional methods of the recommendation application. The results indicate that the proposed testing framework is usable and efficient for testing the recommendation systems that use bigdata processing techniques.
The vast majority of social science research uses small (megabyte- or gigabyte-scale) datasets. These fixed-scale datasets are commonly downloaded to the researcher's computer where the analysis is performed. The ...
详细信息
The vast majority of social science research uses small (megabyte- or gigabyte-scale) datasets. These fixed-scale datasets are commonly downloaded to the researcher's computer where the analysis is performed. The data can be shared, archived, and cited with well-established technologies, such as the dataverse Project, to support the published results. The trend toward bigdataincluding large-scale streaming datais starting to transform research and has the potential to impact policymaking as well as our understanding of the social, economic, and political problems that affect human societies. However, bigdata research poses new challenges to the execution of the analysis, archiving and reuse of the data, and reproduction of the results. Downloading these datasets to a researcher's computer is impractical, leading to analyses taking place in the cloud, and requiring unusual expertise, collaboration, and tool development. The increased amount of information in these large datasets is an advantage, but at the same time it poses an increased risk of revealing personally identifiable sensitive information. In this article, we discuss solutions to these new challenges so that the social sciences can realize the potential of bigdata.
With the increasingly serious global climate change, low-carbon development has become a common concern of all countries. As the largest developing country in the world, China actively responds to climate change and v...
详细信息
With the continuous development of Internet and information technology, bigdata has become an important topic in today's society. In the power industry, big data algorithms are gradually being applied. This paper...
详细信息
Structural and metallurgical factors that cause the differences in the fracture resistance of steels and alloys are studied, which is necessary for predicting the destruction of media with an inhomogeneous structure. ...
详细信息
Structural and metallurgical factors that cause the differences in the fracture resistance of steels and alloys are studied, which is necessary for predicting the destruction of media with an inhomogeneous structure. The prospects for digitalization of measuring the parameters of the structure and fracture surfaces using big data algorithms are considered.
Discovering new trends and co-occurrences in massive data is a key step when analysing social media, data coming from sensors, etc. Traditional data Mining techniques are not able, in many occasions, to handle such am...
详细信息
ISBN:
(纸本)9783319914763;9783319914756
Discovering new trends and co-occurrences in massive data is a key step when analysing social media, data coming from sensors, etc. Traditional data Mining techniques are not able, in many occasions, to handle such amount of data. For this reason, some approaches have arisen in the last decade to develop parallel and distributed versions of previously known techniques. Frequent itemset mining is not an exception and in the literature there exist several proposals using not only parallel approximations but also Spark and Hadoop developments following the MapReduce philosophy of bigdata. When processing fuzzy data sets or extracting fuzzy associations from crisp data the implementation of such bigdata solutions becomes crucial, since available algorithms increase their execution time and memory consumption due to the problem of not having Boolean items. In this paper, we first review existing parallel and distributed algorithms for frequent itemset and association rule mining in the crisp and fuzzy case, and afterwards we develop a preliminary proposal for mining not only frequent fuzzy itemsets but also fuzzy association rules. We also study the performance of the proposed algorithm in several datasets that have been conveniently fuzzyfied obtaining promising results.
We have provided a parallel implementation of Gaussian Mixture Model (GMM) Expectation Maximization algorithm using Apache Hama Bulk synchronous Parallel approach. Apache Hama is suitable for iterative, compute intens...
详细信息
ISBN:
(纸本)9781467379106
We have provided a parallel implementation of Gaussian Mixture Model (GMM) Expectation Maximization algorithm using Apache Hama Bulk synchronous Parallel approach. Apache Hama is suitable for iterative, compute intensive tasks. EM is iterative algorithm which converges to local minimum after many iterations. We have provided approach for distributing workload for Expectation and Maximization tasks on cluster nodes in case of bigdata. The approach is compared with Hadoop MaprRduce and Apache Spark implementations, using different datasets.
暂无评论