The bigdata generated by tunnel boring machines (TBMs) are widely used to reveal complex rock-machine interactions by machine learning (ML) algorithms. datapreprocessing plays a crucial role in improving ML accuracy...
详细信息
The bigdata generated by tunnel boring machines (TBMs) are widely used to reveal complex rock-machine interactions by machine learning (ML) algorithms. datapreprocessing plays a crucial role in improving ML accuracy. For this, a TBM big data preprocessing method in ML was proposed in the present study. It emphasized the accurate division of TBM tunneling cycle and the optimization method of feature extraction. Based on the data collected from a TBM water conveyance tunnel in China, its effectiveness was demonstrated by application in predicting TBM performance. Firstly, the Score-Kneedle (S-K) method was proposed to divide a TBM tunneling cycle into five phases. Conducted on 500 TBM tunneling cycles, the S-K method accurately divided all five phases in 458 cycles (accuracy of 91.6%), which is superior to the conventional duration division method (accuracy of 74.2%). Additionally, the S-K method accurately divided the stable phase in 493 cycles (accuracy of 98.6%), which is superior to two state-of-the-art division methods, namely the histogram discriminant method (accuracy of 94.6%) and the cumulative sum change point detection method (accuracy of 92.8%). Secondly, features were extracted from the divided phases. Specifically, TBM tunneling resistances were extracted from the free rotating phase and free advancing phase. The resistances were subtracted from the total forces to represent the true rock-fragmentation forces. The secant slope and the mean value were extracted as features of the increasing phase and stable phase, respectively. Finally, an ML model integrating a deep neural network and genetic algorithm (GA-DNN) was established to learn the preprocessed data. The GA-DNN used 6 secant slope features extracted from the increasing phase to predict the mean field penetration index (FPI) and torque penetration index (TPI) in the stable phase, guiding TBM drivers to make better decisions in advance. The results indicate that the proposed TBM bigdata preproce
Customer satisfaction is an essential area of the industry in this 21st century sometimes as known as the information age. However, the perception of customer expectation remains a problem in today's businesses. T...
详细信息
ISBN:
(纸本)9781728112824
Customer satisfaction is an essential area of the industry in this 21st century sometimes as known as the information age. However, the perception of customer expectation remains a problem in today's businesses. The internet has enabled people to spread out their thoughts through Social Media (SM) platforms, forums, news comments, and blogs. Consequently, those platforms are generating exponentially the immense amounts of data. The extraction of opinions from those bigdata can actively allow to rate organizations, learn the consumer needs, and adjust the business's strategies. This paper presents a concept of building a rating system, using bigdata Analytics (BDA) techniques, that apply the existing Sentiment Analysis (SA) algorithms to gain insight into reviews gathered from SM applications. The system will allow to list the various categories of services and evaluate them based on the obtained the customers' reactions. Also, this study aims to manage a large volume of information to rank the institutions and provide a practical solution for competitive, marketing analysis, and track the improvement of customer satisfaction within both the public and private sectors to boost the excellent service delivery in Rwanda.
Internet traffic destined to routable yet unallocated IP addresses is commonly referred to as telescope or darknet data. Such unsolicited traffic is frequently, abundantly and effectively exploited to generate various...
详细信息
ISBN:
(纸本)9781479966646
Internet traffic destined to routable yet unallocated IP addresses is commonly referred to as telescope or darknet data. Such unsolicited traffic is frequently, abundantly and effectively exploited to generate various cyber threat intelligence related, but not limited to, scanning activities, distributed denial of service attacks and malware identification. However, such data typically contains a significant amount of misconfiguration traffic caused by network/routing or hardware/software faults. The latter not only immensely affects the purity of darknet data, which hinders the accuracy of inference algorithms that operate on such data, but also wastes valuable storage resources. This paper proposes a probabilistic model to preprocess darknet data in order to prepare it for effective use. The aim is to fingerprint darknet misconfiguration traffic and subsequently filter it out. The model is advantageous as it does not rely on arbitrary cut-off thresholds, provide separate likelihood models to distinguish between misconfiguration and other darknet traffic, and is independent from the nature of the source of the traffic. To the best of our knowledge, the proposed model renders a first attempt ever to formally tackle the problem of preprocessing darknet traffic. Through empirical evaluations using real darknet traffic and by comparing the proposed model against the baseline and a heuristic approach, we demonstrate the accuracy and effectiveness of the model.
The rapid observed increase in using the Internet led to the presence of huge amounts of data. Traditional data technologies, techniques, and even applications cannot cope with the new data's volume, structure, an...
详细信息
ISBN:
(纸本)9783030311292;9783030311285
The rapid observed increase in using the Internet led to the presence of huge amounts of data. Traditional data technologies, techniques, and even applications cannot cope with the new data's volume, structure, and types of styles. bigdata concepts come to assimilate this non-stop flooding. bigdata analysis process used to jewel the useful data and exclude the other one which provides better results with minimum resource utilization, time, and cost. Feature selection principle is a traditional data dimension reduction technique, and bigdata analytics provided modern technologies and frameworks that feature selection can be integrated with them to provide better performance for the principle itself and help in preprocessing of bigdata on the other hand. The main objective of this paper is to survey the most recent research challenges for bigdata analysis and preprocessing processes. The analysis is carried out via acquiring data from resources, storing them, then filtered to pick up the useful ones and dismissing the unwanted ones then extracting information. Before analyzing data, it needs preparation to remove noise, fix incomplete data and put it in a suitable pattern. This is done in the preprocessing step by various models like data reduction, cleaning, normalization, preparation, integration, and transformation.
In order to deeply explore the hidden value of bigdata and promote the process improvement and decision-making level of complex product manufacturing enterprises, the construction of bigdata analysis service platfor...
详细信息
ISBN:
(纸本)9781728114101
In order to deeply explore the hidden value of bigdata and promote the process improvement and decision-making level of complex product manufacturing enterprises, the construction of bigdata analysis service platform for complex product manufacturing was studied. On the basis of the requirement analyzing of complex products, we set up a service-oriented manufacturing bigdata access platform architecture, and introduced the key technologies of the platform, elaborated the platform function. The platform has implemented the bigdata access, pre-processing, storage and analysis, especially provided distributed computing engines, algorithmic artifacts, visual artifacts, and visual analysis tools in complex product manufacturing. It supports the rapid construction of complex product bigdata analysis applications in form of service assembly.
暂无评论