Fog computing, which works complementary to cloud computing, is being developed to overcome the issue of cloud computing such as latency in the cases where the data is to be retrieved immediately. But, along with solv...
详细信息
ISBN:
(数字)9781728161754
ISBN:
(纸本)9781728161761
Fog computing, which works complementary to cloud computing, is being developed to overcome the issue of cloud computing such as latency in the cases where the data is to be retrieved immediately. But, along with solving the problem of latency, fog computing brings along with it different set of security issues than cloud computing. The storage and processing capabilities of fog computing is limited and hence, the security issues must be solved with these constrained resources. One of the problems faced when the data is stored outside the internal network is loss of confidentiality. For this, the data must be encrypted. But, whenever document needs to be searched, all the related documents must be decrypted first and later the required document is to be fetched. Within this time frame, the document data can be accessed by an unauthorized person. So, in this paper, a searchable symmetric encryption scheme is proposed wherein the authorized members of an organization can search over the encrypted data and retrieve the required document in order to preserve the security and privacy of the data. Also, the searching complexity of the algorithm is much less so that it suitable to fog computing environment.
PT XYZ is a company engaged in the retail minimarket business in Indonesia. In running its business, one of the key activities involved is opening a new minimarket store. The audit team will analyze proposals to predi...
详细信息
ISBN:
(数字)9781728181967
ISBN:
(纸本)9781728181974
PT XYZ is a company engaged in the retail minimarket business in Indonesia. In running its business, one of the key activities involved is opening a new minimarket store. The audit team will analyze proposals to predict sales of would-be new stores, however, the results of the predictions are often not following reality, so research is needed to predict sales more accurately. This study aims to analyze the prediction of minimarket stores sales using deep learning technique to determine new minimarket stores. The model can predict 53.18% stores that achieve the target prediction and 28.32% stores that have the potential to achieve the target in the future, which indicates an increase in predicting stores on target compared to the branch office method which only predicts 31.2% of stores that achieve the target and 31.62% of potential stores. Thus, the approval decision of new minimarket stores which predicted achieve its target can be more accurate than the branch office method. The audit team will use the model to predict the store sales and consider the result for approving the proposal. Factors that had a significant influence on sales were rack size, store age, distance between competitors, domain location, and store type.
We are living in unprecedented times and anyone in this world could be impacted by natural disasters in some way or the other. Life is unpredictable and what is to come is unforeseeable. Nobody knows what the very nex...
详细信息
ISBN:
(数字)9781728141084
ISBN:
(纸本)9781728141091
We are living in unprecedented times and anyone in this world could be impacted by natural disasters in some way or the other. Life is unpredictable and what is to come is unforeseeable. Nobody knows what the very next moment will hold, maybe it could be a disastrous one too. The past cannot be changed but it can act constructively towards the betterment of the current situation, `Precaution is better than cure'. To be above this uncertain dilemma of life and death situations, `Automated Identification of Disaster News for Crisis Management is proposed using Machine Learning and Natural Language processing'. A software solution that can help disaster management websites to dynamically show the disaster relevant news which can be shared to other social media handles through their sites. The objective here is to automatically scrape news from English news websites and identify disaster relevant news using natural language processing techniques and machine learning concepts, which can further be dynamically displayed on the crisis management websites. The complete model is automated and requires no manual labor at all. The architecture is based on Machine Learning principles that classifies news scraped from top news websites using a spider-scraper into two categories, one being disaster relevant news and other being disaster irrelevant news and eventually displaying the relevant disaster news on the crisis management website.
Streaming Analytics (SA) and Complex Event Recognition (CER) are of paramount importance in searching for an ultimate Big data solution that can simultaneously address data Velocity, Variety, and Volume. Indeed, the g...
详细信息
ISBN:
(数字)9781728162515
ISBN:
(纸本)9781728162522
Streaming Analytics (SA) and Complex Event Recognition (CER) are of paramount importance in searching for an ultimate Big data solution that can simultaneously address data Velocity, Variety, and Volume. Indeed, the growing popularity of streaming data has pushed the boundaries of existing data systems, fostering the rise of Stream processing Engines (SPE). However, data Velocity never appears isolated. Streams are huge, heterogeneous, and noisy as they come from multiple sources. Horizontally-scalable SPEs like Flink and KSQL-DB allow continuous stream analytics using SQL-like languages. On the other hand, CER engines like OracleCEP and DroolFusion use regular languages for (parallel) pattern detection over heterogeneous streams. This paper takes a first step towards a unifying solution. To this extent, we present KELPr, an in-memory distributed CER engine designed extending the Dual Streaming Model and implemented on top of Kafka Streams.
Agent-based modeling has been proposed to simulate real world situations where autonomous agents take their own decisions based on simple rules and data from the environment. The strawberry market in California is a c...
详细信息
ISBN:
(数字)9781728185262
ISBN:
(纸本)9781728185279
Agent-based modeling has been proposed to simulate real world situations where autonomous agents take their own decisions based on simple rules and data from the environment. The strawberry market in California is a challenging example as prices can vary suddenly due to change in supplies and the fruits cannot be stored for long durations. The microeconomic theory that is expected to model this market is implemented within the simulation model to predict the strawberry price based on the difference between total supply and total demand. In this study, the observed yield of strawberry for the two main suppliers in California is considered as total supply; and for predicting the demand, different demand functions are presented. To estimate the ABM model parameters, two optimization methods are applied with Python-Netlogo. Finally, the computational results are presented to show the performance of the prediction model with directions for future research for improving the results.
Changes in data distribution of streaming data (i.e., concept drifts), constitute a central issue in online data mining. The main reason is that these changes are responsible for outdating stream learning models, redu...
详细信息
ISBN:
(数字)9781728162515
ISBN:
(纸本)9781728162522
Changes in data distribution of streaming data (i.e., concept drifts), constitute a central issue in online data mining. The main reason is that these changes are responsible for outdating stream learning models, reducing their predictive performance over time. A common approach adopted by real-time adaptive systems to deal with concept drifts is to employ detectors that indicate the best time for updates. However, an unrealistic assumption of most detectors is that the labels become available immediately after data arrives. In this paper, we introduce an unsupervised and model-independent concept drift detector suitable for high-speed and high-dimensional data streams in realistic scenarios with the scarcity of labels. We propose a straightforward two-dimensional representation of the data aiming faster processing for detection. We develop a simple adaptive drift detector on this visual representation that is efficient for fast streams with thousands of features and is accurate as existing costly methods that perform various statistical tests. Our method achieves better performance measured by execution time and accuracy in classification problems for different types of drifts, including abrupt, oscillating, and incremental. Experimental evaluation demonstrates the versatility of the method in several domains, including astronomy, entomology, public health, political science, and medical science.
With the rapid development of the Internet, intelligent QA (Question Answering) system has been widely used in telecom operators, financial services, e-commerce shopping and other industries, but there are few researc...
详细信息
ISBN:
(数字)9781728160245
ISBN:
(纸本)9781728160252
With the rapid development of the Internet, intelligent QA (Question Answering) system has been widely used in telecom operators, financial services, e-commerce shopping and other industries, but there are few researches and applications of intelligent QA system in the field of Chinese classical poetry. In view of the above situation, this paper aims to implement an automatic QA system based on the knowledge graph of Chinese classical poetry by combining natural language processing technology. In terms of the construction of knowledge graph, the common triads of Chinese classical poetry knowledge was extracted from the classical poetry websites and the knowledge graph of Chinese classical poetry stored with Neo4j was constructed. In the aspect of question recognition and multi-round dialogue, the Rasa framework was adopted to extract the entity and identify the intention of the user's questions in Chinese classical poetry, so as to realize multi-round dialogue.
We propose to assign the F-transform kernels to the CNN weights and compare them with commonly used initialization. By this, we develop a new initialization mechanism where the F-transform convolution kernels are used...
详细信息
ISBN:
(数字)9781728169323
ISBN:
(纸本)9781728169330
We propose to assign the F-transform kernels to the CNN weights and compare them with commonly used initialization. By this, we develop a new initialization mechanism where the F-transform convolution kernels are used in the convolutional layers. Based on a series of experiments, we demonstrate the suitability of the F-transform-based deep neural network in the domain of image processing with the focus on classification. Moreover, we support our insight by revealing the similarity between the F-transform and first-layer kernels in certain deep neural networks.
Analysts prefer simpler interpreted languages to program their computations. Prominent languages include R, Python, and Matlab. On the other hand, analysts aim to compute mathematical models as fast as possible, espec...
详细信息
ISBN:
(数字)9781728162515
ISBN:
(纸本)9781728162522
Analysts prefer simpler interpreted languages to program their computations. Prominent languages include R, Python, and Matlab. On the other hand, analysts aim to compute mathematical models as fast as possible, especially with large data sets. data summarization remains a fundamental technique to accelerate machine learning computations. Based on this motivation, we propose a novel summarization mechanism computed via a single matrix multiplication in the statistical R language. We show our summarization benefits a large family of linear models, including Linear Regression, PCA, and Naive Bayes. We present a subsystem that enables exploiting summarization by detecting Gramian matrix products in R. We optimize the existing R source code by overriding the internal R matrix multiplication algorithm using ours. Our solution can be plugged into R and help solving where a similar matrix multiplication appears, much faster and without RAM limitations. Moreover, our solution can be benefited from the parallel processing ability of the summarization matrix. We present an experimental validation showing our subsystem incurs little overhead since it works on source code while providing much faster speeds compared to the R language built-in functions. To round up our comparisons, we also compare our subsystem with Spark in parallel machines. For our solution, we assume that data can be in the HDFS, disk, or already partitioned. Our solution triumphs Spark in most cases proving we can also compete in the big data space.
The semantic web allows machines to understand the meaning of data and to make better use of it.. Resource Description Framework (RDF) is the liagna franca of Semantic Web. While Big data handles the problematic of st...
详细信息
ISBN:
(数字)9781728126807
ISBN:
(纸本)9781728126814
The semantic web allows machines to understand the meaning of data and to make better use of it.. Resource Description Framework (RDF) is the liagna franca of Semantic Web. While Big data handles the problematic of storing and processing massive data, it still does not provide a support for RDF data. In this paper, we present a new Big data semantic web comprised of a classical Big data system with a semantic layer. As a proof of concept of our approach, we use Mobile-learning as a case study. The architecture we propose is composed of two main parts: a knowledge server and an adaptation model. The knowledge server allows trainers and business experts to represent their expertise using business rules and ontology to ensure heterogeneous knowledge. Then, in a mobility environment, the knowledge server makes it possible to take into account the constraints of the environment and the user constraints thanks to the RDF exchange format. The adaptation model based on RDF graphs corresponds to combinatorial optimization algorithms, whose objective is to propose to the learner a relevant combination of Learning Object based on its contextual constraints. Our solution guarantees scalability, and high data availability through the use of the principle of replication. The results obtained in the system evaluation experiments, on a large number of servers show the efficiency, scalability, and robustness of our system if the amount of data processed is very large.
暂无评论