Compute efficiency is an important consideration for traffic flow prediction models. Machine learning algorithms adjust model parameters automatically based on the data, but often require users to set additional param...
详细信息
ISBN:
(纸本)9781728103235
Compute efficiency is an important consideration for traffic flow prediction models. Machine learning algorithms adjust model parameters automatically based on the data, but often require users to set additional parameters, known as hyperparameters. Hyperparameters can significantly impact prediction accuracy. Traffic measurements, typically collected online by sensors, are serially correlated. Moreover, the data distribution may change gradually. A typical adaptation strategy is periodically re-tuning the model hyperparameters, at the cost of computational burden. In this work, we present an efficient and principled online hyperparameter learning algorithm for kernel-based traffic prediction models. In tests with real traffic measurement data, our approach requires as little as one-seventh of the computation time of other tuning methods, while achieving better or similar prediction accuracy.
This paper describes a software tool DBMine, developed to assist industrial engineers in data mining. This tool implements three common data mining methodologies: Bacon's algorithm, Decision Trees and DB-Learn. Im...
详细信息
This paper describes a software tool DBMine, developed to assist industrial engineers in data mining. This tool implements three common data mining methodologies: Bacon's algorithm, Decision Trees and DB-Learn. Implemented in Microsoft Visual Basic 3.0(C), DBMine, can utilize data in Microsoft Access 2.0(C) and in Watcom SQL(C) databases. This paper will also present an example session in which job shop sequences produced by a Genetic Algorithm are explored for regularity. (C) 1997 Elsevier Science Ltd.
The problem of community detection in a network with features at its nodes takes into account both the graph structure and node features. The goal is to find relatively dense groups of interconnected entities sharing ...
详细信息
ISBN:
(纸本)9783030623647;9783030623654
The problem of community detection in a network with features at its nodes takes into account both the graph structure and node features. The goal is to find relatively dense groups of interconnected entities sharing some features in common. We apply the so-called data recovery approach to the problem by combining the least-squares recovery criteria for both, the graph structure and node features. In this way, we obtain a new clustering criterion and a corresponding algorithm for finding clusters/communities one-by-one. We show that our proposed method is effective on real-world data, as well as on synthetic data involving either only quantitative features or only categorical attributes or both. Our algorithm appears competitive againststate-of-the-art algorithms.
Analysing credit data using a neural network has hitherto proved to be very resilient to attempts to improve success rates in prediction. We present a technique using simulated data which results in a marginal improve...
详细信息
ISBN:
(纸本)9783030623647;9783030623654
Analysing credit data using a neural network has hitherto proved to be very resilient to attempts to improve success rates in prediction. We present a technique using simulated data which results in a marginal improvement in success rate. The empirical probability distribution for each feature of the training data is determined, and random samples are drawn from those distributions. The result is termed 'artificial' data. It is then possible to generate equal volumes of data for each of the binary outcomes (default or not), thereby alleviating a class imbalance classification problem. The simulation method uses a copula (to preserve the correlation structure of the original data) and optimal feature weighting to give acceptable results. The results indicate that overall percentage success rates for the more common outcome only are improved, but there is a more significant improvement in the AUC metric. The significance of this result in the context of assessing credit worthiness is discussed.
With the popularities of Massive Open Online Courses, a great number of enrollments in MOOCs generate much educational big data in terms of online activities and logs, which might be valuable for academia and practiti...
详细信息
ISBN:
(纸本)9789869401265
With the popularities of Massive Open Online Courses, a great number of enrollments in MOOCs generate much educational big data in terms of online activities and logs, which might be valuable for academia and practitioners. More personalized and intelligent online learning environment could be potentially created through educational data mining and learning analytics techniques. Based on Item Response Theory (IRT), the current study builds an item analysis system to identify alternative concepts/misconceptions from leaners' response in exams. By calculating difficulty parameter and discrimination parameter from massive learners, our systems are believed to benefit both teaching faculties and online learners. With the affordances of the system, teaching faculties could assess leaners' learning performance and quality of test items while alternative conception of leaners would be identified for strategic learning. Other practical and technical implications will be discussed in this paper.
automated vehicles are AI-based safety-critical robots that fulfill transportation needs while interacting with the general public in traffic. Software engineering for automated vehicles requires a DevOps-style proces...
详细信息
ISBN:
(纸本)9781728117645
automated vehicles are AI-based safety-critical robots that fulfill transportation needs while interacting with the general public in traffic. Software engineering for automated vehicles requires a DevOps-style process with special considerations for functions based on machine learning and incremental safety assurance at vehicle and fleet level. This technical briefing reviews current challenges, industry practices, and opportunities for future research in software engineering for automated vehicles.
When sensor fusion operations are conducted in coalition environments, security of the data and infrastructure used for model fusion are very important. AI enabled sensor fusion infrastructure can be attacked on many ...
详细信息
ISBN:
(纸本)9780996452762
When sensor fusion operations are conducted in coalition environments, security of the data and infrastructure used for model fusion are very important. AI enabled sensor fusion infrastructure can be attacked on many fronts, including attacks on the data used for sensor information fusion and disrupting the communication between devices and the fusion nodes, in addition to the traditional security attacks. As the infrastructure for sensor fusion becomes more automated with multiple intelligent assistants for data collection, different types of attacks are possible. AI enabled approaches can be used to improve the security and resiliency of federated networks, and the data that is shared across coalition problems. In this paper, we discuss the challenges associated with security of coalition infrastructures, and approaches to improve the security using AI and machine learning techniques
For the management and operation of a Wastewater Treatment Plant (WWTP), the influent flow is one of the most important variables. Hence, this paper presents an evaluation of multiple Deep learning models to forecast ...
详细信息
ISBN:
(纸本)9783030623616;9783030623623
For the management and operation of a Wastewater Treatment Plant (WWTP), the influent flow is one of the most important variables. Hence, this paper presents an evaluation of multiple Deep learning models to forecast the influent flow in WWTPs for the next three days, taking into account previous influent observations as well as historical climatological data. Long Short-Term Memory networks (LstMs) and one-dimensional Convolutional Neural Networks (CNNs), following a channels' last approach, were conceived to tackle this time series problem. The best candidate LstM model was able to forecast the influent flow with an approximate overall error of 200m3 for the three forecast days. On the other hand, the best candidate CNN model presented a slightly higher error, being outperformed by LstM-based models. Nonetheless, CNNs, which are typically applied in the computer vision domain, also showed interesting performance for time series forecasting.
Negation is a linguistic phenomenon that usually occurs in a text for denial or refute of some occasion. Detection of such negative assertions is an essential sub-task in various applications of information extraction...
详细信息
ISBN:
(纸本)9781665424271
Negation is a linguistic phenomenon that usually occurs in a text for denial or refute of some occasion. Detection of such negative assertions is an essential sub-task in various applications of information extraction and data mining. In this paper, we present a deep multitask learning (MTL) framework to enhance the performance of Negation Scope detection using part-of-speech (POS) tagging as an auxiliary task. We show how the relationship between these two tasks, which do not seem to be easily linked from a linguistic point of view, is mutually beneficial.
We propose a new deep learning based system for short term prediction of pedestrian behavior in front of a vehicle. To achieve this, we first develop a framework for class specific object tracking and short term path ...
详细信息
ISBN:
(纸本)9781728103235
We propose a new deep learning based system for short term prediction of pedestrian behavior in front of a vehicle. To achieve this, we first develop a framework for class specific object tracking and short term path prediction based on a variant of a Variational Recurrent Neural Network (VRNN), which incorporates latent variables corresponding to a dynamic state space model. The low level visual features learned from this system were found to be highly informative for the discrete intention prediction task (i.e., predicting whether a pedestrian is stopping or crossing), and achieved high performance on the Daimler benchmark. This is despite a much smaller training dataset than is normally used for training deep learning models. To the best of our knowledge, we are the first to apply deep learning to this problem without using externally trained pedestrian pose estimation systems. Our system performs comparable to the state-of-the-art approach that relies on pose estimation, and runs in real time.
暂无评论