The University institutions currently face challenges in conducting the admissions process to get new students. The process must be carried out in the right way to ensure that prospective students who are accepted hav...
详细信息
ISBN:
(纸本)9781665438070
The University institutions currently face challenges in conducting the admissions process to get new students. The process must be carried out in the right way to ensure that prospective students who are accepted have the right abilities to meet academic targets in their chosen scientific field. The admission process is carried out based on predetermined criteria by determining the weight of the requirements according to the policy for that period. There is often a mismatch in the abilities of the student candidates who are accepted with the skills needed in the chosen field, so that there is a potential dropout risk for these students. One way to avoid this is to know the essential criteria in the admission test. We create two models based on the chosen algorithm. The Random Forest algorithm has a better accuracy rate, which is 85.17%, compared to the 80,27% accuracy rate of the Neural Network algorithm. This study found the most important feature of the admission process is the school ranking, where this feature has the most significant influence compared to the other, which is more than 20% importance's rate. This study also found a significant difference in the gender distribution of the accepted applicants, with a ratio of 24% for male registrants and 76% for female registrants. For the failed registrants, there were 25% male registrants and 75% female registrants. This study is based on the admission test data so that the most important feature found in this study can be used as a basis for policymaking for admission tests to come.
Purpose The impact of cyberattacks all over the world has been increasing at a constant rate every year. Performing exploratoryanalysis helps organizations to identify, manage and safeguard the information that could...
详细信息
Purpose The impact of cyberattacks all over the world has been increasing at a constant rate every year. Performing exploratoryanalysis helps organizations to identify, manage and safeguard the information that could be vulnerable to cyber-attacks. It encourages to the creation of a plan for security controls that can help to protect data and keep constant tabs on threats and monitor their organization's networks for any breaches. Design/methodology/approach The purpose of this experimental study is to state the use of data science in analyzing data and to provide a more detailed view of the most common cybersecurity attacks, what are the most accessed logical ports, visible patterns, as well as the trends and occurrence of attacks. The data to be processed has been obtained by aggregating data provided by a company's technology department, which includes network flow data produced by nine different types of attacks within every day user activities. This could be insightful for many companies to measure the damage caused by these breaches but also gives a foundation for future comparisons and serves as a basis for proactive measures within industry and organizations. Findings The most common cybersecurity attacks, most accessed logical ports and their visible patterns were found in the acquired data set. The strategies, which attackers have used with respect to time, type of attacks, specific ports, IP addresses and their relationships have been determined. The statistical hypothesis was also performed to check whether attackers were confined to perform random attacks or to any specific machines with some pattern. Originality/value Policies can be suggested such that if an attack is conducted on a specific machine, which can be prevented by identifying the machine, ports and duration of the attacks on which the attacker is targeting and to formulate such policies that the organization should follow to tackle these targeted attacks in the future.
dataanalysis helps travel organizations to provide better recommendations for investing in their future trips based on its business and personal trips. This paper presents the basic concepts, various types and levels...
详细信息
ISBN:
(数字)9781728192512
ISBN:
(纸本)9781728192512
dataanalysis helps travel organizations to provide better recommendations for investing in their future trips based on its business and personal trips. This paper presents the basic concepts, various types and levels of dataanalysis, predictive modeling techniques and appropriate performance measures. There are basically three types of algorithms for predicting such as linear regression (machine learning model), analysis of Variance (statistical model) and artificial neural network (machine learning model). dataanalysis is being used in many fields such as health care, manufacturing, information technology and so on. A travel dataset provided by the uber in Kaggle is used to study the performance of chosen predicting algorithms. The primarymethodology behind this study is to analyze and find the accuracy of the most frequent category of trip among all trips taken by a customer in a region using dataanalysis. The parameters which are taken into consideration are category, purpose, total distance and speed of the travel. The results of precision, recall, fl score, Area Under Curve (AUC) and Receiver Operating Characteristic Curve (ROC) are evident that the Artificial neural network (ANN) based prediction is comparatively higher than other algorithms.
Failure detection and isolation (FDI) is a crucial step in diagnostics and is quickly shifting to towards using analytical techniques such as machine learning and deep learning, rather than traditional rules-based app...
详细信息
ISBN:
(纸本)9798350307696
Failure detection and isolation (FDI) is a crucial step in diagnostics and is quickly shifting to towards using analytical techniques such as machine learning and deep learning, rather than traditional rules-based approaches. This is partially due to the availability of sensor systems, hardware and networking allowing for a vast collection and processing of data. However, this information is prone to issues such as noise, corruption, poor formatting and recording practices. In most cases, a diagnostics project may stall midway due to late discovery of these problems. This paper proposes exploring the data beforehand, to locate issues in the data and/or optimize data quality to maximize performance or explain possible performance loss. Various techniques such as data visualization, statistical analysis and feature importance are mentioned. Most importantly, a domain knowledge set is to be integrated with such correlation-based methods to ensure that data quality decisions are made with understanding of the system. The limitations of such analysis including scalability and interpretation issues are discussed as well, leading to proposals of possible future paths to improvement such as sensor fusion and AI-based recommendations.
We illustrate with two simple examples how Interactive Evolutionary Computation (IEC) can be applied to exploratory data analysis (EDA). IEC is particularly valuable in an EDA context because the objective function is...
详细信息
ISBN:
(纸本)3540223436
We illustrate with two simple examples how Interactive Evolutionary Computation (IEC) can be applied to exploratory data analysis (EDA). IEC is particularly valuable in an EDA context because the objective function is by definition either unknown a priori or difficult to formalize. The first example involves what is probably the simplest possible transformation of data: linear projections. While the concept of linear projections is simple to grasp, in practice finding the appropriate two-dimensional projection that reveals important features of high-dimensional data is no easy task. We show how IEC can be used to quickly find the most informative linear projection(s). In another, more complex example, IEC is used to evolve the "true" metric of attribute space. Indeed, the assumed distance function in attribute space strongly conditions the information content of a two-dimensional display of the data, regardless of the dimension reduction approach. The goal here is to evolve the attribute space distance function until "interesting" features of the data are revealed when a clustering algorithm is applied.
Mobile health involves gathering smartphone-sensor data passively from user's phones, as they live their lives 'In-the-wild", periodically annotating data with health labels. Such data is used by machine ...
详细信息
ISBN:
(纸本)9783031254765;9783031254772
Mobile health involves gathering smartphone-sensor data passively from user's phones, as they live their lives 'In-the-wild", periodically annotating data with health labels. Such data is used by machine learning models to predict health. Purely Computational approaches generally do not support interpretability of the results produced from such models. In addition, the interpretability of such results may become difficult with larger study cohorts which make population-level insights desirable. We propose Population Level Exploration and analysis of smartphone DEtected Symptoms (PLEADES), an interactive visual analytics framework to present smartphone-sensed data. Our approach uses clustering and dimension reduction to discover similar days based on objective smartphone sensor data, across participants for population level analyses. PLEADES enables analysts to apply various clustering and projection algorithms to several smartphonesensed datasets. PLEADES overlays human-labelled symptom and contextual information from in-the-wild collected smartphone-sensed data, to empower the analyst to interpret findings. Such views enable the contextualization of the symptoms that can manifest in smartphone sensor data. We used PLEADES to visualize two real world in-the-wild collected datasets with objective sensor data and human-provided health labels. We validate our approach through evaluations with data visualization and human context recognition experts.
The main aim of every student and parent belonging to the current generation is to gain best education to have bright future and best career opportunities. Undoubtedly there are multiple career opportunities these day...
详细信息
Analyzing software repositories with thousands of artifacts is data intensive, which makes interactive exploration analysis of such data infeasible. We introduce a novel approach, Dominoes, that can support automated ...
详细信息
We review two forms of immediate reward reinforcement learning: in the first of these, the learner is a stochastic node while in the second the individual unit is deterministic but has stochastic synapses. We illustra...
详细信息
We review two forms of immediate reward reinforcement learning: in the first of these, the learner is a stochastic node while in the second the individual unit is deterministic but has stochastic synapses. We illustrate the first method on the problem of Independent Component analysis. Four learning rules have been developed from the second perspective and we investigate the use of these learning rules to perform linear projection techniques such as principal component analysis, exploratory projection pursuit and canonical correlation analysis. The method is very general and simply requires a reward function which is specific to the function we require the unit to perform. We also discuss how the method can be used to learn kernel mappings and conclude by illustrating its use on a topology preserving mapping. Crown Copyright (C) 2008 Published by Elsevier Ltd. All rights reserved.
暂无评论