Big-data systems are increasingly shared by diverse, data-intensive applications from different domains. However, existing systems lack the support for I/O management, and the performance of bigdata applications degra...
详细信息
Enormous amount of educational data has been accumulated through Massive Open Online Courses (MOOCs), as well as commercial and non-commercial learning platforms. This is in addition to the educational data released b...
详细信息
ISBN:
(纸本)9781450341905
Enormous amount of educational data has been accumulated through Massive Open Online Courses (MOOCs), as well as commercial and non-commercial learning platforms. This is in addition to the educational data released by US government since 2012 to facilitate disruption in education by making data freely available. The high volume, variety and velocity of collected data necessitate use of big data tools and storage systems such as distributeddatabases for storage and Apache Spark for analysis. This tutorial will introduce researchers and faculty to real world applications involving data mining and predictive analytics in learning sciences. In addition, the tutorial will introduce statistics required to validate and accurately report results. Topics will cover how big data is being used to transform education. Specifically, we will demonstrate how exploratory data analysis, data mining, predictive analytics, machine learning, and visualization techniques are being applied to educational big data to improve learning and scale insights driven from millions of student's records. The tutorial will be held over a half day and will be hands on with pre-posted material. Due to the interdisciplinary nature of work, the tutorial appeals to researchers from a wide range of backgrounds including big data, predictive analytics, learning sciences, educational data mining, and in general, those interested in how big data analytics can transform learning. As a prerequisite, attendees are required to have familiarity with at least one programming language.
The proceedings contain 24 *** special focus in this conference is on Short Papers, Big Data Applications and Principles, Data centered Smart Applications and ADBIS Doctoral *** topics include: Towards automated perfo...
ISBN:
(纸本)9783319440651
The proceedings contain 24 *** special focus in this conference is on Short Papers, Big Data Applications and Principles, Data centered Smart Applications and ADBIS Doctoral *** topics include: Towards automated performance optimization of BPMN business processes;pixel-based analysis of information dashboard attributes;towards adaptive distributed top-k query processing;basis functions as pivots in space of users preferences;towards semi-structured JSON big data;skyline algorithms on streams of multidimensional data;canonical data model for data warehouse;a quality-based query rewriting algorithm for data integration, towards spatial crowdsourcing in vehicular networks using mobile agents;shift of image processing technologies to column-oriented databases;influence of parallelism property of streaming engines on their performance;reducing big data by means of context-aware tailoring;feature ranking and selection for big data sets;a bagged associative classifier for big data frameworks;a new parallel approximate subspace clustering algorithm;smart modeling for lightweight mobile application development methods;an implementation method of an information credibility calculation system for emergency such as natural disasters;model capsules for research and engineering networks;usage of aspect-oriented programming in adaptive application structure;short-term user behaviour changes modelling and framework for managing distinct versions of data in relational databases.
Many advances have been made in the design of full replication protocols in distributedsystems. Causal consistency in such systems has received great interest. However, most existing works focus on the implementation...
详细信息
ISBN:
(纸本)9781509036837
Many advances have been made in the design of full replication protocols in distributedsystems. Causal consistency in such systems has received great interest. However, most existing works focus on the implementation in full replication because it simplifies designing the algorithm. More recently, interest in full replication has shifted to focus on the development of partial replication protocols which emphasize a better network capacity utilization. In this paper, we present the analytic data to compare the performances of three proposed protocols in partial replication and full replication. We also give simulation results to present the advantage of partial replication over full replication.
The proceedings contain 29 papers. The special focus in this conference is on Big Data Analytics, Cloud Data Management, Internet of Things, Security, Privacy Engineering, Data Protection, Data Hiding, Context-Based D...
ISBN:
(纸本)9783319480565
The proceedings contain 29 papers. The special focus in this conference is on Big Data Analytics, Cloud Data Management, Internet of Things, Security, Privacy Engineering, Data Protection, Data Hiding, Context-Based Data Analysis, Emerging Data Management systems and Applications. The topics include: Incorporating trust, certainty and importance of information into knowledge processing systems - an approach;incremental parallel support vector machines for classifying large-scale multi-class image datasets;a large-scale two-level clustering similarity search with MapReduce;immune approach to the protection of IoT devices;heuristic-guided verification for fast congestion detection on wireless sensor networks;security risk management in the aviation turnaround sector;a novel encryption mechanism for door lock;information and identity theft Without ARP spoofing in LAN environments;a watermarking framework for outsourced and distributed relational databases;face quality measure for face authentication;using graph database for evidence correlation on android smartphones;a secure token-based communication for authentication and authorization servers;an enhancement of the Rew-XAC model for workflow data access control in healthcare;trust and risk-based access control for privacy preserving threat detection systems;fine grained attribute based access control model for privacy protection;automatic extraction of semantic relations from text documents;the present and future of large-scale systems modeling and engineering;non-disjoint multi-agent scheduling problem on identical parallel processors and an evaluative model to assess the organizational efficiency in training corporations.
BigData denotes voluminous amounts of semi structured and / or unstructured dataset that grow enormously and is difficult to handle because of the complexities associated with it. BigData is too large to process using...
详细信息
ISBN:
(纸本)9781509041534
BigData denotes voluminous amounts of semi structured and / or unstructured dataset that grow enormously and is difficult to handle because of the complexities associated with it. BigData is too large to process using traditional database systems. MapReduce is a framework used for processing BigData in parallel on large number of commodity computers. This model has been an inspiration for the genesis of many parallel computing frameworks like Hadoop. Data outsourcing is a frugal option for organizations which consider building and maintaining their own data management systems. However, precautions have to be taken to ensure that their data is not compromised. It is a challenge to protect the data and safeguard privacy of the users, especially in distributed environments. The focus of this paper is to reduce the overhead of Searchable encryption on Hadoop by minimizing the time taken for encryption of huge volumes of data by processing them in parallel. The encryption and search operations are multi-user supported and the framework is designed to handle addition and revocation of user access privileges with ease of use. The increase in size owing to encryption is offset by enabling compression.
暂无评论