PurposeThis study aims to empirically investigate the impact of adopting big data and data analytics (BD&A) on audit quality (AQ).Design/methodology/approachA questionnaire was distributed among audit practitioner...
详细信息
PurposeThis study aims to empirically investigate the impact of adopting big data and data analytics (BD&A) on audit quality (AQ).Design/methodology/approachA questionnaire was distributed among audit practitioners working at audit firms in Egypt and 205 responses were collected. Partial least square structural equation modeling (PLS-SEM) was used to analyze and test research *** results reveal that BD&A has a direct significant positive effect on the audit process (AP) and auditor competence (AC). However, an insignificant impact of BD&A is found on audit fees (AF). In addition, the results indicate that BD&A has significant positive direct and indirect impacts on *** limitations/implicationsThe results of this study will benefit several auditing stakeholders, such as audit firms, audit regulators, novice financial auditors and academic ***/valueThis research is one of the earliest to empirically address the role of BD&A in enhancing AQ. It incorporates AP, AC and AF as mediators into a single model to explain the impact of BD&A on AQ. Also, it attempts to provide empirical evidence from a developing country with a less-regulated audit environment.
Predictive analytics is of great interest when it comes to enhancing Business Intelligence. Businesses have already started to use bigdataanalytics, particularly predictive and prescriptive analytics, to strengthen ...
详细信息
ISBN:
(纸本)9781728126074
Predictive analytics is of great interest when it comes to enhancing Business Intelligence. Businesses have already started to use bigdataanalytics, particularly predictive and prescriptive analytics, to strengthen and increase their business yields. Not only has analytics resulted in business growth, but has also provided a significant competitive edge over others. The voluminous data generated from various resources is highly unstructured in nature and adding a structure to it would leverage the actual potential of the data. New techniques and frameworks should serve as human aids in automatically and intelligently analyzing large datasets in order to acquire useful information. In this paper, we attempt to perform bigdataanalytics on data from one of the most important and growing sources, namely, Telecommunication. To keep pace with the growing telecommunication market and ever increasing demands of the consumers for quality service, the telecom service providers are required to observe and estimate various trends in customer's usage to plan future upgrades and deployments driven by real data. We have attempted to use several data mining techniques to find hidden and interesting patterns from the telecom data generated by Telecoms Italia cellular network for the city of Milano, Italy. K-means clustering is used to categorize the usage statistics while several machine learning algorithms like Decision Tree, Random Forest, Logistic Regression and SVM are used for predicting the usage of telecom services. In the end, a performance comparison matrix is generated to rate the performance of these algorithms for the given dataset. All these experiments are performed on the bigdata environment set up at the supercomputing infrastructure of CDAC. Given such a matrix, the result can be applied to similar dataset pertaining to other domains as well.
In the era of bigdata, new scientific applications such as those used in astronomy [1] are emerging and challenging High Performance Computing (HPC) systems and software. Traditionally, HPC applications were compute-...
详细信息
ISBN:
(纸本)9781538632505
In the era of bigdata, new scientific applications such as those used in astronomy [1] are emerging and challenging High Performance Computing (HPC) systems and software. Traditionally, HPC applications were compute-bounded, with a light use of the I/O capabilites at the start and end of the execution. In contrast, emergent applications present data- intensive behaviors arising several new challenges to be faced by hardware and software.
Background: Chronic diseases, such as opioid use disorder (OUD) require a multifaceted scientific approach to address their evolving complexity. The Council for the Advancement of Nursing Science's (Council) four ...
详细信息
Background: Chronic diseases, such as opioid use disorder (OUD) require a multifaceted scientific approach to address their evolving complexity. The Council for the Advancement of Nursing Science's (Council) four nursing science priority areas (precision health;global health, determinants of health, and bigdata/dataanalytics) were established to provide a framework to address current complex health problems. Purpose: To examine OUD research through the nursing science priority areas and evaluate the appropriateness of the priority areas as a framework for research on complex health conditions. Method: OUD was used as an exemplar to explore the relevance of the nursing science priorities for future research. Findings: Research in the four priority areas is advancing knowledge in OUD identification, prevention, and treatment. Intersection of OUD research population focus and methodological approach was identified among the priority areas. Discussion: The Council priorities provide a relevant framework for nurse scientists to address complex health problems like OUD.
The emergence of next generation DNA sequencers has raised interest in short read de novo assembly of whole genomes. Though numerous frameworks were developed in the field, the presence of errors in reads as well as t...
详细信息
ISBN:
(纸本)9780769561493
The emergence of next generation DNA sequencers has raised interest in short read de novo assembly of whole genomes. Though numerous frameworks were developed in the field, the presence of errors in reads as well as the increasing size of datasets call for scalable preprocessing methods for noise filtering. In this paper we present a filtering algorithm that targets determination of valid k-mers in a de Bruijn graph built from short reads. Such preprocessing will help increase accuracy and reduce memory footprint in further assembly procedures by removing erroneous k-mers from the datasets at an early stage. The algorithm leverages GraphLab, a scalable graph processing framework not previously used in traditional assembly toolchains. The accuracy of the algorithm was evaluated with synthetic datasets exhibiting various error rates and proven to be able to determine large parts of de Bruijn graphs on datasets with error level greater than real-life datasets. The implementation is executed on a distributed cluster and a study of its scalability and operating performances is conducted and exhibits interesting scaling properties, hence demonstrating the relevance of GraphLab in such a context.
Graph-mining is a class of data-mining problems where programs involve the processing of data modeled as graphs. These applications often exhibit irregular and data-dependent communication patterns, hampering parallel...
详细信息
ISBN:
(纸本)9781509020881
Graph-mining is a class of data-mining problems where programs involve the processing of data modeled as graphs. These applications often exhibit irregular and data-dependent communication patterns, hampering parallelization opportunities on distributed architectures. Many tools and frameworks were created for the scalable processing of graphs but their comparison is non-trivial on distributed architectures as there is no efficiency metrics with respect to distributed resource usage. Considering an in-house use-case, program trace analysis for parallelization optimizations, we study the benefits and limits of a graph-processing framework for a tangible application. The algorithm was implemented using GraphLab and executed on a humble 7-node commodity cluster with input instances up to 40 million vertices and 50 million edges. We propose in this paper an in-depth analysis of the GraphLab system to evaluate its performance and scalability in the context of program trace analysis. The analysis is driven both by traditional and domain-specific metrics and contributes to a better understanding of the system behavior.
暂无评论