This research-to-practice full paper compares datascience education strategies in China and the United States, exploring whether different approaches can achieve similar educational outcomes. In the U.S., data scienc...
详细信息
datascience is a complex and evolving field, but most agree that it can bedefined as a combination of expertise drawn from three broad areascomputerscience and technology, math and statistics, and domain knowledge - ...
详细信息
Consensus on the definition of datascience remains low despite the widespread establishment of academic programs in the field and continued demand for data scientists in industry. Definitions range from rebranded sta...
详细信息
data and code working together is fundamental to machine learning (ML), but the context around datasets and interactions between datasets and code are in general captured only rudimentarily. Context such as how the da...
data and code working together is fundamental to machine learning (ML), but the context around datasets and interactions between datasets and code are in general captured only rudimentarily. Context such as how the dataset was prepared and created, what source data were used, what code was used in processing, how the dataset evolved, and where it has been used and reused can provide much insight, but this information is often poorly documented. That is unfortunate since it makes datasets into black-boxes with potentially hidden characteristics that have downstream consequences. We argue that making dataset preparation more accessible and dataset usage easier to record and document would have significant benefits for the ML community: it would allow for greater diversity in datasets by inviting modification to published sources, simplify use of alternative datasets and, in doing so, make results more transparent and robust, while allowing for all contributions to be adequately credited. We present a platform, Renku, designed to support and encourage such sustainable development and use of data, datasets, and code, and we demonstrate its benefits through a few illustrative projects which span the spectrum from dataset creation to dataset consumption and showcasing.
Health data and cutting-edge technologies empower medicine and improve *** has become even more true during the COVID-19 *** coronavirus data sharing and worldwide collaboration,the speed of vaccine development for CO...
详细信息
Health data and cutting-edge technologies empower medicine and improve *** has become even more true during the COVID-19 *** coronavirus data sharing and worldwide collaboration,the speed of vaccine development for COVID-19 is *** and data technologies were quickly adopted during the pandemic,showing how those technologies can be harnessed to enhance public health and healthcare.A wide range of digital data sources are being utilized and visually presented to enhance the epidemiological surveillance of *** contact tracing mobile apps have been adopted by many countries to control community *** learning has been utilized to achieve various solutions for COVID-19 disruption,including outbreak prediction,virus spread tracking.
The research presented in this paper is motivated by the need for comprehensive quality control and assessment of the downlinked raw data from space science satellites, utilizing limited information before data calibr...
详细信息
ISBN:
(数字)9798331542825
ISBN:
(纸本)9798331542832
The research presented in this paper is motivated by the need for comprehensive quality control and assessment of the downlinked raw data from space science satellites, utilizing limited information before data calibration. We propose a framework for space science satellite raw data quality control and assessment, which includes data quality requirements, data quality control, data quality assessment and data quality improvement. Based on essential data quality indices (such as continuity, integrity, consistency, and timeliness), we introduce a three-layer data quality control method and a multidimensional quality assessment model, data quality is further improved through data improvement actions. To validate the proposed framework, we used raw data from the Gravitational wave high- energy Electromagnetic Counterpart All-sky Monitor (GECAM) as a case study. The results demonstrate that the framework is reasonable and feasible, and the data quality control and assessment information provided is accurate and reliable.
An essential component of an Intelligent Transportation System (ITS) is anomaly detection. There is an increasing need for the identification of unusual occurrences in the traffic network due to the yearly growth in v...
详细信息
This paper aims at addressing some challenges that prevent monitoring, evaluation and learning systems from effectively supporting the adaptive management. Using a case study of monitoring the drinking water quality c...
详细信息
暂无评论