Due to the high-complexity and non-linearity of the marine ecosystem, the internal functional mechanisms of the system have not been fully recognized. This lack of recognition of ecosystems restricts the use of tradit...
详细信息
Due to the high-complexity and non-linearity of the marine ecosystem, the internal functional mechanisms of the system have not been fully recognized. This lack of recognition of ecosystems restricts the use of traditional deterministic, aquatic, and ecological dynamic models based on mechanisms or assumptions. By introducing advanced intelligent algorithm support vector machine (SVM) and space display mode cellular automata (CA) from soft computing, a marine ecological module is constructed by combining SVM with CA. Simulation results such as water level, water depth, and flow velocity of the mechanism-based hydrodynamic modules are imported to the corresponding cell. The cell together with the existing cells and neighboring state, spatial location variables as well as extrinsic factors, exerts a combined effect on the cellular state of the cell soon afterwards, and then the coupling model based on the numerical method and soft computing is constructed. With the remote-sensed chlorophyll concentration data at Bohai Bay as the study object, the root-mean-square error (RMSE) of model simulation results were within the range of 0.042 mu g/L to 0.373 mu g/L, and the difference of spatial auto-correlation indicators Moran's I was within the range of 0.003 mu g/L to 0.176 mu g/L. Results indicate that an established coupling model encompasses a proper simulation of the temporal and spatial changing features of the chlorophyll concentration. This study thus integrates the traditional, hydrodynamic numerical module and ecological module based on soft computing methods, providing new means and ideas for simulated marine environment.
A columnar data representation is known to be an efficient way for data storage, specifically in cases when the analysis is often done based only on a small fragment of the available data structures. A data representa...
详细信息
A columnar data representation is known to be an efficient way for data storage, specifically in cases when the analysis is often done based only on a small fragment of the available data structures. A data representation like Apache Parquet is a step forward from a columnar representation, which splits data horizontally to allow for easy parallelization of data analysis. Based on the general idea of columnar data storage, working on the FNAL LDRD Project FNAL-LDRD-2016-032, we have developed a striped data representation, which, we believe, is better suited to the needs of High Energy Physics data analysis. A traditional columnar approach allows for efficient data analysis of complex structures. While keeping all the benefits of columnar data representations, the striped mechanism goes further by enabling easy parallelization of computations without requiring special hardware. We will present an implementation and some performance characteristics of such a data representation mechanism using a distributed no-SQL database or a local file system, unified under the same API and data representation model. The representation is efficient and at the same time simple so that it allows for a common data model and APIs for wide range of underlying storage mechanisms such as distributed no-SQL databases and local file systems. Striped storage adopts Numpy arrays as its basic data representation format, which makes it easy and efficient to use in Python applications. The Striped data Server is a web service, which allows to hide the server implementation details from the end user, easily exposes data to WAN users, and allows to utilize well known and developed data caching solutions to further increase data access efficiency. We are considering the Striped data Server as the core of an enterprise scale data analysis platform for High Energy Physics and similar areas of data processing. We have been testing this architecture with a 2TB dataset from a CMS dark matter search and
The vast majority of security breaches encountered recent years are direct result of insecure source code. Therefore, the protection of software critically depends on the identification of security defect in source co...
详细信息
The proceedings contain 53 papers. The topics discussed include: justifying SSD storage in enterprise cloud environments;an energy-aware workload balancing method for cloud video data storage management;ABS: agent-bas...
ISBN:
(纸本)9781509036776
The proceedings contain 53 papers. The topics discussed include: justifying SSD storage in enterprise cloud environments;an energy-aware workload balancing method for cloud video data storage management;ABS: agent-based scheduling for data-intensive workflow in software-as-a-service environments;template-based genetic algorithm for QoS-aware task scheduling in cloud computing;parallelizing k-means-based clustering on spark;taming big data scheduling with locality-aware scheduling;a data streams analysis strategy based on hoeffding tree with concept drift on hadoop system;fast construction of an index tree for large non-ordered discrete datasets using multi-way top-down split and MapReduce;a study of age distribution inference in Sina Weibo;and a rule-based knowledge discovery engine embedded semantic graph knowledge repository for retail business.
Public health in developed countries is heavily affected by pollution specially in highly populated areas. Amongst the pollutants with greatest impact in health, ozone is particularly addressed in this paper due to im...
详细信息
ISBN:
(纸本)9783319671802;9783319671796
Public health in developed countries is heavily affected by pollution specially in highly populated areas. Amongst the pollutants with greatest impact in health, ozone is particularly addressed in this paper due to importance of its effect on cardiovascular and respiratory problems and their prevalence on developed societies. Local authorities are compelled to provide satisfactory predictions of ozone levels and thus the need of proper estimation tools rises. A data driven approach to prediction demands high quality data but those observations collected by weather stations usually fail to meet this requirement. This paper reports a new approach to robust ozone levels prediction by using an outlier detection technique in an innovative way. The aim is to assess the feasibility of using raw data without preprocessing in order to obtain similar or better results than with traditional outlier removal techniques. An experimental dataset from a location in Spain, Ponferrada, is used through an experimental stage in which such approach provides satisfactory results in a difficult case.
3D ultra high resolution videos can be downloaded within seconds by deploying state-of-the art 5G technology. It handles big data with lesser delay and provides more bandwidth. The Internet of things (IoT) is the phys...
详细信息
This paper proposes and improves a model for China's cotton reserves trading market, generates a more accurate price level table in contrast to the widely used China Cotton Association(CCA)'s table, and predic...
详细信息
Most of the irrelevant or noise features in high-dimensional data present significant challenges to high-dimensional mislabeled instances detection methods based on feature selection. Traditional methods often perform...
详细信息
ISBN:
(数字)9783030050900
ISBN:
(纸本)9783030050900;9783030050894
Most of the irrelevant or noise features in high-dimensional data present significant challenges to high-dimensional mislabeled instances detection methods based on feature selection. Traditional methods often perform the two dependent step: The first step, searching for the relevant subspace, and the second step, using the feature subspace which obtained in the previous step training model. However, Feature subspace that are not related to noise scores and influence detection performance. In this paper, we propose a novel sequential ensemble method SENF that aggregate the above two phases, our method learns the sequential ensembles to obtain refine feature subspace and improve detection accuracy by iterative sparse modeling with noise scores as the regression target attribute. Through extensive experiments on 8 real-world high-dimensional datasets from the UCI machine learning repository [3], we show that SENF performs significantly better or at least similar to the individual baselines as well as the existing state-of-the-art label noise detection method.
How to analyze, interpret and make use of marine big data has become a serious challenge to the geophysical community. A statistical technique vertical EOF (VEOF) is introduced to extract vertical pattern of the ocean...
详细信息
Geological disaster recognition on optical image is one of the key techniques in disaster control and disaster relief. Comparing with optical images, remote sensing images contain much higher resolution and more visua...
详细信息
Geological disaster recognition on optical image is one of the key techniques in disaster control and disaster relief. Comparing with optical images, remote sensing images contain much higher resolution and more visualized contents. In this paper, we propose a landslide recognition framework which trains a deep auto-encoder network on the compressed domain. ANN or SVM is used as the classifier for decision making. In addition, in order to meet the requirement of some real-time applications, a high performance training network on CUDA-enabled GPUs is designed and implemented. Experiments are conducted on optical images from Google Earth. (C) 2018 The Authors. Published by Elsevier B.V.
暂无评论