data clustering is a method of putting same data object into group. A clustering rule does partitions of a data set into many groups supported the principle of maximizing the intra-class similarity and minimizing the ...
详细信息
data clustering is a method of putting same data object into group. A clustering rule does partitions of a data set into many groups supported the principle of maximizing the intra-class similarity and minimizing the inter-class similarity. Finding clusters in object, particularly high dimensional object, is difficult when the clusters are different shapes, sizes, and densities, and when data contains noise and outliers. This paper provides a new clustering algorithm for normalized data set and proven that our new planned clustering approach work efficiently when dataset are normalized. (C) 2016 The Authors. Published by Elsevier B.V.
Traditional Relational database Management Systems are continuously being replaced by NoSQL data stores as a result of the growing demand for big data applications. The emergence of a large number of implementations o...
详细信息
Traditional Relational database Management Systems are continuously being replaced by NoSQL data stores as a result of the growing demand for big data applications. The emergence of a large number of implementations of such like systems is a contributing indicator. This paper deals with the analysis of some key design characteristics of NoSQL systems and uses these for their characterization based on their capabilities. Furthermore, it highlights the relationship between NoSQL systems and cloud infrastructures and explains the impact that the existence of one has to the other. (C) 2016 The Authors. Published by Elsevier B.V.
computing an aggregate on big data is a common application of mapreduce. The map groups values based on some key, and the reduce computes the aggregate for the group. But temporal data cannot be effectively grouped, s...
详细信息
ISBN:
(纸本)9781509042975
computing an aggregate on big data is a common application of mapreduce. The map groups values based on some key, and the reduce computes the aggregate for the group. But temporal data cannot be effectively grouped, so to compute a temporal aggregate, a new strategy is needed. Temporal data is data annotated with time metadata, usually a temporal period representing the lifetime of the data in some time dimension. Since periods have extent they are not directly amenable to grouping on some value-based key. The main contribution of this paper is to show how temporal aggregates can be computed using CouchDB (and by extension in other mapreduce systems). We introduce a new kind of timestamp, which we call a log-segmented timestamp, and we show how to use the timestamp to compute a temporal aggregate. Our technique reuses and extends existing mapreduce techniques.
Reliability assessment is a key step in ensuring the quality of a product. As semiconductor technology continues to evolve, the reliability test process also complicates, involving engineers and technical assistants r...
详细信息
ISBN:
(纸本)9781467398237
Reliability assessment is a key step in ensuring the quality of a product. As semiconductor technology continues to evolve, the reliability test process also complicates, involving engineers and technical assistants responsible for different test tasks. In this paper, we propose a design of a comprehensive Reliability Management and Index System that integrates online test requests, database management and test data analysis. In addition, the resultant big data collected by the system inspire potential data-mining applications for new reliability data-analysis approaches.
Feel-Phy is a computerized and unmanned question answering system which is able to solve open-ended Physics problems, providing adaptive guidance and retrieve relevant resources to user inputs. Latent Semantic Indexin...
详细信息
Electroencephalography (EEG) machine is a medical equipment which is used to diagnose seizure. EEG signal records data in the form of graph which consist of abnormal patterns such as spikes, sharp waves and also spike...
详细信息
The Free Flow of data is an emerging challenge to which the European Commission is currently working on with a legislative proposal due for the end of 2016, as part of the Digital Single Market (DSM) strategy. The pro...
详细信息
The Free Flow of data is an emerging challenge to which the European Commission is currently working on with a legislative proposal due for the end of 2016, as part of the Digital Single Market (DSM) strategy. The proposal aims at tackling unjustified "restrictions on the free movement of data" among Member States. This paper analyses a number of cloud challenges of trustworthy inter-cloud environments identified by on-going EU-funded-research initiatives dealing with security, privacy anddata protection issues of Cloud solutions. (C) 2016 Published by Elsevier B.V.
Many application areas such as biodiversity monitoring and speech recognition produce several gigabytes of audio data per day;think of weeks, months of accumulated data available to analyse. However, current methods a...
详细信息
ISBN:
(纸本)9781509042975
Many application areas such as biodiversity monitoring and speech recognition produce several gigabytes of audio data per day;think of weeks, months of accumulated data available to analyse. However, current methods and frameworks can only process a few megabytes at a time. Moreover, these methods require a lot of manual processing before they can give relevant results for biodiversity researchers. In this paper, we present a novel scalable framework called B2P2 for handling large audio files and processing them in an efficient manner. The proposed framework can process and handle large audio data that can indicate whether there are sounds of potential interest throughout large audio recordings. We implemented the proposed framework using Bigdata platforms such as Spark and HDFS. Experimental results demonstrate an increase in speed-up and performance of processing audio files when increasing nodes anddata sizes on cloud computing.
This paper introduces the EGI Open data Platform and the EGI dataHub, outlines their functionality and explains how this meets the requirements of EGI end users. The paper also explains how these new services can supp...
详细信息
This paper introduces the EGI Open data Platform and the EGI dataHub, outlines their functionality and explains how this meets the requirements of EGI end users. The paper also explains how these new services can support the European Open science Cloud and will fit into the future European Strategy Report on Research Infrastructures (ESFRI). (C) 2016 The Authors. Published by Elsevier B.V.
As is well known, earthquake often inflicts severe casualties and property losses. The occurrence of earthquakes cannot be reliably predicted by current technology, therefore the earthquake emergency and rescue is an ...
详细信息
ISBN:
(纸本)9781467398237
As is well known, earthquake often inflicts severe casualties and property losses. The occurrence of earthquakes cannot be reliably predicted by current technology, therefore the earthquake emergency and rescue is an important part of protecting against and mitigating earthquake disasters. Basic geographical data play an important role in earthquake emergency work. However, it is hard to acquire the spatial data of the earthquake site in short time. Therefore, the Google Maps data could be applied in the early stage of post-earthquake emergency work. This paper discusses the principles of Google Maps and technologies of applying Google Maps data in earthquake emergency work. The down loading and merging algorithm is designed and implemented. Using Google Maps data and the program, we produced thematic maps in real earthquake emergency work. It is proved that the methods are feasible and have great practical application significance.
暂无评论