this paper proposes and assesses a Big Data Platform for effective storage and analysis of On Board Unit (OBU) data related to the mobility of trucks in Belgium. the large volume and the streaming nature of the OBU da...
详细信息
ISBN:
(纸本)9781728116389;9781728116372
this paper proposes and assesses a Big Data Platform for effective storage and analysis of On Board Unit (OBU) data related to the mobility of trucks in Belgium. the large volume and the streaming nature of the OBU data requires the setup of a big data platform for an efficient collection, storage and analysis. the solution relies on (i) the Hadoop distributed File System (HDFS) to store data, (ii) the Apache Parquet format for data compression and columnar storage, and (iii) Spark for parallel and streaming processing of data. Data replication, compression and columnar storage ensure robustness to node failure, data distribution, and faster access to data.
Cloud computing is one of the most popular technologies nowadays because of its wide utilities and various benefits in several IT companies all over the world. However, in front of the increasing users' requests f...
详细信息
ISBN:
(纸本)9781538637906
Cloud computing is one of the most popular technologies nowadays because of its wide utilities and various benefits in several IT companies all over the world. However, in front of the increasing users' requests for computing services, cloud providers are encouraged to deploy large data centers, which consumes very large amount of energy and contribute to high operational costs. Among the effects, carbon dioxide emission rate is growing each day due to the huge amount of power consumption. this energy efficiency is an important issue in cloud computing, mainly due to the required electrical power to run these systems and to cool them. therefore, energy consumption has become a major concern for the widespread deployment of Cloud data centers. the growing importance for parallelapplications in the Cloud introduces significant challenges in reducing energy consumption from hosted servers. this paper addresses the problem of placing independent applications on the physical servers (hosts) of a Cloud infrastructure. We proposed a novel heuristic to allocate applications so that total energy consumption is reduced. Our proposal respects various constraints e.g. the machines availability, capability and the duplication of applications. Experiments are illustrated to validate the potential of our approach.
Volumetric DDoS attacks continue to inflict serious damage. Many proposed defenses for mitigating such attacks assume that a monitoring system has already detected the attack. However, many proposed DDoS monitoring sy...
详细信息
ISBN:
(纸本)9781450355490
Volumetric DDoS attacks continue to inflict serious damage. Many proposed defenses for mitigating such attacks assume that a monitoring system has already detected the attack. However, many proposed DDoS monitoring systems do not focus on efficiently analyzing high volume network traffic to provide important characterizations of the attack in real-time to downstream traffic filtering systems. We propose a scalable real-time framework for an effective volumetric DDoS monitoring system that leverages modern big data technologies for streaming analytics of high volume network traffic to accurately detect and characterize attacks.
We present here the results of our investigation of a transactional model of parallel programming on cluster computing systems. this model is specifically targeted for graph applications withthe goal of harnessing un...
详细信息
ISBN:
(纸本)9781538619933
We present here the results of our investigation of a transactional model of parallel programming on cluster computing systems. this model is specifically targeted for graph applications withthe goal of harnessing unstructured parallelism inherently present in many such problems. In this model, tasks for vertex-centric computations are executed optimistically in parallel as serializable transactions. A key-value based globally shared object store is implemented in the main memory of the cluster nodes for storing the graph data. Task computations read and modify data in the distributed global store, without any explicitly programmed message-passing in the application code. Based on this model we developed a framework for parallel programming of graph applications on computing clusters. We present here the programming abstractions provided by this framework and its architecture. Using several graph problems we illustrate the simplicity of the abstractions provided by this model. these problems include graph coloring, k-nearest neighbors, and single-source shortest path computation. We also illustrate how incremental computations can be supported by this programming model. Using these problems we evaluate the transactional programming model and the mechanisms provided by this framework.
A multitenant Storm cluster runs multiple stream processing applications and uses the default Isolation Scheduler to schedule them. Isolation Scheduler assigns resources to topologies based on static resource configur...
详细信息
ISBN:
(纸本)9781450355490
A multitenant Storm cluster runs multiple stream processing applications and uses the default Isolation Scheduler to schedule them. Isolation Scheduler assigns resources to topologies based on static resource configuration and does not provide any means for prioritizing topologies based on their varying business requirements. thus, performance degradation, even complete starvation of topologies with high priority is possible when the cluster is resource constrained and comprises an inadequate number of resources. Two priority based resource scheduling techniques are proposed to overcome these problems. A performance analysis based on prototyping and measurements demonstrates the effectiveness of the proposed techniques.
distributedcomputing systems cover a broad range of computing infrastructures, which are heterogeneous, inter-connected and architected around stack-based deployments. Failure occurrences within such tightly-coupled ...
详细信息
ISBN:
(纸本)9781450351959
distributedcomputing systems cover a broad range of computing infrastructures, which are heterogeneous, inter-connected and architected around stack-based deployments. Failure occurrences within such tightly-coupled systems while are expected, do not easily lend to predictive modeling due to the complex interactions between interconnected service layers. Ois work examines service level instabilities, occurring within data centers, participating in High Energy Physics (HEP) scienti_c research. We present a stability measure based on which a failure event selection process is deployed to detect periods of instability within individual data centers. Experts recognize that understanding conditions for failure is crucial when designing recovery procedures. For distributedcomputing systems risk and failure analysis facilitates implementation of measures for service availability, subsystem recovery and network redundancy.
Transcoding of a SHVC video is time-consuming especially with high definition resolutions. In this paper, we propose a distributed SHVC video transcoding system that can speed up the transcoding time of SHVC videos. B...
详细信息
ISBN:
(纸本)9781538627617
Transcoding of a SHVC video is time-consuming especially with high definition resolutions. In this paper, we propose a distributed SHVC video transcoding system that can speed up the transcoding time of SHVC videos. By equipping distributedcomputingtechnologies, the proposed architecture can reduce the transcoding time of SHVC videos without the modifications of existing SHVC encoding algorithms and can satisfy the video requirements of high-definition television and high-resolution mobile devices.
Using the multilevel concatenation, long block codes can be constructed from shorter component codes, resulting in much less decoding complexity. the component codes can also be constructed from multilevel concatenati...
详细信息
Transactional frequent subgraph mining identifies frequent structural patterns in a collection of graphs. this research problem has wide applicability and increasingly requires higher scalability over single machine s...
详细信息
ISBN:
(纸本)9781450355490
Transactional frequent subgraph mining identifies frequent structural patterns in a collection of graphs. this research problem has wide applicability and increasingly requires higher scalability over single machine solutions to address the needs of Big Data use cases. We introduce DIMSpan, an advanced approach to frequent subgraph mining that utilizes the features provided by distributed in-memory dataflow systems such as Apache Flink or Apache Spark. It determines the complete set of frequent subgraphs from arbitrary string-labeled directed multigraphs as they occur in social, business and knowledge networks. DIMSpan is optimized to runtime and minimal network traffic but memory-aware. An extensive performance evaluation on large graph collections shows the scalability of DIMSpan and the effectiveness of its optimization techniques.
Skeletal parallelism offers a good trade-off between programming productivity and execution efficiency. In this style of parallelism, an application is a composition of algorithmic skeletons. An algorithmic skeleton c...
详细信息
暂无评论