Today39;s internet world impose a trade-off between Peta-byte to Exa-byte being created in digital computer world attributable enormous volume of unstructured datasets being generating from diverse social sites, IOT...
详细信息
ISBN:
(纸本)9789813296909;9789813296893
Today's internet world impose a trade-off between Peta-byte to Exa-byte being created in digital computer world attributable enormous volume of unstructured datasets being generating from diverse social sites, IOT, Google, Twitter, Yahoo, monitoring surroundings through sensors, etc., is big data (BD). Because second to second doubles the datasets volume size but the shortage of smooth dynamic processing, analysis and scalability techniques. Because the recent high-speed decade we applied only extant methods and common tools about the gigabyte data process and perform computations on whole world huge data. Apache open free source Hadoop is the latest BD weapon can process zetta-byte dimensions of databases by its most developed and popular components as HDFS and map reduce (MR), to get done excellent storage features magnificent and reliable processing on zetta-byte of datasets. MR likes more famous software, popular framework for handling BD existing issues with full parallel, highly distributed, and most scalable manner. Despite, Hadoop, map and reduces tasks have more limitations like poor allocating custom resources, stream way processing, shortage of latency, the deficit of efficient performance, imperfection of optimization, the real-time trend of computations and diverse logical elucidation. We significant most modern progressive features computing procedures. This examination paper shows Apache fastest spark tool, world latest and fastest tool is apache storm has efficient frameworks to conquer those limitations.
In recent years, the amount of data is growing extensively. In companies, spreadsheets are one common approach to conduct data processing and statistical analysis. However, especially when working with massive amounts...
详细信息
This paper discusses the establishment, management and application of remote sensing image database. A systematical structure for remote sensing image database is presented and analyzed. This architecture is based on ...
详细信息
Workloads with precedence constraints due to data dependencies are common in various applications. These workloads can be represented as directed acyclic graphs (DAG), and are often data-intensive, meaning that data l...
详细信息
ISBN:
(纸本)9783030576752;9783030576745
Workloads with precedence constraints due to data dependencies are common in various applications. These workloads can be represented as directed acyclic graphs (DAG), and are often data-intensive, meaning that data loading cost is the dominant factor and thus cache misses should be minimized We address the problem of parallel scheduling of a DAG of data-intensive tasks to minimize makespan. To do so, we propose greedy online scheduling algorithms that take load balancing, data dependencies, and data locality into account. Simulations and an experimental evaluation using an Apache Spark cluster demonstrate the advantages of our solutions.
Cloud environments can provide virtualized, elastic, control-lable and high-quality on-demand infrastructure services for supporting complex distributedapplications. However, existing IaaS (Infrastructure-as-a-Servic...
详细信息
ISBN:
(纸本)9783030483401;9783030483395
Cloud environments can provide virtualized, elastic, control-lable and high-quality on-demand infrastructure services for supporting complex distributedapplications. However, existing IaaS (Infrastructure-as-a-Service) solutions mainly focus on the automated integration or deployment of generic applications;they lack flexible infrastructure planning and provisioning solutions and do not have rich support for the high service quality and trustworthiness required by social network applications. This paper introduces an automated cloud virtual infrastructure solution for social network applications, called Co-located and Orchestrated Network Fabric (CONF), which was conducted in a recently funded EU H2020 project ARTICONF. CONF aims to improve the existing infrastructure support in the DevOps lifecycle of social network applications to optimize QoS performance metrics as well as ensure fast recovery in the presence of faults or performance drops.
Cloud Computing is widely recognized as distributed computing paradigm for the next generation of dynamically scalable applications. Recently a novel service model, called Function-as-a-Service (FaaS), has been propos...
详细信息
ISBN:
(纸本)9783030483401;9783030483395
Cloud Computing is widely recognized as distributed computing paradigm for the next generation of dynamically scalable applications. Recently a novel service model, called Function-as-a-Service (FaaS), has been proposed, that enables users to exploit the computational power of cloud infrastructures, without the need to configure and manage complex computations systems. FaaS paradigm represents an opportunity to easily develop and execute extreme-scale applications as it allows fine-grain decomposition of the application with a much more efficient scheduling on cloud provider infrastructure. We introduce FLY, a domain-specific language for designing, deploying and executing scientific computing applications by exploiting the FaaS service model on different cloud infrastructures. In this paper, we present the design and the language definition of FLY on several computing (local and FaaS) back-ends: Symmetric multiprocessing (SMP), Amazon AWS Lambda, Microsoft Azure Functions, Google Cloud Functions, and IBM Bluemix/Apache OpenWhisk. We also present the first FLY source-to-source compiler, publicly available on GitHub, which supports SMP and AWS back-ends.
Big data is a commodity which is highly valued in the entire globe. It is not just regarded as data but in the world of experts, we can derive intelligence from it. Because of it39;s characteristics which are Variet...
详细信息
Cube and conquer is currently the most effective approach to solve hard combinatorial problems in parallel. It organizes the search in two phases. First, a look-ahead solver splits the problem into many sub-problems, ...
详细信息
ISBN:
(数字)9783030518257
ISBN:
(纸本)9783030518257;9783030518240
Cube and conquer is currently the most effective approach to solve hard combinatorial problems in parallel. It organizes the search in two phases. First, a look-ahead solver splits the problem into many sub-problems, called cubes, which are then solved in parallel by incremental CDCL solvers. In this tool paper we present the first fully integrated and automatic distributed cube-and-conquer solver Paracooba targeting cluster and cloud computing. Previous work was limited to multi-core parallelism or relied on manual orchestration of the solving process. Our approach uses one master per problem to initialize the solving process and automatically discovers and releases compute nodes through elastic resource usage. Multiple problems can be solved in parallel on shared compute nodes, controlled by a custom peer-to-peer based load-balancing protocol. Experiments show the scalability of our approach.
Tangle is a novel directed acyclic graph (DAG)-based distributed ledger preferred over traditional linear ledgers in blockchain applications because of better transaction throughput. Earlier techniques have mostly foc...
详细信息
ISBN:
(纸本)9783030602482;9783030602475
Tangle is a novel directed acyclic graph (DAG)-based distributed ledger preferred over traditional linear ledgers in blockchain applications because of better transaction throughput. Earlier techniques have mostly focused on comparing the performance of graph chains over linear chains and incorporating the Markov Chain Monte Carlo process in probabilistic traversals to detect unverified transactions in DAG chains. In this paper, we present a parallel detection method for unverified transactions. Experimental evaluation of the proposed parallel technique demonstrates a significant, scalable average speed-up of close to 70%, and a peak speed-up of approximately 73% for a large number of transactions.
Payload anomaly detection can discover malicious behaviors hidden in network packets. It is hard to handle payload due to its various possible characters and complex semantic context, and thus identifying abnormal pay...
详细信息
ISBN:
(纸本)9781665421263
Payload anomaly detection can discover malicious behaviors hidden in network packets. It is hard to handle payload due to its various possible characters and complex semantic context, and thus identifying abnormal payload is also a non-trivial task. Prior art only uses the n-gram language model to extract features, which directly leads to ultra-high-dimensional feature space and also fails to capture the context semantics fully. Accordingly, this paper proposes a word embedding-based context-sensitive network flow payload anomaly detection method (termed WECAD). First, WECAD obtains the initial feature representation of the payload through the word embedding-based method. Then, we propose a corpus pruning algorithm, which applies the cosine similarity clustering and frequency distribution to prune inconsequential characters. We only keep the essential characters to reduce the calculation space. Subsequently, we propose a context learning algorithm. It employs the co-occurrence matrix transformation technology and introduces the backward step size to consider the order relationship of essential characters. Comprehensive experiments on real-world intrusion detection datasets validate the effectiveness of our method.
暂无评论