Frequent items in high-speed streaming data are important to many applications like network monitoring and anomaly detecting. To deal with high arrival rate of streaming data, it is desirable that such systems be capa...
详细信息
ISBN:
(纸本)9781509007684
Frequent items in high-speed streaming data are important to many applications like network monitoring and anomaly detecting. To deal with high arrival rate of streaming data, it is desirable that such systems be capable of supporting high processing throughput with tight guarantees on errors. In this paper, we address the problem of finding frequent and top-k items, and present a parallel version of the Space Saving algorithm in the context of the open source distributed computing system. Based on the theoretical analysis, the errors are restrictively bounded in our algorithm, and our parallel design could achieve high throughput. Taking advantage of the distributed computing resources, our evaluation reveals that such design delivers linear speedup with remarkable scalability.
In the field of parallelcomputing, the late leader Ken Kennedy, has raised a concern in early 1990s: "Is parallelcomputing Dead?" Now, we have witnessed the tremendous momentum of the "second spring&q...
详细信息
ISBN:
(纸本)9783319470993;9783319470986
In the field of parallelcomputing, the late leader Ken Kennedy, has raised a concern in early 1990s: "Is parallelcomputing Dead?" Now, we have witnessed the tremendous momentum of the "second spring" of parallelcomputing in recent years. But, what lesson should we learn from the history of parallelcomputing when we are walking out from the bottom state of the field? To this end, this paper examines the disappointing state of the work in parallel Turing machine models in the past 50 years of parallelcomputing research. Lacking a solid yet intuitive parallel Turing machine model will continue to be a serious challenge. Our paper presents an attempt to address this challenge - by presenting a proposal of a parallel Turing machine model - the PTM model. We also discuss why we start our work in this paper from a parallel Turing machine model instead of other choices.
Software defined network (SDN) is an emerging network architecture that has drawn the attention of academics and industry in recent years. Affected by investment protection, risk control and other factors, the full de...
详细信息
ISBN:
(纸本)9783319470993;9783319470986
Software defined network (SDN) is an emerging network architecture that has drawn the attention of academics and industry in recent years. Affected by investment protection, risk control and other factors, the full deployment of SDN will not be finished in the short term, thus it results into a coexistence state of traditional IP network and SDN which is named hybrid SDN. In this paper, we formulate the SDN controller's optimization problem for load balancing as a mathematical model. then we propose a routing algorithm Dijkstra-Repeat in SDN nodes which can offer disjoint multipath routing. To make it computationally feasible for large scale networks, we develop a new Fast Fully Polynomial Time Approximation Schemes (FPTAS) based Lazy Routing Update (LRU).
Recent advances in deep learning have shown that Binary Neural network (BNN) is able to provide a satisfying accuracy on various image datasets with a significant reduction in computation and memory cost. With both we...
详细信息
Recent advances in deep learning have shown that Binary Neural network (BNN) is able to provide a satisfying accuracy on various image datasets with a significant reduction in computation and memory cost. With both weights and activations binarized to +1 or -1 in BNNs, the high-precision multiply-and-accumulate (MAC) operations can be replaced by XNOR and bit-counting operations. In this work, we present two computing-in-memory (CIM) architectures withparallelized weighted-sum operation for accelerating the inference of BNN: 1) parallel XNOR-SRAM, where a customized 8T-SRAM cell is used as a synapse;2) parallel XNOR-RRAM, where a customized bit-cell consisting of 2T2R cells is used as a synapse. For large-scale weight matrices in neural networks, the array partition is necessary, where multi-level sense amplifiers (MLSAs) are employed as the intermediate interface for accumulating partial weighted sums. We explore various design options with different sub-array sizes and sensing bit-levels. Simulation results with 65nm CMOS PDK and RRAM models show that the system with 128x128 sub-array size and 3-bit MLSA can achieve 87.46% for an inspired VGG-like network on CIFAR-10 dataset, showing less than 1% degradation compared to the ideal software accuracy. the estimated energy-efficiency of XNOR-SRAM and XNOR-RRAM shows ~30X improvement compared to the corresponding conventional SRAM and RRAM architectures with sequential row-by-row read-out.
Full-Batch update and mini-batch update are two most widely used algorithms in back-propagation(BP) neural network, to deal withthe huge training time and computation cost in the learning process. parallelcomputing ...
详细信息
ISBN:
(纸本)9781509028603
Full-Batch update and mini-batch update are two most widely used algorithms in back-propagation(BP) neural network, to deal withthe huge training time and computation cost in the learning process. parallelcomputing can improve the computation efficiency and have implemented these two algorithms on Mapreduce framework. In this paper, we implement these two algorithms on Spark framework and evaluate the performance by extensive experimental results. We verify that, Spark framework outperforms Mapreduce in implementing full batch update algorithm due to its innovative design philosophy and lazy evaluation mechanisms. In addition, mini-batch update algorithm will cost less training time than full-batch update and we also notice that the moderate batch size k will perform better, although accurate analysis is not completed. Our work obtain novel knowledge in recognizing the performance in Spark framework for these two algorithms.
Web pages and web-based services are becoming more and more complex. the average page size for the Alexa top 1000 websites in 2016 has reached 2.1 MB and fetching a page requires requests for 128 different objects. Al...
详细信息
ISBN:
(纸本)9783901882852
Web pages and web-based services are becoming more and more complex. the average page size for the Alexa top 1000 websites in 2016 has reached 2.1 MB and fetching a page requires requests for 128 different objects. Although the bandwidth has been increasing exponentially in the last few years, the web experience is not improving at the same pace because of latency issues in HTTP/1. the HTTP/2 protocol aims to solve these issues by allowing clients and servers to multiplex HTTP requests and responses on a single TCP connection. If HTTP/2 is widely adopted, it can have enormous benefits not only for the user experience, but also for the servers and the network. Since clients do not have to open multiple parallel connections to avoid the problem of head-of-line blocking in HTTP/1.1, the number of concurrent TCP sessions can be significantly reduced. However., although multiplexing is one of the main features of HTTP/2, nothing actually prevents a client from opening multiple HTTP/2 connections to a server. In this paper we investigate the behavior of HTTP/2 traffic in the wild. We perform experiments to examine if web browsers use a single connection per domain over HTTP/2 in practice. Contrary to popular belief, our experiments on the traffic of a large university campus network and a residential network show that a significant number of HTTP/2 accesses are performed using parallel connections to a single domain on a server. We present two possible hypotheses for this behavior and discuss its implications for the future of the web.
this book constitutes the refereed proceedings of the 12thifip WG 12.5 internationalconference on Artificial Intelligence Applications and Innovations, AIAI 2016, and three parallel workshops, held in thessaloniki, ...
ISBN:
(数字)9783319449449
ISBN:
(纸本)9783319449432
this book constitutes the refereed proceedings of the 12thifip WG 12.5 internationalconference on Artificial Intelligence Applications and Innovations, AIAI 2016, and three parallel workshops, held in thessaloniki, Greece, in September 2016. the workshops are the third Workshop on New Methods and Tools for Big Data, MT4BD 2016, the 5th Mining Humanistic Data Workshop, MHDW 2016, and the First Workshop on 5G - Putting Intelligence to the network Edge, 5G-PINE 2016. the 30 revised full papers and 8 short papers presented at the main conference were carefully reviewed and selected from 65 submissions. the 17 revised full papers and 7 short papers presented at the 3 parallel workshops were selected from 33 submissions. the papers cover a broad range of topics such as artificial neural networks, classification, clustering, control systems - robotics, data mining, engineering application of AI, environmental applications of AI, feature reduction, filtering, financial-economics modeling, fuzzy logic, genetic algorithms, hybrid systems, image and video processing, medical AI applications, multi-agent systems, ontology, optimization, pattern recognition, support vector machines, text mining, and Web-social media data AI modeling.
Statistical analysis of aggregated records is widely used in various domains such as market research, sociological investigation and network analysis, etc. Stratified sampling (SS), which samples the population divide...
详细信息
ISBN:
(纸本)9783319470993;9783319470986
Statistical analysis of aggregated records is widely used in various domains such as market research, sociological investigation and network analysis, etc. Stratified sampling (SS), which samples the population divided into distinct groups separately, is preferred in the practice for its high effectiveness and accuracy. In this paper, we propose a scalable and efficient algorithm named DSS, for SS to process large datasets. DSS executes all the sampling operations in parallel by calculating the exact subsample size for each partition according to the data distribution. We implement DSS on Spark, a big-data processing system, and we show through large-scale experiments that it can achieve lower data-transmission cost and higher efficiency than state-of-the-art methods with high sample representativeness.
Nowadays, cryptography is one of the common security mechanisms. Cryptography algorithms are used to make secure data transmission over unsecured networks. Vital applications are required to techniques that encrypt/de...
详细信息
ISBN:
(纸本)9781509043354
Nowadays, cryptography is one of the common security mechanisms. Cryptography algorithms are used to make secure data transmission over unsecured networks. Vital applications are required to techniques that encrypt/decrypt big data at the appropriate time, because the data should be encrypted/decrypted are variable size and usually the size of them is large. In this paper, for the mentioned requirements, the counter mode cryptography (CTR) algorithm with Data Encryption Standard (DES) core is paralleled by using Graphics Processing Unit (GPU). A secondary part of our work, this parallel CTR algorithm is applied on special network on chip (NoC) architecture that designed by Heracles toolkit. the results of numerical comparison show that GPU-based implementation can be achieved better runtime in comparison to the CPU-based one. Furthermore, our final implementations show that parallel CTR mode cryptography is achieved better runtime by using special NoC that applied on FPGA board in comparison to GPU-based and CPU ones.
this paper focuses on using big data technology to solve the operational reliability evaluation problem in the power distribution system. Operational reliability evaluation means the capability to forecast the future ...
详细信息
ISBN:
(纸本)9781509051540
this paper focuses on using big data technology to solve the operational reliability evaluation problem in the power distribution system. Operational reliability evaluation means the capability to forecast the future reliability from the current system. this becomes a challenge problem as the power distribution network has been even more complex. Meanwhile, the safety and reliability are considered more and more critical for the network. therefore, it needs to new approach to evaluate the operational state of power distribution system. For such a purpose, this paper proposes an operational reliability evaluation method based on the big data of distribution network and big data processing technology. Based on the operational reliability index system of distribution network, this paper firstly analyzed the main influencing factors of each reliability indexes by using parallel association rules mining method. then, by using these factors as input variables, an evaluation model was established based on artificial neural network. Based on the evaluation model and real-time data, the operational reliability can be obtained.
暂无评论