The field of Intelligent systems and Applications has expanded enormously during the last two decades. Theoretical and practical results in this area are growing rapidly due to many successful applications and new the...
ISBN:
(纸本)9783662523315
The field of Intelligent systems and Applications has expanded enormously during the last two decades. Theoretical and practical results in this area are growing rapidly due to many successful applications and new theories derived from many diverse problems. This book is dedicated to the Intelligent systems and Applications in many different aspects. In particular, this book is to provide highlights of the current research in Intelligent systems and Applications. It consists of research papers in the following specific topics: l Authentication, Identification, and Signaturel Intrusion Detectionl Steganography, Data Hiding, and Watermarkingl Database, System, and Communication Securityl Computer Vision, Object Tracking, and Pattern Recognitionl Image Processing, Medical Image Processing, and Video Codingl Digital Content, Digital Life, and Human Computer Interactionl parallel, Peer-to-peer, distributed, and Cloud Computingl Software Engineering and Programming Language This book provides a reference to theoretical problems as well as practical solutions and applications for the state-of-the-art results in Intelligent systems and Applications on the aforementioned topics. In particular, both the academic community (graduate students, post-doctors and faculties) in Electrical Engineering, Computer Science, and Applied Mathematics; and the industrial community (engineers, engineering managers, programmers, research lab staffs and managers, security managers) will find this book interesting."
Data driven science is becoming increasingly more common, complex, and is placing tremendous stresses on visualization and analysis frameworks. Data sources producing 10GB per second (and more) are becoming increasing...
详细信息
ISBN:
(纸本)9781509036837
Data driven science is becoming increasingly more common, complex, and is placing tremendous stresses on visualization and analysis frameworks. Data sources producing 10GB per second (and more) are becoming increasingly commonplace in both simulation, sensor and experimental sciences. These data sources, which are often distributed around the world, must be analyzed by teams of scientists that are also distributed. Enabling scientists to view, query and interact with such large volumes of data in near-real-time requires a rich fusion of visualization and analysis techniques, middleware and workflow systems. This paper discusses initial research into visualization and analysis of distributed data workflows that enables scientists to make near-real-time decisions of large volumes of time varying data.
The continuous evolution of digital services, is resulting in the generation of extremely large data sets that are created in almost real time. Exploring new opportunities for improving the quality of these digital se...
详细信息
ISBN:
(纸本)9781509036837
The continuous evolution of digital services, is resulting in the generation of extremely large data sets that are created in almost real time. Exploring new opportunities for improving the quality of these digital services, as well as providing better-personalized experiences to digital users are two major challenges to be addressed. Different methods, tools, and techniques existed today to generate actionable insights from digital services data. Traditionally, big data problems are handled on historical data-sets. However, there is a growing demand on real-time data analytics to offer new services to users and to provide pro-active customers' care, personalized ads, emergency aids, just to give a few examples. Spite of the fact that there are few existing frameworks for real-time analytics, however, utilizing those for solving distributed real-time big data analytical problems stills remains a challenge. Existing real-time data analytics (RTDA) frameworks are not covering all the features that requires for distributed computation in real-time. Therefore, in this paper, we present a qualitative overview and analysis on some of the mostly used existing RTDA frameworks. Specifically, Apache Spark, Apache Flink, Apache Storm, and Apache Samza are covered and discussed in this paper.
Modeling multi-way data can be accomplished using tensors, which are data structures indexed along three or more dimensions. Tensors are increasingly used to analyze extremely large and sparse multi-way datasets in li...
详细信息
ISBN:
(纸本)9781509021413
Modeling multi-way data can be accomplished using tensors, which are data structures indexed along three or more dimensions. Tensors are increasingly used to analyze extremely large and sparse multi-way datasets in life sciences, engineering, and business. The canonical polyadic decomposition (CPD) is a popular tensor factorization for discovering latent features and is most commonly found via the method of alternating least squares (CPD-ALS). The computational time and memory required to compute CPD limits the size and dimensionality of the tensors that can be solved on a typical workstation, making distributed solution approaches the only viable option. Most methods for distributed-memory systems have focused on distributing the tensor in a coarse-grained, one-dimensional fashion that prohibitively requires the dense matrix factors to be fully replicated on each node. Recent work overcomes this limitation by using a fine-grained decomposition of the tensor nonzeros, at the cost of computationally expensive hypergraph partitioning. To that effect, we present a medium-grained decomposition that avoids complete factor replication and communication, while eliminating the need for expensive pre-processing steps. We use a hybrid MPI+OpenMP implementation that exploits multi-core architectures with a low memory footprint. We theoretically analyze the scalability of the coarse-, medium-, and fine-grained decompositions and experimentally compare them across a variety of datasets. Experiments show that the medium-grained decomposition reduces communication volume by 36-90% compared to the coarse-grained decomposition, is 41-76x faster than a state-of-the-art MPI code, and is 1.5-5.0x faster than the fine-grained decomposition with 1024 cores.
The rise of big data systems has created a need for benchmarks to measure and compare the capabilities of these systems. Big data benchmarks present unique scalability challenges. The supercomputing community has wres...
详细信息
ISBN:
(纸本)9781509036837
The rise of big data systems has created a need for benchmarks to measure and compare the capabilities of these systems. Big data benchmarks present unique scalability challenges. The supercomputing community has wrestled with these challenges for decades and developed methodologies for creating rigorous scalable benchmarks (e.g., HPC Challenge). The proposed PageRank pipeline benchmark employs supercomputing benchmarking methodologies to create a scalable benchmark that is reflective of many real-world big data processing systems. The PageRank pipeline benchmark builds on existing prior scalable benchmarks (Graph500, Sort, and PageRank) to create a holistic benchmark with multiple integrated kernels that can be run together or independently. Each kernel is well defined mathematically and can be implemented in any programming environment. The linear algebraic nature of PageRank makes it well suited to being implemented using the GraphBLAS standard. The computations are simple enough that performance predictions can be made based on simple computing hardware models. The surrounding kernels provide the context for each kernel that allows rigorous definition of both the input and the output for each kernel. Furthermore, since the proposed PageRank pipeline benchmark is scalable in both problem size and hardware, it can be used to measure and quantitatively compare a wide range of present day and future systems. Serial implementations in C++, Python, Python with Pandas, Matlab, Octave, and Julia have been implemented and their single threaded performance has been measured.
Developing scalable real-time systems that can simultaneously process massive amounts of noisy multi-sensory data, while being energy efficient, is a dominant challenge in the new era of cognitive computing. Low-power...
详细信息
ISBN:
(纸本)9781450340397
Developing scalable real-time systems that can simultaneously process massive amounts of noisy multi-sensory data, while being energy efficient, is a dominant challenge in the new era of cognitive computing. Low-power, flexible neurosynaptic architectures offer tremendous promise in this area. To this end, we developed TrueNorth, a 65mW brain-inspired processor that implements a non-von Neumann, parallel, distributed, event-driven, modular, scalable, defect-tolerant architecture. With 4096 neurosynaptic cores, the TrueNorth chip contains 1 million digital neurons and 256 million synapses tightly interconnected by an event-driven routing infrastructure. The fully digital 5.4 billion transistor implementation leverages existing CMOS scaling trends, while ensuring one-to-one correspondence between hardware and software. Given that the TrueNorth architecture breaks path with prevailing architectures, conventional tool flows could not be used for the design. Therefore, we developed a novel design methodology that includes mixed asynchronous-synchronous circuits, interfaces, and a complete tool flow for building an event-driven, low-power neurosynaptic chip. Further, we have adapted existing VLSI CAD placement tools for mapping logical neural networks to the physical core locations on the TrueNorth chip to reduce the network's communication energy. The TrueNorth chip's low power consumption is ideal for use not only in large-scale computationally intensive applications, but also for embedded battery-powered mobile applications. The chip is fully configurable in terms of connectivity and neural parameters to allow custom configurations for a wide range of cognitive and sensory perception applications. We have successfully demonstrated the use of TrueNorth chips in multiple applications, including visual object recognition, with higher performance and orders of magnitude lower power than the same algorithms run on von Neumann architectures.
Applying scientific workflow to perform in-silico experiment is a more and more prevalent solution among the scientist's communities. Because of the data and compute intensive behavior of the scientific workflows ...
详细信息
Applying scientific workflow to perform in-silico experiment is a more and more prevalent solution among the scientist's communities. Because of the data and compute intensive behavior of the scientific workflows parallel and distributed system (grids, clusters, clouds and supercomputers) are required to execute them. After all the complexity of these infrastructures and the continuously changing environment significantly encumber or even prevent the repeatability or the reproducibility which is often needed for results sharing or for judging scientific claims. The necessary data and parameters of the re-execution can be originated from different sources (infrastructural, third party, or related to the binaries), which may change or become unavailable during the years. Our ultimate goal is to compensate the lack of the original parameters by replacing, evaluating or simulating the value of the parameters in dispute. In order to create these methods we determined the levels of the re-execution and we defined a descriptor-space to collect all the parameters needed to the reproducibility. However these procedures take some extra cost the average reproducibility cost can be computed or even evaluated. In this paper we give a method to evaluate the average cost of making a workflow reproducible if the exact computation is not possible.
High-Level Synthesis (HLS) tools have been developed to increase the abstraction level of hardware design process, by using models like high-level programming languages (e.g. C/C++), Domain Specific Languages and Grap...
详细信息
ISBN:
(纸本)9781509036837
High-Level Synthesis (HLS) tools have been developed to increase the abstraction level of hardware design process, by using models like high-level programming languages (e.g. C/C++), Domain Specific Languages and Graphs. However, despite their advances in the last decade, the available HLS tools still require from the designer a broad hardware knowledge, which prevents a bigger reduction in the design time. In this work, we propose a method to be used at the top of current high-level synthesis tools, allowing for a speed-up in the development process. The method starts with the application description in a subset of the ANSI-C language. Then we generate a Graph from the C source code. There is a finite number of possible nodes, and this fact allows the creation of a database of alternative hardware models for each possible node type. A simple optimization algorithm selects the combination of nodes which best fits under the constraints (power consumption, resource use, speed). If a node is not in the database or the constraints were not met, then the designer can use any commercial high-level synthesis tool (or a direct RTL description), to create a new hardware model and include it in the database. Some test cases were implemented using both the proposed methodology and a commercial HLS tool. The results obtained indicate that the method can reduce the design time while still providing fair results when compared to the commercial tool.
Recent years the Hadoop distributed File System(HDFS) has been deployed as the bedrock for many parallel big data processing systems, such as graph processing systems, MPI-based parallel programs and scala/java-based ...
详细信息
ISBN:
(纸本)9781479980062
Recent years the Hadoop distributed File System(HDFS) has been deployed as the bedrock for many parallel big data processing systems, such as graph processing systems, MPI-based parallel programs and scala/java-based Spark frameworks, which can efficiently support iterative and interactive data analysis in memory. The first part of my dissertation mainly focuses on studying parallel data access on distributed file systems, e.g, HDFS. Since the distributed I/O resources and global data distribution are often not taken into consideration, the data requests from parallel processes/executors will unfortunately be served in a remote or imbalanced fashion on the storage servers. In order to address these problems, we develop I/O middleware systems and matching-based algorithms to map parallel data requests to storage servers such that local and balanced data access can be achieved. The last part of my dissertation presents our plans to improve the performance of interactive data access in big data analysis. Specifically, most interactive analysis programs will scan through the entire data set regardless of which data is actually required. We plan to develop a content-aware method to quickly access required data without this laborious scanning process.
Database systems running on a cluster of machines, i.e. rack-scale databases, are a common architecture for many large databases and data appliances. As the data movement across machines is often a significant bottlen...
详细信息
ISBN:
(纸本)9781450327589
Database systems running on a cluster of machines, i.e. rack-scale databases, are a common architecture for many large databases and data appliances. As the data movement across machines is often a significant bottleneck, these systems typically use a low-latency, high-throughput network such as InfiniBand. To achieve the necessary performance, parallel join algorithms must take advantage of the primitives provided by the network to speed up data transfer. In this paper we focus on implementing parallel in-memory joins using Remote Direct Memory Access (RDMA), a communication mechanism to transfer data directly into the memory of a remote machine. The results of this paper are, to our knowledge, the first detailed analysis of parallel hash joins using RDMA. TO capture their behavior independently of the network characteristics, we develop an analytical model and test our implementation on two different types of networks. The experimental results show that the model is accurate and the resulting distributed join exhibits good performance.
暂无评论