this book constitutes the refereed proceedings of the 12th IFIP WG 12.5 internationalconference on Artificial Intelligence Applications and Innovations, AIAI 2016, and three parallel workshops, held in thessaloniki, ...
ISBN:
(纸本)9783319831688
this book constitutes the refereed proceedings of the 12th IFIP WG 12.5 internationalconference on Artificial Intelligence Applications and Innovations, AIAI 2016, and three parallel workshops, held in thessaloniki, Greece, in September 2016. the workshops are the third Workshop on New Methods and Tools for Big Data, MT4BD 2016, the 5th Mining Humanistic Data Workshop, MHDW 2016, and the First Workshop on 5G - Putting Intelligence to the Network Edge, 5G-PINE 2016. the 30 revised full papers and 8 short papers presented at the main conference were carefully reviewed and selected from 65 submissions. the 17 revised full papers and 7 short papers presented at the 3 parallel workshops were selected from 33 submissions. the papers cover a broad range of topics such as artificial neural networks, classification, clustering, control systems - robotics, data mining, engineering application of AI, environmental applications of AI, feature reduction, filtering, financial-economics modeling, fuzzy logic, genetic algorithms, hybrid systems, image and video processing, medical AI applications, multi-agent systems, ontology, optimization, pattern recognition, support vector machines, text mining, and Web-social media data AI modeling.
processing of big scale-free graphs on parallelarchitectures with high parallelization opportunities connected with a lot of overheads. Due to skewed degree distribution each thread receives different amount of compu...
详细信息
ISBN:
(数字)9783319654829
ISBN:
(纸本)9783319654829;9783319654812
processing of big scale-free graphs on parallelarchitectures with high parallelization opportunities connected with a lot of overheads. Due to skewed degree distribution each thread receives different amount of computational workload. In this paper we present a method devoted to address this challenge by modificating CSR data structure and redistributing work across threads. the method was implemented in breadth-first search and single source shortest pathalgorithms for GPU architecture.
Derivation of input sequences for distinguishing states of a finite state machine (FSM) specification is well studied in the context of FSM-based functional testing. We present a parallel multithreaded implementation ...
详细信息
ISBN:
(纸本)9781509054541
Derivation of input sequences for distinguishing states of a finite state machine (FSM) specification is well studied in the context of FSM-based functional testing. We present a parallel multithreaded implementation of the exact algorithm using Open Multi-processing (OpenMP). Experiments are conducted to assess the performance of the parallel implementation as compared to the sequential implementation using both execution time speedup and efficiency.
the Discrete Periodic Radon Transform (DPRT) has many important applications in reconstructing images from their projections and has recently been used in fast and scalable architectures for computing 2D convolutions....
详细信息
the Discrete Periodic Radon Transform (DPRT) has many important applications in reconstructing images from their projections and has recently been used in fast and scalable architectures for computing 2D convolutions. Unfortunately, the direct computation of the DPRT involves O(N~3) additions and memory accesses that can be very costly in single-core architectures. the current paper presents new and efficient algorithms for computing the DPRT and its inverse on multi-core CPUs and GPUs. the results are compared against specialized hardware implementations (FPGAs/ASICs). the results provide significant evidence of the success of the new algorithms. On an 8-core CPU (Intel Xeon), with support for two threads per core, FastDirDPRT and FastDirInvDPRT achieve a speedup of approximately 10× (up to 12.83×) over the single-core CPU implementation. On a 2048-core GPU (GTX 980), FastRayDPRT and FastRayInvDPRT achieve speedups in the range of 526 (for 127 × 127) to 873 (for 1021 × 1021), which approximate ideal speedups of what can be achieved. the DPRT can be computed exactly and in real-time (30 frames per second) for 1471 × 1471 images using FastRayDPRT on the GPU. Furthermore, the GPU algorithms approximate the performance of an efficient FPGA implementation using 2N parallel cores at 100MHz.
the proceedings contain 11 papers. the topics discussed include: overcoming load imbalance for irregular sparse matrices;optimizing Word2Vec performance on multicore systems;parallel depth-first search for directed ac...
ISBN:
(纸本)9781450351362
the proceedings contain 11 papers. the topics discussed include: overcoming load imbalance for irregular sparse matrices;optimizing Word2Vec performance on multicore systems;parallel depth-first search for directed acyclic graphs;progressive load balancing of asynchronous algorithms;a case for migrating execution for irregular applications;pressure-driven hardware managed thread concurrency for irregular applications;an efficient data layout transformation algorithm for locality-aware parallel sparse FFT;spherical region queries on multicore architectures;evaluation of knight landing high bandwidth memory for HPC workloads;enabling work-efficiency for high performance vertex-centric graph analytics on GPUs;and accelerating energy games solvers on modern architectures.
Dynamic programming techniques are well-established and employed by-various practical algorithms, including the edit-distance algorithm or the dynamic time warping algorithm. these algorithms usually operate in an ite...
详细信息
Dynamic programming techniques are well-established and employed by-various practical algorithms, including the edit-distance algorithm or the dynamic time warping algorithm. these algorithms usually operate in an iteration-based manner where new values are computed from values of the previous iteration. the data dependencies enforce synchronization which limits possibilities for internal parallelprocessing. In this paper, we investigate parallel approaches to processing matrix-based dynamic programming algorithms on modern multicore CPUs, Intel Xeon Phi accelerators, and general purpose GPUs. We address boththe problem of computing a single distance on large inputs and the problem of computing a number of distances of smaller inputs simultaneously (e.g., when a similarity query is being resolved). Our proposed solutions yielded significant improvements in performance and achieved speedup of two orders of magnitude when compared to the serial baseline. (C) 2016 Elsevier Ltd. All rights reserved.
the ultrafast electron beam X-ray computed tomography (CT) measuring system of the Helmholtz-Zentrum Dresden-Rossendorf (HZDR) is primarily operated for fundamental multiphase flow investigations, e.g. in various tech...
详细信息
the ultrafast electron beam X-ray computed tomography (CT) measuring system of the Helmholtz-Zentrum Dresden-Rossendorf (HZDR) is primarily operated for fundamental multiphase flow investigations, e.g. in various technical devices, and for validation of enhanced flow simulation models, e.g. developed for computational fluid dynamic codes (CFD). the CT scanner delivers cross-sectional material distributions by contactless measurements with a spatial resolution of approximately 1 mm and a temporal resolution of maximal 8 kHz. Currently, two central time-consuming processes have been identified limiting the efficient usage of that worldwide unique CT technique: a) the data transfer from the detector system to central data storages (e.g. computer or data base) and b) the data processing. thus, data pre-processing and data reconstruction algorithms have been adapted for the use at multi-core central processing units (CPUs) and even many-core graphics processing units (GPUs). For optimal data processing results an advanced performance PC with two parallel operated high performance graphics processing units, a six-core processor, a high internal data bus speed and a large memory block has been assembled. the newly developed data processingalgorithms induce a performance improvement of approximately 137 for the entire data processing sequence compared to the previous universally applicable single core CPU based data processing tool. (C) 2016 Elsevier Ltd. All rights reserved.
the proceedings contain 9 papers. the special focus in this conference is on Accelerating Data Analysis and Data Management Systems Using Modern Processor and Storage architectures. the topics include: Efficient range...
ISBN:
(纸本)9783319561103
the proceedings contain 9 papers. the special focus in this conference is on Accelerating Data Analysis and Data Management Systems Using Modern Processor and Storage architectures. the topics include: Efficient range queries on modern CPUs;vectorized time series algorithms on modern commodity CPUs;compression-aware in-memory query processing;overtaking CPU DBMSes with a GPU in whole-query analytic processing withparallelism-friendly execution plan optimization;making in-memory databases fast on modern NICs;an analysis on modern hardware;locality-adaptive parallel hash joins using hardware transactional memory;an embedded in-memory DBMS enabling instant snapshot sharing and runtime fragility in main memory.
the proceedings contain 29 papers. the special focus in this conference is on Measurement, Modelling and Evaluation of Computing Systems. the topics include: Evaluating a Single-Server Queue with Asynchronous Speed Sc...
ISBN:
(纸本)9783319749464
the proceedings contain 29 papers. the special focus in this conference is on Measurement, Modelling and Evaluation of Computing Systems. the topics include: Evaluating a Single-Server Queue with Asynchronous Speed Scaling;Active Queue Management Based on Congestion Policing (CP-AQM);Deficit Round Robin with Limited Deficit Savings (DRR-LDS) for Fairness Among TCP Users;Modeling the Performance of ARQ Error Control in an LTE Transmission System;catching Corner Cases in Network Calculus – Flow Segregation Can Improve Accuracy;QoE Analysis of the Setup of Different Internet Services for FIFO Server Systems;VirtuWind – An SDN- and NFV-Based Architecture for Softwarized Industrial Networks;A Modular Environment to Test SCADA Solutions for Wind Parks;Evaluation of Single-Hop Beaconing with Congestion Control in IEEE WAVE and ETSI ITS-G5;markov Automata on Discount!;practical QoE Evaluation of Adaptive Video Streaming;a Domain-Specific Language and Toolchain for Performance Evaluation Based on Measurements;SLA Tool;A Tool for Generating Automata of IEC60870-5-104 Implementations;a Software Tool for the Compact Solution of the Chemical Master Equation;logical PetriNet A Tool to Model Digital Circuit Petri Nets and Transform them into Digital Circuits;classCast: A Tool for Class-Based Forecasting;collider – parallel Experiments in Silico;FunSpec4DTMC – A Tool for Modelling Discrete-Time Markov Chains Using Functional Specification;Model-Based System Design and Evaluation of Image processingarchitectures with SimTAny Framework;markov Analysis of Optimum Caching as an Equivalent Alternative to Belady’s Algorithm Without Look-Ahead;intrusion Detection for Sequence-Based Attacks with Reduced Traffic Models;performance Benchmarking of Network Function Chain Placement algorithms.
High-quality translation is time-consuming and an expensive process. Named Entity (NE) Translation, including proper names, remains a very important task for multilingual natural language processing. Most of the gold ...
详细信息
暂无评论