Integer Linear programming (ILP) is an important mathematical approach for solving time-sensitive real-life optimization problems, including network routing, map routing, traffic scheduling, etc. However, the algorith...
详细信息
ISBN:
(纸本)9798331506476
Integer Linear programming (ILP) is an important mathematical approach for solving time-sensitive real-life optimization problems, including network routing, map routing, traffic scheduling, etc. However, the algorithms for solving ILPs are typically sparse and branch-intensive, and not CPU/GPU friendly. In the paper 'What could a million cores do to solve Integer programs', Koch et al. [40] presented data illustrating that Integer Linear programming (ILP) applications take tens of hours of execution time even on the largest parallel computers. Long execution time is a problem because many real-life applications need a decision in seconds or minutes. The widely used ILP solvers, like Gurobi (optimized for CPUs), perform software-based optimizations to handle the inherent sparsity in ILPs but still do not meet decision threshold because of the limited throughput of CPUs. GPUs are suited for large-sized dot-product compute, however, GPU-based ILP solvers also do not meet decision thresholds as (i) GPU is not sparsity friendly and (ii) GPU incurs thread divergence for branching, resulting in under-utilization of streaming engines and periodic host-GPU interaction. We propose SPARK, a sparsity-aware, reuse-aware, energy-efficient, reconfigurable, near-cache ILP architecture that (i) re-configures the existing L1 cache present in CPUs to perform near-cache acceleration with easy integration into the baseline CPU pipeline with minimal area overhead (∼ 1.4% of a CPU), (ii) performs near-cache sparsity detection and sparsity-aware compute, reducing the number of insignificant computations, and data movement energy overheads, (iii) leverages the computational patterns present in algorithms used for solving ILP to realize a reuse-aware architecture, and (iv) is applicable to solving sparse and dense ILPs and LPs (Linear Programs). We observe 15 x / 20 x, and 152 x / 740 x performance/energy improvement over AMD's Zen3 CPU, and Nvidia's Tesla v100 GPU for sparse reallife ILPs
The proceedings contain 40 papers. The topics discussed include: efficient heuristic for placing monitors on flow networks;efficient implementation of genetic algorithms on GP-GPU with scheduled persistent CUDA thread...
ISBN:
(纸本)9781467391177
The proceedings contain 40 papers. The topics discussed include: efficient heuristic for placing monitors on flow networks;efficient implementation of genetic algorithms on GP-GPU with scheduled persistent CUDA threads;exploiting pure superword level parallelism for array indirections;modeling binary oriented software buffer overflow vulnerability in process algebra;OpenISMA: an approach of achieving a scalable OpenFlow network by identifiers separating and mapping;an efficient tolerant-anisotropic localization for large-scale wireless sensor network;distributed processing of approximate range queries in wireless sensor networks;vector localization algorithm based on signal strength in wireless sensor network;and parallel and improvement of pre-computation technique for approximation shortest distance query.
The proceedings contain 39 papers. The topics discussed include: face inpainting with dilated skip architecture and multi-scale adversarial networks;plaintext checkable encryption with check delegation: new models and...
ISBN:
(纸本)9781538694039
The proceedings contain 39 papers. The topics discussed include: face inpainting with dilated skip architecture and multi-scale adversarial networks;plaintext checkable encryption with check delegation: new models and simple constructions;state space model predictive control based on nuclear norm system identification;time series forecasting using sequence-to-sequence deep learning framework;gesture recognition based on tri-axis accelerometer using 1D Gabor filters;the application of clustering mining technology in e-commerce website;enhancing data availability through automatic replication in the hadoop cloud system;and mapping exceptions to high-level source code on a heterogeneous architecture.
The proceedings contain 30 papers. The topics discussed include: railway video inspection system based on ACP theory;DRGC: a dynamic redundant gradient coding method;personalized federated aggregation algorithm based ...
ISBN:
(纸本)9798350371024
The proceedings contain 30 papers. The topics discussed include: railway video inspection system based on ACP theory;DRGC: a dynamic redundant gradient coding method;personalized federated aggregation algorithm based on local attention mechanism;a hierarchical model loading strategy in WebBIM scene based on cloud-edge-terminal collaborative architecture;AGM: adaptive graph enhanced graph matching for planar tracking;development trends and countermeasures of China’s cloud artificial intelligence chip industry;multimodal sentiment analysis based on supervised contrastive learning and cross-modal translation under modalities missing;a fine-grained task execution scheme for terminal-edge-cloud cooperative networks;multimodal fake news detection based on multi-source heterogeneous data fusion;and FVHNet: homography matrix estimation for virtual camera.
The proceedings contain 36 papers. The topics discussed include: accelerating a lossy compression method with fine-grained parallelism on a GPU;a parallel algorithm to construct node-independent spanning trees on the ...
ISBN:
(纸本)9781665496391
The proceedings contain 36 papers. The topics discussed include: accelerating a lossy compression method with fine-grained parallelism on a GPU;a parallel algorithm to construct node-independent spanning trees on the line graph of locally twisted cube;FEAS: a faster event-driven accelerator supporting inhibitory spiking neural network;efficient distributed parallel aligning reads and reference genome with many repetitive subsequences using compact De Bruijn graph;efficient algorithm for home care scheduling with budget constraint;optimizations of a linear matrix solver in a composite simulation for a vector computer;a GA-based energy aware virtual machine placement algorithm for cloud data centers;joint optimization of path planning and task assignment for space robot;infectious disease dynamics model considering suspected population;on the adoption of metaheuristics for solving 0–1 knapsack problems;ternary optical computer: an overview and recent developments;and path integral Monte Carlo quantum annealing-based clustering and routes optimization of clustered UAV network.
The proceedings contain 56 papers. The topics discussed include: design and evaluation of dynamically-allocated multi-queue buffers with multiple packets for NoC routers;algorithmic aspects for bi-objective multiple-c...
ISBN:
(纸本)9781479938445
The proceedings contain 56 papers. The topics discussed include: design and evaluation of dynamically-allocated multi-queue buffers with multiple packets for NoC routers;algorithmic aspects for bi-objective multiple-choice hardware/software partitioning;fault-tolerant distributed publish/subscribe using self-stabilization;a runtime framework for GPGPU;a sensitive and robust grid reputation system based on rating of recommenders;efficient FPGA-mapping of 1024 point FFT pipeline SDF processor;multi-parameter online identification algorithm of induction motor for hybrid electric vehicle applications;wide area power system fault detection using compressed sensing to reduce the WAN data traffic;PSO applied to optimal operation of a micro-grid with wind power;and a fuzzy multi-objective optimization method solving the output of energy storage system.
The proceedings contain 66 papers. The topics discussed include: option pricing on the GPU with backward stochastic differential equation;DHFS: a high-throughput heterogeneous file system based on mainframe for cloud ...
ISBN:
(纸本)9780769545752
The proceedings contain 66 papers. The topics discussed include: option pricing on the GPU with backward stochastic differential equation;DHFS: a high-throughput heterogeneous file system based on mainframe for cloud storage;security-driven fault tolerant scheduling algorithm for high dependable distributed real-time system;a polynomial algorithm for the vertex disjoint min-min problem in planar graphs;improving parallel FDTD method performance using SSE instructions;communication-aware design space exploration for efficient run-time MPSoC management;energy minimization for software real-time systems with uncertain execution time;an efficient algorithm for privacy preserving maximal frequent itemsets mining;distributed network resources monitoring based on multi-agent and matrix grammar;an implementation of GPU-based parallel optimization for an extended uncertain data query algorithm;and job scheduling optimization for multi-user MapReduce clusters.
The proceedings contain 28 papers. The topics discussed include: passivity-based finite-time consensus for nonlinear fractional-order multi-agent systems;accelerating GNN inference by soft channel pruning;a multi-obje...
ISBN:
(纸本)9781665452182
The proceedings contain 28 papers. The topics discussed include: passivity-based finite-time consensus for nonlinear fractional-order multi-agent systems;accelerating GNN inference by soft channel pruning;a multi-object detection sampling algorithm for large scenes;comparative study on data sovereignty guarantee technology;cluster-based federated learning framework for intrusion detection;graph-based multi-view partial multi-label learning;hypergraphs: concepts, applications and analysis;traffic speed prediction of road cluster with heterogeneous sampling frequency;multi-selection attention for multimodal aspect-level sentiment classification;deep just-in-time consistent comment update via source code changes;do not have enough data? an easy data augmentation for code summarization;leveraging graph to improve lexicon enhanced Chinese sequence labelling;and parallel accelerating ultra-long read alignment by vertical partitioning data.
The proceedings contain 54 papers. The topics discussed include: an efficient video program delivery algorithm in tree networks;an efficient content delivery algorithm for intermittently connected mobile ad hoc networ...
ISBN:
(纸本)9780769543123
The proceedings contain 54 papers. The topics discussed include: an efficient video program delivery algorithm in tree networks;an efficient content delivery algorithm for intermittently connected mobile ad hoc networks;one-hop neighbor transmission coverage information based distributed algorithm for connected dominating set;a novel P2P identification algorithm based on genetic algorithm and particle swarm optimization;accelerating reconfiguration for degradable mesh-connected processor arrays;a novel approach for multilevel fixed outline floorplanning;scheduling multiple multithreaded applications on asymmetric and symmetric chip multiprocessors;a hybrid fault tolerance model for reliable scheduling of critical real-time applications on grid systems;GTFTTS: a generalized tit-for-tat based corporative game for temperature-aware task scheduling in multi-core systems;and a scheduling strategy on load balancing of virtual machine resources in cloud computing environment.
The proceedings contain 44 papers. The topics discussed include: reduce data coherence cost with an area efficient double layer counting bloom filter;synchronization-aware dynamic thread scheduling for improving perfo...
ISBN:
(纸本)9780769548982
The proceedings contain 44 papers. The topics discussed include: reduce data coherence cost with an area efficient double layer counting bloom filter;synchronization-aware dynamic thread scheduling for improving performance and saving energy in multi-core embedded systems;efficient and secure trust negotiation over the Internet;design a low-power scheduling mechanism for a multicore android system;energy-aware scheduling for weakly-hard real-time system with I/O device;sparse matrix-vector multiplication based on network-on-chip: on data mapping;monoecism watermarking algorithm;a new piecewise chaotic mapping and its application in image secure communication;formulistic detection of malicious fast-flux domains;task scheduling prediction algorithms for dynamic hardware/software partitioning;and triggering cascades on strongly connected directed graphs.
暂无评论