the proceedings contains 81 papers from Eight ieeesymposium on parallel and distributedprocessing. Topics discussed include: parallel logic simulation;distributed list coloring;fault tolerant wormhole routing;effici...
详细信息
the proceedings contains 81 papers from Eight ieeesymposium on parallel and distributedprocessing. Topics discussed include: parallel logic simulation;distributed list coloring;fault tolerant wormhole routing;efficient broadcast and multicast on multistage interconnection networks;efficient file transmission algorithm;implementing cooperative software with high level communication packages;framework for modeling applications;scalable scheduling algorithm;periodically regular chordal rings;index-shuffle graphs;optimistic parallel computation;load-balancing in sparse matrix-vector multiplication;real-time sonar beamforming;loop allocation policy;adaptive loop scheduling algorithm;and logical disks.
the proceedings contain 161 papers. the topics discussed include: deep learning for phishing detection;a partition matching method for optimal attack path analysis;an energy and robustness adjustable optimization meth...
ISBN:
(纸本)9781728111414
the proceedings contain 161 papers. the topics discussed include: deep learning for phishing detection;a partition matching method for optimal attack path analysis;an energy and robustness adjustable optimization method of file distribution services;deriving the political affinity of twitter users from their followers;on the usability of big (social) data;re-running large-scale parallel programs using two nodes;predicting hacker adoption on darkweb forums using sequential rule mining;an on-the-fly scheduling strategy for distributed stream processing platform;deadlock-free adaptive routing based on the repetitive turn model for 3D network-on-chip;and radix: enabling high-throughput georeferencing for phenotype monitoring over voluminous observational data.
Transformer-based models are widely used in natural language processing tasks, and their application has been further extended to computer vision as well. In their usage, data security has become a crucial concern whe...
详细信息
Transformer-based models are widely used in natural language processing tasks, and their application has been further extended to computer vision as well. In their usage, data security has become a crucial concern when deploying deep learning services on cloud platforms. To address these security concerns, Multi-party computation (MPC) is employed to prevent data and model leakage during the inference process. However, Transformer model introduces several challenges for MPC computation, including the time overhead of the Softmax (normalized exponential) function, the accuracy issue caused by the "dynamic range" of approximated division and exponential, and the high memory overhead when processing long sequences. To overcome these challenges, we propose MLformer, an MPC-based inference framework for transformer models based on Crypten Knott et al. (Adv Neural Inf Process Syst 34: 4961-4973, 2021), a secure machine learning framework suggested by Facebook AI Research group, in the semi-honest adversary model. In this framework, we replace the softmax attention with linear attention, which has linear time and memory complexity with input length. the modification eliminates the softmax function entirely, resulting in lower time and memory overhead. To ensure the accuracy of linear attention, we propose the scaled linear attention to address the dynamic range issue caused by the MPC division used and a new approximate division function is proposed to reduce the computational time of the attention block. Furthermore, to improve the efficiency and accuracy of MPC exponential and reciprocal which are commonly used in transformer model, we propose a novel MPC exponential protocol and first integrate the efficient reciprocal protocol Bar-Ilan and Beaver (in Proceedings of the 8th annual ACM symposium on principles of distributed computing, pp. 201-209, 1989) to our framework. Additionally, we optimize the computation of causal linear attention, which is utilized in private in
Microprocessor design space exploration is an inevitable stage in the early stages of microprocessor design. In work [1], a critical path analysis based design space exploration method is proposed. Critical path analy...
详细信息
ISBN:
(纸本)9781728111414
Microprocessor design space exploration is an inevitable stage in the early stages of microprocessor design. In work [1], a critical path analysis based design space exploration method is proposed. Critical path analysis on the instruction dependence graph is often used in the research of the micro-architecture of the instruction pipeline of the microprocessor. Previous analysis method must process the huge log file serially and the analysis time was very long. In this paper, a parallel analysis algorithm based on multithreading was presented. By partitioning the log file into multiple blocks and using multiple threads to process them in parallel, this algorithm achieved a nearly linear speedup according to the number of thread.
An efficient technique for mapping arbitrarily large Bayesian belief networks on hypercubes with deadlock free implementation is presented. this technique shows that the speedup does not vary withthe number of nodes ...
详细信息
ISBN:
(纸本)0818676833
An efficient technique for mapping arbitrarily large Bayesian belief networks on hypercubes with deadlock free implementation is presented. this technique shows that the speedup does not vary withthe number of nodes in Bayesian network and is limited by the height of the Peot-Shachter tree. this technique also shows that the overhead in implementing Bayesian networks on parallel machines like hypercubes can be large.
Recently, there has been growing interest in simultaneous exploitation of task and data parallelism in scientific applications and in compiler and runtime support of this combined form of parallelism. In this paper we...
详细信息
ISBN:
(纸本)0818676833
Recently, there has been growing interest in simultaneous exploitation of task and data parallelism in scientific applications and in compiler and runtime support of this combined form of parallelism. In this paper we report on the integration of task and data parallelism on an important irregular application from the VLSI computer-aided design field namely VLSI layout verification. We report on the implementation, and experimental results of our study on a SUN Sparcserver 1000 shared memory multiprocessor a CM-5 distributed memory multiprocessor.
Efficient divide and conquer algorithms can be mapped to a parallel computer using either Task parallelism or Data parallelism. the former involves significant data movement and the latter can lead to severe load imba...
详细信息
ISBN:
(纸本)0818676833
Efficient divide and conquer algorithms can be mapped to a parallel computer using either Task parallelism or Data parallelism. the former involves significant data movement and the latter can lead to severe load imbalances. III this paper a new strategy is proposed, which we call Concatenated parallelism,for efficient parallel solution of problems resulting in divide and conquer trees. Our strategy is useful when the communication rime due to data movement in distributing the subproblems is significant in comparison to the rime required for subdivision.
In the paper an introductory research on visual system for underwater scene change detection and environment monitoring by an autonomous underwater drone is described. the systems sensor front-end is composed of a sid...
详细信息
ISBN:
(纸本)9781728111414
In the paper an introductory research on visual system for underwater scene change detection and environment monitoring by an autonomous underwater drone is described. the systems sensor front-end is composed of a side-scan sonar, a set of video cameras and lighting module. the system contains a number of processing blocks. First is responsible for signal filtering and conditioning. the main processing unit is based on a change detection module operating with our tensor based scene change detection unit. thanks to our developed parallel algorithm for tensor based model construction, the system is able to find abrupt scene changes, as well as presence of previously unseen objects which can be of interest and which are left for further monitored. In the work-in-progress report the system architecture, theoretical foundations, as well as preliminary experimental results are presented.
暂无评论