Nowadays, many organizations pay attention to the relevant technologies of Big Data to analyze more accurately, quickly, and efficiently. Real-time Big Data analytics is challenging due to the massive volume of comple...
详细信息
ISBN:
(纸本)9781728111902
Nowadays, many organizations pay attention to the relevant technologies of Big Data to analyze more accurately, quickly, and efficiently. Real-time Big Data analytics is challenging due to the massive volume of complex data needed to distribute in processing. therefore, in this research, we investigate two state-of-the-art architectures: Lambda and Kappa. the Kappa architecture is simply the Lambda architecture without the batch layer. To help businesses decide on the right architecture, their processing time, and resource utilization in the same environment need to be found out. Experiments had been carried out withthe data size 3 MB, 30 MB, and 300 MB. the results showed that Lambda architecture outperforms Kappa architecture around 9% for the accuracy test when using processing time approximately 2.2 times more than Kappa architecture. Lambda architecture also used more 10-20% of CPU usage and 0.5 GB of RAM usage than Kappa architecture.
Programmable hardware accelerator structure and information processing method based on a new model of computation withparallel conflict-free ordered data and command access is proposed in a paper. the advantages of t...
详细信息
ISBN:
(纸本)9781728104492
Programmable hardware accelerator structure and information processing method based on a new model of computation withparallel conflict-free ordered data and command access is proposed in a paper. the advantages of the proposed hardware accelerator structure over traditional ones are highlighted.
the aim of this paper is to present a new high-performance implementation of Marsa-LFIB4 which is an example of high-quality multiple recursive pseudorandom number generators. We propose a new algorithmic approach tha...
详细信息
One way to reduce network traffic in multinode data-parallel stochastic gradient descent is to only exchange the largest gradients. However, doing so damages the gradient and degrades the model9;s performance. Tran...
详细信息
ISBN:
(纸本)9781950737901
One way to reduce network traffic in multinode data-parallel stochastic gradient descent is to only exchange the largest gradients. However, doing so damages the gradient and degrades the model's performance. Tranformer models degrade dramatically while the impact on RNNs is smaller. We restore gradient quality by combining the compressed global gradient withthe node's locally computed uncompressed gradient. Neural machine translation experiments show that Transformer convergence is restored while RNNs converge faster. With our method, training on 4 nodes converges up to 1.5x as fast as with uncompressed gradients and scales 3.5x relative to singlenode training.
Cloud Computing is a computational model that provides all computing services and its requirements over the Internet. So our computation is always available without burdens of carrying large-scale hardware and softwar...
详细信息
ISBN:
(纸本)9781728150758
Cloud Computing is a computational model that provides all computing services and its requirements over the Internet. So our computation is always available without burdens of carrying large-scale hardware and software. the utilization of resources has been decreasing due to the growth of parallelprocessing in most parallel applications. Accordingly, job scheduling, one of the fundamental issues in cloud computing, should manage more efficiently. the accuracy of parallel job scheduling is greatly important for cloud providers in order to guarantee the quality of their service. Given that optimal scheduling improves utilization of resources, reduces response time and satisfies user requirements. Most of the current parallel job scheduling algorithms do not use the consolidation of parallel workloads to improve scheduling performance. this paper introduces a scheduling algorithm enriches the powerful ACFCFS algorithm. To begin with, we employ tentative runs, workload consolidation and two-tier virtual machines architecture. Particularly, we consider deadline for jobs in order to prevent starvation of parallel jobs and improve performance. the simulation results indicate that our algorithm considerably reduces the makespan and the maximum waiting time. therefore it improves scheduling compare to the basic algorithm (ACFCFS). Overall, it can be employed as a strong and effective method for scheduling parallel jobs in the cloud.
the scarcity in annotated data poses a great challenge for event detection (ED). Cross-lingual ED aims to tackle this challenge by transferring knowledge between different languages to boost performance. However, prev...
ISBN:
(纸本)9781950737901
the scarcity in annotated data poses a great challenge for event detection (ED). Cross-lingual ED aims to tackle this challenge by transferring knowledge between different languages to boost performance. However, previous cross-lingual methods for ED demonstrated a heavy dependency on parallel resources, which might limit their applicability. In this paper, we propose a new method for cross-lingual ED, demonstrating a minimal dependency on parallel resources. Specifically, to construct a lexical mapping between different languages, we devise a context-dependent translation method;to treat the word order difference problem, we propose a shared syntactic order event detector for multilingual cotraining. the efficiency of our method is studied through extensive experiments on two standard datasets. Empirical results indicate that our method is effective in 1) performing cross-lingual transfer concerning different directions and 2) tackling the extremely annotation-poor scenario.
this paper presents BiPaR, a bilingual parallel novel-style machine reading comprehension (MRC) dataset, developed to support multilingual and cross-lingual reading comprehension. the biggest difference between BiPaR ...
详细信息
ISBN:
(纸本)9781950737901
this paper presents BiPaR, a bilingual parallel novel-style machine reading comprehension (MRC) dataset, developed to support multilingual and cross-lingual reading comprehension. the biggest difference between BiPaR and existing reading comprehension datasets is that each triple (Passage, Question, Answer) in BiPaR is written parallelly in two languages. We collect 3,667 bilingual parallel paragraphs from Chinese and English novels, from which we construct 14,668 parallel question-answer pairs via crowdsourced workers following a strict quality control procedure. We analyze BiPaR in depth and find that BiPaR offers good diversification in prefixes of questions, answer types and relationships between questions and passages. We also observe that answering questions of novels requires reading comprehension skills of coreference resolution, multi-sentence reasoning, and understanding of implicit causality, etc. With BiPaR, we build monolingual, multilingual, and cross-lingual MRC baseline models. Even for the relatively simple monolingual MRC on this dataset, experiments show that a strong BERT baseline is over 30 points behind human in terms of both EM and F1 score, indicating that BiPaR provides a challenging testbed for monolingual, multilingual and cross-lingual MRC on novels. the dataset is available at https://***/BiPaR/.
Speech-based interactive systems, such as virtual personal assistants, inevitably use complex architectures, with a multitude of modules working in series (or less often in parallel) to perform a task (e.g., giving pe...
详细信息
ISBN:
(纸本)9789811394430;9789811394423
Speech-based interactive systems, such as virtual personal assistants, inevitably use complex architectures, with a multitude of modules working in series (or less often in parallel) to perform a task (e.g., giving personalized movie recommendations via dialog). Add modules for evoking and sustaining sociability withthe user and the accumulation of processing latencies through the modules results in considerable turn-taking delays. We introduce incremental speech processing into the generation pipeline of the system to overcome this challenge with only minimal changes to the system architecture, through partial underspecification that is resolved as necessary. A user study with a sociable movie recommendation agent objectively diminishes turn-taking delays;furthermore, users not only rate the incremental system as more responsive, but also rate its recommendation performance as higher.
In this paper, we investigate the performance of parallel Discrete Event Simulation ( PDES) on a cluster of many-core Intel KNL processors. Specifically, we analyze the impact of different Global Virtual Time (GVT) al...
详细信息
ISBN:
(纸本)9781450362955
In this paper, we investigate the performance of parallel Discrete Event Simulation ( PDES) on a cluster of many-core Intel KNL processors. Specifically, we analyze the impact of different Global Virtual Time (GVT) algorithms in this environment and contribute three significant results. First, we show that it is essential to isolate the thread performing MPI communications from the task of processing simulation events, otherwise the simulation is significantly imbalanced and performs poorly. this applies to both synchronous and asynchronous GVT algorithms. Second, we demonstrate that synchronous GVT algorithm based on barrier synchronization is a better choice for communication-dominated models, while asynchronous GVT based on Mattern's algorithm performs better for computation-dominated scenarios. third, we propose Controlled Asynchronous GVT (CA-GVT) algorithm that selectively adds synchronization to Mattern-style GVT based on simulation conditions. We demonstrate that CA-GVT outperforms both barrier and Mattern's GVT and achieves about 8% performance improvement on mixed computation-communication models. this is a reasonable improvement for a simple modification to a GVT algorithm.
We present effective pre-training strategies for neural machine translation (NMT) using parallel corpora involving a pivot language, i.e., source-pivot and pivot-target, leading to a significant improvement in source ...
详细信息
ISBN:
(纸本)9781950737901
We present effective pre-training strategies for neural machine translation (NMT) using parallel corpora involving a pivot language, i.e., source-pivot and pivot-target, leading to a significant improvement in source -> target translation. We propose three methods to increase the relation among source, pivot, and target languages in the pre-training: 1) step-wise training of a single model for different language pairs, 2) additional adapter component to smoothly connect pre-trained encoder and decoder, and 3) cross-lingual encoder training via autoencoding of the pivot language. Our methods greatly outperform multilingual models up to +2.6% BLEU in WMT 2019 French -> German and German -> Czech tasks. We show that our improvements are valid also in zero-shot/zero-resource scenarios.
暂无评论