Advanced Driving Assistance Systems (ADAS) are vehicle-based intelligent safety systems which help the drivers to improve safety driving. One important building block of ADAS is a Camera Mirror System (CMS). CMS provi...
详细信息
ISBN:
(纸本)9781728127453
Advanced Driving Assistance Systems (ADAS) are vehicle-based intelligent safety systems which help the drivers to improve safety driving. One important building block of ADAS is a Camera Mirror System (CMS). CMS provides means to enhance rearview mirrors by digital information, such as detected obstacles, collision avoidance alerts and more. Efficient implementation of CMS requires the utilization of dedicated processing in hardware, to ensure minimal latency and coexistence with other algorithms in heterogeneous hardware environments. In this paper we analyze CMS algorithm data pipeline and provide insights on how each phase can be efficiently accelerated using dedicated hardware blocks on present-day high-performance microcontrollers. We give early evaluation of the presented approach on top of two architectures: TI TDA2x and NVIDIA Xavier.
Most machine translation systems generate text autoregressively from left to right. We, instead, use a masked language modeling objective to train a model to predict any subset of the target words, conditioned on both...
详细信息
ISBN:
(纸本)9781950737901
Most machine translation systems generate text autoregressively from left to right. We, instead, use a masked language modeling objective to train a model to predict any subset of the target words, conditioned on boththe input text and a partially masked target translation. this approach allows for efficient iterative decoding, where we first predict all of the target words non-autoregressively, and then repeatedly mask out and regenerate the subset of words that the model is least confident about. By applying this strategy for a constant number of iterations, our model improves state-of-the-art performance levels for non-autoregressive and parallel decoding translation models by over 4 BLEU on average. It is also able to reach within about 1 BLEU point of a typical left-to-right transformer model, while decoding significantly faster.(1)
Withthe technology improvements of analog-to-digital converters in terms of sampling rate and achievable resolution, direct digitization of beam signals is of growing interest in the field of beam diagnostics. the se...
详细信息
the evolution of cluster computers based on multicores, many cores and GPGPU accelerators is encouraging application developers to write hybrid parallel programs. Hybrid parallel programming is quite complex as it req...
详细信息
ISBN:
(纸本)9781728143927
the evolution of cluster computers based on multicores, many cores and GPGPU accelerators is encouraging application developers to write hybrid parallel programs. Hybrid parallel programming is quite complex as it requires use of multiple programming paradigms such as MPI, OperiMP, CTIDA/OpenCL to exploit the varied computational power available in a system. the paper brings out the challenges faced by application developers desiring to use heterogeneous HPC clusters. It describes a unified development environment which eases the complete development lifecycle of hybrid parallel programs on a HPC cluster. the software is capable of providing access to multiple clusters of different architectures, owing to the modularity of design and web based approach. the paper also serves as a good resource for researchers interested to gain an insight into hybrid parallel programming.
this paper investigates the problem of actuator identification of a 3-DoF Delta parallel robot, by means of linear AutoRegressive Moving Average with eXogenous input (ARMAX) and nonlinear dynamic Neural Network AutoRe...
详细信息
ISBN:
(纸本)9781665420952
this paper investigates the problem of actuator identification of a 3-DoF Delta parallel robot, by means of linear AutoRegressive Moving Average with eXogenous input (ARMAX) and nonlinear dynamic Neural Network AutoRegressive with eXogenous input (NN-ARX) methods. To this end, the ARMAX and NN-ARX approaches are used to develop a scheme which is capable of identifying a model for each actuator. Based on the ARMAX structure, an accurate model of the actuation system is derived. the model is then trained and tested using the data collected from a real robotic setup. Using a dynamic neural network capabilities, an identification and prediction scheme is designed for modeling the nonlinear dynamic behavior of the system. the NN-ARX is trained based on the collected data from the system, and the new trajectory data is used to validate both methods. By considering the results of experimental implementations, three servo motors are demonstrated to have different dynamical behavior which was expected to happen from the outset, due to uncertainty in fabrication of motors component and gearbox. In the identification and prediction stages, the Root Mean Square Errors (RMSE) index is used to validate and analyze the performance of each method using the validation data from new trajectories. In terms of predicting the output of the system, NN-ARX performed better than ARMAX with RMSE of 0.001441, compared to ARMAX with RMSE of 0.0886. Due to the high accuracy of the obtained models. thus they can be used in the design of motion controllers and modelling disturbances in the system.
Designing subject-independent Brain-Computer Interfaces remains to be an open question for developing systems that can capture the inter-subject intrinsic brain features and classify them with reasonable accuracy. thi...
详细信息
ISBN:
(数字)9781728184852
ISBN:
(纸本)9781728184869
Designing subject-independent Brain-Computer Interfaces remains to be an open question for developing systems that can capture the inter-subject intrinsic brain features and classify them with reasonable accuracy. this paper presents the application of the state-of-the-art deep transfer learning architectures on classifying ERP signals. We report 66.87%, 67.64%, 65.58%, and 71.93% test classification accuracy for DenseNet121, DenseNet201, Xception, and VGG-16 models, respectively. the experimental results demonstrate the viability of our approach in subject independent ERP-signals classification and suggest the better performance of models with fewer layers in classifying ERP signals.
processing-in-memory (PIM) provides massive parallelism with high energy efficiency and becomes a promising solution to the "memory wall" problem. Recently, the emerging metal-oxide resistive random access m...
详细信息
ISBN:
(纸本)9781728116013
processing-in-memory (PIM) provides massive parallelism with high energy efficiency and becomes a promising solution to the "memory wall" problem. Recently, the emerging metal-oxide resistive random access memory (RRAM) has shown its potential to design a PIM architecture. Several stateful logic operations, e.g., NOR and NAND, can be executed in parallel in an RRAM crossbar. Although previous works have designed some algorithms using the stateful logic, it is still under exploration how to fully exploit its potential high parallelism and design an asymptotically fast algorithm for a given function. In this work, we theoretically analyze the parallelism in an RRAM crossbar and design several asymptotically optimal arithmetic algorithms. In detail, we first propose the Single Instruction Multiple Lines (SIML) model to unify the stateful logic families and prove three lower bounds on the time complexity of a parallel RRAM algorithm. then, we design three algorithms for integer addition functions withthe stateful logic, guided by the lower bound analysis. All of them reach the time complexity lower bound. Finally, We make two extensions of the integer addition algorithms, supporting multiplication functions by decomposing them to additions and supporting the flex-point data type by proposing an exponent and mantissa update flow. Experimental evaluation shows that our integer algorithms achieves a speedup up to 13.79x over the previous RRAM algorithms. Our flex-point implementation achieves a 26.60x speedup and saves 73.68% energy compared to an ARM.
Traditionally, most data-to-text applications have been designed using a modular pipeline architecture, in which non-linguistic input data is converted into natural language through several intermediate transformation...
详细信息
ISBN:
(纸本)9781950737901
Traditionally, most data-to-text applications have been designed using a modular pipeline architecture, in which non-linguistic input data is converted into natural language through several intermediate transformations. By contrast, recent neural models for data-to-text generation have been proposed as end-to-end approaches, where the non-linguistic input is rendered in natural language with much less explicit intermediate representations in between. this study introduces a systematic comparison between neural pipeline and end-to-end data-to-text approaches for the generation of text from RDF triples. Botharchitectures were implemented making use of the encoder-decoder Gated-Recurrent Units (GRU) and Transformer, two state-of-the art deep learning methods. Automatic and human evaluations together with a qualitative analysis suggest that having explicit intermediate steps in the generation process results in better texts than the ones generated by end-to-end approaches. Moreover, the pipeline models generalize better to unseen inputs. Data and code are publicly available.(1)
Software applications for biological networks analysis rely on graphs to model the structure interactions. A great part of them requires searching for subgraphs in a target graph or in collections of graphs. Even thou...
详细信息
ISBN:
(纸本)9781728116440
Software applications for biological networks analysis rely on graphs to model the structure interactions. A great part of them requires searching for subgraphs in a target graph or in collections of graphs. Even though very efficient algorithms have been defined to solve such a subgraph isomorphisms problem, the complexity of current real biological networks make their sequential execution time prohibitive. On the other hand, parallelarchitectures, from multi-core to many-core, have become pervasive to deal withthe problem of the data size. Nevertheless, the sequential nature of the graph searching algorithms makes their implementation for parallelarchitectures very challenging. this paper presents three different parallel solutions for the graph searching problem. the first two target the exact search for multi-core CPUs and many-core GPUs, respectively. the third one targets the approximate search for GPUs, which handles node, edge, and node label mismatches. the paper shows how different techniques have been developed in all the solutions to reduce the search space complexity. the paper shows the performance of the proposed solutions on representative biological networks containing antiviral chemical compounds and protein interactions networks.
We present a parallel Iterative Edit (PIE) model for the problem of local sequence transduction arising in tasks like Grammatical error correction (GEC). Recent approaches are based on the popular encoder-decoder (ED)...
详细信息
ISBN:
(纸本)9781950737901
We present a parallel Iterative Edit (PIE) model for the problem of local sequence transduction arising in tasks like Grammatical error correction (GEC). Recent approaches are based on the popular encoder-decoder (ED) model for sequence to sequence learning. the ED model auto-regressively captures full dependency among output tokens but is slow due to sequential decoding. the PIE model does parallel decoding, giving up the advantage of modelling full dependency in the output, yet it achieves accuracy competitive withthe ED model for four reasons: 1. predicting edits instead of tokens, 2. labeling sequences instead of generating sequences, 3. iteratively refining predictions to capture dependencies, and 4. factorizing logits over edits and their token argument to harness pretrained language models like BERT. Experiments on tasks spanning GEC, OCR correction and spell correction demonstrate that the PIE model is an accurate and significantly faster alternative for local sequence transduction.
暂无评论