A real-time emotional architecture (RTEA) for building parallel robotic applications is presented. RTEA allows the application developer to focus in the design and implementation of the agent processes, because the ar...
详细信息
ISBN:
(纸本)9783642131356
A real-time emotional architecture (RTEA) for building parallel robotic applications is presented. RTEA allows the application developer to focus in the design and implementation of the agent processes, because the architecture itself solves, in an autonomous way the decision about the attention to be paid to each of these processes. From the functional point of view, an RTEA selects and adapts its objectives depending on its physical (actuators) and its mental (processing) capabilities. this characteristic makes the architecture a useful solution in such applications that have to deal with several simultaneous tasks, that has real-time constraints, and where the objectives are defined in a flexible way. From the viewpoint of the design and development of applications, RTEA defines its different entities as independent modules. this modularity facilitates the programmer the development of each part of the project. To control the processing capacity of the agent and to guarantee the fulfilment of the temporal constraints of the processes. RTEA has been implemented in a real-time kernel (rt-linux). Mobile robot Experiments have been carried out to show how emotional system influence the mental organisation of the robot when it performs navigational tasks under different environmental conditions.
We investigate the problem of getting to a higher instruction-level parallelism in string matching algorithms. In particular, starting from an algorithm based on bit-parallelism, we propose two flexible approaches for...
详细信息
ISBN:
(纸本)9783642131219
We investigate the problem of getting to a higher instruction-level parallelism in string matching algorithms. In particular, starting from an algorithm based on bit-parallelism, we propose two flexible approaches for boosting it with a higher level of parallelism. these approaches are general enough to be applied to other bit-parallelalgorithms. It turns out that higher levels of parallelism lead to more efficient solutions in practical cases, as demonstrated by an extensive experimentation.
Developing parallel or distributed applications is a hard task and it requires advanced algorithms, realistic modeling, efficient design tools, high-level programming abstractions, high-performance implementations, an...
详细信息
Co-clustering has been extensively used in varied applications because of its potential to discover latent local patterns that are otherwise unapparent by usual unsupervised algorithms such as k-means. Recently, a uni...
详细信息
ISBN:
(纸本)9783642131189
Co-clustering has been extensively used in varied applications because of its potential to discover latent local patterns that are otherwise unapparent by usual unsupervised algorithms such as k-means. Recently, a unified view of co-clustering algorithms, called Bregman co-clustering (BCC), provides a general framework that even contains several existing co-clustering algorithms, thus we expect to have more applications of this framework to varied data types. However, the amount of data collected from real-life application domains easily grows too big to fit in the main memory of a single processor machine. Accordingly, enhancing the scalability of BCC can be a critical challenge in practice. To address this and eventually enhance its potential for rapid deployment to wider applications with larger data, we parallelize all the twelve co-clustering algorithms in the BCC framework using message passing interface (MPI). In addition, we validate their scalability on eleven synthetic datasets as well as one real-life dataset, where we demonstrate their speedup performance in terms of varied parameter settings.
OpenMP is a widely used parallel programming model on traditional multi-core processors. Generally, OpenMP is used to develop fine-grained parallelism through a multithread model. Stream programming model is a new kin...
详细信息
Overview and experimental comparative study of parallelalgorithms of asynchronous cellular at simulation is presented. the algorithms are tested for the model of physicochemical process of surface CO + O-2 reaction o...
详细信息
ISBN:
(纸本)9783642159787
Overview and experimental comparative study of parallelalgorithms of asynchronous cellular at simulation is presented. the algorithms are tested for the model of physicochemical process of surface CO + O-2 reaction over the supported Pd nanoparticles on different parallel computers. For testing we use shared memory computers, distributed memory computers (i.e. clusters), and graphical processing unit. Characterization of these algorithms in respect of methods of parallelism maintenance is given.
Empirical search is an emerging strategy used in systems like ATLAS, FFTW and SPIRAL to find the parameter values of the implementation that deliver near-optimal performance for a particular machine. However, this app...
详细信息
ISBN:
(纸本)9783642152764
Empirical search is an emerging strategy used in systems like ATLAS, FFTW and SPIRAL to find the parameter values of the implementation that deliver near-optimal performance for a particular machine. However, this approach has only proven successful for scientific kernels or serial symbolic sorting. Even commercial libraries like Intel MKL or IBM ESSL do not include parallel version of sorting routines. In this paper we study empirical search in the generation of parallel sorting routines for multi-core systems. parallel sorting presents new challenges that the relative performance of the algorithms depends not only on the characteristics of the architectures and input data, but also on the data partitioning schemes and thread interactions. We have studied parallel sorting algorithms including quick sort, cache-conscious radix sort, multi-way merge sort, sample sort and quick-radix sort, and have built a sorting library using empirical search and artificial neural network. Our results show that this sorting library could generate the best parallel sorting algorithms for different input sets on both x86 and SPARC multi-core architectures, with a peak speedup of 2.2x and 3.9x, respectively.
Smith-Waterman algorithm is a classic dynamic programming algorithm to solve the problem of biological sequence alignment. However, withthe rapid increment of the number of DNA and protein sequences, the originally s...
详细信息
ISBN:
(纸本)9783642131189
Smith-Waterman algorithm is a classic dynamic programming algorithm to solve the problem of biological sequence alignment. However, withthe rapid increment of the number of DNA and protein sequences, the originally sequential algorithm is very time consuming due to there existing the same computing task computed repeatedly on large-scale data. Today's CPU (graphics processor unit) consists of hundreds of processors, so it has a more powerful computation capacity than the current multicore CPU. And as the programmability of GPU improved continuously, using it to do generous purpose computing is becoming very popular. In order to accelerate sequence alignment, previous researchers use the parallelism of the anti-diagonal of similarity matrix to parallelize the Smith-Waterman algorithm on CPU. In this paper, we design a new parallel algorithm which exploits the parallelism of the column of similarity matrix to parallelize the Smith-Waterman algorithm on a heterogeneous system based on CPU and CPU. the experiment result shows that our new parallel algorithm is more efficient than that of previous, which takes full advantage of the features of boththe CPU and CPU and obtains approximately 37 times speedup compared withthe sequential algorithm named OSEARCH implemented on Intel dual-core E2140 processor.
parallel database technology has already shown its efficiency in supporting high-performance Online Analytical processing (OLAP) applications. this scenario implies achieving query optimization over relational Data Wa...
详细信息
ISBN:
(纸本)9783642131189
parallel database technology has already shown its efficiency in supporting high-performance Online Analytical processing (OLAP) applications. this scenario implies achieving query optimization over relational Data Warehouses (RDW) on top of which typical OLAP functionalities, such as roll-up, drill-down and aggregate query answering, can be implemented. As a result, it follows the emerging need for a comprehensive methodology able to support the design of RDW over parallel and distributed environments in all the phases, including data partitioning, fragment allocation, and data replication. Existing design approaches have an important limitation: fragmentation and allocation phases are performed in an isolated manner. In order to overcome this limitation, in this paper we propose a new methodology for designing parallel RDW over distributed environments, for query optimization purposes. the methodology is illustrated on database clusters, as a noticeable case of distributed environments. Contrary to state-of-the-art approaches where allocation is performed after fragmentation, in our approach we propose allocating fragments just during the partitioning phase. Also, a naive replication algorithm that takes into account the heterogeneous characteristics of our reference architecture is proposed.
SIMD architectures are ubiquitous in general purpose and embedded processors to achieve future multimedia performance goals. However, limited to on chip resources and off-chip memory bandwidth, current SIMD extension ...
详细信息
ISBN:
(纸本)9783642131189
SIMD architectures are ubiquitous in general purpose and embedded processors to achieve future multimedia performance goals. However, limited to on chip resources and off-chip memory bandwidth, current SIMD extension only works on short sets of SIMD elements. this leads to large parallelization overhead for small loops in multimedia applications such as loop handling and address generation. this paper presents SIMD-Vector (SV) architecture to enhance SIMD parallelism exploration. It attempts to gain the benefits of both SIMD instructions and more traditional vector instructions which work on numerous values. Several instructions are extended that allows the programmer to work on large vectors of data and those large vectors are executed on a smaller SIMD hardware by a loop controller. To preserve the register file size for holding much longer vectors, we introduce a technique that the long vector references are performed on only one SIMD register in many iterations. We provide a detailed description of the SV architecture and its comparison with traditional vector architecture. We also present a quantitative analysis of the dynamic instruction size decrease and performance improvement of SV architecture.
暂无评论