This paper justifies the use of estimation and prediction of carries to increase the performance of functional units built with the replication of full adders while keeping a low area penalization. Adders and multipli...
详细信息
ISBN:
(纸本)9781424426577
This paper justifies the use of estimation and prediction of carries to increase the performance of functional units built with the replication of full adders while keeping a low area penalization. Adders and multipliers are the most representative modules in this group of functional units. The use of these design techniques allows the implementation of modules with performance improvements ranging from 20% to 50% with only an area overheads around 5%. These functional units are suitable for asynchronous circuits but they could also be introduced in synchronous circuits with speculative techniques. The basic idea consists in estimating the carry out from some parts of the functional units, allowing every part to operate independently and in parallel. These modules are connected to build bigger ones. Results from simulations show that for some applications it is possible to make predictions even more accurate that the bit-based estimation. Predictions have also the advantage they can be introduced in the multipliers design, whether estimators cannot. These predictions are similar to the ones used in the branch prediction in a processor.
This book presents the latest worldwide results in theory and practice of formal techniques for networked and distributed systems. The theme of the book is addressed by specialized papers in the following areas: Forma...
ISBN:
(纸本)9781475788259
This book presents the latest worldwide results in theory and practice of formal techniques for networked and distributed systems. The theme of the book is addressed by specialized papers in the following areas: Formal Methods in Software Development, Process Algebra, Timed Automata, Theories and applications of Verification, distributed Systems Testing, Test Sequence Derivation. In addition, the last part of the book contains special contributions by leading researchers in the above areas to add breadth and give more perspectives to the results. This volume contains the selected proceedings of the internationalconference on Formal techniques for Networked and distributed Systems (Forte 2001), which was sponsored by the international Federation for Information processing (Ifip) and held in Cheju Island, Korea in August 2001. Forte 2001 combines two prestigious conferences, Forte (Formal Description techniques for distributed Systems and Communication Protocols) and Pstv (Protocol Specification Testing and Verification), and has more than 20 years of history. Formal techniques for Networked and distributed Systems will be essential reading for researchers and engineers working in the fields of communications, test equipment R&D, and telecommunications, as well as to software engineering tool developers.
Weddle's rule is one of the best for high-order numerical integration solving techniques. For better accuracy, more step sizes are used to compute the end result, but this leads to an increased number of computati...
详细信息
ISBN:
(纸本)9783031744426;9783031744433
Weddle's rule is one of the best for high-order numerical integration solving techniques. For better accuracy, more step sizes are used to compute the end result, but this leads to an increased number of computations. parallel and multicore computations help us better enhance this computation. To maximize the performance ofWeddle's rule on contemporary processors, this research will investigate the role of multiprocessing. Utilizing multiprocessing and parallel computing techniques can significantly enhance the efficiency, accuracy, and scalability of these computations. In theory, this should reduce the computation time of the integration process as the workload is divided among multiple processors, resulting in better performance. Despite the theoretical promise of reduced computation times through multiprocessing, practical applications often reveal a discrepancy between expected and actual performance gains. This study analyzes the time consumed in computation over the number of processors (1 to 12) and the number of subintervals used in computation during the execution of the integration process. Also, the machine learning technique Gradient Descent can be used to predict time consumption using features like time taken, input size, and number of processors. The model has a 95% accuracy.
Deep learning models for food image classification rely on vast amounts of data to effectively recognize and differentiate between various food items. However, training these models on such extensive datasets presents...
详细信息
With the development of information technology of university library, the mass data of the university library has the basic characteristics of Big Data. However, the current situation of the university library is the ...
详细信息
ISBN:
(纸本)9781538621653
With the development of information technology of university library, the mass data of the university library has the basic characteristics of Big Data. However, the current situation of the university library is the lack of distributed storage and computing model for massive data, the lack of capacity to handl the diverse data sources, including the structured, semi structured and unstructured data, the lack of a simple, flexible application model of big data *** order to solve the problems in the service innovation of University Libraries in China, such as the problem of distributed storage and computation of massive data, the distributed management of diverse data sources, the simple and flexible application of big data services, this paper analyzes the research contents of big data processing, Hadoop ecosystem and the demand for big data services in University Libraries, and presents a technology framework for big data service in University Libraries based Hadoop. The framework includes the distributed storage and parallel computing model of mass data, the distributed management model of diverse data sources and the model of diversified service application for university libraries. This framework takes full account of the service innovation change of University Library under the environment of big data, such as data storage and calculation, data management and service applications et al. It can solve the key technical problems of big data service of University Library in a certain extent.
When do you trust a performance model? More specifically, when can a particular model be used for a specific application? Once a stochastic model is selected, its parameters must be determined. This involves instrumen...
详细信息
ISBN:
(纸本)9783662480960;9783662480953
When do you trust a performance model? More specifically, when can a particular model be used for a specific application? Once a stochastic model is selected, its parameters must be determined. This involves instrumentation, data collection, and finally interpretation;which are very time consuming. Even when done correctly, the results hold for only the conditions under which the system was characterized. For modern, dynamic stream processing systems, this is far too slow if a model-based approach to performance tuning is to be considered. This work demonstrates the use of a Support Vector Machine (SVM) to determine if a stochastic queueing model is usable or not for a particular queueing station within a streaming application. When combined with methods for online service rate approximation, our SVM approach can select models while the application is executing (online). The method is tested on a variety of hardware and software platforms. The technique is shown to be highly effective for determining the applicability of M/M/1 and M/D/1 queueing models to stream processingapplications.
The amount of big data from high-throughput Next-Generation Sequencing (NGS) techniques represents various challenges such as storage, analysis and transmission of massive datasets. One solution to storage and transmi...
详细信息
ISBN:
(纸本)9781467379526
The amount of big data from high-throughput Next-Generation Sequencing (NGS) techniques represents various challenges such as storage, analysis and transmission of massive datasets. One solution to storage and transmission of data is compression using specialized compression algorithms. The existing specialized algorithms suffer from poor scalability with increasing size of the datasets and best available solutions can take hours to compress gigabytes of data. Compression and decompression using these techniques for peta-scale data sets is prohibitively expensive in terms of time and energy. In this paper we introduce paraDSRC, a parallel implementation of the DNA Sequence Reads Compression (DSRC) application using a message passing model that presents reduction of the compression time complexity by a factor of O(1/p) (where p is the number of processing units). Our experimental results show that paraDSRC achieves compression times that are 43% to 99% faster than DSRC and compression throughputs of up to 8.4GB/s on a moderate size cluster. For many of the datasets used in our experiments super-linear speedups have been registered making the implementation strongly scalable. We also show that paraDSRC is more than 25.6x faster than comparable parallel compression algorithms.
Algorithmic skeletons are polymorphic higher-order functions representing common parallelization patterns and implemented in parallel. They can be used as the building blocks of parallel and distributedapplications b...
详细信息
Approximate string matching using the k-difference technique has been widely applied to many fields such as pattern recognition and computational biology. Data dependency exists in the traditional sequential algorithm...
详细信息
Circuit simulation is very important and time-consuming. parallel computing does well accelerate the calculating speeds of many applications. GPU cards have thousands of threads, so it's a good strategy to use GPU...
详细信息
ISBN:
(纸本)9781479979837
Circuit simulation is very important and time-consuming. parallel computing does well accelerate the calculating speeds of many applications. GPU cards have thousands of threads, so it's a good strategy to use GPU cards to perform circuit simulation. This paper use appropriate numerical methods, including techniques used by SPICE and Nonlinear Relaxation (relaxation-based method for solving nonlinear equations), to perform the circuit simulation. We assign the heaviest calculation portion of the simulation program to the GPU card, i.e. the nonlinear equation (derived by Newton Raphson iteration) solving portion. The complete circuit simulation program based on proposed methods has been coded and tested by solving some MOSFET circuits. The resulted speedup justifies the success of this research.
暂无评论