Recently, the high-performance computing community has realized that power is a performance-limiting factor. One reason for this is that supercomputing centers have limited power capacity and machines are starting to ...
详细信息
MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a Map function that processes a key/value pair to generate a set of intermediate key/value...
详细信息
ISBN:
(纸本)159593264X
MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a Map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a Reduce function that merges all intermediate values associated with the same intermediate key. Many real world tasks are expressible in this model. Programs written in this functional style are automatically parallelized and executed on a large cluster of commodity machines. The MapReduce run-time system takes care of the details of partitioning the input data, scheduling the program's execution across a set of machines, handling machine failures, and managing the required intermachine communication. This allows programmers without any experience with parallel and distributed systems to easily utilize the resources of a large distributed system. Our implementation of MapReduce runs on a large cluster of commodity machines and is highly scalable: a typical MapReduce computation processes many terabytes of data on thousands of machines. Programmers find the system easy to use: thousands of MapReduce programs have been implemented and several thousand thousand MapReduce jobs are executed on Google's clusters every day. In this talk I'll describe the basic programming model, discuss our experience using it in a variety of domains, and talk about the implications of programming models like MapReduce as one paradigm to simplify development of parallel software for multi-core microprocessors.
We propose a novel second order cone programming formulation for designing robust classifiers which can handle uncertainty in observations. Similar formulations are also derived for designing regression functions whic...
详细信息
We propose a novel second order cone programming formulation for designing robust classifiers which can handle uncertainty in observations. Similar formulations are also derived for designing regression functions which are robust to uncertainties in the regression setting. The proposed formulations are independent of the underlying distribution, requiring only the existence of second order moments. These formulations are then specialized to the case of missing values in observations for both classification and regression problems. Experiments show that the proposed formulations outperform imputation.
The significance of programmable technologies to meet the needs of digital home entertainment platform, is discussed. The use of programmable technologies has helped to develop devices having capability to provide dyn...
详细信息
The significance of programmable technologies to meet the needs of digital home entertainment platform, is discussed. The use of programmable technologies has helped to develop devices having capability to provide dynamic video, audio and data from a single device. Hitachi has developed digital video recorder using Blackfin processor to increase the viewing experience of TV watchers. The programmable technology also enabled Hitachi to develop Wooo D series of DVRs that recognize, extract and play highlights from recorded video automatically to eliminate the need for fast forwarding. Toshiba has developed mutichannel audio signal chain powered by the SHARC's processors in HD DVDs which provide high definition audio and video. These DVDs decode multi channel streams in wide array of audio streams to simultaneously decode Dolby Digital Plus streams and convert to backward compatible Dolby Digital streams.
The networking and the polling routines of Apache Portable Runtime (APR) in native-code portability for C/C++ programmers are discussed. Source code is said to be portable if it compiles and runs in several environmen...
详细信息
The networking and the polling routines of Apache Portable Runtime (APR) in native-code portability for C/C++ programmers are discussed. Source code is said to be portable if it compiles and runs in several environments without modification, and applications built on portable code have a potentially wide user base. Native-code portability involves messy preprocessor macros to detect and compensate for OS-specific oddities. TCP socket communication involve steady connections between the two socket endpoints. Datagram-oriented and UDP connections pass data in minute packets, the order and delivery of which may be unreliable. APR's networking routines are more convenient to use than the UNIX-style builtins, where both use data structures to represent connections and address information. The APR's OS neutral networking and polling APIs includes OS neutral abstractions for threading files and even processing handling.
Worklist Productivity Monitor (WPM) is developed for healthcare sector to provide an automated way of tracking staff productivity without custom coding by the customer or the vendor. WMP, based on Sun Java™ Composite ...
详细信息
Worklist Productivity Monitor (WPM) is developed for healthcare sector to provide an automated way of tracking staff productivity without custom coding by the customer or the vendor. WMP, based on Sun Java™ Composite Application Suite (JCAPS), is broadly split into two functional areas, namely data collection and retention and representation of data in various graphical formats. The Smart Logic business engine rules provide several features that include Dashboard Graphical User Interface (GUI) that allows managers and supervisors to view productivity data in real-time through a web portal and the ability to view productivity data in different layers through GUI filters. WPM method provides many advantages such as derivation of productivity information using existing data, consolidation of productivity data in a single view, and identification of problem areas requiring management intervention. WPM also provides trend analysis to predict seasonal variations in load of tasks and the resultant changes in productivity.
A trace is a record of the execution of a computer program, showing the sequence of operations executed. Dynamic traces are obtained by executing the program and depend upon the input. Static traces, on the other hand...
详细信息
A trace is a record of the execution of a computer program, showing the sequence of operations executed. Dynamic traces are obtained by executing the program and depend upon the input. Static traces, on the other hand, describe potential sequences of operations extracted statically from the source code. Static traces offer the advantage that they do not depend upon input data. This paper describes a new automatic technique to extract static traces for individual stack and heap objects. The extracted static traces can be used in many ways, such as protocol recovery and validation in particular and program understanding in general. In addition, this article describes four case studies we conducted to explore the efficiency of our algorithm, the size of the resulting static traces, and the influence of the underlying points-to analysis on this size. (c) 2004 Elsevier Inc. All rights reserved.
Event-based middleware is currently being applied for application component integration in a range of application domains. As a result, a variety of event services has been proposed to address different requirements. ...
详细信息
Event-based middleware is currently being applied for application component integration in a range of application domains. As a result, a variety of event services has been proposed to address different requirements. In order to aid the understanding of the relationships between these systems, this paper presents a taxonomy of distributed event-based programmingsystems. The taxonomy is structured as a hierarchy of the properties of a distributed event system and may be used as a framework to describe such a system according to its properties. The taxonomy identifies a set of fundamental properties of event systems and categorizes them according to the event model supported and the structure of the event service. Event services are further classified according to their organization and their interaction models, as well as other functional and non-functional features.
There has been a rapid expansion of computer use in medicine recently in the US and Japan. The reasons are availability of high speed and wireless connections, decreasing cost, demands for increased quality of care an...
详细信息
Constructing high performance computing system and providing easy-to-use programming model for users are two main parts of parallel computing, but the latter has been in a sorry state for a long time. LilyTask program...
详细信息
Constructing high performance computing system and providing easy-to-use programming model for users are two main parts of parallel computing, but the latter has been in a sorry state for a long time. LilyTask programming model is based on tasks which cajn be easily mapped to the decomposition of computation. Without explict synchronization and mutual exclusion, users will concentrate on the inherent parallelism in a problem instead of lock/unlock programming techniques. Most existing implementations of the task pool, a flexible data structure for task parallel, lack basic representation of relations among tasks. LilyTask introduces task group and task relation to help users easily map subproblems to tasks. The kernel of LilyTask system, a distributed task pool, automatically exploits potential parallelism among tasks at runtime. With runtime task assignment and task stealing, LilyTask system achieves dynamic load balancing. Our performance evaluation shows that LilyTask system with task pool outperforms sequential programs and BSPlib in solving both regular and irregular problems.
暂无评论