the problem of development adaptive control systems based on microelectromechanical systems for reduce turbulent flow of the aircraft has been considered. Complex solution based on microelectromechanical systems and g...
详细信息
ISBN:
(纸本)9781509010530
the problem of development adaptive control systems based on microelectromechanical systems for reduce turbulent flow of the aircraft has been considered. Complex solution based on microelectromechanical systems and graphic processing platform NVIDIA Tegra, which includes massively parallel technology, has been proposed. It develops parallelprocessing software based on algorithms that delivers scaling and speed up on distributed memory systems for high performance applications. the distributed microelectromechanical systems technique involves spreading integrated sense-compute-act modules over areas to sense the physical world and act upon it. Tiny elements can reduce drag by sensing vortices and interacting withthem. It will help to reach better maneuverability, increase the range of aircraft payload and capability.
the concept of memory disaggregation has recently been gaining traction in research. With memory disaggregation, data center compute nodes can directly access memory on adjacent nodes and are therefore able to overcom...
详细信息
ISBN:
(纸本)9781665497473
the concept of memory disaggregation has recently been gaining traction in research. With memory disaggregation, data center compute nodes can directly access memory on adjacent nodes and are therefore able to overcome local memory restrictions, introducing a new data management paradigm for distributed computing. this paper proposes and demonstrates a memory disaggregated in-memory object store framework for big data applications by leveraging the newly introduced thymesisFlow memory disaggregation system. the framework extends the functionality of the pre-existing Apache Arrow Plasma object store framework to distributed systems by enabling clients to easily and efficiently produce and consume data objects across multiple compute nodes. this allows big data applications to increasingly leverage parallelprocessing at reduced development costs. In addition, the paper includes latency and throughput measurements that indicate only a modest performance penalty is incurred for remote disaggregated memory access as opposed to local (similar to 6.5 vs similar to 5.75 GiB/s). the results can be used to guide the design of future systems that leverage memory disaggregation as well as the newly presented framework. this work is open-source and publicly accessible at https://***/10.5281/zenodo.6368998.
Heterogeneous parallel systems including accelerators such as Graphics processing Units (GPUs), are expected to play a major role in architecting the largest systems in the world, as well as the most powerful embedded...
详细信息
ISBN:
(纸本)9780769546766
Heterogeneous parallel systems including accelerators such as Graphics processing Units (GPUs), are expected to play a major role in architecting the largest systems in the world, as well as the most powerful embedded devices. Impressive computational speedups have been reported for numerous algorithms in fields of medical image processing, digital signal processing, astrophysics, modeling and simulations. However, it is frequently assumed that the working data set of the application fits in the memory of the accelerator. In this paper, first we elevate this constraint by presenting a simple and scalable compile-time approach for processing large data sets based on I/O tiling. Second, we combine tiling with streaming in our asynchronous execution model, which enables efficient data-driven processing of large data sets on heterogeneous platforms with accelerators. Finally, we present results for several microbenchmarks and three data parallel kernels.
Graph coloring is used to identify subsets of independent tasks in parallel scientific computing applications. Traditional coloring heuristics aim to reduce the number of colors used as that number also corresponds to...
详细信息
ISBN:
(纸本)9781479986484
Graph coloring is used to identify subsets of independent tasks in parallel scientific computing applications. Traditional coloring heuristics aim to reduce the number of colors used as that number also corresponds to the number of parallel steps in the application. However, if the color classes produced have a skew in their sizes, utilization of hardware resources becomes inefficient, especially for the smaller color classes. Equitable coloring is a theoretical formulation of coloring that guarantees a perfect balance among color classes, and its practical relaxation is referred to as balanced coloring. In this paper, we revisit the problem of balanced coloring in the context of parallel computing. the goal is to achieve a balanced coloring of an input graph without increasing the number of colors that an algorithm oblivious to balance would have used. We propose and study multiple heuristics that aim to achieve such a balanced coloring, present parallelization approaches for multi-core and manycore architectures, and cross-evaluate their effectiveness with respect to the quality of balance achieved and performance. Furthermore, we study the impact of the proposed balanced coloring heuristics on a concrete application - viz. parallel community detection, which is an example of an irregular application. the thorough treatment of balanced coloring presented in this paper from algorithms to application is expected to serve as a valuable resource to parallel application developers who seek to improve parallel performance of their applications using coloring.
Over the years, researchers have leveraged patients' discourse on social media to inform clinical and digital interventions. We contribute to these research efforts by mining and analyzing diabetes-related public ...
详细信息
ISBN:
(纸本)9781665435741
Over the years, researchers have leveraged patients' discourse on social media to inform clinical and digital interventions. We contribute to these research efforts by mining and analyzing diabetes-related public posts on two social networks (online forums) withthe aim of identifying management strategies adopted by diabetes patients and suggesting appropriate interventions. We develop a medical named entity recognition framework, MediNER, to identify named entities related to diabetes management and classify them into Food, Medication, therapeutic Procedure, and Supplement. Our analysis shows that food-related strategies are most prevalent among both Type 1 and Type 2 diabetes patients, followed by medication-related strategies. Strategies involving supplements (such as vitamins) are the least utilized. We also investigate for gender differences in the strategies employed by diabetes patients. Finally, we offer design recommendations for digital interventions aimed at diabetes self-management based on our findings.
Interactive program steering is a promising technique for improving the performance of parallel and distributedapplications. Steering decisions are typically based on visual presentations of some subset of the comput...
详细信息
ISBN:
(纸本)0818677937
Interactive program steering is a promising technique for improving the performance of parallel and distributedapplications. Steering decisions are typically based on visual presentations of some subset of the computation's current state, a historical view of the computation's behavior, or views of metrics based on the program's performance. As in any endeavor good decisions require accurate information. However the distributed nature of the collection process may result in distortions in the portrayal of the program's execution. these distortions stem from the merging of streams of information from distributed collection points into a single stream without enforcing the ordering relationships that held among the program components that produced the information. An ordering filter placed at the point at which the streams are merged can ensure a valid ordering, leading to more accurate visualizations and better informed steering decisions. In this paper we describe the implementation of such filters in the Falcon interactive steering toolkit, and present a methodology for their specification for automated generation.
Since they were introduced, Java streams were very fast embraced by the industry, being currently used at a large scale. the parallelism enabled by them is very easy to achieve, but it is constrained either by the use...
详细信息
ISBN:
(纸本)9781728174457
Since they were introduced, Java streams were very fast embraced by the industry, being currently used at a large scale. the parallelism enabled by them is very easy to achieve, but it is constrained either by the used parallelism model (in some cases), or by the set of operations that could be specified using streams. We investigate in this paper the possibility to enhance the computation types that could be defined using the Java streams API by introducing into this infrastructure the PowerList theory based computation. Powerlists are recursive data structures that together withtheir associated algebraic theory offer both abstractions in order to ease the development of parallelapplications, and also a methodology to design parallel algorithms. the Java streaming infrastructure could be adapted to support them in a great measure. We present here such an adaptation, and we analyse and discuss the advantages and constraints. this analysis is exemplified by application examples.
Reductions matter and they are here to stay. Wide adoption of parallelprocessing hardware in a broad range of computer applications has encouraged recent research efforts on their efficient parallelization. Furthermo...
详细信息
ISBN:
(纸本)9780769549712
Reductions matter and they are here to stay. Wide adoption of parallelprocessing hardware in a broad range of computer applications has encouraged recent research efforts on their efficient parallelization. Furthermore, trends towards high productivity languages in mainstream computing increases the demand for efficient programming support. In this paper we present a new approach on parallel reductions for distributed memory systems that provides both scalability and programmability. Using OmpSs, a task-based parallel programming model, the developer has the ability to express scalable reductions through a single pragma annotation. this pragma annotation is applicable for tasks as well as for work-sharing constructs (with implicit tasking) and instructs the compiler to generate the required runtime calls. the supporting runtime handles data and task distribution, parallel execution and data reduction. Scalability is achieved through a software cache that maximizes local and temporal data reuse and allows overlapped computation and communication. Results confirm scalability for up to 32 12-core cluster nodes.
Multi-core phones are now pervasive. Yet, existing applications rely predominantly on a client-server computing paradigm, using phones only as thin clients, sending sensed information via the cellular network to serve...
详细信息
ISBN:
(纸本)9780769552071
Multi-core phones are now pervasive. Yet, existing applications rely predominantly on a client-server computing paradigm, using phones only as thin clients, sending sensed information via the cellular network to servers for processing. this makes the cellular network the bottleneck, limiting overall application performance. In this paper, we propose MobiStreams, a distributed Stream processing System (DSPS) that runs directly on smartphones. MobiStreams can offload computing from remote servers to local phones and thus alleviate the pressure on the cellular network. Implementing DSPS on smartphones faces significant challenges: 1) multiple phones can readily fail simultaneously, and 2) the phones' ad-hoc WiFi network has low bandwidth. MobiStreams tackles these challenges through two new techniques: 1) token-triggered checkpointing, and 2) broadcast-based checkpointing. Our evaluations driven by two real world applications deployed in the US and Singapore show that migrating from a server platform to a smartphone platform eliminates the cellular network bottleneck, leading to 0.78 similar to 42.6X throughput increase and 10%similar to 94.8% latency decrease. Also, MobiStreams' fault tolerance scheme increases throughput by 230% and reduces latency by 40% vs. prior state-of-the-art fault-tolerant DSPSs.
the DCABES is a community working in the area of distributed Computing and applications in Business, Engineering, and Sciences, and is responsible for organizing meetings and symposia related to the field. DCABES inte...
详细信息
暂无评论