We describe techniques for implementing real-time partitioned convolution algorithms on conventional operating systems using two different scheduling paradigms: time-distributed (cooperative) and multi-threaded (preem...
详细信息
ISBN:
(纸本)9782954035109
We describe techniques for implementing real-time partitioned convolution algorithms on conventional operating systems using two different scheduling paradigms: time-distributed (cooperative) and multi-threaded (preemptive). We discuss the optimizations applied to both implementations and present measurements of their performance for a range of impulse response lengths on a recent high-end desktop machine. We find that while the time-distributed implementation is better suited for use as a plugin within a host audio application, the preemptive version was easier to implement and significantly outperforms the time-distributed version despite the overhead of frequent context switches.
Aiming at the problems of insufficient utilization of information about elite particles in archive and instability of particle motion in the population in the multi-objective artificial physics optimization algorithm ...
详细信息
Personal computing on mobile platforms such as laptops and personal digital assistants, rather than in a traditional desktop environment, is becoming increasingly more common. In this paper we address the issue of app...
详细信息
Personal computing on mobile platforms such as laptops and personal digital assistants, rather than in a traditional desktop environment, is becoming increasingly more common. In this paper we address the issue of application session transfer for uninterrupted data access across this diverse range of platforms. This work is part of the iMASH project, a multi-year, multi-discipline collaborative effort focused on enabling mobile client platforms and incorporating them into existing legacy networked systems for use by medical practitioners. We have developed a tiered architecture that includes a middleware server layer positioned between existing application servers and multiple clients to make session transfer transparent to the user. Any client application executing our Middleware-Aware Remote Code library can save and restore its session by interacting with a middleware server. As a proof of concept, we have implemented the transfer of bookmarks, history, web cache, and user preferences with the Mozilla open source web browser. From this effort we have established baseline performance metrics and have found that the overhead is within reasonable bounds of just a few seconds of latency.
Large models have achieved impressive performance in many downstream tasks. Using pipeline parallelism to fine-tune large models on commodity GPU servers is an important way to make the excellent performance of large ...
详细信息
There is increasing interest in computing models that support extensibility of systems through code migration. Although appealing both from the system design and extensibility points of view, extensible systems are vu...
As deep learning grows rapidly, model training heavily relies on parallel methods and there exist numerous cluster configurations. However, current preferences for parallel training focus on data centers, overlooking ...
详细信息
There is considerable interest in developing runtime infrastructures for programs that can migrate from one host to another. Mobile programs are appealing because they support efficient utilization of network resource...
详细信息
ISBN:
(纸本)0769503403
There is considerable interest in developing runtime infrastructures for programs that can migrate from one host to another. Mobile programs are appealing because they support efficient utilization of network resources and extensibility of information servers. This paper presents a scheduling scheme for allocating resources to a mix of real-Time and non real-Time mobile programs. Within this framework, both mobile programs and hosts can specify constraints on how CPU should be allocated. On the basis of the constraints, the scheme constructs a scheduling graph on which it applies several scheduling algorithms. In case of conflicts between mobile program and host specified constraints, the schemes implements a policy that resolves the conflicts in favor of the host. The resulting scheduling scheme is adaptive, flexible, and enforces both program and host specified constraints.
In this paper, we study electrical characteristics of gate-all-around (GAA) silicon-germanium (SiGe) nanowire field effect transistors (NWFETS) with different aspect ratio (AR) of channel. Device characteristics: the ...
详细信息
Computation reuse is known as an effective optimization technique. However, due to the complexity of modern GPU architectures, there is yet not enough understanding regarding the intriguing implications of the interpl...
详细信息
Computation reuse is known as an effective optimization technique. However, due to the complexity of modern GPU architectures, there is yet not enough understanding regarding the intriguing implications of the interplay of compu- ration reuse and hardware specifics on application performance. In this paper, we propose an automatic code generator for a class of stencil codes with inherent computation reuse on CPUs. For such applications, the proper reuse of intermediate results, combined with careful register and on-chip local memory usage, has profound implications on performance. Current state of the art does not address this problem in depth, partially due to the lack of a good program representation that can expose all potential computation reuse. In this paper, we leverage the computation overlap graph (COG), a simple representation of data dependence and data reuse with "element view", to expose potential reuse opportunities. Using COG, we propose a portable code generation and tuning framework for GPUs. Compared with current state-of-the-art code generators, our experimental results show up to 56.7% performance improvement on modern GPUs such as NVIDIA C2050.
Neural Radiance Field (NeRF) has received widespread attention for its photo-realistic novel view synthesis quality. Current methods mainly represent the scene based on point sampling of ray casting, ignoring the infl...
详细信息
暂无评论