检索结果-内蒙古大学图书馆

14th International Conference on Digital Audio Effects, DAFx 2011

作者： Battenberg, Eric Avižienis, Rimas Parallel Computing Laboratory Department of Electrical Engineering and Computer Science University of California Berkeley CA United States

ISBN: (纸本)9782954035109

We describe techniques for implementing real-time partitioned convolution algorithms on conventional operating systems using two different scheduling paradigms: time-distributed (cooperative) and multi-threaded (preemptive). We discuss the optimizations applied to both implementations and present measurements of their performance for a range of impulse response lengths on a recent high-end desktop machine. We find that while the time-distributed implementation is better suited for use as a plugin within a host audio application, the preemptive version was easier to implement and significantly outperforms the time-distributed version despite the overhead of frequent context switches.

关键词： Impulse response

来源：评论

学校读者我要写书评

暂无评论

A Multi-Objective Artificial Physics Optimization Algorithm Based on Two-Phase Search 8

A Multi-Objective Artificial Physics Optimization Algorithm ...

引用

8th International Symposium on Computer Science and Intelligent Control, ISCSIC 2024

作者： Zhang, Huihua Xie, Liping Taiyuan University of Science and Technology Shanxi Key Laboratory of Big Data Analysis and Parallel Computing Taiyuan China

ISBN: (纸本)9798350380286

Aiming at the problems of insufficient utilization of information about elite particles in archive and instability of particle motion in the population in the multi-objective artificial physics optimization algorithm (MOAPO) in solving multiobjective optimization problems, A multi-objective artificial physics optimization algorithm based on two-phase search (TPMOAPO) is proposed. To begin with, the algorithm improves the calculation of the mass of particles, so that the strength and weakness of the particles can be accurately transformed into the corresponding masses while improving the efficiency of particle mass calculation. Next, a two-phase search strategy is proposed, which makes the algorithm have strong exploration ability in the first phase, and the second phase gradually enhances the exploitation capability with iterations, which solves the problem of instability motion of particles in the search process. Finally, the simulated binary crossover (SBX) and polynomial-based mutation (PM) operators are adopted in the archive to further enhance the search capability of the algorithm. For verifying the performance of TP-MOAPO, 21 benchmark functions were selected to compare with the classical multi-objective particle swarm optimization algorithms: MOPSO, dMOPSO, SMPSO, MMOPSO, and NMPSO, and the experimental results show the superiority of TP-MOAPO in these functions. © 2024 IEEE.

关键词： Optimization algorithms

来源：评论

学校读者我要写书评

暂无评论

Handoff of application sessions across time and space

Handoff of application sessions across time and space

引用

International Conference on Communications (ICC2001)

作者： Phan, T. Xu, K. Guy, R. Bagrodia, R. Parallel Computing Laboratory Computer Science Department University of California Los Angeles CA 90095 United States

Personal computing on mobile platforms such as laptops and personal digital assistants, rather than in a traditional desktop environment, is becoming increasingly more common. In this paper we address the issue of application session transfer for uninterrupted data access across this diverse range of platforms. This work is part of the iMASH project, a multi-year, multi-discipline collaborative effort focused on enabling mobile client platforms and incorporating them into existing legacy networked systems for use by medical practitioners. We have developed a tiered architecture that includes a middleware server layer positioned between existing application servers and multiple clients to make session transfer transparent to the user. Any client application executing our Middleware-Aware Remote Code library can save and restore its session by interacting with a middleware server. As a proof of concept, we have implemented the transfer of bookmarks, history, web cache, and user preferences with the Mozilla open source web browser. From this effort we have established baseline performance metrics and have found that the overhead is within reasonable bounds of just a few seconds of latency.

关键词： Mobile computing

来源：评论

学校读者我要写书评

暂无评论

Efficient Large Models Fine-tuning on Commodity Servers via Memory-balanced Pipeline parallelism 25

Efficient Large Models Fine-tuning on Commodity Servers via ...

引用

25th IEEE International Conferences on High Performance computing and Communications, 9th International Conference on Data Science and Systems, 21st IEEE International Conference on Smart City and 9th IEEE International Conference on Dependability in Sensor, Cloud and Big Data Systems and Applications, HPCC/DSS/SmartCity/DependSys 2023

作者： Liu, Yujie Lai, Zhiquan Liu, Weijie Wang, Wei Li, Dongsheng College of Computer National University of Defense Technology National Key Laboratory of Parallel and Distributed Computing Changsha China

ISBN: (纸本)9798350330014

Large models have achieved impressive performance in many downstream tasks. Using pipeline parallelism to fine-tune large models on commodity GPU servers is an important way to make the excellent performance of large models available to the general public. Previous solutions fail to achieve an efficient memory-balanced pipeline parallelism. In this poster, we introduce a memory load-balanced pipeline parallel solution. This solution balances memory consumption across stages on commodity GPU servers via NVLink bridges. It establishes a new pathway to offload data from GPU to CPU by using the PCIe link of adjacent GPUs connected by the NVLink bridge. Furthermore, our method orchestrates offload operations to minimize the offload latency during large model fine-tuning. Experiments demonstrate that our solution can balance the memory footprint among pipeline stages without sacrificing training performance. © 2023 IEEE.

关键词： Program processors

来源：评论

学校读者我要写书评

暂无评论

Secure execution of mobile programs

Secure execution of mobile programs

引用

DARPA Information Survivability Conference and Exposition, DISCEX 2000

作者： Pandey, R. Hashii, B. Lal, M. Parallel and Distributed Computing Laboratory Computer Science Department University of California DavisCA95616 United States

ISBN: (纸本)0769504906

There is increasing interest in computing models that support extensibility of systems through code migration. Although appealing both from the system design and extensibility points of view, extensible systems are vulnerable to an external program's aberrant execution behaviors. We examine the problems of resource access control and resource consumption. We propose solutions for these problems and analyze their effectiveness. © 2000 IEEE.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Rethinking the Distributed DNN Training Cluster Design from the Cost-effectiveness View 25

Rethinking the Distributed DNN Training Cluster Design from ...

引用

作者： Lai, Zhiquan Liu, Yujie Wang, Wei Hao, Yanqi Li, Dongsheng College of Computer National University of Defense Technology National Key Laboratory of Parallel and Distributed Computing Changsha China

ISBN: (纸本)9798350330014

As deep learning grows rapidly, model training heavily relies on parallel methods and there exist numerous cluster configurations. However, current preferences for parallel training focus on data centers, overlooking the financial constraints faced by most researchers. To attain the best performance within the cost limitation, we introduce a throughput-cost metric to accurately characterize clusters' cost-effectiveness. Based on this metric, we design a cost-effective cluster featuring the 3090 with NVLink. The experiment results demonstrate that our cluster achieves remarkable cost-effectiveness in various distributed model training schemes. © 2023 IEEE.

关键词： Cost effectiveness

来源：评论

学校读者我要写书评

暂无评论

CPU resource control for mobile programs 1

CPU resource control for mobile programs

引用

1st International Symposium on Agent Systems and Applications and 3rd International Symposium on Mobile Agents, ASA/MA 1999

作者： Lal, Manoj Pandey, Raju Computer Science Department Parallel and Distributed Computing Laboratory University of California DavisCA95616 United States

ISBN: (纸本)0769503403

There is considerable interest in developing runtime infrastructures for programs that can migrate from one host to another. Mobile programs are appealing because they support efficient utilization of network resources and extensibility of information servers. This paper presents a scheduling scheme for allocating resources to a mix of real-Time and non real-Time mobile programs. Within this framework, both mobile programs and hosts can specify constraints on how CPU should be allocated. On the basis of the constraints, the scheme constructs a scheduling graph on which it applies several scheduling algorithms. In case of conflicts between mobile program and host specified constraints, the schemes implements a policy that resolves the conflicts in favor of the host. The resulting scheduling scheme is adaptive, flexible, and enforces both program and host specified constraints.

关键词： Scheduling algorithms

来源：评论

学校读者我要写书评

暂无评论

Impact of geometry aspect ratio on 10-nm gate-all-around silicon-germanium nanowire field effect transistors 14

Impact of geometry aspect ratio on 10-nm gate-all-around sil...

引用

2014 14th IEEE International Conference on Nanotechnology, IEEE-NANO 2014

作者： Chao, Pei-Jung Li, Yiming Parallel and Scientific Computing Laboratory Institute of Biomedical Engineering National Chaio Tung University Hsinchu300 Taiwan

ISBN: (纸本)9781479956227

In this paper, we study electrical characteristics of gate-all-around (GAA) silicon-germanium (SiGe) nanowire field effect transistors (NWFETS) with different aspect ratio (AR) of channel. Device characteristics: the subthreshold swing (SS), the drain induced barrier lowering (DIBL), and the ION/IOFF ratio are simulated by using three-dimensional quantum mechanically corrected device simulation. Electrical characteristics of 10-nm-gate GAA Si1-xGex NWFET devices are explored with respect to different thickness of SiGe and Ge's mole fraction. It is investigated that an ellipse-shaped channel with a small aspect ratio possesses better DC characteristics, compared with the one which has large AR due to its good gate controllability. © 2014 IEEE.

关键词： Aspect ratio

来源：评论

学校读者我要写书评

暂无评论

Highly Optimized Code Generation for Stencil Codes with Computation Reuse for GPUs

引用

Journal of Computer Science & Technology 2016年第6期31卷 1262-1274页

作者： Wen-Jing Ma Kan Gao Guo-Ping Long Laboratory of Parallel Software and Computing Science Institute of Software Chinese Academy of Sciences Beijing 100190 China State Key Laboratory of Computer Science Institute of Software Chinese Academy of Sciences Beijing 100190 China Information Center China Association for Science and Technology Beijing 100863 China

Computation reuse is known as an effective optimization technique. However, due to the complexity of modern GPU architectures, there is yet not enough understanding regarding the intriguing implications of the interplay of compu- ration reuse and hardware specifics on application performance. In this paper, we propose an automatic code generator for a class of stencil codes with inherent computation reuse on CPUs. For such applications, the proper reuse of intermediate results, combined with careful register and on-chip local memory usage, has profound implications on performance. Current state of the art does not address this problem in depth, partially due to the lack of a good program representation that can expose all potential computation reuse. In this paper, we leverage the computation overlap graph （COG）, a simple representation of data dependence and data reuse with ＂element view＂, to expose potential reuse opportunities. Using COG, we propose a portable code generation and tuning framework for GPUs. Compared with current state-of-the-art code generators, our experimental results show up to 56.7% performance improvement on modern GPUs such as NVIDIA C2050.

关键词： GPGPU OpenCL stencil code generation computation reuse

来源：评论

学校读者我要写书评

暂无评论

Area-NeRF: Area-based Neural Radiance Fields 2

Area-NeRF: Area-based Neural Radiance Fields

引用

2nd International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2023

作者： Ye, Zonxin Li, Wenyu Qiao, Peng Dou, Yong National University of Defense Technology National Key Laboratory of Parallel and Distributed Computing School of Computer Changsha China

ISBN: (纸本)9798350331417

Neural Radiance Field (NeRF) has received widespread attention for its photo-realistic novel view synthesis quality. Current methods mainly represent the scene based on point sampling of ray casting, ignoring the influence of the observed area changing with distance. In addition, The current sampling strategies are all focused on the distribution of sampling points on the ray, without paying attention to the sampling of the ray. We found that the current ray sampling strategy for scenes with the camera moving forward severely reduces the convergence speed. In this work, we extend the point representation to area representation by using relative positional encoding, and propose a ray sampling strategy that is suitable for camera trajectory moving forward. We validated the effectiveness of our method on multiple public datasets. © 2023 IEEE.

关键词： NeRF neural radiance field neural rendering novel view synthesis

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：