检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

3,437 篇 会议
108 篇 期刊文献
3 册 图书

馆藏范围

3,548 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

2,425 篇 工学
- 2,335 篇 计算机科学与技术...
- 1,551 篇 软件工程
- 465 篇 信息与通信工程
- 335 篇 电气工程
- 252 篇 控制科学与工程
- 185 篇 网络空间安全
- 179 篇 电子科学与技术（可...
- 41 篇 生物医学工程（可授...
- 34 篇 动力工程及工程热...
- 29 篇 机械工程
- 29 篇 建筑学
- 29 篇 生物工程
- 28 篇 安全科学与工程
- 25 篇 土木工程
- 23 篇 光学工程
- 22 篇 力学（可授工学、理...
- 20 篇 化学工程与技术
- 20 篇 交通运输工程
- 17 篇 环境科学与工程（可...
1,069 篇 理学
- 949 篇 数学
- 131 篇 统计学（可授理学、...
- 80 篇 系统科学
- 77 篇 物理学
- 37 篇 生物学
- 26 篇 化学
462 篇 管理学
- 363 篇 管理科学与工程(可...
- 231 篇 工商管理
- 123 篇 图书情报与档案管...
43 篇 经济学
- 43 篇 应用经济学
21 篇 法学
- 21 篇 社会学
15 篇 农学
14 篇 医学
11 篇 教育学
3 篇 文学
1 篇 军事学

主题

490 篇 parallel process...
381 篇 parallel process...
313 篇 concurrent compu...
294 篇 computer science
276 篇 distributed comp...
266 篇 distributed comp...
217 篇 parallel algorit...
163 篇 computer archite...
162 篇 application soft...
132 篇 computational mo...
130 篇 parallel program...
121 篇 costs
118 篇 hardware
110 篇 algorithm design...
110 篇 computer network...
108 篇 delay
107 篇 processor schedu...
88 篇 distributed proc...
84 篇 parallel archite...
78 篇 hypercubes

机构

13 篇 pacific northwes...
11 篇 ieee
9 篇 syracuse univ sy...
9 篇 georgia inst of ...
9 篇 department of co...
8 篇 new jersey inst ...
7 篇 ibm thomas j. wa...
7 篇 school of comput...
7 篇 purdue univ west...
7 篇 irisa rennes
6 篇 texas a&m univ c...
6 篇 univ of californ...
6 篇 michigan state u...
6 篇 univ of maryland...
6 篇 institute of com...
6 篇 ohio state univ ...
6 篇 carnegie mellon ...
5 篇 department of co...
5 篇 department of co...
5 篇 university of lu...

作者

14 篇 bader david a.
10 篇 li keqin
9 篇 zomaya albert y.
9 篇 prasanna viktor ...
9 篇 das sajal k.
8 篇 prasad sushil k.
8 篇 maciejewski anth...
8 篇 sussman alan
8 篇 sun xian-he
7 篇 ibarra oscar h.
7 篇 boukerche azzedi...
7 篇 casanova henri
7 篇 talbi el-ghazali
7 篇 panda dhabaleswa...
7 篇 olariu s.
7 篇 cai wentong
7 篇 ahmad ishfaq
7 篇 aluru srinivas
6 篇 pan yi
6 篇 dongarra jack

语言

3,527 篇 英文
20 篇 其他
1 篇 中文

检索条件"任意字段=Proceedings of the Third IEEE Symposium on Parallel and Distributed Processing"

共 3548 条记录，以下是751-760 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Software viterbi decoder with SSE4 parallel processing instructions for software dvb-t receiver

Software viterbi decoder with SSE4 parallel processing instr...

引用

2009 ieee International symposium on parallel and distributed processing with Applications, ISPA 2009

作者： Tseng, Shu-Ming Kuo, Yu-Chin Ku, Yen-Chih Hsu, Yueh-Teng Department of Electronic Engineering National Taipei University of Technology Taipei Taiwan Lite-On Technology Corporation Taipei Taiwan

ISBN: (纸本)9780769537474

In this paper, we discuss the procedures how to make Viterbi decoder faster. The implementation in Intel CPU with SSE4 parallel processing instruction sets and some other methods achieves the decoding speed 47.05 Mbps (0.64 Mbps originally). The DVB-T mode used in Taiwan needs 13.27 Mbps to achieve real-time reception, so our implementation of software Viterbi decoder takes only 28% CPU loading.

关键词： Viterbi algorithm

来源：评论

学校读者我要写书评

暂无评论

Efficient and deadlock-free reconfiguration for source routed networks

Efficient and deadlock-free reconfiguration for source route...

引用

23rd ieee International parallel and distributed processing symposium, IPDPS 2009

作者： Solheim, Åshild Grønstad Lysne, Olav Bermúdez, Aurelio Casado, Rafael Sødring, Thomas Skeie, Tor Robles-Gómez, Antonio Networks and Distributed Systems Group Simula Research Laboratory Lysaker Norway Department of Informatics University of Oslo Oslo Norway Computing Systems Department University of Castilla-La Mancha Albacete Spain

ISBN: (纸本)9781424437504

Overlapping Reconfiguration is currently the most effi-cient method to reconfigure an interconnection network, but is only valid for systems that apply distributed routing. This paper proposes a solution which enables utilization of Overlapping Reconfiguration in a source routed environ-ment. We demonstrate how a synchronized injection of tokens has a significant impact on the performance of the method. Furthermore, we propose and evaluate an optimization of the original algorithm that reduces (and in some cases even eliminates) performance issues caused by the token forwarding regime, such as increased latency and decreased throughput. © 2009 ieee.

关键词： Interconnection networks (circuit switching)

来源：评论

学校读者我要写书评

暂无评论

Program Optimization of Stencil Based Application on the GPU-accelerated System

Program Optimization of Stencil Based Application on the GPU...

引用

ieee International symposium on parallel and distributed processing with Applications

作者： Wang, Guibin Yang, Xuejun Zhang, Ying Tang, Tao Fang, XuDong Natl Univ Def Technol Sch Comp Natl Lab Parallel & Distributed Proc Changsha Hunan Peoples R China

ISBN: (纸本)9780769537474

Graphic processing Unit (GPU), with many lightweight data-parallel cores, can provide substantial parallel computational power to accelerate general purpose applications. But the powerful computing capacity could not be fully utilized for memory-intensive applications, which are limited by off-chip memory bandwidth and latency. Stencil computation has abundant parallelism and low computational intensity which make it a useful architectural evaluation benchmark. In this paper, we propose some memory optimizations for a stencil based application mgrid from SPEC 2K benchmarks. Through exploiting data locality in 3-level memory hierarchies and tuning the thread granularity, we reduce the pressure on the off-chip memory bandwidth. To hide the long off-chip memory access latency, we further prefetch data during computation through double-buffer. In order to fully exploit the CPU-GPU heterogeneous system, we redistribute the computation between these two computing resource. Through all these optimizations, we gain 24.2x speedup compared to the simple mapping version, and get as high as 34.3x speedup when compared with a CPU implementation.

关键词： GPGPU CUDA mgrid stencil heterogeneous system

来源：评论

学校读者我要写书评

暂无评论

Tools for scalable performance analysis on petascale systems IPDPS 2009 all symposium tutorial

Tools for scalable performance analysis on petascale systems...

引用

23rd ieee International parallel and distributed processing symposium, IPDPS 2009

作者： Chung, I-Hsin Seelam, S.R. Mohr, B. Labarta, J. IBM T.J. Watson United States Research Center Juelich Germany UPC Barcelona Spain

ISBN: (纸本)9781424437504

Tools are becoming increasingly important to efficiently utilize the computing power available in contemporary large scale systems. The drastic increase in the size and the complexity of systems require tools to be scalable while producing meaning full and easily digestible information that may help the user pin-point problems at scale. The goal of this tutorial is to introduce some state-of-the-art performance tools from three different organizations to a diverse audience group. Together these tools provide a broad spectrum of capabilities necessary to analyze the performance of scientific and engineering applications on a variety of large and small scale systems.

关键词： Large scale systems

来源：评论

学校读者我要写书评

暂无评论

CaravelaMPI: Message Passing Interface for parallel GPU-based Applications

CaravelaMPI: Message Passing Interface for Parallel GPU-base...

引用

8th International symposium on parallel and distributed Computing

作者： Yamagiwa, Shinichi Sousa, Leonel INESC ID IST P-1000029 Lisbon Portugal

ISBN: (纸本)9780769536804

With the ever increasing demand for high quality 3D image processing on markets such as cinema and gaming, graphics processing units (GPUs) capabilities have shown tremendous advances. Although GPU-based cluster computing, which uses GPUs as the processing units, is one of the most promising high performance parallel computing platforms, currently there is no programming environment, interface or library designed to use these multiple computing resources to compute tasks in parallel. This paper proposes the CaravelaMPI, a new message passing interface targeted for GPU cluster computing, providing a unified and transparent interface to manage both communication and GPU execution. Experimental results show that the transparent interface of CaravelaMPI allows to efficiently program GPU-based clusters, not only decreasing the required programming effort but also increasing the performance of GPU-based cluster computing platforms.

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

Using GPU to Accelerate Cache Simulation

Using GPU to Accelerate Cache Simulation

引用

ieee International symposium on parallel and distributed processing with Applications

作者： Wan Han Gao Xiaopeng Wang Zhiqiang Li Yi Beijing Univ Aeronaut & Astronaut State Key Lab Virtual Real Technol & Syst Sch Comp Sci & Engn Beijing 100083 Peoples R China

ISBN: (纸本)9780769537474

Caches play a major role in the performance of high-speed computer systems. Trace-driven simulator is the most widely used method to evaluate cache architectures. However, as the cache design moves to more complicated architectures, along with the size of the trace is becoming larger and larger. Traditional simulation methods are no longer practical due to their long simulation cycles. Several techniques have been proposed to reduce the simulation time of sequential trace-driven simulation. This paper considers the use of generic GPU to accelerate cache simulation which exploits set-partitioning as the main source of parallelism. We develop more efficient parallel simulation techniques by introducing more knowledge into the Compute Unified Device Architecture (CUDA) on the GPU. Our experimental result shows that the new algorithm gains 2.76x performance improvement compared to traditional CPU-based sequential algorithm.

关键词： parallel algorithms caches trace-driven simulation GPGPU CUDA

来源：评论

学校读者我要写书评

暂无评论

Bandwidth Sensitive Co-allocation Scheme for parallel Downloading in Data Grid

Bandwidth Sensitive Co-allocation Scheme for Parallel Downlo...

引用

ieee International symposium on parallel and distributed processing with Applications

作者： Hsu, Ching-Hsien Chu, Chia-Wei Chou, Chih-Hsun Chung Hua Univ Dept CSIE Hsinchu Taiwan

ISBN: (纸本)9780769537474

The large sized data sets are replicated in more than one site for the better availability to the nodes in a grid. Downloading the dataset from these replicated locations have practical difficulties, due to network traffic, congestion, frequent change-in performance of the servers, etc. In order to speed up the download, complex server selection techniques, network and server loads are used. However, consistent performance is not guaranteed due to the shared nature of network links of the load on them, which can vary unpredictably. In this paper, we present a bandwidth sensitive co-allocation scheme for parallel downloading in grid economics. Objective of the proposed technique aims to service grid applications efficiently and economically in data grids. With the consideration of cost factor, we present a novel mechanism for server selection, dynamic rile decomposition and co-allocation. Under considerations in costs, our mechanism for selections of servers with various techniques combined is able to significantly attenuate economic costs. We compared our scheme with the existing schemes and the preliminary results show notable improvement in overall completion time of data transfer.

关键词： Grid scheduling data grid parallel downloading

来源：评论

学校读者我要写书评

暂无评论

Efficient and Lightweight Data Integrity Check in In-Networking Storage Wireless Sensor Networks

Efficient and Lightweight Data Integrity Check in In-Network...

引用

ieee International symposium on parallel and distributed processing with Applications

作者： Ren, Wei Ren, Yi Zhang, Hui China Univ Geosci Sch Comp Sci Wuhan 430074 Peoples R China Univ Agder UIA Informat & Comm Tech Dept Kristiansand Norway Dept Elect Politecnico Torino Turin Italy

ISBN: (纸本)9780769537474

In In-networking storage Wireless Sensor Networks, sensed data are stored locally for a long term and retrieved on-demand instead of real-time. To maximize data survival, the sensed data are normally distributively stored at multiple nearby nodes. It arises a problem that how to check and grantee data integrity of distributed data storage in the context of resource constraints. In this paper, a technique called Two Granularity Linear Code (TGLC) that consists of Intracodes and Inter-codes is presented. An efficient and lightweight data integrity check scheme based on TGLC is proposed. Data integrity can be checked by any one who holds short Intercodes, and the checking credentials is short Intra-codes that is dynamically generated. The proposed scheme is efficient and lightweight with respect to low storage and communication overhead, and yet checking validity is maintained. Our conclusion is justified by extensive analysis.

关键词： Data Integrity Linear Coding Secure Storage Wireless Sensor Networks distributed Storage

来源：评论

学校读者我要写书评

暂无评论

Design of a parallel AES for graphics hardware using the CUDA framework

Design of a parallel AES for graphics hardware using the CUD...

引用

23rd ieee International parallel and distributed processing symposium, IPDPS 2009

作者： Biagio, Andrea Di Barenghi, Alessandro Agosta, Giovanni Pelosi, Gerardo Politecnico di Milano Italy Universit'a degli Studi di Bergamo Italy

ISBN: (纸本)9781424437504

Web servers often need to manage encrypted transfers of *** encryption activity is computationally intensive, and exposes a significant degree of parallelism. At the same time, cheap multicore processors are readily available on graphics hardware, and toolchains for development of general purpose programs are being released by the vendors. In this paper, we propose an effective implementation of the AES-CTR symmetric cryptographic primitive using the CUDA framework. We provide quantitative data for different implementation choices and compare them with the common CPU-based OpenSSL implementation on a performance- cost basis. With respect to previous works, we focus on optimizing the implementation for practical application scenarios, and we provide a throughput improvement of over 14 times. We also provide insights on the programming knowledge required to efficiently exploit the hardware resources by exposing the different kinds of parallelism built in the AES-CTR cryptographic primitive. © 2009 ieee.

关键词： Cryptography

来源：评论

学校读者我要写书评

暂无评论

A task-pool parallel I/O paradigm for an I/O intensive application

A task-pool parallel I/O paradigm for an I/O intensive appli...

引用

ieee International symposium on parallel and distributed processing with Applications

作者： Li, Jianjiang Yan, Lin Gao, Zhe Hei, Dan Univ Sci & Technol Beijing Dept Comp Sci Beijing Peoples R China

ISBN: (纸本)9780769537474

in regards to applications like 3D seismic migration, it is quite important to improve the I/O performance within an cluster computing system. Such seismic data processing applications are the I/O intensive applications. For example, large 3D data volume cannot be hold totally in computer memories. Therefore the input data files have to be divided into many fine-grained chunks. Intermediate results are written out at various stages during the execution, and final results are written out by the master process. This paper describes a novel manner for optimizing the parallel I/O data access strategy and load balancing for the above-mentioned particular program model. The optimization, based on the application defined API, reduces the number of I/O operations and communication (as compared to the original model). This is done by forming groups of threads with "group roots", so to speak, that read input data (determined by an index retrieved from the master process) and then send it to their group members. In the original model, each process/thread reads the whole input data and outputs its own results. Moreover the loads are balanced, for the on-line dynamic scheduling of access request to process the migration data. Finally, in the actual performance test, the improvement of performance is often more than 60% by comparison with the original model.

关键词： task-pool parallel I/O load-balancing I/O intensive

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共355页 << < 72 73 74 75 76 77 78 79 80 81 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：