检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

765 篇 会议
30 篇 期刊文献

馆藏范围

795 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

631 篇 工学
- 583 篇 计算机科学与技术...
- 345 篇 软件工程
- 267 篇 电子科学与技术（可...
- 186 篇 电气工程
- 109 篇 信息与通信工程
- 42 篇 控制科学与工程
- 41 篇 动力工程及工程热...
- 29 篇 机械工程
- 20 篇 生物工程
- 13 篇 建筑学
- 12 篇 仪器科学与技术
- 11 篇 冶金工程
- 11 篇 土木工程
- 9 篇 材料科学与工程（可...
- 9 篇 化学工程与技术
- 9 篇 生物医学工程（可授...
- 4 篇 光学工程
- 4 篇 农业工程
235 篇 理学
- 192 篇 数学
- 39 篇 物理学
- 21 篇 生物学
- 20 篇 统计学（可授理学、...
- 11 篇 系统科学
- 7 篇 化学
66 篇 管理学
- 55 篇 管理科学与工程(可...
- 36 篇 工商管理
- 12 篇 图书情报与档案管...
18 篇 经济学
- 18 篇 应用经济学
16 篇 法学
- 14 篇 社会学
4 篇 教育学
- 4 篇 教育学
4 篇 农学

主题

357 篇 field programmab...
241 篇 field programmab...
37 篇 signal processin...
36 篇 clocks
34 篇 ofdm
34 篇 estimation
33 篇 modulation
33 篇 filtration
30 篇 fpga
22 篇 hardware
17 篇 routing
15 篇 computer archite...
14 篇 switches
13 篇 table lookup
11 篇 logic arrays
11 篇 prototypes
10 篇 systolic arrays
10 篇 costs
9 篇 field-programmab...
9 篇 field-programmab...

机构

6 篇 univ of toronto ...
6 篇 imperial college...
5 篇 university of to...
5 篇 univ of toronto
5 篇 univ of californ...
5 篇 altera corporati...
4 篇 univ british col...
4 篇 dept. of cs 256-...
4 篇 department of el...
4 篇 xilinx
4 篇 intel corporatio...
3 篇 computer science...
3 篇 tsinghua univers...
3 篇 university of ca...
3 篇 department of ee...
3 篇 georgia inst tec...
3 篇 department of el...
3 篇 xilinx inc. san ...
3 篇 department of co...
3 篇 school of comput...

作者

21 篇 cong jason
20 篇 rose jonathan
12 篇 betz vaughn
10 篇 hauck scott
10 篇 wawrzynek john
9 篇 dehon andré
9 篇 zhang zhiru
8 篇 wilton steven j....
8 篇 schmit herman
8 篇 chen deming
8 篇 langhammer marti...
7 篇 constantinides g...
7 篇 chow paul
6 篇 anderson jason h...
6 篇 wilton steven j....
6 篇 pasca bogdan
6 篇 li fei
5 篇 jason cong
5 篇 so hayden kwok-h...
5 篇 dehon andre

语言

784 篇 英文
11 篇 其他

检索条件"任意字段=Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays"

共 795 条记录，以下是791-800 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

A system-level synthesis algorithm with guaranteed solution quality 00

A system-level synthesis algorithm with guaranteed solution ...

引用

Design, Automation and Test in Europe Conference and Exhibition

作者： U. Nagaraj Shenoy P. Banerjee A. Choudhary Northwestern University Evanston IL USA

ISBN: (纸本)9781581132441

Recently a number of heuristic based system-level synthesis algorithms have been proposed. Though these algorithms quickly generate good solutions, how close these solutions are to optimal is a question that is difficult to answer. While current exact techniques produce optimal results, they fail to produce them in reasonable time. This paper presents a synthesis algorithm that produces solutions of guaranteed quality (optimal in most cases or within a known bound) with practical synthesis times (few seconds to minutes). It takes a unified look (the lack of which is one of the main sources of sub-optimality in the heuristic techniques) at different aspects of system synthesis such as pipelining, selection, allocation, scheduling and FPGA reconfiguration. Our technique can handle both time constrained as well as resource constrained synthesis problems. We present results of our algorithm implemented as part of the Match project at Northwestern University.

关键词： field programmable gate arrays Timing Costs Time factors Digital signal processing Delay Pipeline processing Signal design Digital signal processors Flow graphs

来源：评论

学校读者我要写书评

暂无评论

A Compiler for Scalable Placement and Routing of Brain-like Architectures 13

A Compiler for Scalable Placement and Routing of Brain-like ...

引用

international symposium on Physical Design

作者： Narayan Srinivasa HRL Laboratories LLC Malibu CA USA

ISBN: (纸本)9781450318679

The challenging aspect of building neuromorphic circuits in mature CMOS technology to match brain-like architectures is two-fold: scalability and connectivity. Scalability means that the circuits have to be expandable to match biological brains in terms of synaptic and neuronal densities. The challenge here is to implement 10~6 neurons and 10~(10) synapses with an average fanout of 10~4, in a square cm of CMOS. Connectivity means that the circuit has to offer the capability to have both short and long range (by physical distance) connections between neurons. A large part of this challenge is how to implement a connectivity of 10~4 synapses per neuron. Unfortunately, even the exponential transistor density growth being experienced today is not sufficient to realize such massive connectivity and synaptic densities in a traditional CMOS process. Recent approaches to address these challenges have been to integrate CMOS with nanotechnology in order to achieve the required synaptic densities. These solutions use crossbar architectures predominantly but the connectivity challenge still remains a daunting task for such solutions. To meet these challenges, a novel synaptic time-multiplexing (STM) concept was developed along with a neural fabric design. This combination has the advantage of offering greater flexibility and long range connectivity. It also provides a method to overcome the limitations of conventional CMOS technology to match the synaptic density and connectivity requirements found in mammalian brains while maintaining nonlinear synapses and learning. In order to program neuromorphic hardware for any desired brain architecture, the topology would first have to be converted into a connectivity matrix or a graph representation. This matrix along with the statistics on the number of neurons and synapses is provided as input to a neuromorphic compiler. The neuromorphic compiler compiles the neural network structure description into: 1) an assignment of the network'

关键词： Routing Placement Brain-like Architectures Compiler Network routing compilers Neuromorphics placement Synapses Neurons ARCHITECTURE Cerebrum field programmable gate arrays CMOS technology Building construction Synaptic density

来源：评论

学校读者我要写书评

暂无评论

Bridging the Gap Between LLMs and LNS with Dynamic Data Format and Architecture Codesign

Bridging the Gap Between LLMs and LNS with Dynamic Data Form...

引用

IEEE/acm international symposium on Microarchitecture (MICRO)

作者： Pouya Haghi Chunshu Wu Zahra Azad Yanfei Li Andrew Gui Yuchen Hao Ang Li Tony Tong Geng University of Rochester Rochester NY Pacific Northwest National Laboratory Richland WA Meta Menlo Park CA

ISBN: (数字)9798350350579

ISBN: (纸本)9798350350586

Deep Neural Networks (DNNs) have achieved tremendous success in the past few years. However, their training and inference demand exceptional computational and memory resources. Quantization has been shown as an effective approach to mitigate the cost, with the mainstream data types reduced from FP32 to FP16/BF16 and recently FP4 in the latest NVIDIA B100 GPUs. With increasingly aggressive quantization, however, the conventional floating-point formats suffer from limited precision in representing numbers around zero. Recently, NVIDIA demonstrated the potential of using a Logarithmic Number System (LNS) for the next generation of tensor cores. While LNS mitigates the hurdles in representing small numbers, in this work we observed a mismatch between LNS and the emerging Large Language Models (LLM), where LLM exhibits significant outliers when directly adopting the LNS format. In this paper, we present a data-format/architecture codesign to bright this gap. On the format side, we propose a dynamic LNS format to flexibly represent outliers at a higher precision, by exploiting asymmetry in the LNS representation and identifying outliers through a per-block basis. On the architecture side, for demonstration, we realize the dynamic LNS format in a systolic array, which can handle the irregularity of the outliers at runtime. We implement our approach on an Alveo U280 FPGA as a prototype. Experimental results show that our design can effectively handle the outliers and resolve the mismatch between LNS and LLM, contributing to an accuracy improvement of 15.4% and 16% over the floating-point and the original LNS baselines, with up to 15.3% over the state-of-the-art quantization methods using four LLM models. Our observation and design lay a solid foundation for the large-scale adoption of the LNS format in the next-generation deep learning hardware.

关键词： Training Quantization (signal) Tensors Accuracy Runtime Solids Hardware Systolic arrays Next generation networking field programmable gate arrays

来源：评论

学校读者我要写书评

暂无评论

Memory Allocation Under Hardware Compression

Memory Allocation Under Hardware Compression

引用

IEEE/acm international symposium on Microarchitecture (MICRO)

作者： Muhammad Laghari Yuqing Liu Gagandeep Panwar David Bears Chandler Jearls Raghavendra Srinivas Esha Choukse Kirk W. Cameron Ali R. Butt Xun Jian Virginia Tech Microsoft Research

ISBN: (数字)9798350350579

ISBN: (纸本)9798350350586

As the scaling of memory density slows physically, a promising solution is to scale memory logically by enhancing the CPU's memory controller to encode and store data more densely in memory. This is known as hardware memory compression. Hardware memory compression decouples OS-managed physical memory from actual memory (i.e., DRAM); the memory controller spends a dynamically varying amount of DRAM on each physical page, depending on the compressibility of the page's content. The newly-decoupled actual memory effectively forms a new layer of memory beyond the traditional layers of virtual, pseudo-physical, and physical memory. We note unlike these traditional memory layers, each with its own specialized allocation interface (e.g., malloc/mmap for virtual memory, page tables+MMU for physical memory), this new layer of memory introduced by hardware memory compression still awaits its own unique memory allocation interface; its absence makes the allocation of actual memory imprecise and, sometimes, even impossible. Imprecisely allocating less actual memory, and/or unable to allocate more, can harm performance. Even imprecisely allocating more actual memory to some jobs can be harmful as it can result in allocating less actual memory to other jobs in highly-occupied memory systems, where compression is useful. To restore precise memory allocation, we design a new memory allocation specialized for this new layer of memory and, subsequently, architect a new MMU-like component in the memory controller and tackle the corresponding design challenges. We create a full-system FPGA prototype of a hardware-compressed memory system with precise memory allocation. Our evaluations using the prototype show that jobs perform stably under colocation. The performance variation is only 1%-2%; in comparison, it is 19%-89% under the prior art.

关键词： Art Microarchitecture Memory management Random access memory Prototypes Hardware Resource management field programmable gate arrays

来源：评论

学校读者我要写书评

暂无评论

LHCC: Low-Latency and Hi-Precision Congestion Control in RDMA Datacenter Networks

LHCC: Low-Latency and Hi-Precision Congestion Control in RDM...

引用

international Workshop on Quality of Service

作者： Bodong Yan Yangming Zhao Sun Xu Jianchun Liu Hongli Xu School of Computer Science and Technology University of Science and Technology of China Suzhou Institute for Advanced Research University of Science and Technology of China

ISBN: (数字)9798350350128

ISBN: (纸本)9798350350135

Congestion Control (CC) plays a vital role in deploying lossless datacenter networks based on Remote Direct Memory Access (RDMA). A high-performance CC scheme should provide low-latency and precise feedback to congestion events. However, no existing CC schemes achieved both features simultaneously. In this paper, we propose LHCC, a Low-latency and Hi-precision Congestion Control scheme for RDMA datacenter networks. LHCC uses out-band signaling to notify the network status and hence a packet sender can detect congestion events within an RTT. In addition, LHCC adjusts packet sending rate by taking into consideration all queues along the entire path that a packet has gone through. Accordingly, it provides a more precise CC compared with existing schemes especially when there are multiple bottlenecks in the networks. We build the LHCC prototype on a real testbed carrying NVIDIA Bluefield-3 NICs and AGM39D FPGAs. Both testbed experiments and extensive simulations show that LHCC can reduce the Flow Completion Time (FCT) slow down and reduce the buffer usage (i.e., reduce the queue lengths) by up to 62.5% and 58%, respectively, compared with the state-of-the-art high-precision CC scheme, HPCC.

关键词： Prototypes Switches Quality of service Telemetry Low latency communication field programmable gate arrays Load modeling

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共80页 << < 71 72 73 74 75 76 77 78 79 80 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：