检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

2,293 篇 会议
29 篇 期刊文献
8 册 图书

馆藏范围

2,330 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

1,456 篇 工学
- 1,377 篇 计算机科学与技术...
- 673 篇 软件工程
- 295 篇 电气工程
- 290 篇 信息与通信工程
- 125 篇 电子科学与技术（可...
- 98 篇 网络空间安全
- 75 篇 控制科学与工程
- 57 篇 动力工程及工程热...
- 40 篇 生物工程
- 34 篇 机械工程
- 21 篇 材料科学与工程（可...
- 21 篇 建筑学
- 16 篇 生物医学工程（可授...
- 15 篇 光学工程
- 14 篇 环境科学与工程（可...
- 12 篇 仪器科学与技术
- 11 篇 土木工程
409 篇 理学
- 307 篇 数学
- 49 篇 物理学
- 46 篇 生物学
- 43 篇 系统科学
- 37 篇 统计学（可授理学、...
- 11 篇 地球物理学
- 9 篇 化学
214 篇 管理学
- 157 篇 管理科学与工程(可...
- 111 篇 工商管理
- 65 篇 图书情报与档案管...
30 篇 经济学
- 30 篇 应用经济学
28 篇 法学
- 25 篇 社会学
18 篇 医学
- 14 篇 临床医学
8 篇 农学
7 篇 教育学
3 篇 文学
1 篇 艺术学

主题

530 篇 computer archite...
204 篇 high performance...
201 篇 concurrent compu...
173 篇 hardware
173 篇 distributed comp...
164 篇 application soft...
146 篇 computer science
133 篇 parallel process...
126 篇 computational mo...
125 篇 delay
117 篇 costs
115 篇 computer network...
109 篇 grid computing
96 篇 bandwidth
91 篇 laboratories
77 篇 processor schedu...
67 篇 scalability
66 篇 resource managem...
62 篇 cloud computing
56 篇 distributed comp...

机构

7 篇 univ chicago dep...
7 篇 computer science...
7 篇 carnegie mellon ...
6 篇 univ wisconsin m...
6 篇 mathematics and ...
6 篇 intel corp santa...
6 篇 mathematics and ...
6 篇 changsha univers...
6 篇 institute of com...
5 篇 penn state univ ...
5 篇 univ toronto on
5 篇 school of electr...
5 篇 georgia inst tec...
5 篇 sandia national ...
5 篇 univ illinois ur...
5 篇 computer systems...
5 篇 college of compu...
5 篇 department of co...
4 篇 department of co...
4 篇 school of comput...

作者

9 篇 i. foster
8 篇 mutlu onur
7 篇 chong frederic t...
7 篇 guedes dorgival
7 篇 zhou huiyang
7 篇 magoules frederi...
7 篇 prasanna viktor ...
6 篇 navaux philippe ...
6 篇 patt yale n.
6 篇 torrellas josep
6 篇 kim nam sung
6 篇 d.k. panda
6 篇 wen-mei w. hwu
6 篇 r.k. iyer
5 篇 xie yuan
5 篇 loh gabriel h.
5 篇 schwan karsten
5 篇 li chao
5 篇 ahamed abal-kass...
5 篇 panda dhabaleswa...

语言

2,313 篇 英文
17 篇 其他
1 篇 中文

检索条件"任意字段=Proceedings - 16th Symposium on Computer Architecture and High Performance Computing"

共 2330 条记录，以下是201-210 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Machine Learning Inference on Serverless Platforms Using Model Decomposition 23

Machine Learning Inference on Serverless Platforms Using Mod...

引用

proceedings of the IEEE/ACM 16th International Conference on Utility and Cloud computing

作者： Adrien Gallego Uraz Odyurt Yi Cheng Yuandou Wang Zhiming Zhao Informatics Institute University of Amsterdam Amsterdam The Netherlands High-Energy Physics Radboud University Nijmegen The Netherlands

ISBN: (纸本)9798400702341

Serverless offers a scalable and cost-effective service model for users to run applications without focusing on underlying infrastructure or physical servers. While the Serverless architecture is not designed to address the unique challenges posed by resource-intensive workloads, e.g., Machine Learning (ML) tasks, it is highly scalable. Due to the limitations of Serverless function deployment and resource provisioning, the combination of ML and Serverless is a complex undertaking. We tackle this problem through decomposition of large ML models into smaller sub-models, referred to as slices. We set up ML inference tasks using these slices as a Serverless workflow, i.e., sequence of functions. Our experimental evaluations are performed on the Serverless offering by AWS for demonstration purposes, considering an open-source format for ML model representation, Open Neural Network Exchange. Achieved results portray that our decomposition method enables the execution of ML inference tasks on Serverless, regardless of the model size, benefiting from the high scalability of this architecture while lowering the strain on computing resources, such as required run-time memory.

关键词： serverless computing

来源：评论

学校读者我要写书评

暂无评论

pLUTo: Enabling Massively Parallel Computation in DRAM via Lookup Tables 55

pLUTo: Enabling Massively Parallel Computation in DRAM via L...

引用

55th Annual IEEE/ACM International symposium on Microarchitecture (MICRO)

作者： Ferreira, Joao Dinis Falcao, Gabriel Gomez-Luna, Juan Alser, Mohammed Orosa, Lois Sadrosadati, Mohammad Kim, Jeremie S. Oliveira, Geraldo F. Shahroodi, Taha Nori, Anant Mutlu, Onur Swiss Fed Inst Technol Zurich Switzerland Univ Coimbra P-3000 Coimbra Portugal Galicia Supercomputing Ctr Santiago De Compostela Spain Delft Univ Technol Delft Netherlands Intel Santa Clara CA USA

ISBN: (数字)9781665462723

ISBN: (纸本)9781665462723

Data movement between the main memory and the processor is a key contributor to execution time and energy consumption in memory-intensive applications. this data movement bottleneck can be alleviated using Processing-in-Memory (PiM). One category of PiM is Processing-using-Memory (PuM), in which computation takes place inside the memory array by exploiting intrinsic analog properties of the memory device. PuM yields high performance and energy efficiency, but existing PuM techniques support a limited range of operations. As a result, current PuM architectures cannot efficiently perform some complex operations (e.g., multiplication, division, exponentiation) without large increases in chip area and design complexity. To overcome these limitations of existing PuM architectures, we introduce pLUTo (processing-using-memory with lookup table (LUT) operations), a DRAM-based PuM architecture that leverages the high storage density of DRAM to enable the massively parallel storing and querying of lookup tables (LUTs). the key idea of pLUTo is to replace complex operations with low-cost, bulk memory reads (i.e., LUT queries) instead of relying on complex extra logic. We evaluate pLUTo across 11 real-world workloads that showcase the limitations of prior PuM approaches and show that our solution outperforms optimized CPU and GPU baselines by an average of 713x and 1.2x, respectively, while simultaneously reducing energy consumption by an average of 1855x and 39.5x. Across these workloads, pLUTo outperforms state-of-the-art PiM architectures by an average of 18.3x. We also show that different versions of pLUTo provide different levels of flexibility and performance at different additional DRAM area overheads (between 10.2% and 23.1%). pLUTo 's source code and all scripts required to reproduce the results of this paper are openly and fully available at https://***/CMU- SAFARI/pLUTo.

关键词： performance evaluation Energy consumption Microarchitecture Source coding Random access memory computer architecture System integration

来源：评论

学校读者我要写书评

暂无评论

PPS: A Packets Pattern-based Video Identification in Encrypted Network Traffic 23

PPS: A Packets Pattern-based Video Identification in Encrypt...

引用

proceedings of the IEEE/ACM 16th International Conference on Utility and Cloud computing

作者： Syed Muhammad Ammar Hassan Bukhari Muhammad Afaq Wang-Cheol Song Department of Computer Engineering Jeju National University Jeju-si Republic of Korea

ISBN: (纸本)9798400702341

Video identification in encrypted network traffic has become a trending field in the research area for user behavior and Quality of Experience (QoE) analysis. However, the traditional methods of video identification have become ineffective with the usage of Hypertext Transfer Protocol Secure (HTTPS). this paper presents a video identification method in encrypted network traffic using the number of packets received at the user's end in a second. For this purpose, video streams are captured, and feature is extracted from the video streams in the form of a series of Packets per Seconds (PPS). this feature is provided as input to a Convolutional Neural Network (CNN), which learns the pattern from the network traffic feature and successfully identifies the video even if the pattern differs from the training sample. the results show that PPS outperforms the other video identification techniques with a high accuracy of 90%. Moreover, the results show that CNN outperforms its counterpart regarding video identification with a 25% performance increase.

关键词： video identification

来源：评论

学校读者我要写书评

暂无评论

FedLTF: Linear Probing Teaches Fine-tuning to Mitigate Noisy Labels in Federated Learning 16

FedLTF: Linear Probing Teaches Fine-tuning to Mitigate Noisy...

引用

16th Asian Conference on Machine Learning, ACML 2024

作者： Zhan, Shaojie Yu, Lixing Chen, Hanqi Ji, Tianxi School of Information Science and Engineering Yunnan University Yunnan China Department of Computer Science Texas Tech University Lubbock United States

the presence of noisy labels has always been a primary factor affecting the effectiveness of federated learning (FL). Conventional FL approaches relying on Supervised Learning (SL) tend to overfit the noise labels, resulting in suboptimal Feature Extractor (FE). In this paper, we exploit models obtained in Self-Supervised Learning (SSL) to mitigate the impact of noisy labels in FL. In addition, we explore two popular methods to transfer to downstream tasks: linear probing, which updates only the last classification layers, and fine-tuning, which updates all model parameters. We empirically observe that, although fine-tuning typically yields higher accuracy than linear probing, in the presence of noise, it is very sensitive to noisy labels and will cause performance degradation. To achieve the best of both worlds (i.e., high accuracy and robustness against noisy labels), we "teach" fine-tuning to control overfitting. In particular, we leverage SSL to obtain a robust FE that is unaffected by noisy labels, and employ linear probing to train the classifiers. the FE and classifiers are integrated to construct a teacher model, which undergoes knowledge distillation to instruct the fine-tuning process of the student model. Extensive experimental evaluations conducted on multiple datasets demonstrate the effectiveness and robustness of our proposed framework against noisy labels in FL, outperforming state-of-the-art methods. the code is available at https://***/ss3b3/FedLTF. © 2024 S. Zhan, L. Yu, H. Chen & T. Ji.

关键词： Federated learning

来源：评论

学校读者我要写书评

暂无评论

Confidential Serverless Made Efficient with Plug-In Enclaves 21

Confidential Serverless Made Efficient with Plug-In Enclaves

引用

ACM/IEEE 48th Annual International symposium on computer architecture (ISCA)

作者： Li, Mingyu Xia, Yubin Chen, Haibo Shanghai Jiao Tong Univ Inst Parallel & Distributed Syst Shanghai AI Lab Shanghai Peoples R China

ISBN: (纸本)9781665433334

Serverless computing has become a fact of life on modern clouds. A serverless function may process sensitive data from clients. Protecting such a function against untrusted clouds using hardware enclave is attractive for user privacy. In this work, we run existing serverless applications in SGX enclave, and observe that the performance degradation can be as high as 5.6 x to even 422.6 x . Our investigation identifies these slowdowns are related to architectural features, mainly from page-wise enclave initialization. Leveraging insights from our overhead analysis, we revisit SGX hardware design and make minimal modification to its enclave model. We extend SGX with a new primitive-region-wise plugin enclaves that can be mapped into existing enclaves to reuse attested common states amongst functions. By remapping plugin enclaves, an enclave allows in-situ processing to avoid expensive data movement in a function chain. Experiments show that our design reduces the enclave function latency by 94.74-99.57%, and boosts the autoscaling throughput by 19-179 x.

关键词： Intel SGX Serverless Confidential computing

来源：评论

学校读者我要写书评

暂无评论

Considerations Regarding the Perturb And Observe Method To Control high-Power Wind Systems Operating At Variable Wind Speeds 16

Considerations Regarding The Perturb And Observe Method To C...

引用

16th IEEE International symposium on Applied Computational Intelligence and Informatics, SACI 2022

作者： Sorandaru, Ciprian Musuroi, Sorin Ancuti, Mihaela Codruta Ancuti, Razvan Stanciu, Alin Marius Lazar, Meda Alexandra University Politehnica Timisoara Department of Electrical Engineering Timisoara Romania University Politehnica Timisoara Department of Computer Sciences and Information Technology Timisoara Romania

ISBN: (数字)9781665481250

ISBN: (纸本)9781665481250

In this study, the situations in which the Perturb and Observe Method (POM) can be used to determine the points in which the wind turbine operates at maximum power were examined. Among them, the points in which there are large fluctuations in wind speed over time were taken into consideration. the performance of management systems in wind power plants operating at variable wind speeds is determined using data received through measurements (wind speed values). the control systems based on the Perturb and Observe Method were analyzed and compared with the ones based PI-type (proportional-integrator) regulators, resulting in that the first ensures the wind turbine operates in the optimal area from the energy point of view. the conditions under which the Perturb and Observe Method can be applied have been determined here. the maximum power point coordinates (MPP) were also estimated. the connection between the optimal mechanical angular velocity and the wind speed was used to assess the quality of the adjustment. In this case, the obtained results were based on experimental data from the wind turbines in the Dobrogea area. the optimum area from the energy point of view is determined by changing the load of the electric generator and therefore, the new coordinates of the maximum power point of the turbine are estimated, e.g. the maximum power value and the optimal mechanical angular velocity. © 2022 IEEE.

关键词： Wind turbines

来源：评论

学校读者我要写书评

暂无评论

Albireo: Energy-Efficient Acceleration of Convolutional Neural Networks via Silicon Photonics 21

Albireo: Energy-Efficient Acceleration of Convolutional Neur...

引用

ACM/IEEE 48th Annual International symposium on computer architecture (ISCA)

作者： Shiflett, Kyle Karanth, Avinash Bunescu, Razvan Louri, Ahmed Ohio Univ Athens OH 45701 USA Univ N Carolina Charlotte NC USA George Washington Univ Washington DC 20052 USA

ISBN: (纸本)9781665433334

With the end of Dennard scaling, highly-parallel and specialized hardware accelerators have been proposed to improve the throughput and energy-efficiency of deep neural network (DNN) models for various applications. However, collective data movement primitives such as multicast and broadcast that are required for multiply-and-accumulate (MAC) computation in DNN models are expensive, and require excessive energy and latency when implemented with electrical networks. this consequently limits the scalability and performance of electronic hardware accelerators. Emerging technology such as silicon photonics can inherently provide efficient implementation of multicast and broadcast operations, making photonics more amenable to exploit parallelism within DNN models. Moreover, when coupled with other unique features such as low energy consumption, high channel capacity with wavelength-division multiplexing (WDM), and high speed, silicon photonics could potentially provide a viable technology for scaling DNN acceleration. In this paper, we propose Albireo, an analog photonic architecture for scaling DNN acceleration. By characterizing photonic devices such as microring resonators (MRRs) and Mach-Zehnder modulators (MZM) using photonic simulators, we develop realistic device models and outline their capability for system level acceleration. Using the device models, we develop an efficient broadcast combined with multicast data distribution by leveraging parameter sharing through unique WDM dot product processing. We evaluate the energy and throughput performance of Albireo on DNN models such as ResNet18, MobileNet and VGG16. When compared to current state-of-the-art electronic accelerators, Albireo increases throughput by 110X, and improves energy-delay product (EDP) by an average of 74 X with current photonic devices. Furthermore, by considering moderate and aggressive photonic scaling, the proposed Albireo design shows that EDP can be reduced by at least 229 X.

关键词： optical computing silicon photonics hardware acceleration deep neural networks

来源：评论

学校读者我要写书评

暂无评论

A Low-Cost Accelerator for License Plate Character Recognition Using Convolutional Neural Networks

A Low-Cost Accelerator for License Plate Character Recogniti...

引用

IEEE Latin American symposium on Circuits and Systems (LASCAS)

作者： George B. Nardes thiago H. Rausch Douglas R. Melo Cesar A. Zeferino LEDS University of Vale do Itajai Itajai Brazil

ISBN: (数字)9798331522124

ISBN: (纸本)9798331522131

Convolutional Neural Networks (CNNs) are widely used for optical character recognition of vehicle license plates in automatic license plate recognition (ALPR) systems. However, their high computational complexity makes meeting specific ALPR applications' time and cost requirements challenging. this work aimed to develop a CNN architecture and select a hardware acceleration technique to create a low-cost optical character recognition (OCR) system capable of real-time vehicle identification. We designed the CNN architecture with accuracy and simplicity in mind, and we chose the hardware acceleration technique based on silicon cost and performance. Our 8-bit quantized CNN achieved an accuracy of 97.11%, and the accelerator resulted in a latency of 4.21 ms and a throughput of 598 FPS. the solution offers accuracy and performance comparable to related work methods, using less than 20% of the hardware resources.

关键词： Accuracy Costs computer architecture throughput Real-time systems Convolutional neural networks Character recognition License plate recognition Hardware acceleration Field programmable gate arrays

来源：评论

学校读者我要写书评

暂无评论

Ultra Efficient Acceleration for De Novo Genome Assembly via Near-Memory computing 21

Ultra Efficient Acceleration for De Novo Genome Assembly via...

引用

proceedings of the 30th International Conference on Parallel architectures and Compilation Techniques

作者： Minxuan Zhou Niema Moshiri Lingxi Wu Kevin Skadron Muzhou Li Tajana Rosing University of California San Diego University of Virginia

ISBN: (纸本)9781665442787

De novo assembly of genomes for which there is no reference, is essential for novel species discovery and metagenomics. In this work, we accelerate two key performance bottlenecks of DBG-based assembly, graph construction and graph traversal, with a near-data processing (NDP) architecture based on 3D-stacking. the proposed framework distributes key operations across NDP cores to exploit a high degree of parallelism and high memory bandwidth. We propose several optimizations based on domain-specific properties to improve the performance of our design. We integrate the proposed techniques into an existing DBG assembly tool, and our simulation-based evaluation shows that the proposed NDP implementation can improve the performance of graph construction by 33× and traversal by 16× compared to the state-of-the-art.

关键词：

来源：评论

学校读者我要写书评

暂无评论

A Shared Memory SMC Sampler for Decision Trees

A Shared Memory SMC Sampler for Decision Trees

引用

International symposium on computer architecture and high performance computing (SBAC-PAD)

作者： Efthyvoulos Drousiotis Alessandro Varsi Paul G. Spirakis Simon Maskell Department of Electrical Engineering and Electronics University of Liverpool Liverpool UK Department of Computer Science University of Liverpool Liverpool UK

Modern classification problems tackled by using Decision Tree (DT) models often require demanding constraints in terms of accuracy and scalability. this is often hard to achieve due to the ever-increasing volume of data used for training and testing. Bayesian approaches to DTs using Markov Chain Monte Carlo (MCMC) methods have demonstrated great accuracy in a wide range of applications. However, the inherently sequential nature of MCMC makes it unsuitable to meet both accuracy and scaling constraints. One could run multiple MCMC chains in an embarrassingly parallel fashion. Despite the improved run-time, this approach sacrifices accuracy in exchange for strong scaling. Sequential Monte Carlo (SMC) samplers are another class of Bayesian inference methods that also have the appealing property of being parallelizable without trading off accuracy. Nevertheless, finding an effective parallelization for the SMC sampler is difficult, due to the challenges in parallelizing its bottleneck, redistribution, in such a way that the workload is equally divided across the processing elements, especially when dealing with variable-size models such as DTs. this study presents a parallel SMC sampler for DTs on Shared Memory (SM) architectures, with an $O(log_{2} N)$ parallel redistribution for variable-size samples. On an SM machine mounting 32 cores, the experimental results show that our proposed method scales up to a factor of 16 compared to its serial implementation, and provides comparable accuracy to MCMC, but 51 times faster.

关键词：

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共233页 << < 17 18 19 20 21 22 23 24 25 26 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：