检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

4,280 篇 会议
47 篇 期刊文献
3 册 图书

馆藏范围

4,330 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

2,431 篇 工学
- 2,167 篇 计算机科学与技术...
- 1,080 篇 软件工程
- 607 篇 电气工程
- 586 篇 信息与通信工程
- 185 篇 仪器科学与技术
- 138 篇 控制科学与工程
- 120 篇 电子科学与技术（可...
- 64 篇 生物工程
- 59 篇 机械工程
- 43 篇 动力工程及工程热...
- 38 篇 交通运输工程
- 31 篇 生物医学工程（可授...
- 27 篇 安全科学与工程
- 26 篇 网络空间安全
- 24 篇 光学工程
- 23 篇 土木工程
- 22 篇 材料科学与工程（可...
- 20 篇 建筑学
- 19 篇 航空宇航科学与技...
418 篇 理学
- 253 篇 数学
- 86 篇 物理学
- 72 篇 生物学
- 51 篇 系统科学
- 42 篇 统计学（可授理学、...
261 篇 管理学
- 179 篇 管理科学与工程(可...
- 97 篇 图书情报与档案管...
- 65 篇 工商管理
35 篇 医学
- 30 篇 临床医学
33 篇 法学
- 19 篇 社会学
18 篇 经济学
- 18 篇 应用经济学
11 篇 农学
7 篇 教育学
7 篇 文学
2 篇 军事学
1 篇 艺术学

主题

543 篇 parallel process...
259 篇 computer archite...
238 篇 computational mo...
207 篇 distributed comp...
186 篇 concurrent compu...
179 篇 computer network...
164 篇 network topology
157 篇 hardware
157 篇 graphics process...
148 篇 routing
130 篇 bandwidth
123 篇 application soft...
120 篇 neural networks
120 篇 protocols
107 篇 scalability
104 篇 multicore proces...
103 篇 throughput
102 篇 algorithm design...
102 篇 kernel
101 篇 computer science

机构

15 篇 institute of inf...
14 篇 institute of com...
14 篇 science and tech...
13 篇 college of compu...
13 篇 school of cyber ...
13 篇 shandong fundame...
12 篇 school of comput...
10 篇 science and tech...
9 篇 univ chinese aca...
9 篇 university of ch...
9 篇 natl univ def te...
9 篇 natl univ def te...
8 篇 shenyang aerospa...
8 篇 beijing univ pos...
8 篇 nanjing univ aer...
8 篇 beijing universi...
7 篇 shanghai jiao to...
7 篇 college of compu...
7 篇 chinese acad sci...
7 篇 national key lab...

作者

17 篇 bagherzadeh nade...
16 篇 kotenko igor
14 篇 j. duato
12 篇 wang yijie
11 篇 daneshtalab maso...
10 篇 ebrahimi masoume...
10 篇 yijie wang
10 篇 nader bagherzade...
10 篇 igor kotenko
10 篇 lexi xu
9 篇 li dongsheng
9 篇 marco aldinucci
9 篇 matsutani hiroki
8 篇 hiroki matsutani
8 篇 masoumeh ebrahim...
8 篇 plosila juha
8 篇 marco danelutto
8 篇 wang huaimin
7 篇 tenhunen hannu
7 篇 xinzhou cheng

语言

4,282 篇 英文
39 篇 其他
11 篇 中文
1 篇 乌克兰文

检索条件"任意字段=Euromicro Conference on Parallel, Distributed and Network-Based Processing"

共 4330 条记录，以下是131-140 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Straggler Mitigation in distributed Deep Learning: A Cluster-based Hybrid Synchronization Approach

Straggler Mitigation in Distributed Deep Learning: A Cluster...

引用

euromicro conference on parallel, distributed and network-based processing

作者： Mustafa Burak Senyigit Deniz Turgay Altilar Department of Computer Engineering Istanbul Technical University Istanbul Turkey Aselsan A.S. Ankara Turkey

ISBN: (数字)9798331524937

ISBN: (纸本)9798331524944

The rapid growth in model sizes and training datasets has led researchers to focus on distributed deep learning to accelerate the training process. Bulk Synchronous parallel (BSP) and Asynchronous parallel (ASP) are two fundamental synchronization paradigms employed in distributed training. BSP allows workers to iterate synchronously but is prone to the straggler problem. In contrast, ASP enables asynchronous iteration, but training with stale gradients can reduce statistical efficiency. This paper introduces a cluster-based, hierarchical, and hybrid synchronization scheme designed to mitigate the straggler effect and enhance resource utilization in heterogeneous training workloads. We define performance metrics for communication and computation capabilities of workers, and then cluster them based on their performance scores. The clusters are placed on a hierarchical tree where the slower clusters are placed of the deeper levels, and the performant clusters are positioned closer to the root. Workers within the same cluster adopt BSP utilizing ring allreduce, while inter-cluster communication is facilitated asynchronously through the master node in each cluster. This approach aims to minimize waiting times among workers and effectively overlap communication and computation. Experiments conducted on a toy CNN model and the Fashion MNIST dataset demonstrate that our method achieves convergence 1.76 and 1.93 times faster than BSP and ASP, respectively.

关键词： Deep learning Training Metalearning Computational modeling Prevention and mitigation Toy manufacturing industry Computer architecture Synchronization Resource management Convergence

来源：评论

学校读者我要写书评

暂无评论

On-Demand and Automatic Deployment of Microservice at the Edge based on NGSI-LD

On-Demand and Automatic Deployment of Microservice at the Ed...

引用

euromicro conference on parallel, distributed and network-based processing

作者： Francesco Martella Valeria Lukaj Maria Fazio Antonio Celesti Massimo Villari University of Messina Italy

This paper focuses on a new approach to conceiving “virtual sensors” operating in smart environments, which are abstracted components able to map different behaviours on the same Internet of Things (IoT)-based infrastructures according to the needs of the high-level applications. To realize “virtual sensors”, it is necessary to codify user requests in an automation process for the deployment at the Edge of the microservices (MSs) that satisfy such requests. We present a solution that implements all the necessary functionalities to bind the user application with the Edge device in charge to execute the “virtual sensors”. In particular, the solution we propose is based on the FIWARE NGSI-LD information model, which helps us to standardize the communication among the different entities involved in the process. Moreover, the paper describes the reference architecture we designed, provides the implementation details of our first prototype and reports the results of our evaluation experiments.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Assessing the Performance of Docker in Docker Containers for Microservice-based Architectures

Assessing the Performance of Docker in Docker Containers for...

引用

euromicro conference on parallel, distributed and network-based processing

作者： Felipe Bedinotto Fava Luiz Felipe Laviola Leite Luís Fernando Alves Da Silva Pedro Ramires Da Silva Amalfi Costa Angelo Gaspar Diniz Nogueira Amanda Fagundes Gobus Lopes Claudio Schepke Diego Luis Kreutz Rodrigo Brãndao Mansilha Graduate Software Engineering (PPGES) - Laboratory of Advances Studies in Computation (LEA) Federal University of Pampa (UNIPAMPA) Alegrete Brazil

We provide a comprehensive and updated assessment of Docker versus Docker in Docker (DinD), evaluating its impact on CPU, memory, disk, and network. Using different workloads, we evaluate DinD's performance across distinct hardware platforms and GNU/Linux distributions on cloud Infrastructure as a Service (laaS) platforms like Google Compute Engine (GCE) and traditional server-based environments. We developed an automated tools suite to achieve our goal. We execute four well-known benchmarks on Docker and its nested-container variant. Our findings indicate that nested-containers require up to 7 seconds for startup, while the Docker standard containers require less than 0.5 seconds for Debian and Alpine operating systems. Our results suggest that Docker containers based on Debian consistently outperform their Alpine counter-parts, showing lower CPU latency. A key distinction among these Docker images lies in the varying number of installed libraries (e.g., stretching from 13 to 119) across different Linux distributions for the same system (e.g., MySQL). Furthermore, the number of events and CPU latency indicates that the influence of DinD over Docker proves that it is insignificant for both operating systems. In terms of memory, running containers of Debian-based images consume 20% more size of memory than those based on Alpine. No significant differences are between nested-containers and Dockers for disk and network IO. It is worth emphasizing that some of the disparities, such as a bigger memory footprint, appear to be a direct result of the software stack in use, including different kernel versions. libraries. and other essential packages.

关键词：

来源：评论

学校读者我要写书评

暂无评论

UAV Swarm Collaborative Transmission Optimization for Machine Learning Tasks 30

UAV Swarm Collaborative Transmission Optimization for Machin...

引用

30th IEEE International conference on parallel and distributed Systems, ICPADS 2024

作者： Chao, Liangchen Zhang, Bo Guo, Hengpeng Ji, Fangheng Li, Junfeng Zhengzhou University School of Cyber Science and Engineering Zhengzhou China

ISBN: (纸本)9798331515966

Recent developments in artificial intelligence technologies have seen an increasing volume of real-time data, collected by unmanned aerial vehicle (UAV), and processed by machine learning (ML) techniques instead of human labor. Traditional transmission techniques aiming at low loss rates are hence rendered ineffective for ML. In this paper, we investigate a collaborative transmission optimization method among a group of UAVs, with the goal of maximizing the efficiency of server-side ML tasks. Towards this end, we propose a novel network-coding enabled multi-agent deep reinforcement learning approach named DC-MAPPO in the UAV Ad Hoc network with limited resources. Our method deploys random linear network coding for source packet coding, with an improved multi-agent proximal policy optimization algorithm combining the dual-clip method for broadcasting strategy optimization. We adopt a handwriting recognition algorithm based on the MNIST dataset to verify the effectiveness of DC-MAPPO. Simulation results demonstrate that DC-MAPPO outperforms baseline schemes in terms of rewards, rate of convergence, and recognition efficiency of ML algorithms. © 2024 IEEE.

关键词： network coding

来源：评论

学校读者我要写书评

暂无评论

Dynamic Resource Partitioning for Multi-Tenant Systolic Array based DNN Accelerator

Dynamic Resource Partitioning for Multi-Tenant Systolic Arra...

引用

euromicro conference on parallel, distributed and network-based processing

作者： Midia Reshadi David Gregg School of Computer Science and Statistics Lero Trinity College Dublin Dublin 2 Ireland

Deep neural networks (DNN) have become a significant applications in both cloud-server and edge devices. Meanwhile, the growing number of DNNs on those platforms raises the need to execute multiple DNNs on the same device. This paper proposes a dynamic partitioning algorithm to perform concurrent processing of multiple DNNs on asystolic-array-based accelerator. Sharing an accelerator's storage and processing resources across multiple DNNs increases resource utilization and reduces computation time and energy consumption. To this end, we propose a partitioned weight stationary dataflow with a minor modification in the logic of the processing element. We evaluate the energy consumption and computation time with both heavy and light workloads. Simulation results show a 35% and 62% improvement in energy consumption and 56% and 44% in computation time under heavy and light workloads, respectively, compared with single tenancy.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Fast Maximal Independent Sets on Dynamic Graphs

Fast Maximal Independent Sets on Dynamic Graphs

引用

euromicro conference on parallel, distributed and network-based processing

作者： Prajjwal Nijhara Aditya Trivedi Dip Sankar Banerjee Department of Computer Science and Engineering Indian Institute of Technology Jodhpur NH62 Karwar Jodhpur Rajasthan India

ISBN: (数字)9798331524937

ISBN: (纸本)9798331524944

Finding the Maximal Independent Set (MIS) in a graph is a well-known problem with applications in resource allocation, load balancing, and routing optimization. This task is particularly challenging for large graphs as it requires multiple iterations over the entire set of vertices. Recently, there has been significant interest in developing techniques to maintain the MIS dynamically in evolving graphs rather than re-computing from scratch. In this paper, we propose new data structures and techniques for computing MIS in parallel on dynamic graphs. We specifically propose techniques to handle insertions and deletions in a batched setting. We conducted detailed experiments on shared memory multicore CPUs using graphs ranging from 50 million to ${1. 2}$ billion edges. Our results show that using our technique for insertions and deletions can provide up to 15.64x and 10.57x speedups on average over comparable baselines. Additionally, the final MIS we produce varies by only about ${0. 1 8 \%}$ in cardinality compared to the existing state-of-the-art.

关键词： Multicore processing Scalability Graphics processing units Routing Load management Data structures Real-time systems Distance measurement Resource management Optimization

来源：评论

学校读者我要写书评

暂无评论

Recursive Broadcasting Approach

Recursive Broadcasting Approach

引用

euromicro conference on parallel, distributed and network-based processing

作者： Hovhannes A. Harutyunyan Narek Hovhannisyan Department of Computer Science and Software Engineering Concordia University Montreal QC Canada NAS RA Institute for Informatics and Automation Problems Yerevan Armenia

ISBN: (数字)9798331524937

ISBN: (纸本)9798331524944

Broadcasting is one of the fundamental information dissemination primitives in interconnection networks, where a message is passed from one node (called originator) to all other nodes in the network. Following the increasing interest in interconnection networks, extensive research was dedicated to broadcasting. Two main research goals of this area are finding inexpensive network structures that maintain efficient broadcasting and finding the broadcast time for well-known and widely used network topologies. In the scope of this study, we will mainly focus on determining the broadcast time and nearoptimal broadcasting schemes in networks. Determination of the broadcast time of any node in an arbitrary network is known to be NP-hard. Polynomial time solutions are known only for a few network topologies. There also exist various heuristic and approximation algorithms for different network topologies. In this study, we consider the broadcast time problem on graphs that comprise some recursive structures. We initiate a novel direction to designing broadcasting algorithms on recursively defined graphs. We provide a theoretical foundation for future broadcasting studies, as well as discuss several practical applications of the approach we introduce.

关键词： Multicast algorithms network topology Multiprocessor interconnection Heuristic algorithms Broadcasting Approximation algorithms Polynomials Topology

来源：评论

学校读者我要写书评

暂无评论

Jointly Trajectory Representation Learning on Road network and Semantics using On-road IoT Data 30

Jointly Trajectory Representation Learning on Road Network a...

引用

30th IEEE International conference on parallel and distributed Systems, ICPADS 2024

作者： Gao, Longfei Zhao, Ying Li, Jiajia Zhang, Jing Yang, Yu Guo, Na Shenyang Aerospace University Shenyang China The Education University of Hong Kong Hong Kong

ISBN: (纸本)9798331515966

Internet of Things (IoT) data provides rich data sources and application scenarios for trajectory representation learning. Trajectory representation learning aims to transform the original trajectory information into a general low-dimensional vector representation for many different downstream tasks (trajectory similarity calculation, anomaly detection, etc.). Current road network-based trajectory learning methods mainly focus on the spatial structure of the road network and often ignore the semantic information and complex feature information embedded in IOT data, in addition, the spatial and semantic properties of trajectories cannot be adequately preserved simultaneously. To this end, we propose a Trajectory Representation Learning framework based on Road network-TRLR. It first uses a graph attention network to learn the topological and semantic properties of road network segments, then it fuses node vectors and segment vectors as representations of node units, this is to support more data input types and enhanced node characteristics. Lastly, it learns the travel semantics of the trajectories through an information-enhanced transformer model, which captures the sequence information in the trajectory and generates the trajectory representation vector. In addition, we also propose four data augmentation methods to ensure that the trajectories can maintain both their spatial and semantic properties. To validate the effectiveness of our modeling approach, we conduct experiments on real datasets for similar trajectory search and mask prediction tasks. The experimental results demonstrate the performance improvement of our model. © 2024 IEEE.

关键词： Vectors

来源：评论

学校读者我要写书评

暂无评论

A Cross-Platform OpenVX Library for FPGA Accelerators 29

A Cross-Platform OpenVX Library for FPGA Accelerators

引用

29th euromicro International conference on parallel, distributed and network-based processing (PDP)

作者： Angelica Davila-Guzman, Maria Gran Tejero, Ruben Villarroya-Gaudo, Maria Suarez Gracia, Dario Kalms, Lester Gohringer, Diana Univ Zaragoza DIIS I3A HiPEAC Network Excellence Zaragoza Spain Tech Univ Dresden HiPEAC Network Excellence Dresden Germany

ISBN: (纸本)9781665414555

In Computer Vision, open programming standards sod, as OpenVX have emerged to bring together portability and acceleration across devices. Unfortunately, achieving both goals on UPGAs remains a challenge because FPGAs still require to adapt the code with proprietary extensions. Exclusively for Xilinx devices, the HiF1ip X open source library partially solves this problem by offering a clean C++ OpenVX API that offers the performance of proprietary extensions without exposing its complexity to programmer. While HiFlipVX enables portability within Xilins devices, portability between FPGA manufacturers remains an open challenge. This work extends the HiFlipVX's capabilities with a twofold goal: i) to support Intel FPGA devices with different memory conhgurations. and ii) to enable execution on FPGAs as discrete accelerators. To accomplish these goals, the proposed implementation combines two HIS programming models: C++, using Inters system of tasks that enables to coalesce nodes and reduce control overhead, and OpenCL, which provides efficient compute kernel nodes. On Intel FPGAs. compared with pure OpenCL implementations, the proposed implementation reduces kernel dispatch resources, saving up to 24% of ALUT resources for each kernel in a graph, and improves performance. Gains are 2.6x on overage for representative applications, such as Canny edge detector, or Census transform, compared with state-of-the-art frameworks.

关键词： FPGA HiFlipVX HLS OpenVX OpenCL

来源：评论

学校读者我要写书评

暂无评论

Yggdrasil: Reducing network I/O Tax with (CXL-based) distributed Shared Memory 24

Yggdrasil: Reducing Network I/O Tax with (CXL-Based) Distrib...

引用

53rd International conference on parallel processing (ICPP)

作者： Tang, Wenda Han, Ying Ai, Tianxiang Li, Guanghui Yu, Bin Yang, Xin China Telecom eSurfing Cloud Beijing Peoples R China

ISBN: (纸本)9798400717932

In communication-intensive applications that run on hosts with high-speed network hardware, a common challenge arises from the significant burden placed on the native socket system within the OS. Researchers have devoted considerable effort to optimizing the kernel networking stack and moving the TCP/IP stack to user-space. In this paper, we describe a novel socket replacement solution, Yggdrasil, a CXL-based user-space high-performance socket system. Yggdrasil is fully compatible with Linux socket, making it a drop-in replacement for existing applications without the need for code modifications. In order to optimize performance, Yggdrasil employs CXL-based distributed shared memory (DSM) for inter-host communication whenever it is available. In cases where DSM is not accessible, Yggdrasil transparently switches back to Linux socket for communication. A key element in achieving isolation in Yggdrasil involves a trusted user-space monitoring daemon responsible for managing control plane operations like connection setup and access control. Within the data plane of Yggdrasil, a peer-to-peer model is adopted for communication between processes. To bridge the semantic gap between socket and DSM, we exploit several techniques to ensure compatibility and performance, including (1) transparent dynamic fast/slow data path navigation, ( 2) decentralized CXL memory management, (3) lock-free queue based QoS-aware dynamic data polling, and (4) semantics-aware memory page migration. By evaluating Yggdrasil on both emulated and real CXL hardware, we show that Yggdrasil outperforms Linux socket in Memcached throughput by 8.2x and reduces latency by 24 similar to 320x in a micro benchmark across different message sizes.

关键词： CXL Socket Rack-Scale Communication Disaggregation Memory Management

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共433页 << < 10 11 12 13 14 15 16 17 18 19 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：