检索结果-内蒙古大学图书馆

4th ieee international conference on VLSI systems, Architecture, Technology and applications, VLSI SATA 2024

作者： Bhagirath, K. Pant, Dheeraj Tiwari, Abhishek Khaneja, Vivek Centre For Development Of Advanced Computing Embedded Systems Division Noida India

ISBN: (纸本)9798350362268

This paper presents a streamlined approach to address the challenges of integrating heterogeneous clusters in deep learning(DL) accelerators. As the demand for scalable and high-performance computing in DL continues to grow, the proposed framework offers an easy implementation solution, ensuring efficiency without compromise. The method details core components and design principles, guiding the inclusion of heterogeneous clusters step by step. Extensive experiments show the framework's adaptability, handling diverse workloads and cluster architectures. Our exploration integrates various dataflows, dynamically selected by NN during runtime for optimal performance. Results indicate improved scalability and robustness compared to existing methods. This research provides a valuable resource for practitioners seeking efficient and accessible solutions for harnessing the power of heterogeneous clusters in DL acceleration, with implications for advancements in AI and computational efficiency. © 2024 ieee.

关键词： Computational efficiency

来源：评论

学校读者我要写书评

暂无评论

A Model-Based SoC FPGA Design Approach for Implementing Industrial High-Speed and Low-Latency Green Energy Control applications

A Model-Based SoC FPGA Design Approach for Implementing Indu...

引用

2024 ieee international conference on Artificial Intelligence and Green Energy, ICAIGE 2024

作者： Flatt, Holger Jongmanns, Malte Heilmann, Gernod Maly, Doris Petric, Alexandra Kirchner, Ingmar Holzapfel, Matthias Bunge, Frank Fraunhofer Iosb Lemgo32657 Germany Sokratel GmbH Bochum44801 Germany Wrd Wobben Research & Development GmbH Aurich26607 Germany

ISBN: (纸本)9798350389838

In this paper, a novel SoC FPGA based approach is proposed that accelerates the development of energy control applications for decentralized converter nodes that are controlled by one central unit. In this particular use case, Gigabit Ethernet in combination with FPGA based protocol processing is used for high-speed and low-latency communication between distributed units. The protocol processing is embedded within an MPSoC platform environment executing high-level tasks via a multicore CPU and real-time control within the logic part of the FPGA. For application specific controller development, the complete platform is integrated into MATLAB®/Simulink® by using embedded Coder™ and HDL Coder™ in order to hide FPGA details from the control algorithm designers. The proposed concept is evaluated for a real-world wind power plant control topology supporting control frequencies of up to 640 kHz. As a main benefit, the concept can be used for the development of new control applications with highest demands on control precision by reducing time-to-market with a real-world proven FPGA-based platform concept. © 2024 ieee.

关键词： MATLAB

来源：评论

学校读者我要写书评

暂无评论

Exploring the Performance of Deep Neural Networks on embedded Many-Core Processors 13

Exploring the Performance of Deep Neural Networks on Embedde...

引用

13th ACM/ieee international conference on Cyber-Physical systems (ICCPS )

作者： Yabe, Takuma Azumi, Takuya Saitama Univ Saitama Japan

ISBN: (纸本)9781665409674

This paper explores and evaluates the potential of deep neural network (DNN)-based machine learning algorithms on embedded many-core processors in cyber-physical systems, such as self-driving systems. To run applications in embedded systems, a platform characterized by low power consumption with high accuracy and real-time performance is required. Furthermore, a platform is required that allows the coexistence of DNN applications and other applications, including conventional real-time control software, to enable advanced embedded systems, such as self-driving systems. Clustered many-core processors, such as Katray MPPA380 Coolidge, can run multiple applications on a single platform because each cluster can run applications independently. Moreover, MPPA3-80 integrates multiple arithmetic elements that operate at low frequencies, thereby enabling high performance and low power consumption comparable to that of embedded graphics processing units. Furthermore, the Kalray Neural Network (KaNN) code generator, a deep learning inference compiler for the MPPA3-80 platform, can efficiently perform DNN inference on MPPA3-80. This paper evaluates DNN models, including You Only Look Once (YOLO)-based and Single Shot MultiBox Detector (SSD)-based models, on MPPA3-80. The evaluation examines the frame rate and power consumption in relation to the size of the input image, the computational accuracy, and the number of clusters.

关键词： many-core processor deep neural network embedded system Katray MPPA3-80 Coolidge

来源：评论

学校读者我要写书评

暂无评论

Evaluating the Potential of Elastic Jobs in HPC systems

Evaluating the Potential of Elastic Jobs in HPC Systems

引用

2023 international conference on High Performance computing, Network, Storage, and Analysis, SC Workshops 2023

作者： Eberius, David Rahman, Md. Wasi-Ur- Ozog, David Intel Corporation United States

ISBN: (纸本)9798400707858

It is generally assumed that elastic parallel applications, with the ability to dynamically resize their process count, would provide numerous benefits to High-Performance computing (HPC) systems and applications. Supporting this capability, however, requires significant effort at several layers of the HPC software stack. At a minimum, the resource management system, the distributed communication libraries, and the distributed applications themselves would have to explicitly support elasticity. With this level of widespread support required, there must be significant motivation for developers to commit to adding this capability. We aim to determine whether there are practical benefits to supporting elasticity by simulating HPC systems with support for elastic jobs using real-world job data. Our simulations show significant improvements of adding elastic jobs with up to 35.34% higher system utilization, 75.3% lower runtime, 99.76% lower wait time, and 75.22% lower total turnaround time. © 2023 Owner/Author.

关键词： Elasticity

来源：评论

学校读者我要写书评

暂无评论

Deep Learning Techniques for Local Spinach Variant and Freshness Detection

Deep Learning Techniques for Local Spinach Variant and Fresh...

引用

2024 ieee international conference on computing, applications and systems, COMPAS 2024

作者： Hasan, Md. Nahid Akhi, Amatul Bushra Shaha, Sonjoy Prosad Daffodil International University Department of Computer Science and Engineering Dhaka Bangladesh

ISBN: (纸本)9798331529765

Capturing the fundamental qualities and properties of the local spinach variation entails doing a thorough investigation. Examining the spinach variant's unique morphology, nutritional makeup, flavor character, and growing environment is part of this procedure. A thorough comprehension of the variant's unique characteristics is attained, making it easier to identify and distinguish it from other spinach variations. DL algorithms may be utilized to determine the freshness of the local spinach variation. One kind of DL model is the convolutional neural network (CNN), which may be educated on datasets including pictures of spinach at various stages of freshness. These models are trained to extract pertinent characteristics, including color, texture, and leaf shape, that are indicative of freshness. After training, the model can quickly and non-destructively evaluate spinach samples by correctly classifying them according to their freshness status. Using deep learning to detect freshness in the local spinach variety, an extensive framework for quality assurance and evaluation is created. This method guarantees that customers obtain a high-quality product by making it easier to identify and characterize the spinach variation and to assess its freshness in real time. © 2024 ieee.

关键词： Contrastive Learning

来源：评论

学校读者我要写书评

暂无评论

Modeling and Simulation of an Elastic Passive Joint for Non-flipping Jumping Robot

Modeling and Simulation of an Elastic Passive Joint for Non-...

引用

2024 ieee international conference on real-time computing and Robotics, RCAR 2024

作者： Li, Qi Peng, Liang Wu, Zhiyuan Ye, Pengda Zhang, Weitao Xu, Yi Shi, Qing Beijing Institute of Technology Intelligent Robotics Institute School of Mechatronical Engineering Beijing100081 China Beijing Institute of Technology Ministry of Education Key Laboratory of Biomimetic Robots and Systems Beijing100081 China

ISBN: (纸本)9798350372601

To enhance the environmental adaptability of small-sized robots, jumping is commonly employed to achieve high mobility and obstacle-clearing capabilities. However, the prevalent problem of flipping in jumping robots limits their practical applications. Instead of adding extra control mechanisms and actuators, we propose an elastic passive joint (EPJ) to mitigate flipping. By incorporating a revolute joint, a switch, and a spring at the base of the hindleg, the leg can rotate around the body during the take-off phase, with the spring absorbing angular kinetic energy to reduce excessive angular velocity. We conducted dynamic modeling and a series of simulations to optimize the EPJ's position and stiffness. The simulation results indicate that with the EPJ in operation, adjusting the axis position slightly results in a zero-point for angular velocity. Additionally, the optimal spring stiffness of 1566 N/m ensures a non-flipping jump, which decreases the jumping height but improves the jumping distance. © 2024 ieee.

关键词： Kinetic energy

来源：评论

学校读者我要写书评

暂无评论

Special issue on big data computing service and machine learning applications

引用

FUTURE GENERATION COMPUTER systems-THE international JOURNAL OF ESCIENCE 2025年 171卷

作者： Potika, Katerina Eirinaki, Magdalini Vitali, Monica Bernasconi, Anna Fujioka, Hiroyuki San Jose State Univ San Jose CA 95192 USA Politecn Milan Milan Italy Fukuoka Inst Technol Fukuoka Japan

This Special Issue addresses the evolving landscape of big data generated by sensors, devices, and services. The shift from centralized cloud infrastructures to distributed systems that involve cloud, edge, and Internet of Things (IoT) devices requires innovative approaches to managing and analyzing big data. The key challenges include privacy, security, energy efficiency, data quality, and trust. This Special Issue invited researchers to submit innovative solutions covering topics such as: Big Data Analytics and Machine Learning;Integrated, Heterogeneous, and Distributed Infrastructures for Big Data Management;Big Data Platforms and Technologies;real-time Big Data Services and applications;Big Data Security and Privacy Preservation;Big Data Quality and Trust;Trustworthy data sharing;Sustainability and Energy-Efficiency of Big Data;Storage and Computation;Big Data and Analytics for Healthcare;Big Data applications and Experiences. This initiative expands on discussions from the ieee Big Data Service (BDS) 2023 conference held in Athens Greece, reaching a broader audience of researchers.

关键词： Big Data Service Machine learning Big data applications Big Data Platforms and Technologies

来源：评论

学校读者我要写书评

暂无评论

Rigorous Floating-Point to Fixed-Point Quantization of Deep Neural Networks on STM32 Micro-controllers 10

Rigorous Floating-Point to Fixed-Point Quantization of Deep ...

引用

10th international conference on Control, Decision and Information Technologies (CoDIT)

作者： Ben Khalifa, Dorra Martel, Matthieu Univ Toulouse Federat ENAC ISAE SUPAERO ONERA Toulouse France Univ Perpignan Via Domitia LAMPS Lab Perpignan France Numalis Co 265 Ave Etats Languedoc Montpellier France

ISBN: (纸本)9798350373981;9798350373974

Embedding artificial intelligence onto low-power devices is a challenging task that has been partially overcome by recent advances in machine learning and hardware design. Currently, deep neural networks can be deployed on embedded targets to perform various tasks such as speech recognition, object detection or human activity recognition. However, it is still possible to optimize deep neural networks on embedded devices. These optimizations mainly concern energy consumption, memory and real-time constraints, but also easier deployment at the edge. In addition, there is still a need for a better understanding of what can be achieved for different use cases. This work focuses on the quantization and deployment of deep neural networks on low-power 32-bit microcontrollers. In this article, the quantization method used is based on solving an integer optimization problem derived from the neural network model and concerning the accuracy of the computations and results at each point of the network. We evaluate the performance of our quantization method on a collection of neural networks measuring the analysis time and time-to-solution improvement between the floating- and fixed-point networks, considering a typical embedded platform employing a STM32 Nucleo-144 microcontroller.

关键词： embedded systems artificial intelligence constraint generation quantization power consumption micro-controllers

来源：评论

学校读者我要写书评

暂无评论

GPU Acceleration of Multi-object Tracking with Motion Vector interpolation and Affine Transformation 34

GPU Acceleration of Multi-object Tracking with Motion Vector...

引用

34th ieee international conference on Application-Specific systems, Architectures and Processors (ASAP)

作者： Kunimoto, Yoshiki Chang, Qiong Yamaguchi, Yoshiki Maruyama, Tsutomu Univ Tsukuba Grad Sch Sci & Technol Tsukuba Japan Tokyo Inst Technol Sch Comp Tokyo Japan Univ Tsukuba Fac Syst & Informat Engn Tsukuba Japan

ISBN: (纸本)9798350346855

In recent studies of object detection and tracking, neural networks have been widely used, and their accuracy has improved. However, its computational complexity is very high and requires the use of high-end GPUs. In order to achieve real-time inference on edge devices, it is necessary to reduce the computational complexity of the network by scaling it down, but this leads to a loss of accuracy. To avoid this loss of accuracy, a method has been proposed in which object detection is performed using a neural network at regular intervals, and in the frames in between, the detected object positions are interpolated using motion prediction. In this research, we propose a method to improve the accuracy of interpolation even when the camera is moving by using an affine transformation used for image stabilization. We also show its real-time computation method on Jetson TX2, one of the lowest power embedded GPUs. The proposed method enables real-time processing of object detection using Yolov5s and tracking of the detected objects at the edge.

关键词： multi object tracking acceleration embedded GPU

来源：评论

学校读者我要写书评

暂无评论

PEERNet: An End-to-End Profiling Tool for real-time Networked Robotic systems

PEERNet: An End-to-End Profiling Tool for Real-Time Networke...

引用

2024 international conference on Intelligent Robots and systems

作者： Narayanan, Aditya Kasibhatla, Pranav Choi, Minkyu Li, Po-han Zhao, Ruihan Chinchali, Sandeep Univ Texas Austin Austin TX 78712 USA

ISBN: (纸本)9798350377712;9798350377705

Networked robotic systems balance compute, power, and latency constraints in applications such as self-driving vehicles, drone swarms, and teleoperated surgery. A core problem in this domain is deciding when to offload a computationally expensive task to the cloud, a remote server, at the cost of communication latency. Task offloading algorithms often rely on precise knowledge of system-specific performance metrics, such as sensor data rates, network bandwidth, and machine learning model latency. While these metrics can be modeled during system design, uncertainties in connection quality, server load, and hardware conditions introduce real-time performance variations, hindering overall performance. We introduce PEERNet, an end-to-end and real-time profiling tool for cloud robotics. PEERNet enables performance monitoring on heterogeneous hardware through targeted yet adaptive profiling of system components such as sensors, networks, deep-learning pipelines, and devices. We showcase PEERNet's capabilities through networked robotics tasks, such as image-based teleoperation of a Franka Emika Panda arm and querying vision language models using an Nvidia Jetson Orin. PEERNet reveals non-intuitive behavior in robotic systems, such as asymmetric network transmission and bimodal language model output. Our evaluation underscores the effectiveness and importance of benchmarking in networked robotics, demonstrating PEERNet's adaptability. Our code is open-source and available at ***/UTAustin-SwarmLab/PEERNet.

关键词： Visual languages

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：