Hyperdimensional computing (HDC) is a lightweight machine learning paradigm. Since HDC relies on bitwise operations instead of matrix multiplications, it is commonly used for classification tasks in edge computing dev...
详细信息
ISBN:
(纸本)9798350361766;9798350361759
Hyperdimensional computing (HDC) is a lightweight machine learning paradigm. Since HDC relies on bitwise operations instead of matrix multiplications, it is commonly used for classification tasks in edge computing devices. For this purpose, numerous hardware architectures have been proposed to accelerate HDC applications. However, existing solutions suffer from a lack of flexibility, which prevents from a deployment of HDC for a wide range of applications. In this paper, we propose a general-purpose HDC accelerator, called GP-HDCA, which is suitable for FPGAs implementation. To enable the efficient implementation of encoders, which is the most critical component in HDC, we define an instruction set tailored to ease the use of the accelerator as a coprocessor. Synthesis results show that our accelerator, configured with a 32-bit integer size and 32-bit vector slice, requires only 7% of the resources available in a Zedboard. Finally, our results show that a 12x speedup is achieved when processing a language detection application, demonstrating the suitability of the architecture for edge computing.
As drone technology penetrates even more application domains, Machine Learning (ML) is becoming a key driver enabling intelligence in the sky. However, ML Practitioners and Drone Application Operators are faced with s...
详细信息
ISBN:
(数字)9781665471770
ISBN:
(纸本)9781665471770
As drone technology penetrates even more application domains, Machine Learning (ML) is becoming a key driver enabling intelligence in the sky. However, ML Practitioners and Drone Application Operators are faced with several challenges when wanting to test ML-driven drone applications early in the design phase. These include the development and configuration of experiment use-cases over a robotics simulator along with the collection and assessment of desired KPIs which can range from ML algorithm accuracy to drone resource utilization and the impact of "intelligence" to the drone's energy footprint. This demonstration showcases FlockAI, an open and modular by design framework supporting users with the rapid deployment and repeatable testing during the design phase of ML-driven drone applications over the Webots robotics simulator. Through realistic use-cases, the demonstration will show how FlockAI can be used to design drone testbeds with "ready-to-go" drone templates, deploy ML models, configure on-board/remote inference, monitor and export drone resource utilization, network overhead and energy consumption to pinpoint performance inefficiencies and understand if various trade-offs can be exploited.
While visual sensing is often the predominant modality for a robot to localize objects in the environment, tactile and force sensing become crucial when objects are occluded, poorly visible, or buried. However, existi...
详细信息
ISBN:
(纸本)9798350377712;9798350377705
While visual sensing is often the predominant modality for a robot to localize objects in the environment, tactile and force sensing become crucial when objects are occluded, poorly visible, or buried. However, existing works on locating buried objects rely solely on force measurements at a single contact point on the robot end-effector, making 3D localization very challenging. This paper presents an alternative approach using a tactile sensor that measures both normal and shear forces (i.e. 3-axis) on distributed points;three Long Short-Term Memory (LSTM) models are trained with real-world data to perform real-time 3D localization (i.e. distance, direction and depth) of an object buried within a granular material. Our experimental results suggest that measuring both normal and shear forces (instead of just normal) on distributed contact points (instead of only one point) is essential for the accurate 3D localization of buried objects.
Neuromorphic computing, exemplified by spiking neural networks (SNN), seeks to replicate human brain functionality through event-driven processes, encoding information via spikes, and adopting biological learning prin...
详细信息
ISBN:
(纸本)9798350383638;9798350383645
Neuromorphic computing, exemplified by spiking neural networks (SNN), seeks to replicate human brain functionality through event-driven processes, encoding information via spikes, and adopting biological learning principles. Its comparative advantage over traditional computing lies in the eventdriven nature of computations, promising notably high energy efficiency. However, the hardware implementation of SNN poses limitations for various applications. This study proposes an In-memory computing (IMC) approach, utilizing a Resistive RAM-based (RRAM) crossbar array to expedite the SNN algorithm. The investigation scrutinizes the accuracy of three network variants-fp32, fp16, and int8-utilizing different data types. Remarkably, by reducing the datasize to one fourth of the original size, the accuracy increased by 1.17% after retraining. Additionally, quantizing the network from fp32 to 8-bit fixed point, and using an RRAM crossbar array, yielded savings of similar to 1634x in memory access energy, similar to 1636x in memory access latency, and similar to 132x in computations energy. Furthermore, utilizing the RRAM crossbar array for the acceleration of the quantized SNN algorithm yielded similar to 10x reduction in average power consumption per inference, and similar to 159x savings in required area.
In this paper, we demonstrate how Hyperledger Fabric, one of the most popular permissioned blockchains, can benefit from network-attached acceleration. The scalability and peak performance of Fabric is primarily limit...
详细信息
ISBN:
(数字)9781665471770
ISBN:
(纸本)9781665471770
In this paper, we demonstrate how Hyperledger Fabric, one of the most popular permissioned blockchains, can benefit from network-attached acceleration. The scalability and peak performance of Fabric is primarily limited by the bottlenecks present in its block validation/commit phase. We propose Blockchain Machine, a hardware accelerator coupled with a hardware-friendly communication protocol, to act as the validator peer. It can be adapted to applications and their smart contracts, and is targeted for a server with network-attached FPGA acceleration card. The Blockchain Machine retrieves blocks and transactions in hardware directly from the network interface, which are then validated through a configurable and efficient block-level and transaction-level pipeline. The validation results are then transferred to the host CPU where non-bottleneck operations are executed. From our implementation integrated with Fabric v1.4 LTS, we observed up to 12x speedup in block validation when compared to software-only validator peer, with commit throughput of up to 68,900 tps. Our work provides an acceleration platform that will foster further research on hardware acceleration of permissioned blockchains.
This paper presents a type II charge pump PLL operating at 23.8 GHz in 0.13 mu m SiGe BiCMOS technology for distributed beamforming applications. The PLL includes a differential Colpitts VCO that generates a highly re...
详细信息
ISBN:
(纸本)9798350387186;9798350387179
This paper presents a type II charge pump PLL operating at 23.8 GHz in 0.13 mu m SiGe BiCMOS technology for distributed beamforming applications. The PLL includes a differential Colpitts VCO that generates a highly reliable phased-locked signal. It provides a good locking range of 3.5 GHz and a loop bandwidth of 1 MHz with a phase margin of 77 degrees. Additionally, at a 1 MHz offset, the PLL exhibits a measured phase noise of -94 dBc/Hz.
With the rapid evolution of the Internet of Things (IoT), there is a noticeable surge in both the proliferation of edge devices and the voluminous data they generate. These edge devices are progressively furnished wit...
详细信息
ISBN:
(纸本)9798350359329;9798350359312
With the rapid evolution of the Internet of Things (IoT), there is a noticeable surge in both the proliferation of edge devices and the voluminous data they generate. These edge devices are progressively furnished with AI processors, harnessing the power of deep learning to augment their data processing capabilities. However, in edge environments, traditional federated learning methods typically send multiple models to a central server for aggregation, which gives rise to several tough challenges such as low data transmission efficiency, privacy concerns, and the threat of model poisoning attacks. In this paper, we introduce a distributed machine learning framework with an innovative collaborative voting mechanism to integrate the results of adaptive pruned models on various end devices for edge computing. The main goals of this framework are to mitigate the risk of data privacy and strengthen the system's resilience against model poisoning attacks. Additionally, an adaptive model pruning mechanism is implemented to tailor diverse models according to the limited computational resources available on end devices for enhancing training efficiency. Experiments reveal that our framework can effectively mitigate the impact of poisoning attacks, but also provide superior efficiency and accuracy for edge computing compared with other prevalent federated learning methods.
This paper introduces a new distributed State Variable Estimation Algorithm (DSVEA). This algorithm is developed based on the Kalman filter for estimating the state variables of large-scale systems. This new approach ...
详细信息
ISBN:
(纸本)9798350373981;9798350373974
This paper introduces a new distributed State Variable Estimation Algorithm (DSVEA). This algorithm is developed based on the Kalman filter for estimating the state variables of large-scale systems. This new approach involves decomposing the large-scale state estimation problem into several interconnected local sub-estimation problems. To demonstrate its effectiveness, the DSVEA is applied to estimate the state variables of a large-scale power system consisting of five synchronous generators. Simulation results confirm the excellent performance of the DSVEA, validating its potential for practical implementation.
The continued growth of distributed Energy Resources (DER) presents challenges to electrical grids, not the least of which is electrical protection. This article develops a classification criterion to identify the fau...
详细信息
In this paper, we design and develop a new multimedia distribution platform that mainly utilizes containerization and microservice architecture technologies. Using our approach, the multimedia service source code loca...
详细信息
ISBN:
(纸本)9798350364279;9798350364262
In this paper, we design and develop a new multimedia distribution platform that mainly utilizes containerization and microservice architecture technologies. Using our approach, the multimedia service source code located in a repository such as Git can be built into a container image for distribution and management, and the process of delivering it to the target edge device can be performed through a pipeline. In addition, distributed edge devices can be built into clusters with various connection profiles and utilized for services. Real-time monitoring functions are provided to ensure stable service operation even after the service is deployed. To implement this complex service platform, we follow the microservice architecture method. Stable operation was confirmed even during an operational test period of over a year. This technology is expected to help deploy multimedia services conveniently and quickly and manage them stably and efficiently.
暂无评论