In this paper, we present a novel method of downlink precoding for cell-free massive multiple-input multiple-output (MIMO) systems using over-the-air (OTA) training. By drawing analogies between a cell-free massive MI...
详细信息
The accessibility of real-time operational data along with breakthroughs in processing power have promoted the use of Machine Learning (ML) applications in current power systems. Prediction of device failures, meteoro...
详细信息
ISBN:
(纸本)9781665460125
The accessibility of real-time operational data along with breakthroughs in processing power have promoted the use of Machine Learning (ML) applications in current power systems. Prediction of device failures, meteorological data, system outages, and demand are among the applications of ML in the electricity grid. In this paper, a Reinforcement Learning (RL) method is utilized to design an efficient energy management system for grid-tied Energy Storage Systems (ESS). We implement a Deep Q-Learning (DQL) approach using Artificial neuralnetworks (ANN) to design a microgrid controller system simulated in the PSCAD environment. The proposed on-grid controller coordinates the main grid, aggregated loads, renewable generations, and Advanced Energy Storage (AES). To reduce the cost of operating AESs, the designed controller takes the hourly energy market price into account in addition to physical system characteristics.
In order to exchange information between systems, the information must get encoded into a predefined data format, and it must be transferred in a protocol that the communicating parties have agreed upon. This works we...
详细信息
ISBN:
(纸本)9783903176591
In order to exchange information between systems, the information must get encoded into a predefined data format, and it must be transferred in a protocol that the communicating parties have agreed upon. This works well if all parties follow the same protocol standard and use the same data description schemes. If systems use different data formats or protocols, then some sort of translation is required. Protocol and data format translation has been attempted previously through rule-based approaches, ontologies, and also by using machine learning (ML) techniques. Due to the current advances related to AI/ML methods, tools, and infrastructure, the accuracy and feasibility of "translation" with ML-approaches improved significantly. This paper introduces a generic approach and methodology for translating data formats and protocols with ML-based methods and presents our initial results through JSON-XML and JSON-SenML translation.
Temporal Graph neuralnetworks (TGNNs) are powerful models to capture temporal, structural, and contextual information on temporal graphs. The generated temporal node embeddings outperform other methods in many downst...
详细信息
ISBN:
(纸本)9781665481069
Temporal Graph neuralnetworks (TGNNs) are powerful models to capture temporal, structural, and contextual information on temporal graphs. The generated temporal node embeddings outperform other methods in many downstream tasks. Real-world applications require high performance inference on real-time streaming dynamic graphs. However, these models usually rely on complex attention mechanisms to capture relationships between temporal neighbors. In addition, maintaining vertex memory suffers from intrinsic temporal data dependency that hinders task-level parallelism, making it inefficient on general-purpose processors. In this work, we present a novel model-architecture co-design for inference in memory-based TGNNs on FPGAs. The key modeling optimizations we propose include a light-weight method to compute attention scores and a related temporal neighbor pruning strategy to further reduce computation and memory accesses. These are holistically coupled with key hardware optimizations that leverage FPGA hardware. We replace the temporal sampler with an on-chip FIFO based hardware sampler and the time encoder with a look-up-table. We train our simplified models using knowledge distillation to ensure similar accuracy vis- ' a-vis the original model. Taking advantage of the model optimizations, we propose a principled hardware architecture using batching, pipelining, and prefetching techniques to further improve the performance. We also propose a hardware mechanism to ensure the chronological vertex updating without sacrificing the computation parallelism. We evaluate the performance of the proposed hardware accelerator on three real-world datasets. The proposed model reduces the computation complexity by 84% and memory accesses by 67% with less than 0.33% accuracy loss. Compared with CPU/GPU, our FPGA accelerator achieves 16.4/2.3x speedup in latency and 0.27% improvement in accuracy compared with the state-of-the-art inference algorithm. To the best of our knowledge, this
Electroluminescence (EL) image technology has long been a standard tool for detecting defects in the photo-voltaic (PV) industry. In recent years, deep learning technology has been developed rapidly and applied to in-...
详细信息
Electroluminescence (EL) image technology has long been a standard tool for detecting defects in the photo-voltaic (PV) industry. In recent years, deep learning technology has been developed rapidly and applied to in-line inspection of EL images of PV modules, but not without challenges. In addition to imbalanced data, the recall rate and inspection speed for detecting defects of PV modules are unable to meet industrial expectations. Therefore, this study proposes a hierarchical inspection system to address these challenges. To alleviate the problem of imbalanced data, an EL image of sc-Si solar modules is cropped into images of single solar cells. This allows us to reduce the processing size of a large EL image without overlooking small defects and to increase the amount of data samples for learning fine-to-coarse defect features. To speed up detection, we design two light-weighted convolutional neuralnetwork (CNN) models according to feature map analysis. Experimental results show the precision and recall rate of our CNN models on the testing dataset reach 99.36% and 98.77% respectively, which are confirmed by t-distributed stochastic neighbor embedding visualization. With the same detection discrimi-nability, our model needs only 49.63% of the detection time of ResNet-50. After alleviating the problem of imbalanced data, the performance of this hierarchical inspection system meets the requirement for the in-line inspection of the industry.
One of the most important steps in the manufacturing of electronic goods is the creation of printed circuit boards (PCB). An electronic device's PCBs are the first stage in production, so even a small error there ...
详细信息
The success of deep neuralnetworks suggests that cognition may emerge from indecipherable patterns of distributedneural activity. Yet these networks are pattern-matching black boxes that cannot simulate higher cogni...
详细信息
The success of deep neuralnetworks suggests that cognition may emerge from indecipherable patterns of distributedneural activity. Yet these networks are pattern-matching black boxes that cannot simulate higher cognitive functions and lack numerous neurobiological features. Accordingly, they are currently insufficient computational models for understanding neural information processing. Here, we show how neural circuits can directly encode cognitive processes via simple neurobiological principles. To illustrate, we implemented this model in a non-gradient-based machine learning algorithm to train deep neuralnetworks called essence neuralnetworks (ENNs). neural information processing in ENNs is intrinsically explainable, even on benchmark computer vision tasks. ENNs can also simulate higher cognitive functions such as deliberation, symbolic reasoning and out-of-distribution generalization. ENNs display network properties associated with the brain, such as modularity, distributed and localist firing, and adversarial robustness. ENNs establish a broad computational framework to decipher the neural basis of cognition and pursue artificial general intelligence.
Cryptocurrency phishing scams is a significant treat to Ethereum, one of the most popular blockchain platforms. Most of existing Ethereum phishing detection methods are based on traditional machine learning or graph r...
详细信息
This paper proposes a fully decentralized federated learning (FL) scheme for Internet of Everything (IoE) devices that are connected via multi-hop networks. Because FL algorithms hardly converge the parameters of mach...
详细信息
This paper proposes a fully decentralized federated learning (FL) scheme for Internet of Everything (IoE) devices that are connected via multi-hop networks. Because FL algorithms hardly converge the parameters of machine learning (ML) models, this paper focuses on the convergence of ML models in function spaces. Considering that the representative loss functions of ML tasks e.g., mean squared error (MSE) and Kullback-Leibler (KL) divergence, are convex functionals, algorithms that directly update functions in function spaces could converge to the optimal solution. The key concept of this paper is to tailor a consensus-based optimization algorithm to work in the function space and achieve the global optimum in a distributed manner. This paper first analyzes the convergence of the proposed algorithm in a function space, which is referred to as a meta-algorithm, and shows that the spectral graph theory can be applied to the function space in a manner similar to that of numerical vectors. Then, consensus-based multi-hop federated distillation (CMFD) is developed for a neuralnetwork (NN) to implement the meta-algorithm. CMFD leverages knowledge distillation to realize function aggregation among adjacent devices without parameter averaging. An advantage of CMFD is that it works even with different NN models among the distributed learners. Although CMFD does not perfectly reflect the behavior of the meta-algorithm, the discussion of the meta-algorithm's convergence property promotes an intuitive understanding of CMFD, and simulation evaluations show that NN models converge using CMFD for several tasks. The simulation results also show that CMFD achieves higher accuracy than parameter aggregation for weakly connected networks, and CMFD is more stable than parameter aggregation methods.
Deep neuralnetworks (DNNs) are increasingly popular owing to their ability to solve complex problems such as image recognition, autonomous driving, and natural language processing. Their growing complexity coupled wi...
详细信息
ISBN:
(纸本)9798350339864
Deep neuralnetworks (DNNs) are increasingly popular owing to their ability to solve complex problems such as image recognition, autonomous driving, and natural language processing. Their growing complexity coupled with the use of larger volumes of training data (to achieve acceptable accuracy) has warranted the use of GPUs and other accelerators. Such accelerators are typically expensive, with users having to pay a high upfront cost to acquire them. For infrequent use, users can, instead, leverage the public cloud to mitigate the high acquisition cost. However, with the wide diversity of hardware instances (particularly GPU instances) available in public cloud, it becomes challenging for a user to make an appropriate choice from a cost/performance standpoint. In this work, we try to address this problem by (i) introducing a comprehensive distributed deep learning (DDL) profiler Stash, which determines the various execution stalls that DDL suffers from, and (ii) using Stash to extensively characterize various public cloud GPU instances by running popular DNN models on them. Specifically, it estimates two types of communication stalls, namely, interconnect and network stalls, that play a dominant role in DDL execution time. Stash is implemented on top of prior work, DS-analyzer, that computes only the CPU and disk stalls. Using our detailed stall characterization, we list the advantages and shortcomings of public cloud GPU instances for users to help them make an informed decision(s). Our characterization results indicate that the more expensive GPU instances may not be the most performant for all DNN models and that AWS can sometimes sub-optimally allocate hardware interconnect resources. Specifically, the intra-machine interconnect can introduce communication overheads of up to 90% of DNN training time and the network-connected instances can suffer from up to 5x slowdown compared to training on a single instance. Furthermore, (iii) we also model the impact of DNN m
暂无评论