The widespread use of the Internet and digital services has significantly increased data collection and processing. Critical domains like healthcare rely on this data, but privacy and security concerns limit its usabi...
详细信息
The widespread use of the Internet and digital services has significantly increased data collection and processing. Critical domains like healthcare rely on this data, but privacy and security concerns limit its usability, constraining the performance of intelligent systems, particularly those leveraging neuralnetworks (NNs). NNs require high-quality data for optimal performance, but existing privacy-preserving methods, such as Federated Learning and Differential Privacy, often degrade model accuracy. While Homomorphic Encryption (HE) has emerged as a promising alternative, existing HE-based methods face challenges in computational efficiency and scalability, limiting their real-world application. To address these issues, we introduce ENNigma, a novel framework employing state-of-the-art Fully Homomorphic Encryption techniques. This framework introduces optimizations that significantly improve the speed and accuracy of encrypted NN operations. Experiments conducted using the CIC-DDoS2019 dataset - a benchmark for distributed Denial of Service attack detection - demonstrate ENNigma's effectiveness. A classification performance with a maximum relative error of 1.01% was achieved compared to non-private models, while reducing multiplication time by up to 59% compared to existing FHE-based approaches. These results highlight ENNigma's potential for practical, privacy-preserving neuralnetwork applications.
Neurodynamic observations indicate that the cerebral cortex evolved by self-organizing into functional networks, These networks, or distributed clusters of regions, display various degrees of attention maps based on i...
详细信息
Neurodynamic observations indicate that the cerebral cortex evolved by self-organizing into functional networks, These networks, or distributed clusters of regions, display various degrees of attention maps based on input. Traditionally, the study of network self-organization relies predominantly on static data, overlooking temporal information in dynamic neuromorphic data. This paper proposes Temporal Self-Organizing (TSO) method for neuromorphic data processing using a spiking neuralnetwork. The TSO method incorporates information from multiple time steps into the selection strategy of the Best Matching Unit (BMU) neurons. It enables the coupled BMUs to radiate the weight across the same layer of neurons, ultimately forming a hierarchical self-organizing topographic map of concern. Additionally, we simulate real neuronal dynamics, introduce a glial cell-mediated Glial-LIF (Leaky Integrate-and-fire) model, and adjust multiple levels of BMUs to optimize the attention topological *** demonstrate that the proposed Self-organizing Glial Spiking neuralnetwork (SG-SNN) can generate attention topographies for dynamic event data from coarse to fine. A heuristic method based on cognitive science effectively guides the network's distribution of excitatory regions. Furthermore, the SG-SNN shows improved accuracy on three standard neuromorphic datasets: DVS128-Gesture, CIFAR10-DVS, and N-Caltech 101, with accuracy improvements of 0.3%, 2.4%, and 0.54% respectively. Notably, the recognition accuracy on the DVS128-Gesture dataset reaches 99.3%, achieving state-of-the-art (SOTA) performance.
At present, the commonly used active and passive islanding detection methods have their own shortcomings, and the islanding detection effect is difficult to meet the requirements. Therefore, a new detection method bas...
详细信息
At present, the commonly used active and passive islanding detection methods have their own shortcomings, and the islanding detection effect is difficult to meet the requirements. Therefore, a new detection method based on wavelet signal processing and artificial intelligence to identify islanding is proposed in this paper. In this method, the feature quantities required for islanding detection are obtained by wavelet transform and signal processing, and then the feature quantities are identified by neuralnetwork to determine whether the islanding is generated in distributed generation *** transformation has a strong ability of signal feature extraction, while the neuralnetwork has strong learning and identification abilities, the combination of the both is beneficial to improve the success rate of islanding detection. Simulation verification shows that the new islanding detection method proposed in this paper can detect islanding quickly and accurately, and the performance of islanding detection has been significantly improved.
Deep neuralnetworks are gaining importance and popularity in applications and *** to the enormous number of learnable parameters and datasets,the training of neuralnetworks is computationally *** and distributed com...
详细信息
Deep neuralnetworks are gaining importance and popularity in applications and *** to the enormous number of learnable parameters and datasets,the training of neuralnetworks is computationally *** and distributed computation-based strategies are used to accelerate this training *** Adversarial networks(GAN)are a recent technological achievement in deep *** generative models are computationally expensive because a GAN consists of two neuralnetworks and trains on enormous ***,a GAN is trained on a single *** deep learning accelerator designs are challenged by the unique properties of GAN,like the enormous computation stages with non-traditional convolution *** work addresses the issue of distributing GANs so that they can train on datasets distributed over many TPUs(Tensor processing Unit).distributed learning training accelerates the learning process and decreases computation *** this paper,the Generative Adversarial network is accelerated using the distributed multi-core TPU in distributed data-parallel synchronous *** adequate acceleration of the GAN network,the data parallel SGD(Stochastic Gradient Descent)model is implemented in multi-core TPU using distributed TensorFlow with mixed precision,bfloat16,and XLA(Accelerated Linear Algebra).The study was conducted on the MNIST dataset for varying batch sizes from 64 to 512 for 30 epochs in distributed SGD in TPU v3 with 128×128 systolic *** extensive batch technique is implemented in bfloat16 to decrease the storage cost and speed up floating-point *** accelerated learning curve for the generator and discriminator network is *** training time was reduced by 79%by varying the batch size from 64 to 512 in multi-core TPU.
Fiber-optic acoustic sensors, such as interferometric fiber-optic sensor array and phase-sensitive optical time-domain reflectometry, have excellent sensitivity and have been widely adopted in applications. However, t...
详细信息
Fiber-optic acoustic sensors, such as interferometric fiber-optic sensor array and phase-sensitive optical time-domain reflectometry, have excellent sensitivity and have been widely adopted in applications. However, their sampling rates are limited by the round-trip time of laser in the sensor array or the sensing fiber, which finally restricts the dynamic range due to the issue of phase wrapping. Classical phase unwrapping function requires the signal to obey the Itoh condition, otherwise, the recovery of true phase for strongly-swinging signals will fail. In this work, we propose a neuralnetwork for phase unwrapping of interferometric sensing signals named Prearranged Ascending Receptive Field Transformer (PARFT). The Transformer architecture with regressive output was employed which is capable of time-domain signal processing, and 1D convolution layers with ascending kernel sizes were designed to replace the positional encoding in standard Transformer, providing particular inductive biases aiming at the phase unwrapping problem. For both simulated dataset via random matrix enlargement (RME) method and real dataset recorded by distributed acoustic sensor (DAS), the network showed high efficiency and competitive accuracy in phase-unwrapping of signals violating the Itoh condition, compared with previously proposed neuralnetworks or handcrafted algorithms.
This paper presents a novel approach for executing the inference of a network of pre-trained deep neuralnetworks (DNNs) on commercial-off-the-shelf devices that are deployed at the edge. The problem is to partition t...
详细信息
This paper presents a novel approach for executing the inference of a network of pre-trained deep neuralnetworks (DNNs) on commercial-off-the-shelf devices that are deployed at the edge. The problem is to partition the computation of the DNNs between an energy-constrained and performance-limited edge device EE, and an energy-unconstrained, higher performance device CC, referred to as the cloudlet, with the objective of minimizing the energy consumption of EE subject to a deadline constraint. The proposed partitioning algorithm takes into account the performance profiles of executing DNNs on the devices, the power consumption profiles, and the variability in the delay of the wireless channel. The algorithm is demonstrated on a platform that consists of an NVIDIA Jetson Nano as the edge device EE and a Dell workstation with a Titan Xp GPU as the cloudlet. Experimental results show significant improvements both in terms of energy consumption of EE and processing delay of the application. Additionally, it is shown how the energy-optimal solution is changed when the deadline constraint is altered. Moreover, the overhead of decision-making for our proposed method is significantly lower than the state-of-the-art Integer Linear Programming (ILP) solutions.
作者:
Duan, LianMa, BowenZou, WeiwenShanghai Jiao Tong Univ
Intelligent Microwave Lightwave Integrat Innovat C Dept Elect Engn State Key Lab Adv Opt Commun Syst & Networks 800 Dongchuan Rd Shanghai 200240 Peoples R China
Time-of-flight (ToF) signal processing has become increasingly crucial in depth perception applications. We propose a photonic spike processing method for ToF signals based on synaptic delay plasticity, which adjusts ...
详细信息
Time-of-flight (ToF) signal processing has become increasingly crucial in depth perception applications. We propose a photonic spike processing method for ToF signals based on synaptic delay plasticity, which adjusts the spike timing of encoded signals to achieve low-latency processing without the need for weight loading and control. This method employs photonic neurons that directly encode optical pulses of ToF signals into temporal spike sequences, eliminating the necessity for a time-to-digital converter (TDC). We use tunable optical delay lines to emulate the photonic synaptic regulation of spike timing. In addition, we demonstrate the efficacy of a photonic spiking neuralnetwork that trains the synaptic delay parameters using the ModelNet dataset, achieving an accuracy of 96.36%. In experiments, the processing delay of ToF signals is 58.66 ns, representing a reduction of two orders of magnitude compared with traditional TDC-based methods. This approach facilitates applying synaptic diversity in photonic neuromorphic information processing.(c) 2025 Optica Publishing Group. All rights, including for text and data mining (TDM), Artificial Intelligence (AI) training, and similar technologies, are reserved
Optical neuralnetworks (ONNs) have emerged as high-performance neuralnetwork accelerators, owing to its broad bandwidth and low power consumption. However, most current ONN architectures still struggle to fully leve...
详细信息
Optical neuralnetworks (ONNs) have emerged as high-performance neuralnetwork accelerators, owing to its broad bandwidth and low power consumption. However, most current ONN architectures still struggle to fully leverage their advantages in processing speed and energy efficiency. Here, we demonstrate a large-scale, ultra-high-speed, and low-power ONN distributed parallel computing architecture, implemented on a thin-film lithium niobate platform. It can encode image information at a modulation rate of 128 Gbaud and perform 16 parallel 2 x 2 convolution kernel operations, achieving 8.190 trillion multiply-accumulate operations per second (TMACs/s) with a power efficiency of 4.55 tera operations per second per watt (Tops/W). This work conducts proof-of-concept experiments for image edge detection and three different ten-class dataset recognitions, showing performance comparable to digital computers. Thanks to its excellent scalability, high speed, and low power consumption, the integrated distributed parallel optical computing architecture shows great potential to perform much more sophisticated tasks for demanding applications, such as autonomous driving and video action recognition.
Growing dataset and model sizes for Deep neuralnetworks (DNNs) training have necessitated distributed training. Despite a rich literature on designing better distributed training algorithms and frameworks, few of the...
详细信息
ISBN:
(纸本)9798400716607
Growing dataset and model sizes for Deep neuralnetworks (DNNs) training have necessitated distributed training. Despite a rich literature on designing better distributed training algorithms and frameworks, few of them touch the lower layers (e.g. transport layer) of TCP/IP model. There's a recent paper [11] calling for rethinking the transport layer for distributed training, which suggests by simulation the potential performance gain from redesigning a new transport layer. Building on this premise, our research identifies TCP's limitations for distributed ML in data centers and introduces a Parameter Server (PS) system with a bounded-loss communication layer. It validates gradient-loss-tolerant neuralnetworks for ML tasks and enhances distributed training efficiency through a novel communication protocol. Preliminary results show that such a method could potentially improve the performance by 8.2X, reinforcing the feasibility of scalable and efficient distributed ML systems.
Finding connectivity in graphs has numerous applications, such as social network analysis, data mining, intra-city or inter-cities connectivity, neuralnetwork, and many more. The deluge of graph applications makes gr...
详细信息
Finding connectivity in graphs has numerous applications, such as social network analysis, data mining, intra-city or inter-cities connectivity, neuralnetwork, and many more. The deluge of graph applications makes graph connectivity problems extremely important and worthwhile to explore. Currently, there are many single-node algorithms for graph mining and analysis;however, those algorithms primarily apply to small graphs and are implemented on a single machine node. Finding 2-edge connected components (2-ECCs) in massive graphs (billions of edges and vertices) is impractical and time-consuming, even with the best-known single-node algorithms. processing a big graph in a parallel and distributed fashion saves considerable time to finish processing. Moreover, it enables stream data processing by allowing quick results for vast and continuous nature data sets. This research proposes a distributed and parallel algorithm for finding 2-ECCs in big undirected graphs (subsequently called ``BiECCA'') and presents its time complexity analysis. The proposed algorithm is implemented on a MapReduce framework and uses an existing algorithm to find connected components (CCs) in a graph as a sub-step. Finally, we suggest a few novel ideas and approaches as extensions to our work.
暂无评论