Bayesian Neural Networks (BayNNs) are powerful tools for making predictions while also providing uncertainty estimates, a crucial aspect in safety-critical systems. However, implementing BayNNs in edge applications po...
详细信息
ISBN:
(纸本)9798350386257;9798350386240
Bayesian Neural Networks (BayNNs) are powerful tools for making predictions while also providing uncertainty estimates, a crucial aspect in safety-critical systems. However, implementing BayNNs in edge applications poses challenges, often necessitating transformations such as quantification or approximation, especially for Gaussian processes, commonly achieved through Dropout methods. this paper explores a new implementation of the Dropout module by using Superparamagnetic Tunnel Junction (SMTJ). the stochastic circuit enables fine-tuning of the generated stochastic bitstreams used to drop different parameters in the BayNN. the proposed module implementation leads to better power efficiency and robustness against variability. the results demonstrate an improvement in the energy efficiency of up to 34.8x when compared to the related works.
distributed Particle Swarm Optimization (DPSO) is a powerful algorithm that can instruct a group of AUVs to conduct search operation. However, the effectiveness of DPSO is constrained by local communication. In this p...
详细信息
ISBN:
(纸本)9798350386066;9798350386059
distributed Particle Swarm Optimization (DPSO) is a powerful algorithm that can instruct a group of AUVs to conduct search operation. However, the effectiveness of DPSO is constrained by local communication. In this paper, we proposed an Ad-hoc communication strategy that can greatly extend the effectively communication range of DPSO. By utilizing the Adhoc communication strategy, nbest can be shared with other AUVs several hops away. the preliminary result shows the effectiveness of our proposed scheme.
Most contemporary HPC programming models assume an inelastic runtime in which the resources allocated to an application remain fixed throughout its execution. Conversely, elastic runtimes can expand and shrink resourc...
详细信息
ISBN:
(纸本)9798350395679;9798350395662
Most contemporary HPC programming models assume an inelastic runtime in which the resources allocated to an application remain fixed throughout its execution. Conversely, elastic runtimes can expand and shrink resources based on availability and/or dynamic application requirements. In this paper, we implement elasticity for PaRSEC, a task-based dataflow runtime, using inter-node GPU work stealing. In addition to supporting elasticity, we demonstrate that inter-node GPU work stealing can enhance the performance of imbalanced applications by up to 45%.
this paper explores second-order nonlinear disordered photonic media assembled from oxide nanoparticles, particularly barium titanate and lithium niobate nanocrystals. thanks to the simultaneous linear scattering and ...
详细信息
ISBN:
(纸本)9798350377330;9798350377323
this paper explores second-order nonlinear disordered photonic media assembled from oxide nanoparticles, particularly barium titanate and lithium niobate nanocrystals. thanks to the simultaneous linear scattering and second-harmonic generation, these media favors the emergence of a variety of unexplored optical processes with applications in optical frequency conversion and photonic neuromorphic computing.
In this paper, we compare the equalization performance of three different passive reservoir computing architectures using micro-ring resonators as reservoirs. the architectures consist of four resonators, arranged in ...
详细信息
ISBN:
(纸本)9798350377330;9798350377323
In this paper, we compare the equalization performance of three different passive reservoir computing architectures using micro-ring resonators as reservoirs. the architectures consist of four resonators, arranged in a parallel 4x1, sequential 1x4 and deep 2x2 configuration. the reservoirs are simulated using the two-dimensional finite-difference time-domain method and are fed with simulation data from a comprehensive simulation setup with realistic component characteristics. We show, that the parallel architecture outperforms boththe sequential and the deep micro-ring resonator configurations. Moreover, the parallel structure with a linear readout outperforms state-of-the-art digital signal processing in terms of bit error ratio by up to an order of magnitude. this passive reservoir computing architecture could pave the way to more energy-efficient and high symbol rate intra-DCN and mobile fronthaul systems.
Streaming applications are expected to process an ever-increasing amount of data with high throughput and stringent latency requirements. Flooding these applications with incoming data may overload the stream processi...
详细信息
ISBN:
(纸本)9798350395679;9798350395662
Streaming applications are expected to process an ever-increasing amount of data with high throughput and stringent latency requirements. Flooding these applications with incoming data may overload the stream processing engine, leading to a system with unstable queues and infinitely growing latencies. Existing stream processing systems are equipped to deal with such overload scenarios reactively, either through back pressure or load shedding mechanisms. these mechanisms, however, have considerable drawbacks as they consume additional system resources, incur in non-negligible performance overheads, and may compromise the quality of application-level results. To address this gap, we propose a strategy based on reinforcement learning to throttle the input rate of data sources in streaming applications. the proposed strategy mitigates overload scenarios by addressing the source of the problem, thus allowing resources to be better utilized by application and system components and mitigating the performance overhead of system-level reactive mechanisms. through our experiments with two different applications, we demonstrate that our proposed approach reduces end-to-end latencies by up to 82% and increases throughput by up to 10% compared to back pressure mechanisms implemented in state-of-the-art stream processing engines.
We study the Traveling Salesman Problem (TSP) in the Congested Clique Model (CCM) of distributedcomputing. We present a deterministic distributed algorithm that computes a tour for the TSP using O(1) rounds and O(m) ...
详细信息
ISBN:
(纸本)9798350369458;9798350369441
We study the Traveling Salesman Problem (TSP) in the Congested Clique Model (CCM) of distributedcomputing. We present a deterministic distributed algorithm that computes a tour for the TSP using O(1) rounds and O(m) messages for a given undirected weighted complete graph of n nodes and m edges with an approximation factor 2 of the optimal. the TSP has wide applications in logistics, planning, manufacturing and testing microchips, DNA sequencing etc., and we claim that our proposed O(1)-rounds approximation algorithm to the TSP, which is fast and efficient, can also be used to minimize the energy consumption in Wireless Sensor Networks.
distributed Quantum computing (DQC) has the potential to solve industrial large-scale problems by connecting multiple small quantum processors together to form a larger computing system. Concerning the emerging distri...
详细信息
ISBN:
(纸本)9798350386066;9798350386059
distributed Quantum computing (DQC) has the potential to solve industrial large-scale problems by connecting multiple small quantum processors together to form a larger computing system. Concerning the emerging distributed paradigm, a pivotal challenge lies in crafting specialized network topologies to establish efficient connections among quantum processors while minimizing communication costs. In this paper, we propose a novel DQC topology generation algorithm (DQC-TG) to create optimal and near-optimal network topologies for homogeneous and heterogeneous quantum computers, respectively. Furthermore, for specific quantum circuits requiring diverse communication demands between each pair of qubits, we extend the original algorithm into DQC-TG-Plus to design network topologies tailored for these circuits to further enhance the performance. We perform extensive simulations to evaluate the superiority of the generated network topology designs by our algorithms to baselines.
Reservoir computing is a recurrent computing architecture particularly suited for implementation on photonic hardware, especially in its time-delay (time-multiplexed) implementation. Here, we review our effort in time...
详细信息
ISBN:
(纸本)9798350377330;9798350377323
Reservoir computing is a recurrent computing architecture particularly suited for implementation on photonic hardware, especially in its time-delay (time-multiplexed) implementation. Here, we review our effort in timedelay reservoir computing based on a single nonlinear node consisting of a silicon microring resonator. We focus on relating the physical effects taking place in the resonator to the computing performance, as well as proposing extensions of the standard scheme to allow for increased computing efficiency, e.g. through task-multiplexing, or reduced footprint, e.g. through input-multiplexing.
the deployment of beyond 5G and 6G networks introduces many new services with stringent Quality of Service (QoS) requirements. Recently machine learning has been shown to be a viable solution in proposing adaptable so...
详细信息
ISBN:
(纸本)9798350377330;9798350377323
the deployment of beyond 5G and 6G networks introduces many new services with stringent Quality of Service (QoS) requirements. Recently machine learning has been shown to be a viable solution in proposing adaptable solutions. However, centralized machine learning based solutions still encounter hurdles in achieving real-time responsiveness due to their need of a global network view. In this paper, we explore a distributed approach aimed at optimizing network performance in real-time scenarios. By using Multi-Agent systems (MAS), our method targets near-real-time end-to-end delay assurance across diverse network domains, without the need for prior traffic profile knowledge. Evaluated results highlight the effectiveness of our approach in reducing routing costs and ensuring desired end-to-end delay levels.
暂无评论