A simple yet new sensing method for measurements of refractive index based on microwave-photonic hybrid optical fiber interferometers (optically coherent and incoherent) is proposed and experimentally demonstrated. ...
详细信息
Deep learning plays increasingly important role in future wireless network management and optimization. Existing training methods such as label-based supervised learning and label-free learning have inherent limitatio...
详细信息
This work proposes a Momentum-Enabled Kronecker-Factor-Based Optimizer Using Rank-1 Updates, called MKOR, that improves the training time and convergence properties of deep neural networks (DNNs). Second-order techniq...
This work proposes a Momentum-Enabled Kronecker-Factor-Based Optimizer Using Rank-1 Updates, called MKOR, that improves the training time and convergence properties of deep neural networks (DNNs). Second-order techniques, while enjoying higher convergence rates vs first-order counterparts, have cubic complexity with respect to either the model size and/or the training batch size. Therefore, they exhibit poor scalability and performance in transformer models, e.g. large language models (LLMs), because the batch sizes in these models scale by the attention mechanism sequence length, leading to large model size and batch sizes. MKOR's complexity is quadratic with respect to the model size, alleviating the computation bottlenecks in second-order methods. Because of their high computation complexity, state-of-the-art implementations of second-order methods can only afford to update the second order information infrequently, and thus do not fully exploit the promise of better convergence from these updates. By reducing the communication complexity of the second-order updates, as well as achieving a linear communication complexity, MKOR increases the frequency of second-order updates. We also propose a hybrid version of MKOR (called MKOR-H) that mid-training falls backs to a first order optimizer if the second order updates no longer accelerate convergence. Our experiments show that MKOR outperforms state-of-the-art first-order methods, e.g. the LAMB optimizer, and best implementations of second-order methods, i.e. KAISA/KFAC, up to 2.57× and 1.85 × respectively on BERT-Large-Uncased on 64 GPUs.
Distortion of skyrmions arouses much attention recently due to the exotic topologic al and dynamic *** the formation mechanism and dynamical behavior of the deformed skyrmions promotes practical spintronic ***,as a ty...
详细信息
Distortion of skyrmions arouses much attention recently due to the exotic topologic al and dynamic *** the formation mechanism and dynamical behavior of the deformed skyrmions promotes practical spintronic ***,as a typical form of deformation,has been discovered both in experiments and in ***,the intrinsic mechanism is ***,the coexistence of zero-field circular and elongated skyrmions in helimagnetic films is *** elongated skyrmions,which are determined by the intrinsic Dzyaloshinskii-Moriya interaction(DMI),carry the same topological charge as the circular ones and show the skyrmion Hall *** dynamics reveal again the significant role of the intrinsic DMI playing in the skyrmion elongation.
Recent developments in battery technology and the reduction of battery costs have resulted in battery energy storage systems (BESS) becoming more economically viable for various power sectors and automobile applicatio...
详细信息
Perception module of Autonomous vehicles (AVs) are increasingly susceptible to be attacked, which exploit vulnerabilities in neural networks through adversarial inputs, thereby compromising the AI safety. Some researc...
Multiview clustering has wide real-world applications because it can process data from multiple sources. However, these data often contain missing instances and noises, which are ignored by most multiview clustering m...
详细信息
Drug repositioning is a crucial aspect of biomedical research, and predicting drug-disease associations (DDAs) is a critical step in this process. With the development of deep learning and neural network technologies,...
详细信息
Collaborative machine learning involves training models on data from multiple parties but must incentivize their participation. Existing data valuation methods fairly value and reward each party based on shared data o...
详细信息
We study the problem of out-of-distribution (OOD) detection, that is, detecting whether a machine learning (ML) model's output can be trusted at inference time. While a number of tests for OOD detection have been ...
详细信息
We study the problem of out-of-distribution (OOD) detection, that is, detecting whether a machine learning (ML) model's output can be trusted at inference time. While a number of tests for OOD detection have been proposed in prior work, a formal framework for studying this problem is lacking. We propose a definition for the notion of OOD that includes both the input distribution and the ML model, which provides insights for the construction of powerful tests for OOD detection. We also propose a multiple hypothesis testing inspired procedure to systematically combine any number of different statistics from the ML model using conformal p-values. We further provide strong guarantees on the probability of incorrectly classifying an in-distribution sample as OOD. In our experiments, we find that threshold-based tests proposed in prior work perform well in specific settings, but not uniformly well across different OOD instances. In contrast, our proposed method that combines multiple statistics performs uniformly well across different datasets and neural networks architectures.
暂无评论