Anomaly detection(AD) has been extensively studied and applied across various scenarios in recent years. However, gaps remain between the current performance and the desired recognition accuracy required for practical...
详细信息
Anomaly detection(AD) has been extensively studied and applied across various scenarios in recent years. However, gaps remain between the current performance and the desired recognition accuracy required for practical *** paper analyzes two fundamental failure cases in the baseline AD model and identifies key reasons that limit the recognition accuracy of existing approaches. Specifically, by Case-1, we found that the main reason detrimental to current AD methods is that the inputs to the recovery model contain a large number of detailed features to be recovered, which leads to the normal/abnormal area has not/has been recovered into its original state. By Case-2, we surprisingly found that the abnormal area that cannot be recognized in image-level representations can be easily recognized in the feature-level representation. Based on the above observations, we propose a novel recover-then-discriminate(ReDi) framework for *** takes a self-generated feature map(e.g., histogram of oriented gradients) and a selected prompted image as explicit input information to address the identified in Case-1. Additionally, a feature-level discriminative network is introduced to amplify abnormal differences between the recovered and input representations. Extensive experiments on two widely used yet challenging AD datasets demonstrate that ReDi achieves state-of-the-art recognition accuracy.
We have witnessed the emergence of superhuman intelligence thanks to the fast development of large language models(LLMs) and multimodal language models. As the application of such superhuman models becomes increasingl...
详细信息
We have witnessed the emergence of superhuman intelligence thanks to the fast development of large language models(LLMs) and multimodal language models. As the application of such superhuman models becomes increasingly popular, a critical question arises: how can we ensure they still remain safe, reliable, and aligned well with human values encompassing moral values, Schwartz's Values, ethics, and many more? In this position paper, we discuss the concept of superalignment from a learning perspective to answer this question by outlining the learning paradigm shift from large-scale pretraining and supervised fine-tuning, to alignment training. We define superalignment as designing effective and efficient alignment algorithms to learn from noisy-labeled data(point-wise samples or pair-wise preference data) in a scalable way when the task is very complex for human experts to annotate and when the model is stronger than human experts. We highlight some key research problems in superalignment, namely, weak-to-strong generalization, scalable oversight, and evaluation. We then present a conceptual framework for superalignment, which comprises three modules: an attacker which generates the adversary queries trying to expose the weaknesses of a learner model, a learner which refines itself by learning from scalable feedbacks generated by a critic model with minimal human experts, and a critic which generates critics or explanations for a given query-response pair, with a target of improving the learner by criticizing. We discuss some important research problems in each component of this framework and highlight some interesting research ideas that are closely related to our proposed framework, for instance, self-alignment, self-play, self-refinement, and more. Last, we highlight some future research directions for superalignment, including the identification of new emergent risks and multi-dimensional alignment.
Edge closeness and betweenness centralities are widely used path-based metrics for characterizing the importance of edges in *** general graphs,edge closeness centrality indicates the importance of edges by the shorte...
详细信息
Edge closeness and betweenness centralities are widely used path-based metrics for characterizing the importance of edges in *** general graphs,edge closeness centrality indicates the importance of edges by the shortest distances from the edge to all the other *** betweenness centrality ranks which edges are significant based on the fraction of all-pairs shortest paths that pass through the ***,extensive research efforts go into centrality computation over general graphs that omit time ***,numerous real-world networks are modeled as temporal graphs,where the nodes are related to each other at different time *** temporal property is important and should not be neglected because it guides the flow of information in the *** state of affairs motivates the paper’s study of edge centrality computation methods on temporal *** introduce the concepts of the label,and label dominance relation,and then propose multi-thread parallel labeling-based methods on OpenMP to efficiently compute edge closeness and betweenness centralities *** types of optimal temporal *** edge closeness centrality computation,a time segmentation strategy and two observations are presented to aggregate some related temporal edges for uniform *** edge betweenness centrality computation,to improve efficiency,temporal edge dependency formulas,a labeling-based forward-backward scanning strategy,and a compression-based optimization method are further proposed to iteratively accumulate centrality *** experiments using 13 real temporal graphs are conducted to provide detailed insights into the efficiency and effectiveness of the proposed *** with state-ofthe-art methods,labeling-based methods are capable of up to two orders of magnitude speedup.
Polysemy is a common phenomenon in linguistics. Quantum-inspired complex word embeddings based on Semantic Hilbert Space play an important role in natural language processing, which may accurately define a genuine pro...
详细信息
Polysemy is a common phenomenon in linguistics. Quantum-inspired complex word embeddings based on Semantic Hilbert Space play an important role in natural language processing, which may accurately define a genuine probability distribution over the word space. The existing quantum-inspired works manipulate on the real-valued vectors to compose the complex-valued word embeddings, which lack direct complex-valued pre-trained word representations. Motivated by quantum-inspired complex word embeddings, we propose a complex-valued pre-trained word embedding based on density matrices, called Word2State. Unlike the existing static word embeddings, our proposed model can provide non-linear semantic composition in the form of amplitude and phase, which also defines an authentic probabilistic distribution. We evaluate this model on twelve datasets from the word similarity task and six datasets from the relevant downstream tasks. The experimental results on different tasks demonstrate that our proposed pre-trained word embedding can capture richer semantic information and exhibit greater flexibility in expressing uncertainty.
In the development of linear quadratic regulator(LQR) algorithms, the Riccati equation approach offers two important characteristics——it is recursive and readily meets the existence condition. However, these attribu...
详细信息
In the development of linear quadratic regulator(LQR) algorithms, the Riccati equation approach offers two important characteristics——it is recursive and readily meets the existence condition. However, these attributes are applicable only to transformed singular systems, and the efficiency of the regulator may be undermined if constraints are violated in nonsingular versions. To address this gap, we introduce a direct approach to the LQR problem for linear singular systems, avoiding the need for any transformations and eliminating the need for regularity assumptions. To achieve this goal, we begin by formulating a quadratic cost function to derive the LQR algorithm through a penalized and weighted regression framework and then connect it to a constrained minimization problem using the Bellman's criterion. Then, we employ a dynamic programming strategy in a backward approach within a finite horizon to develop an LQR algorithm for the original system. To accomplish this, we address the stability and convergence analysis under the reachability and observability assumptions of a hypothetical system constructed by the pencil of augmented matrices and connected using the Hamiltonian diagonalization technique.
The earthquake early warning(EEW) system provides advance notice of potentially damaging ground shaking. In EEW, early estimation of magnitude is crucial for timely rescue operations. A set of thirty-four features is ...
详细信息
The earthquake early warning(EEW) system provides advance notice of potentially damaging ground shaking. In EEW, early estimation of magnitude is crucial for timely rescue operations. A set of thirty-four features is extracted using the primary wave earthquake precursor signal and site-specific *** Japan's earthquake magnitude dataset, there is a chance of a high imbalance concerning the earthquakes above strong impact. This imbalance causes a high prediction error while training advanced machine learning or deep learning models. In this work, Conditional Tabular Generative Adversarial Networks(CTGAN), a deep machine learning tool, is utilized to learn the characteristics of the first arrival of earthquake P-waves and generate a synthetic dataset based on this information. The result obtained using actual and mixed(synthetic and actual) datasets will be used for training the stacked ensemble magnitude prediction model, MagPred, designed specifically for this study. There are 13295, 3989, and1710 records designated for training, testing, and validation. The mean absolute error of the test dataset for single station magnitude detection using early three, four, and five seconds of P wave are 0.41, 0.40,and 0.38 MJMA. The study demonstrates that the Generative Adversarial Networks(GANs) can provide a good result for single-station magnitude prediction. The study can be effective where less seismic data is available. The study shows that the machine learning method yields better magnitude detection results compared with the several regression models. The multi-station magnitude prediction study has been conducted on prominent Osaka, Off Fukushima, and Kumamoto earthquakes. Furthermore, to validate the performance of the model, an inter-region study has been performed on the earthquakes of the India or Nepal region. The study demonstrates that GANs can discover effective magnitude estimation compared with non-GAN-based methods. This has a high potential for wid
Industrial cyber-physical systems closely integrate physical processes with cyberspace, enabling real-time exchange of various information about system dynamics, sensor outputs, and control decisions. The connection b...
详细信息
Industrial cyber-physical systems closely integrate physical processes with cyberspace, enabling real-time exchange of various information about system dynamics, sensor outputs, and control decisions. The connection between cyberspace and physical processes results in the exposure of industrial production information to unprecedented security risks. It is imperative to develop suitable strategies to ensure cyber security while meeting basic performance *** the perspective of control engineering, this review presents the most up-to-date results for privacy-preserving filtering,control, and optimization in industrial cyber-physical systems. Fashionable privacy-preserving strategies and mainstream evaluation metrics are first presented in a systematic manner for performance evaluation and engineering *** discussion discloses the impact of typical filtering algorithms on filtering performance, specifically for privacy-preserving Kalman filtering. Then, the latest development of industrial control is systematically investigated from consensus control of multi-agent systems, platoon control of autonomous vehicles as well as hierarchical control of power systems. The focus thereafter is on the latest privacy-preserving optimization algorithms in the framework of consensus and their applications in distributed economic dispatch issues and energy management of networked power systems. In the end, several topics for potential future research are highlighted.
Binary neural networks have become a promising research topic due to their advantages of fast inference speed and low energy consumption. However, most existing studies focus on binary convolutional neural networks, w...
详细信息
Binary neural networks have become a promising research topic due to their advantages of fast inference speed and low energy consumption. However, most existing studies focus on binary convolutional neural networks, while less attention has been paid to binary graph neural networks. A common drawback of existing studies on binary graph neural networks is that they still include lots of inefficient full-precision operations in multiplying three matrices and are therefore not efficient enough. In this paper, we propose a novel method, called re-quantization-based binary graph neural networks(RQBGN), for binarizing graph neural networks. Specifically, re-quantization, a necessary procedure contributing to the further reduction of superfluous inefficient full-precision operations, quantizes the results of multiplication between any two matrices during the process of multiplying three matrices. To address the challenges introduced by requantization, in RQBGN we first study the impact of different computation orders to find an effective one and then introduce a mixture of experts to increase the model capacity. Experiments on five benchmark datasets show that performing re-quantization in different computation orders significantly impacts the performance of binary graph neural network models, and RQBGN can outperform other baselines to achieve state-of-the-art performance.
This study proposes a malicious code detection model DTL-MD based on deep transfer learning, which aims to improve the detection accuracy of existing methods in complex malicious code and data scarcity. In the feature...
详细信息
Stochastic gradient descent(SGD) and its variants have been the dominating optimization methods in machine learning. Compared with SGD with small-batch training, SGD with large-batch training can better utilize the co...
详细信息
Stochastic gradient descent(SGD) and its variants have been the dominating optimization methods in machine learning. Compared with SGD with small-batch training, SGD with large-batch training can better utilize the computational power of current multi-core systems such as graphics processing units(GPUs)and can reduce the number of communication rounds in distributed training settings. Thus, SGD with large-batch training has attracted considerable attention. However, existing empirical results showed that large-batch training typically leads to a drop in generalization accuracy. Hence, how to guarantee the generalization ability in large-batch training becomes a challenging task. In this paper, we propose a simple yet effective method, called stochastic normalized gradient descent with momentum(SNGM), for large-batch training. We prove that with the same number of gradient computations, SNGM can adopt a larger batch size than momentum SGD(MSGD), which is one of the most widely used variants of SGD, to converge to an?-stationary point. Empirical results on deep learning verify that when adopting the same large batch size,SNGM can achieve better test accuracy than MSGD and other state-of-the-art large-batch training methods.
暂无评论