Malware detection has attracted widespread attention due to the growing malware sophistication. Machine learning based methods have been proposed to find traces of malware by analyzing network traffic. However, networ...
详细信息
Fully capturing contextual information and analyzing the association between entity semantics and type is helpful for joint extraction task: 1) The context can reflect the part of speech and semantics of entity. 2) Th...
详细信息
Consistency assessment is an important measure to ensure that distributed modeling and simulation results are correct and credible, and is crucial for maintaining the global consistency of parallel and distributed Sim...
详细信息
The core problem of Knowledge Base Question Answering(KBQA) is to find queries from user questions to knowledge bases. Specifically, natural language questions need to be transformed into structured queries before ass...
详细信息
The core problem of Knowledge Base Question Answering(KBQA) is to find queries from user questions to knowledge bases. Specifically, natural language questions need to be transformed into structured queries before associating with knowledge bases, and the answers can be found from the knowledge graph using the structured queries. We found structured queries in similar question domains tend to have repetitive reasoning steps. Also, humans often use cases identical to the question and information from these cases to assist in answering new questions. Hence, we propose a new KBQA framework based on similar question domains. We separately design the inference information retriever module to extract cases with a similar structure to the question and the relation information retriever module to narrow the scope of reasoning relation extraction. Finally, we used the retrieved inference cases and relation candidate sets as auxiliary information and generated an executable Knowledge-oriented Programming Language(KoPL) through the program generation module. Experiments have shown that the model can handle complex question answering and has a strong reasoning ability. Our methodology has resulted in new state-of-the-art performance on WebQSP and CWQ datasets.
Vehicle-to-grid (V2G) model has the potential for providing a distributed reserve to the power system developed for large scale implementation of Hybrid Electric Vehicle model. The authors proposed a modified V2G cont...
详细信息
There are questions about how to accurately prepare with the correct number of resources for distribution in order to properly manage the healthcare resources (e.g., healthcare workers, Masks, ART-19 TestKit) required...
详细信息
There are questions about how to accurately prepare with the correct number of resources for distribution in order to properly manage the healthcare resources (e.g., healthcare workers, Masks, ART-19 TestKit) required to tighten the grip on the COVID-19 pandemic. Mathematical and computational forecasting models have well served the means to address these questions, as well as the resulting advisories to governments. A workflow is proposed in this research, aiming to develop a forecasting simulation that makes accurate predictions on COVID-19 confirmed cases in Singapore. According to the analysis of the prior works, six candidate forecasting models are evaluated and compared in the workflow: polynomial regression, linear regression, SVM, Prophet, Holt’s linear, and LSTM models. The study’s goal is to determine the most suitable forecasting model for COVID-19 cases in Singapore. Two algorithms are also proposed to better compute the performance of two models: the order algorithm to determine optimal degree order for the polynomial regression model, and the optimizing algorithm for the Holt’s linear model to calculate the optimal smoothing parameters. Observed from the experiment results with the COVID-19 dataset, the Prophet method model achieves the best performance with the lowest Root Mean Square Error (RMSE) score of 1557.744836 and Mean Absolute Percentage Error (MAPE) score of 0.468827, compared to the other five models. The Prophet method model achieving average accuracy range of 90% when forecasting the number of confirmed COVID-19 cases in Singapore for the next 87 days ahead. is chosen and recommended to be used as a system model for forecast the COVID-19 confirm cases in Singapore. The developed workflow will greatly assist the authorities in taking timely actions and making decisions to contain the COVID-19 pandemic.
For distributed training, the communication overhead for parameter synchronization is heavy in the network. Data aggregation can efficiently alleviate network overheads. However, existing works on data aggregation are...
详细信息
As the application scenarios of convolutional neural network (CNN) become more and more complex, the general CNN accelerator based on matrix multiplication has become a new research focus. The existing mapping methods...
详细信息
Controlled thermonuclear fusion has always been a dream pursued by mankind. However, the physical processes of controlled thermonuclear fusion are complex, requiring numerical simulations with high performance computi...
Controlled thermonuclear fusion has always been a dream pursued by mankind. However, the physical processes of controlled thermonuclear fusion are complex, requiring numerical simulations with high performance computing, and the amount of data generated by the physical processes on spatial, temporal and temperature scales is too large to be captured, managed, processed and collated in a reasonable time frame by mainstream software tools to achieve more aggressive fusion physical design. The data are too large to be captured, managed, processed, and collated into more aggressive targets for fusion physical design in a reasonable time by mainstream software tools. At the same time, the failure of fusion ignition can be caused by the distortion of various key physical quantities, and only by decomposing the process step by step and clarifying the changes of key physical quantities in the fusion physics process, can an effective mechanism be formed to prevent the distortion of key physical quantities from causing ignition failure in experimental physics. Big data in collaboration with artificial intelligence and high performance computing to drive the physical design of fusion is a novel avenue. By data acquisition with and pre-processing, this allows the creation of small sample libraries and deep learning. With supervised learning function convergence, incorporating solid/fluid computational methods, further network layering and cell expansion, the parameters of the physical model conforming to Lawson’s criterion will become experimental physical parameters, and we find the new approach in Data capacity, Model combination approach, Type of material, Calculation speed, Optimisation of design iteration times, etc. are superior to the traditional approach.
Stream clustering is an important data mining technique to capture the evolving patterns in real-time data streams. Today's data streams, e.g., IoT events and Web clicks, are usually high-speed and contain dynamic...
详细信息
ISBN:
(纸本)9781728170022
Stream clustering is an important data mining technique to capture the evolving patterns in real-time data streams. Today's data streams, e.g., IoT events and Web clicks, are usually high-speed and contain dynamically-changing patterns. Existing stream clustering algorithms usually follow an online-offline paradigm with a one-record-at-a-time update model, which was designed for running in a single machine. These stream clustering algorithms, with this sequential update model, cannot be efficiently parallelized and fail to deliver the required high throughput for stream clustering. In this paper, we present DistStream, a distributed framework that can effectively scale out online-offline stream clustering algorithms. To parallelize these algorithms for high throughput, we develop a mini-batch update model with efficient parallelization approaches. To maintain high clustering quality, DistStream's mini-batch update model preserves the update order in all the computation steps during parallel execution, which can reflect the recent changes for dynamically-changing streaming data. We implement DistStream atop Spark Streaming, as well as four representative stream clustering algorithms based on DistStream. Our evaluation on three real-world datasets shows that DistStream-based stream clustering algorithms can achieve sublinear throughput gain and comparable (99%) clustering quality with their single-machine counterparts.
暂无评论