Anomaly detection and explanation in big volumes of real-world medical data, such as those pertaining to COVID-19, pose some challenges. First, we are dealing with time-series data. Typical time-series data describe b...
详细信息
Anomaly detection and explanation in big volumes of real-world medical data, such as those pertaining to COVID-19, pose some challenges. First, we are dealing with time-series data. Typical time-series data describe behavior of a single object over time. In medical data, we are dealing with time-series data belonging to multiple entities. Thus, there may be multiple subsets of records such that records in each subset, which belong to a single entity are temporally dependent, but the records in different subsets are unrelated. Moreover, the records in a subset contain different types of attributes, some of which must be grouped in a particular manner to make the analysis meaningful. Anomaly detection techniques need to be customized for time-series data belonging to multiple entities. Second, anomaly detection techniques fail to explain the cause of outliers to the experts. This is critical for new diseases and pandemics where current knowledge is insufficient. We propose to address these issues by extending our existing work called IDEAL, which is an lstm-autoencoder based approach for data quality testing of sequential records, and provides explanations of constraint violations in a manner that is understandable to end-users. The extension (1) uses a novel two-level reshaping technique that splits COVID-19 data sets into multiple temporally-dependent subsequences and (2) adds a data visualization plot to further explain the anomalies and evaluate the level of abnormality of subsequences detected by IDEAL. We performed two systematic evaluation studies for our anomalous subsequence detection. One study uses aggregate data, including the number of cases, deaths, recovered, and percentage of hospitalization rate, collected from a COVID tracking project, New York Times, and Johns Hopkins for the same time period. The other study uses COVID-19 patient medical records obtained from Anschutz Medical Center health data warehouse. The results are promising and indicate that
Smart substation is a crucial Cyber-Physical system and is prone to cyber-attack. In this paper, we propose a novel anomaly detection mechanism tailored for detecting the IEC 61850-based network traffic. Three types o...
详细信息
Smart substation is a crucial Cyber-Physical system and is prone to cyber-attack. In this paper, we propose a novel anomaly detection mechanism tailored for detecting the IEC 61850-based network traffic. Three types of traffic features are taken into account for comprehensively characterizing the network traffic during a time window. To eliminate the subjectivity of manually selecting the traffic features, we exploit Discrete Wavelet Transform (DWT) algorithm to secondarily extract the deep features. An improved Locally Linear Embedding (LLE) algorithm is proposed to reduce the dimension of deep feature vectors with more effective dimensionality reduction ability. By doing so, the lstm (Long Short Term Memory)-based autoencoder network that can learn to reconstruct the normal traffic time-series behavior, and thereafter uses the reconstruction error to detect the anomalies. We assess the performance of our proposed mechanism with the comprehensive experiments on the real smart substation. The results indicate that the proposed mechanism can be performed in a fast way with satisfactory detection performance.
早期预警是在线学习中的重要主题,通过早期预警识别有不及格风险的学生可帮助教师及时开展个性化教学干预。使用深度学习模型对学生微观行为模式进行分析以提高早期预警的效果,并提出结合lstm-autoencoder特征处理和注意力权重计算的不及格风险学生早期预警模型(lstm-autoencoder and attention based early warning model,LAA)。该方法通过lstm-autoencoder对学生行为时间序列数据进行特征处理,采用注意力机制计算关键预测因子。实验结果表明,LAA比基线模型取得更高的召回率,对低交互型和非持续型学生具有更好的识别效果,且能将教学干预时间提前;此外,该方法可识别影响成绩的关键周次和行为,可用于辅助教师开展在线教学指导。
暂无评论