Construction safety monitoring is a significant issue in practical engineering. Unfortunately, specific techniques in this field still heavily depend on artificial monitoring. To detect the abnormal scenarios during t...
详细信息
Construction safety monitoring is a significant issue in practical engineering. Unfortunately, specific techniques in this field still heavily depend on artificial monitoring. To detect the abnormal scenarios during the construction process automatically, a method was proposed for the detection and localization of abnormal scenarios in time and space. The method consists of three components: (1) an I3D-AE video prediction model, which extracts the video features from multiple I3Ds and reconstructs the video by 3D deconvolution;(2) a spatial localization module AS-CAM, which determines the location of abnormal areas via back-propagating the I3D-AE;(3) a temporal parameter S-t, which can calculate the abnormal time period. The effectiveness of the method was verified with the use of a dataset, and the resulting data were plotted as ROC curves. The results indicated that the proposed method exceeded 0.9 on the frame-level test and 0.76 on the pixel-level test with the use of the AUC evaluation metric. Therefore, it can be used to assist the construction managers to improve the efficiency of construction safety management.
作者:
Zhang, DonglinWu, Xiao-JunJiangnan Univ
Sch Artificial Intelligence & Comp Sci Wuxi 214122 Jiangsu Peoples R China Jiangnan Univ
Jiangsu Prov Engn Lab Pattern Recognit & Computat Wuxi 214122 Jiangsu Peoples R China
Hashing based methods have gained great success for cross-modal similarity search, due to its fast query speed and low storage cost. However, there are some challenging problems that need to be further solved: 1) Many...
详细信息
Hashing based methods have gained great success for cross-modal similarity search, due to its fast query speed and low storage cost. However, there are some challenging problems that need to be further solved: 1) Many approaches are sensitive to noises and outliers, because pound 2 norm is utilized in the objec-tive function, the error may be amplified. 2) Most existing methods take relaxation or rounding scheme to generate binary codes, causing a large quantization loss. 3) Many supervised cross-media algorithms usually take a large n x n matrix to preserve the similarity relationship, leading to large calculation and making them unscalable. To mitigate these challenges, we develop a novel cross-media search algorithm, i.e., robust and discrete matrix factorization hashing, dubbed RDMH. The method takes a two-step strat-egy. In the first phase, the pound 2 , 1 norm is utilized to improve the robustness, which makes our model not sensitive to noises and outliers. We can learn the hash codes directly by the proposed discrete optimiza-tion method instead of relaxation scheme, avoiding the large quantization loss. Moreover, RDMH corre-lates the hash codes and semantic labels directly instead of manipulating the large similarity matrix. In the second phase, we propose an autoencoder strategy to learn the hash functions, more valuable infor-mation can be preserved and making the hash functions more powerful. Comprehensive experiments on several databases demonstrate the superior performance and efficacy of the developed RDMH. (c) 2021 Elsevier Ltd. All rights reserved.
We present a new network whereby an agent navigates in the 3D environment to find a target object according to a language-based instruction. Such a task is challenging because the agent has to understand the instructi...
详细信息
We present a new network whereby an agent navigates in the 3D environment to find a target object according to a language-based instruction. Such a task is challenging because the agent has to understand the instruction correctly and takes a series of actions to locate a target among others without colliding with obstacles. The essence of our proposed network consists of a coarse-to-fine fusion model to fuse language and vision and an autoencoder to encode visual information effectively. Then, an asynchronous reinforcement learning algorithm is used to coordinate detailed actions to complete the navigation task. Extensive evaluation using three different levels of the navigation task in the 3D Vizdoom environment suggests that our model outperforms the state-of-the-art. To see if the proposed network can deal with a real-world 3D environment for the navigation task, it is combined with Rec-BERT, which is based on REVERIE. The result suggests that it performs better, especially for unseen cases, and it is also useful to visualize what and when the agent pays attention to while it navigates in a complex indoor environment. & COPY;2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://***/licenses/by-nc-nd/4.0/).
Research about recommender systems emerges over the last decade and comprises valuable services to increase different companies' revenue. While most existing recommender systems rely either on a content based appr...
详细信息
Research about recommender systems emerges over the last decade and comprises valuable services to increase different companies' revenue. While most existing recommender systems rely either on a content based approach or a collaborative approach, there are hybrid approaches that can improve recommendation accuracy using a combination of both approaches. Even though many algorithms are proposed using such methods, it is still necessary for further improvement. This paper proposes a recommender system method using a graph-based model associated with the similarity of users' ratings in combination with users' demographic and location information. By utilizing the advantages of autoencoder feature extraction, we extract new features based on all combined attributes. Using the new set of features for clustering users, our proposed approach (GHRS) outperformed many existing recommendation algorithms on recommendation accuracy. Also, the method achieved significant results in the cold-start problem. All experiments have been performed on the MovieLens dataset due to the existence of users' side information.
autoencoder (AE) is widely used in image fusion. However, AE-based fusion methods usually use the same encoder to extract the features of images from different sensors/modalities without considering the differences be...
详细信息
autoencoder (AE) is widely used in image fusion. However, AE-based fusion methods usually use the same encoder to extract the features of images from different sensors/modalities without considering the differences between them. In addition, these methods cannot fuse the images in real time. To solve these problems, an end-to-end fusion network is proposed for fast infrared image and visible image fusion. We design an end-to-end W-shaped network (W-Net), which consists of two independent encoders, one shared decoder and skip connections. The two encoders extract the representative features of images from different sources respectively, and the decoder combines the hierarchical features from corresponding layers and reconstructs the fused image without using additional fusion layer or any handcrafted fusion rules. Skip connections are added to help retain the details and salient features in the fused image. Specifically, W-Net is lightweight, with fewer parameters than the existing AE-based methods. The experimental results show that our fusion network performs well in terms of subjective and objective visual assessments compared with other state-of-the-art fusion methods. It can fuse the images very fast (e.g., the fusion time of 20 pairs of images in the TNO dataset is 0.871 to 1.081 ms), operating above real-time speed.
Electronic medical records (EMRs) support the development of machine learning algorithms for predicting disease incidence, patient response to treatment, and other healthcare events. But so far most algorithms have be...
详细信息
Electronic medical records (EMRs) support the development of machine learning algorithms for predicting disease incidence, patient response to treatment, and other healthcare events. But so far most algorithms have been centralized, taking little account of the decentralized, non-identically independently distributed (non-IID), and privacy-sensitive characteristics of EMRs that can complicate data collection, sharing and learning. To address this challenge, we introduced a community-based federated machine learning (CBFL) algorithm and evaluated it on non-IID ICU EMRs. Our algorithm clustered the distributed data into clinically meaningful communities that captured similar diagnoses and geographical locations, and learnt one model for each community. Throughout the learning process, the data was kept local at hospitals, while locally-computed results were aggregated on a server. Evaluation results show that CBFL outperformed the baseline federated machine learning (FL) algorithm in terms of Area Under the Receiver Operating Characteristic Curve (ROC AUC), Area Under the Precision-Recall Curve (PR AUC), and communication cost between hospitals and the server. Furthermore, communities' performance difference could be explained by how dissimilar one community was to others.
Anomalies are samples that significantly deviate from the rest of the data and their detection plays a major role in building machine learning models that can be reliably used in applications such as data-driven desig...
详细信息
Anomalies are samples that significantly deviate from the rest of the data and their detection plays a major role in building machine learning models that can be reliably used in applications such as data-driven design and novelty detection. The majority of existing anomaly detection methods either are exclusively developed for (semi) supervised settings, or provide poor performance in unsupervised applications where there are no training data with labeled anomalous samples. To bridge this research gap, we introduce a robust, efficient, and interpretable methodology based on nonlinear manifold learning to detect anomalies in unsupervised settings. The essence of our approach is to learn a low-dimensional and interpretable latent representation (aka manifold) for all the data points such that normal samples are automatically clustered together and hence can be easily and robustly identified. We learn this low-dimensional manifold by designing a learning algorithm that leverages either a latent map Gaussian process (LMGP) or a deep autoencoder (AE). Our LMGP-based approach, in particular, provides a probabilistic perspective on the learning task and is ideal for high-dimensional applications with scarce data. We demonstrate the superior performance of our approach over existing technologies via multiple analytic examples and real-world datasets.
Alzheimer's disease (AD) is a neurological disease characterized by complex molecular pathways and neural tissue complexity. Investigation into its molecular structure and mechanisms are ongoing, and no therapeuti...
详细信息
Alzheimer's disease (AD) is a neurological disease characterized by complex molecular pathways and neural tissue complexity. Investigation into its molecular structure and mechanisms are ongoing, and no therapeutically useful genetic risk factors have been identified. As a result, brain images such as magnetic resonance imaging (MRI) and cognitive testing have been used to diagnose AD. Recently, various independent studies have generated and evaluated large-scale omics data from various brain regions, including the prefrontal cortex. Therefore, strategies for detecting or predicting AD must be developed using these data. In addition, integration of these omics data can be a valuable resource for gaining a more thorough understanding of the disease. This study developed a machine-learning-based approach for predicting AD using DNA-methylation and gene expression datasets. It is one of the challenging tasks to manage these data while building a prediction model since these contain tens of thousands of features and have a high dimensional and low sample size (HDLSS) characteristic. To solve this dilemma, we employed an autoencoder (AE) to generate minimized and continuous feature representation. We used multiple machine-learning approaches to predict AD after receiving the encoded data and calculated the accuracy and area under the curve (AUC). Furthermore, we showed that combining DNA methylation and gene expression data can increase the prediction accuracy. Finally, we compared our method to state-of-the-art technique and found that the proposed methodology outperformed it by improving the accuracy and AUC by 9.5 and 10.6%, respectively.
Networks that can describe complex systems in nature are increasingly coupled and interacted, and effective modeling on complex coupling and interaction information is an important research direction of artificial int...
详细信息
Networks that can describe complex systems in nature are increasingly coupled and interacted, and effective modeling on complex coupling and interaction information is an important research direction of artificial intelligence. Representation learning provides us with a paradigm to solve such issues, but the current network representation learning methods are difficult to capture the coupling and interaction information in complex networks. In this paper, we propose a novel deep attributed network representation learning model framework (RolEANE), which can effectively preserve the highly nonlinear coupling and interactive network topological structure and attribute information. We design two different structural role proximity enhancement strategies for the deep autoencoder in the model framework, so that it can efficiently capture network topological structure and attribute information. In addition, the neighbor-modified Skip-Gram model in our model framework can efficiently and seamlessly integrate network topological structure and attribute information, and the selection of an appropriate representation learning output strategy can significantly improve the final performance of the algorithm. The experiments on four real datasets show that our method consistently outperforms the state-of-the-art network representation learning methods. On the node classification task, the average performance is improved by 4.52%-10.28% than the optimal baseline method;on the link prediction task, the average performance is 4.63% higher than the optimal baseline method. (C) 2020 Elsevier B.V. All rights reserved.
Inspired by the idea of topological mechanics and geometric phase, the topological phononic beam governed by topological invariants has seen growing research interest due to generation of a topologically protected int...
详细信息
Inspired by the idea of topological mechanics and geometric phase, the topological phononic beam governed by topological invariants has seen growing research interest due to generation of a topologically protected interface state that can be characterized by geometric Zak phase. The interface mode has maximum amount of wave energy concentration at the interface of topologically variant beams with minimal losses and decaying wave energy fields away from it. The present study has developed a deep learning based autoencoder (AE) to inversely design topological phononic beam with invariants. By applying the transfer matrix method, a rigorous analytical model is developed to solve the wave dispersion relation for longitudinal and bending elastic waves. By determining the phase of the reflected wave, the geometric Zak phase is determined. The developed analytical models are used for input data generation to train the AE. Upon successful training, the network prediction is validated by finite element numerical simulations and experimental test on the manufactured prototype. The developed AE successfully predicts the interface modes for the combination of topologically variant phononic beams. The study findings may provide a new perspective for the inverse design of metamaterial beam and plate structures in solid and computational mechanics. The work is a step towards deep learning networks suitable for the inverse design of phononic crystals and metamaterials enabling design optimization and performance enhancements.
暂无评论