Detecting pill defection remains challenging, despite recent extensive studies, because of the lack of defective data. In this paper, we propose a pipeline composed of a pill detection module and an autoencoder-based ...
详细信息
Detecting pill defection remains challenging, despite recent extensive studies, because of the lack of defective data. In this paper, we propose a pipeline composed of a pill detection module and an autoencoder-based defect detection module to detect defective pills in pill packages. Furthermore, we created a new dataset to test our model. The pill detection module segments pills in an aluminum-plastic package into individual pills. To segment pills, we used a shallow segmentation network that is then divided into individual pills using the watershed algorithm. The defect detection module identifies defects in individual pills. It is trained only on the normal data. Thus, it is expected that the module will be unable to reconstruct defective data correctly. However, in reality, the conventional autoencoder reconstructs defective data better than expected, even if the network is trained only on normal data. Hence, we introduce a patch division method to prevent this problem. The patch division involves dividing the output of the convolutional encoder network into patch-wise features, and then applying patch-wise encoder layer. In this process, each latent patch has its independent weight and bias. This can be interpreted as reconstructing the input image using multiple local autoencoders. The patch division makes the network concentrate only on reconstructing local regions, thereby reducing the overall capacity. This prohibits the proposed network reconstructing unseen data well. Experiments show that the proposed patch division technique indeed improves the defect detection performance and outperforms existing deep learning based anomaly detection methods. The ablation study shows the efficacy of patch division and compression following the concatenation of patch-wise features.
Sparse code multiple access (SCMA) is a code-domain non-orthogonal multiple access (NOMA) technology proposed to meet the access needs of large-scale intelligent terminal devices with high spectrum utilization. To imp...
详细信息
Sparse code multiple access (SCMA) is a code-domain non-orthogonal multiple access (NOMA) technology proposed to meet the access needs of large-scale intelligent terminal devices with high spectrum utilization. To improve the accuracy and computational complexity of SCMA to accommodate the internet of things (IoT) scenario, we design a new end-to-end autoencoder combining convolutional neural networks (CNNs) and residual networks. A residual network with multitask learning improves the decoding accuracy, and CNN units are used for SCMA codeword mapping, with sparse connectivity and weight-sharing to reduce the number of trainable parameters. Simulations show that this scheme outperforms existing autoencoder schemes in bit error rate (BER) and computational complexity.
The acoustic emission (AE) technique is a widely used nondestructive method for in-situ health monitoring of composite structures. Unlike metals, failure mechanisms in composite structures are complex, involving multi...
详细信息
The acoustic emission (AE) technique is a widely used nondestructive method for in-situ health monitoring of composite structures. Unlike metals, failure mechanisms in composite structures are complex, involving multiple damage modes, and each damage mode has a distinct AE signature. This work uses deep learning algorithms called convolutional autoencoder (CAE) and convolutional neural network (CNN) to classify damage modes in carbon fiber-reinforced polymer laminates using AE waveforms. Tensile experiments are carried out on laminates of various stacking sequences, and the acquired raw AE waveforms are transformed into time-frequency planes called spectrograms using short-time Fourier transform. CAE is used for retrieving deep features associated with damage modes in the latent space from these spectrograms. Subsequently, k-means is used to cluster the deep features in the latent space. Each cluster is labeled with a damage mode by inspecting their damage signatures using the scalograms. This labeled data is then used to train the CNN. The CNN, once trained is used on the AE data of pristine and notched quasi-isotropic specimens, and its ability to classify and identify the damage modes is investigated. The trained CNN achieves satisfactory classification accuracy of 96.9% on pristine quasi-isotropic specimen data and 96.4% on notched quasi-isotropic specimen data. When compared to the prediction accuracy of pure damage modes, the prediction accuracy of mixed-mode damage is slightly lower, at 92.5% for pristine and 91.3% for notched quasi-isotropic specimens. This reduction in accuracy is due to the spectrograms of mixed-mode damage containing energy distributed across multiple frequency bands. By classifying the AE waveforms of both pristine and notched quasi-isotropic specimens, the progression of different damage modes is analyzed through the cumulative count of AE waveforms associated with each damage type. These findings enhance the understanding of damage mod
Place recognition is a method for determining whether a robot has previously visited the place it currently observes, thus helping the robot correct its accumulated position error. Ultimately, the robot will travel lo...
详细信息
Place recognition is a method for determining whether a robot has previously visited the place it currently observes, thus helping the robot correct its accumulated position error. Ultimately, the robot will travel long distances more accurately. Conventional image-based place recognition uses features extracted from a bag-of-visual-words (BoVW) scheme or pre-trained deep neural network. However, the BoVW scheme does not cope well with environmental changes, and the pre-trained deep neural network is disadvantageous in that its computation time is high. Therefore, this paper proposes a novel place recognition scheme using an illumination-compensated image-based deep convolutional autoencoder (ICCAE) feature. Instead of reconstructing the raw image, the autoencoder designed to extract ICCAE features is trained to reconstruct the image, whose illumination component is compensated in the logarithm frequency domain. As a result, we can extract the ICCAE features based on a convolution layer that is robust to illumination and environmental changes. Additionally, ICCAE features can perform faster feature matching than the features extracted from existing deep networks. To evaluate the performance of ICCAE feature-based place recognition, experiments were conducted using a public dataset that includes various conditions.
Accurate diagnosis of breast cancer in histopathology images is challenging due to the heterogeneity of cancer cell growth as well as a variety of benign breast tissue proliferative lesions. In this paper, we propose ...
详细信息
Accurate diagnosis of breast cancer in histopathology images is challenging due to the heterogeneity of cancer cell growth as well as a variety of benign breast tissue proliferative lesions. In this paper, we propose a practical and self-interpretable invasive cancer diagnosis solution. With minimum annotation information, the proposed method mines contrast patterns between normal and malignant images in a weak-supervised manner and generate a probability map of abnormalities to verify its reasoning. Particularly, a fully convolutional autoencoder is used to learn the dominant structural patterns among normal image patches. Patches that do not share the characteristics of this normal population are detected and analyzed by one-class support vector machine and one-layer neural network. We apply the proposed method to a public breast cancer image set. Our results, in consultation with a senior pathologist, demonstrate that the proposed method outperforms existing methods. The obtained probability map could benefit the pathology practice by providing visualized verification data and potentially leads to a better understanding of data-driven diagnosis solutions.
This paper presents a pansharpening technique based on the non-subsampled contourlet transform (NSCT) and convolutional autoencoder (CAE). NSCT is exceptionally proficient at presenting orientation information and cap...
详细信息
This paper presents a pansharpening technique based on the non-subsampled contourlet transform (NSCT) and convolutional autoencoder (CAE). NSCT is exceptionally proficient at presenting orientation information and capturing the internal geometry of objects. First, it's used to decompose the multispectral (MS) and panchromatic (PAN) images into high-frequency and low-frequency components using the same number of decomposition levels. Second, a CAE network is trained to generate original low-frequency PAN images from their spatially degraded versions. Low-resolution multispectral images are then fed into the trained convolutional autoencoder network to generate estimated high-resolution multispectral images. Third, another CAE network is trained to generate original high-frequency PAN images from their spatially degraded versions. The result of low-pass CAE is fed to the trained high-pass CAE to generate estimated high-resolution multispectral images. The final pan-sharpened image is accomplished by injecting the detailed map of the spectral bands into the corresponding estimated high-resolution multispectral bands. The proposed method is tested on QuickBird datasets and compared with some existing pan-sharpening techniques. Objective and subjective results demonstrate the efficiency of the proposed method.
Brain network analysis is one of the most effective methods for brain disease diagnosis. Existing studies have shown that exploring information from multimodal data is a valuable way to improve the effectiveness of br...
详细信息
Brain network analysis is one of the most effective methods for brain disease diagnosis. Existing studies have shown that exploring information from multimodal data is a valuable way to improve the effectiveness of brain network analysis. In recent years, deep learning has received more and more attention due to its powerful feature learning capabilities, and it is natural to introduce this tool into multi-modal brain networks analysis. However, it would face two challenges. One is that brain network is in non-Euclidean domain, so the convolution kernel in deep learning cannot be directly used to brain networks. The other is that most of existing multi-modal brain network analysis methods cannot extract full use of complementary information from distinct modalities. In this paper, we propose a multi-modal non-Euclidean brain network analysis method based on community detection and convolutional autoencoder, which can solve the above two problems simultaneously in one framework (M2CDCA). First, we construct the functional and structural brain network, respectively. Second, we design a multi-modal interactive community detection method that exploits structural modality to guide functional modality for detecting community structure, then readjusts the nodes distribution so that the adjusted brain network can preserve the potential community information and is more suitable for convolution kernels. Finally, we design a dual-channel autoencoder model with self-attention mechanism to capture hierarchical and highly non-linear features, then comprehensively use two modalities information for classification. We evaluate our method on an epilepsy dataset, the experimental results show that our method outperform several state-of-the-art methods.
Estimating the joint torques of lower limbs in human gait, known asmotion intent understanding, is of great significance in the control of lower limb exoskeletons. This study presents novel soft smart shoes designed f...
详细信息
Estimating the joint torques of lower limbs in human gait, known asmotion intent understanding, is of great significance in the control of lower limb exoskeletons. This study presents novel soft smart shoes designed for motion intent learning at unspecified walking speeds using long short-term memory with a convolutional autoencoder. The smart shoes serve as a wearable sensing system consisting of a soft instrumented sole and two 3D motion sensors that are nonintrusive to the human gait and comfortable for the wearers. A novel data structure is developed as a "sensor image" for the measured ground reaction force and foot motion. A convolutional autoencoder is established to fuse multisensor datasets and extract the hidden features of the sensor images, which represent the spatial and temporal correlations among the data. Then, long short-term memory is exploited to learn the multiscale, highly nonlinear inputoutput relationships between the acquired features and joint torques. Experiments were conducted on five subjects at threewalking speeds (0.8m/s, 1.2m/s, and 1.6m/s). Results showed that 98% of the r(2) valueswere acceptablein individual testing and 75% of the r(2) values were acceptable in interindividual testing. The proposedmethod is able to learn the join torques in human gait and has satisfactory generalization properties.
Derived from knowledge bases, knowledge graphs represent knowledge expressions in graphs, which utilize nodes and edges to denote entities and relations conceptually. Knowledge graph can be described in textual triple...
详细信息
Derived from knowledge bases, knowledge graphs represent knowledge expressions in graphs, which utilize nodes and edges to denote entities and relations conceptually. Knowledge graph can be described in textual triple form, consisting of head entities, tail entities and relations between entities. In order to represent elements in knowledge graphs, knowledge graph embedding techniques are proposed to map entities and relations into continuous vector spaces as numeric vectors for computational efficiency. Convolution-based knowledge graph embedding models have promising performance for knowledge graph representation learning. However, the input of those neural network-based models is frequently in handmade forms and may suffer from low efficiency in feature extraction procedure of the models. In this paper, a convolutional autoencoder is proposed for knowledge graph representation learning with entity pairs as input, aiming to obtain corresponding hidden relation representation. In addition, a bi-directional relation encoding network is utilized to represent semantic of entities in different directional relation patterns, as an encoder to output representation for initialization of the convolutional autoencoder. Experiments are conducted on standard datasets including, WN18RR, Kinship, NELL-995 and FB15k-237 as a link prediction task. Besides, input embedding matrix composed of different ingredients is designed to evaluate performances of the convolutional autoencoder. The results demonstrate that our model is effective in learning representation from entity feature interactions.
The design and verification of memristor crossbar circuits and systems demand computationally efficient models. The conventional device-level memristor model with a circuit simulator such as simulation program with in...
详细信息
The design and verification of memristor crossbar circuits and systems demand computationally efficient models. The conventional device-level memristor model with a circuit simulator such as simulation program with integrated circuit emphasis (SPICE) to solve a memristor crossbar is time exhaustive. Hence, we propose a neural network-based memristor crossbar modeling method, XBarNet. By transforming memristor crossbar modeling to pixel-to-pixel regression, XBarNet avoids the iterative procedure in the conventional SPICE method, accelerating the runtime significantly. Meanwhile, XBarNet models the interconnect resistance and nonlinear I - V effect of memristor crossbars, which minimizes the simulation errors. We first propose a feature extraction method to bridge a memristor crossbar circuit and a neural network. Then, the network based on the convolutional autoencoder architecture is developed and the filter pruning technique is applied onto XBarNet to reduce the runtime computational cost. The experimental result shows our proposed XBarNet achieves over 78x runtime speed up and 1.7x memory reduction with only 0.28% relative error comparing to the SPICE simulator.
暂无评论