Understanding the factors that contribute to optimal hearing aid fitting and hearing aid user experiences is crucial in order to increase the satisfaction and quality of life of hearing loss patients, as well as reduc...
详细信息
ISBN:
(纸本)9781665464956
Understanding the factors that contribute to optimal hearing aid fitting and hearing aid user experiences is crucial in order to increase the satisfaction and quality of life of hearing loss patients, as well as reduce societal and financial burdens. This work proposes a novel framework that uses encoder-decoder with attention mechanism ( attn-ED) for predicting future hearing aid usage and SHAP to explain the factors contributing to this prediction. It has been demonstrated in experiments that attn-ED performs well at predicting future hearing aid usage, and that SHAP can be utilized to calculate the contribution of different factors affecting hearing aid usage. This framework aims to establish confidence that AI models can be utilized in the medical domain with the use of XAI methods. Moreover, the proposed framework can also assist clinicians in determining the nature of interventions.
An overhead contact system (OCS) is key to providing power to high-speed railways. OCS detection is an important measure to ensure the safe operation of a high-speed railway. At present, OCS anomaly detection mainly r...
详细信息
An overhead contact system (OCS) is key to providing power to high-speed railways. OCS detection is an important measure to ensure the safe operation of a high-speed railway. At present, OCS anomaly detection mainly relies on the manual analysis of the images regularly collected by the 4C system, which is very inefficient and can easily miss anomalies. Although some classification and object detection methods based on deep learning can be used for OCS anomaly detection, the effective training of deep networks can be difficult to support due to the small number of anomaly OCS image samples. Considering that most OCS faults are abnormal fasteners, we propose an abnormal detection method based on normal images, called the nested residual encoder-decoder network (NRE-Net). This network consists of two nested encoder-decoder networks, where the encoder is the shared part, and a residual structure is added to the encoding and decoding branches to enhance the feature expression ability. The experimental results show that the method can greatly improve the accuracy of anomaly detection for the CIFAR-10 dataset and OCS fastener dataset. Compared with the previous state-of-the-art approaches, the F-1 score of the proposed method for the two classes fastener in the OCS fastener dataset has increased by 10.8% and 11.9%, respectively.
Dear editor,Mathematical expressions have been widely employed in scientific research, finance, and statistics, and play a significant role in educational activities. For example, if a computer can recognize teachers&...
详细信息
Dear editor,Mathematical expressions have been widely employed in scientific research, finance, and statistics, and play a significant role in educational activities. For example, if a computer can recognize teachers' handwritten expressions as standard printed mathematical expressions, this will undoubtedly be more conducive and helpful for improving the effectiveness of lectures. Thus, the question of how to make computers automatically recognize mathematical expressions is highly significant.
Hyperspectral images with very high resolution (VHR-HSI) have become considerably valuable due to their abundant spectral and spatial details. Classification of hyperspectral images (HSIs) is a basic and important pro...
详细信息
Hyperspectral images with very high resolution (VHR-HSI) have become considerably valuable due to their abundant spectral and spatial details. Classification of hyperspectral images (HSIs) is a basic and important procedure for diverse applications. However, low interclass spectral variability and high intraclass spectral variability in VHR-HSI, shadows, pedestrians, and low signal-to-noise ratio increase the fuzziness of different categories. To address the known challenges of VHR-HSI classification, an effective classification method based on encoder-decoder architecture is proposed. The proposed algorithm is an object-level contextual convolution neural network based on an improved residual network backbone with 3-D convolution, which fully considers the spatial-spectral and contextual features of HSIs. Two different spatial resolution aerial HSIs are used as experimental data. The results show that the overall accuracy of the proposed method is improved by 7.42% and 18.82%, respectively, compared to the pixelwise convolution neural network and DeepLabv3 algorithm, which is extraordinarily suitable for HSI classification with very high spatial resolution.
Accurate traffic flow prediction is becoming increasingly important for transportation planning, control, management, and information services of successful. Numerous existing models focus on short-term traffic foreca...
详细信息
Accurate traffic flow prediction is becoming increasingly important for transportation planning, control, management, and information services of successful. Numerous existing models focus on short-term traffic forecasts, but effective long-term forecasting of traffic flows have become a challenging issue in recent years. To solve this problem, this paper proposes a deep learning architecture which consisting of two parts: the long short-term memory encoder-decoder structure at the bottom and the calibration layer at the top. In the encoder-decoder model, we propose an hard attention mechanism based on learning similar patterns to enhance neuronal memory and reduce the accumulation of error propagation. To correct some of the missing details, we design a control gate in the calibration layer to learn the predicted data in groups according to different forms. The proposed method is evaluated on real-world datasets and compared with other state-of-the-art methods. It is verified that our model can accurately learn local feature and long-term dependence, and has better accuracy and stability in long-term sequence prediction.
encoder-decoder based automatic speech recognition (ASR) methods are increasingly popular due to their simplified processing stages and low reliance on prior knowledge. Conventional encoder-decoder based approaches us...
详细信息
encoder-decoder based automatic speech recognition (ASR) methods are increasingly popular due to their simplified processing stages and low reliance on prior knowledge. Conventional encoder-decoder based approaches usually learn a sequence-to-sequence mapping function from the source speech to target units (e.g., subwords, characters) in an end-to-end manner. However, it is still unclear how to choose the optimal target unit, or granularity of multiple units. In general, as increasing the information available for learning sequence-to-sequence mapping functions can improve modeling effectiveness, we therefore propose a multi-granularity sequence alignment (MGSA) approach. This aims to enhance cross-sequence interactions between different granularity units for both modeling and inference stages in the encoder-decoder based ASR. Specifically, a decoder module is designed to generate multi-granularity sequence predictions. We then exploit the latent alignment mapping among units having different levels of granularity, by utilizing the decoded multi-level sequences as input for model prediction. The cross-sequence interaction can also be employed to re-calibrate output probabilities in the proposed post-inference algorithm. Experimental results on both WSJ-80 hrs and Switchboard-300 hrs datasets show the superiority of the proposed method compared to traditional multi-task methods as well as to single granularity baseline systems.
The detection of bridge cracks is an important task in bridge maintenance. It can also reflect the health of the bridge. However, cracks are usually in the form of strips, which are different from the concrete surface...
详细信息
The detection of bridge cracks is an important task in bridge maintenance. It can also reflect the health of the bridge. However, cracks are usually in the form of strips, which are different from the concrete surface. Most crack detection algorithms cannot adapt to this situation well. In this paper, the original image of bridge cracks is collected and the data set is obtained through image processing. A bridge crack detection method based on improving encoder-decoder and mixed pooling module is proposed in this article. The basic features of the crack images are extracted by an encoder with dilated convolution. In this way, the resolution of the feature image can be guaranteed, and large receptive field can be obtained. Then the feature picture through the mix pooling module, which helps to capture remote context information and establish a remote dependency. Finally, the decoder restores the picture to its original size and integrates the original features. In the comparison experiment with the same experimental conditions, we compared with the classic image segmentation methods such as PSPNet, U-Net, FCN, and DeepLabv3+. The results show that our method achieves 98.3%, 97.3%, 97.6%, and 84.5% in precision, recall, F1-score, and MIoU. The results show that our method does have certain advantages in the field of crack detection and segmentation.
In this work, two methods are proposed for solving the problem of one-dimensional barcode segmentation in images, with an emphasis on augmented reality (AR) applications. These methods take the partial discrete Radon ...
详细信息
In this work, two methods are proposed for solving the problem of one-dimensional barcode segmentation in images, with an emphasis on augmented reality (AR) applications. These methods take the partial discrete Radon transform as a building block. The first proposed method uses overlapping tiles for obtaining good angle precision while maintaining good spatial precision. The second one uses an encoder-decoder structure inspired by state-of-the-art convolutional neural networks for segmentation while maintaining a classical processing framework, thus not requiring training. It is shown that the second method's processing time is lower than the video acquisition time with a 1024 x 1024 input on a CPU, which had not been previously achieved. The accuracy it obtained on datasets widely used by the scientific community was almost on par with that obtained using the most-recent state-of-the-art methods using deep learning. Beyond the challenges of those datasets, the method proposed is particularly well suited to image sequences taken with short exposure and exhibiting motion blur and lens blur, which are expected in a real-world AR scenario. Two implementations of the proposed methods are made available to the scientific community: one for easy prototyping and one optimised for parallel implementation, which can be run on desktop and mobile phone CPUs.
Structural health monitoring method can provide important information to evaluate operational status of con-crete dams, by establishing accurate models to predict concrete dam behavior with monitored data. This study ...
详细信息
Structural health monitoring method can provide important information to evaluate operational status of con-crete dams, by establishing accurate models to predict concrete dam behavior with monitored data. This study proposed a model using encoder-decoder based on long short-term memory network with dual-stage attention mechanism (DALSTM) to predict the displacement of concrete arch dams. encoder-decoder based on long short -term memory network is a deep learning technique that can perform time series prediction, and dual-stage attention mechanism focuses on the key information in the dam displacement series to improve the perfor-mance. The effectiveness and accuracy of the proposed prediction model are analyzed on a high arch dam using measured temperature in the dam body instead of the seasonal functions to represent the thermal effect. Compared with traditional stepwise regression, multiple linear regression models, radial basis function networks, and other deep learning models, results show that the proposed approach performance is more accurate and robust for dam health monitoring.
Hydrologic signatures are quantitative metrics that describe a streamflow time series. Examples include annual maximum flow, baseflow index and recession shape descriptors. In this paper, we use machine learning (ML) ...
详细信息
Hydrologic signatures are quantitative metrics that describe a streamflow time series. Examples include annual maximum flow, baseflow index and recession shape descriptors. In this paper, we use machine learning (ML) to learn encodings that are optimal ML equivalents of hydrologic signatures, and that are derived directly from the data. We compare the learned signatures to classical signatures, interpret their meaning, and use them to build rainfall-runoff models in otherwise ungauged watersheds. Our model has an encoder-decoder structure. The encoder is a convolutional neural net mapping historical flow and climate data to a low-dimensional vector encoding, analogous to hydrological signatures. The decoder structure includes stores and fluxes similar to a classical hydrologic model. For each timestep, the decoder uses current climate data, watershed attributes and the encoding to predict coefficients that distribute precipitation between stores and store outflow coefficients. The model is trained end-to-end on the U.S. CAMELS watershed data set to minimize streamflow error. We show that learned signatures can extract new information from streamflow series, because using learned signatures as input to the process-informed model improves prediction accuracy over benchmark configurations that use classical signatures or no signatures. We interpret learned signatures by correlation with classical signatures, and by using sensitivity analysis to assess their impact on modeled store dynamics. Learned signatures are spatially correlated and relate to streamflow dynamics including seasonality, high and low extremes, baseflow and recessions. We conclude that process-informed ML models and other applications using hydrologic signatures may benefit from replacing expert-selected signatures with learned signatures.
暂无评论