With the increasing usage of wearable electrocardiogram (ECG) monitoring devices, it is necessary to develop models and algorithms that can analyze the large amounts of ECG data obtained in real-time. Accurate ECG del...
详细信息
With the increasing usage of wearable electrocardiogram (ECG) monitoring devices, it is necessary to develop models and algorithms that can analyze the large amounts of ECG data obtained in real-time. Accurate ECG delineation is key to assisting cardiologists in diagnosing cardiac diseases. The main objective of this study is to design a delineation model based on the encoder-decoder structure to detect different heartbeat waveforms, including P-waves, QRS complexes, T-waves, and No waves (NW), as well as the onset and offset of these waveforms. First, the introduction of a standard dilated convolution module (SDCM) into the encoder path enabled the model to extract more useful ECG signal-informative features. Subsequently, bidirectional long shortterm memory (BiLSTM) was added to the encoding structure to obtain numerous temporal features. Moreover, the feature sets of the ECG signals at each level in the encoder path were connected to the decoder part for multi scale decoding to mitigate the information loss caused by the pooling operation in the encoding process. Finally, the proposed model was trained and tested on both QT and LU databases, and it achieved accurate results compared to other state-of-the-art methods. Regarding the QT database, the average accuracy of ECG waveform classification was 96.90%, and an average classification accuracy of 95.40% was obtained on the LU database. In addition, average F1 values of 99.58% and 97.05% were achieved in the ECG delineation task of the QT and LU databases, respectively. The results show that the proposed ECG_SegNet model has good flexibility and reliability when applied to ECG delineation, and it is a reliable method for analyzing ECG signals in real-time.
Regional rainfall-runoff modeling is a classic and significant research topic in hydrological sciences. Currently, the predominant modeling approach is developing data-driven models. This study proposes a rainfall-run...
详细信息
Regional rainfall-runoff modeling is a classic and significant research topic in hydrological sciences. Currently, the predominant modeling approach is developing data-driven models. This study proposes a rainfall-runoff model named ED-TimesNet (encoder-decoder-based TimesNet), which consists of convolutional neural networks. It transforms a one-dimensional time series into a two-dimensional matrix based on frequency-domain partitioning rules and subsequently employs a two-dimensional visual backbone to learn both local and global features of the hydrological time series. Compared to LSTM-based models and Transformer models, this model learns both intra-period and inter-period variations in hydrological series, simultaneously focusing on the relationships between adjacent and non-adjacent time points. It alleviates the temporal ambiguity problem inherent in attention mechanisms. This research validates the performance of the ED-TimesNet model in regional rainfall-runoff modeling tasks using the Catchment Attributes and Meteorology for Large-sample Studies (CAMELS) dataset. The model achieves a median and mean NSE of 0.8049 and 0.7808, respectively, across 448 basins, outperforming the benchmark LSTM, VIC, and mHM models, and achieving comparable performance to the Transformer model. This paper does not address the model's performance on ungauged basins. The method of predicting runoff based on the periodic features of hydrological data provides a novel perspective for hydrological sciences.
Regular detection and repair for lining cracks are necessary to guarantee the safety and stability of *** development of computer vision has greatly promoted structural health *** study proposes a novel encoder–decod...
详细信息
Regular detection and repair for lining cracks are necessary to guarantee the safety and stability of *** development of computer vision has greatly promoted structural health *** study proposes a novel encoder–decoderstructure,CrackRecNet,for semantic segmentation of lining segment cracks by integrating improved VGG-19 into the U-Net *** image acquisition equipment is designed based on a camera,3-dimensional printing(3DP)bracket and two laser rangefinders.A tunnel concrete structure crack(TCSC)image data set,containing images collected from a double-shield tunnel boring machines(TBM)tunnel in China,was *** data preprocessing operations,such as brightness adjustment,pixel resolution adjustment,flipping,splitting and annotation,2880 image samples with pixel resolution of 448×448 were *** model was implemented by Pytorch in PyCharm processed with 4 NVIDIA TITAN V *** the experiments,the proposed CrackRecNet showed better prediction performance than U-Net,TernausNet,and *** paper also discusses GPU parallel acceleration effect and the crack maximum width quantification.
Traveling shortest path planning, encompassing the Traveling Salesman Problem (TSP) in graph theory, holds profound significance. The motivation for addressing the TSP stems from its critical application in real-world...
详细信息
Traveling shortest path planning, encompassing the Traveling Salesman Problem (TSP) in graph theory, holds profound significance. The motivation for addressing the TSP stems from its critical application in real-world scenarios, such as logistics, where optimal routing can substantially reduce costs and improve service efficiency. Furthermore, TSP-like challenges play a pivotal role in assisting travelers to chart the optimal itinerary, encompassing all landmarks in the least distance or time and concluding at the departure site. This optimization not only streamlines travel routes but also economizes time and energy, ensuring maximal sightseeing within a confined timeframe. Recognizing the limitations of current solutions in achieving high efficiency and accuracy simultaneously, we propose an innovative Association & Integration-based encoder-decoder structure tailored for solving the Traveling Salesman Problem, i.e., A&*** proposed structure comprises four blocks: the information linkage space, dual-path integration encoder, node encoder, and representation decoder. Specifically, the information linkage space constructs associations among hidden information between input sequence samples. The dual-path integration encoder extracts and merges the original representations of the sequence with associated representations. The node encoder extracts current sequence representations, while the representation decoder block computes the probability distribution of sequence samples, completing the combinatorial optimization of the entire sequence. In the experimental evaluation, we utilized three different metrics: Average Tour Length (ATL), Optimality Gap (OG), and Evaluation Time (ET). We compared the proposed method with classical approximation methods and various state-of-the-art deep learning approaches. The experimental results show that our A&I-ED-TSP structure achieved the best ATLs of 5.704, 12.770, 17.981, 21.979, and 25.293 for TSP instances of TSP50, TS
The ionosphere is vital for satellite navigation and radio communication, but observational limitations necessitate ionospheric forecasting. The least squares collocation (LSC) method is commonly used for global navig...
详细信息
The ionosphere is vital for satellite navigation and radio communication, but observational limitations necessitate ionospheric forecasting. The least squares collocation (LSC) method is commonly used for global navigation satellite system (GNSS)-based global ionospheric forecasting, though its accuracy and stability need improvement. This study introduces two optimized models based on the ConvLSTM cell with an encoder-decoder structure to enhance forecasting performance. Using seven years of historical data, the model provides stable forecasts for the following year. Tests from 2015 to 2020 show that optimization reduces root mean square error (RMSE) by 10.159%-16.363% compared to the unoptimized method. The encoder-decoder ConvLSTM-B model achieves the best performance, lowering RMSE by 2.031%-8.547% compared to the ConvLSTM-A model. These results highlight the effectiveness of the proposed approach in improving ionospheric forecast accuracy.
In road extraction from remote sensing images, the road environment is complex and blocked by trees, buildings, and other objects, making it impossible to extract practical (continuous and complete) road information. ...
详细信息
In road extraction from remote sensing images, the road environment is complex and blocked by trees, buildings, and other objects, making it impossible to extract practical (continuous and complete) road information. We propose a joint attention encoder-decoder network (JAED-Net) for road extraction from remote sensing images to solve these problems. First, JAED-Net encodes a modified residual network as the backbone for road feature extraction. A joint attention module is added to the encoder to enhance the network's ability to learn and express road features. Then, strip convolution is added to the decoder, so the network retains more spatial features, such as the width and connectivity of roads during upsampling. Finally, a hybrid weighted loss function is introduced to train the network and ensure stability because of the unbalanced ratio of road and background pixels in remote sensing images. Experimental validation of the proposed network is performed on three publicly available datasets.
encoders are widely used in the field of image caption, but the statements generated by the current image caption method may miss the target and the generated description statements are not appropriate enough for the ...
详细信息
ISBN:
(纸本)9798350359329;9798350359312
encoders are widely used in the field of image caption, but the statements generated by the current image caption method may miss the target and the generated description statements are not appropriate enough for the image content. In order to solve the above problems, we propose a coarse-fine image caption method based on dual encoder-decoder framework, which provides a mechanism for discovering and correcting omissions and enables the model to generate a complete image description. Firstly, an image feature extractor based on global and local information is designed, which can extract global information and local information of image and obtain more abundant image representation. Secondly, a dual encoder-decoder framework is designed, which consists of a coarse-grained encoder-decoder and a fine-grained encoder-decoder. Coarse-grained encoder-decoder requires only the original image features as input, which is processed by transformer to produce a coarse text description. In addition, an image feature auto-enhancement module is proposed to detect missing objects in coarse text and enhance their feature expression. Finally, the fine-grained encoder-decoder uses both the image feature and the coarse text caption as input, and generates the final fine-grained caption after multi-modal information fusion. Experimental results on MSCOCO datasets show that our proposed method outperforms previous image caption methods and achieves a performance of 39.7 BLEU-4 score and 121.6 CIDEr-A score.
Current surveillance systems used on long-span bridges have monitoring dead zones, resulting in incomplete vehicle trajectories, whereas abnormal vehicle trajectories occurring within these dead zones are highly prone...
详细信息
Current surveillance systems used on long-span bridges have monitoring dead zones, resulting in incomplete vehicle trajectories, whereas abnormal vehicle trajectories occurring within these dead zones are highly prone to traffic accidents. Obtaining complete vehicle trajectories within dead zones is challenging, yet crucial for identifying abnormal trajectories. To tackle this challenge, this article presents a joint trajectory completion (JTC) framework, which enables the completion of vehicle trajectories in dead zones with similar vehicle appearances and diverse camera views. An appearance-spatiotemporal fusion (ASTF) module is utilized to integrate motion spatiotemporal and appearance information to enhance vehicle representations. A Transformer-based bi-directional trajectory completion (TBTC) module with two encoder-decoder structures is further constructed to complete trajectories. Finally, a mathematical assessment strategy is established according to traffic rules to identify abnormal trajectories. Sufficient experimental results on the Hangzhou Bay dataset demonstrate the effectiveness of the proposed JTC method, which achieves an improvement in mAP of 7.4% for vehicle re-identification (ReID) and reduces the average RMSE by at least 0.2123 m for completion of vehicle trajectories. The completion of trajectories within dead zones facilitates the subsequent effective identification of abnormal trajectories and rapid accident warnings.
Enabled by hierarchical convolutions and nonlinear mappings, recent action recognition studies have continuously boosted performance with spatiotemporal modelling. In general, motion clues are essential in video-orien...
详细信息
Enabled by hierarchical convolutions and nonlinear mappings, recent action recognition studies have continuously boosted performance with spatiotemporal modelling. In general, motion clues are essential in video-oriented tasks, while existing approaches aggregate the spatial and temporal signatures via specially designed modules in the middle or output stages. To highlight the privilege provided by temporal motions, in this paper, we propose a simple but effective MOTion Estimator (MOTE) to generate the motion patterns from every single frame, avoiding complex dense-frame input. In particular, MOTE follows an encoder-decoder structure, which takes the short-term motion features generated by the pretrained dense-frame network as the learning target. The spatial information of a single frame is utilized to estimate the instantaneous motion appearance. It can support the expression of vulnerable regions, such as the 'hand' in 'waving hands,' which would otherwise be suppressed in the feature maps as the 'hand' suffers from motion blur. The training process of MOTE is independent of the action recognition system. Therefore, the trained MOTE can be transplanted to the input-end of existing action recognition methods to provide instantaneous motion estimation as feature enhancement according to practical requirements. Our experiments performed on Something-Something V1, V2, Kinetics-400, and Diving48 verify the effectiveness of the proposed method.
Non-local network provides a pioneering approach for capturing long-range dependency by aggregating query-specific global context into each query location;however, non-local network applies the identical weight to eac...
详细信息
Non-local network provides a pioneering approach for capturing long-range dependency by aggregating query-specific global context into each query location;however, non-local network applies the identical weight to each channel of feature maps and ignores the differences from the different channels of features. We design a novel tensor attention module (TAM), which integrates the context information along spatial dimension and channel dimension by introducing a bias learnable parameters tensor, so that the feature at each location of each channel can aggregate the features from all other locations. Motivated by SE-Net, we propose a novel second-order covariance attention module (SCAM) to enhance the feature correlation between different channel maps through the second-order statistics and the local cross-channel interaction strategy. We take the encoder-decoder segmentation network DeepLabv3+ as baseline, and in the encoder develop the attention modules TAM and SCAM for semantic segmentation (TCNet). Experimental results on PASCAL VOC 2012 and Cityscapes datasets show that our proposed network has better performance than the other state-of-the-art segmentation networks.
暂无评论