Tensor completion aims at filling in the missing elements of an incomplete tensor based on its partial observations, which is a popular approach for image inpainting. Most existing methods for visual data recovery can...
详细信息
Tensor completion aims at filling in the missing elements of an incomplete tensor based on its partial observations, which is a popular approach for image inpainting. Most existing methods for visual data recovery can be categorized into traditional optimization-based and neuralnetwork-based methods. The former usually adopt a low-rank assumption to handle this ill-posed problem, enjoying good interpretability and generalization. However, as visual data are only approximately low rank, handcrafted low-rank priors may not capture the complex details properly, limiting the recovery performance. For neuralnetwork-based methods, despite their impressive performance in image inpainting, sufficient training data are required for parameter learning, and their generalization ability on the unseen data is a concern. In this paper, combining the advantages of these two distinct approaches, we propose a tensor Completion neuralnetwork (CNet) for visual data completion. The CNet is comprised of two parts, namely, the encoder and decoder. The encoder is designed by exploiting the CANDECOMP/PARAFAC decomposition to produce a low-rank embedding of the target tensor, whose mechanism is interpretable. To compensate the drawback of the low-rank constraint, a decoder consisting of several convolutional layers is introduced to refine the low-rank embedding. The CNet only uses the observations of the incomplete tensor to recover its missing entries and thus is free from large training datasets. Extensive experiments in inpainting color images, grayscale video sequences, hyperspectral images, color video sequences, and light field images are conducted to showcase the superiority of CNet over state-of-the-art methods in terms of restoration performance.
The accurate detection of mesoscale convective systems (MCSs) is crucial for meteorological monitoring due to their potential to cause significant destruction through severe weather phenomena, such as hail, thundersto...
详细信息
The accurate detection of mesoscale convective systems (MCSs) is crucial for meteorological monitoring due to their potential to cause significant destruction through severe weather phenomena, such as hail, thunderstorms, and heavy rainfall. However, the existing methods for MCS detection mostly targets on single-frame detection, which just considers the static characteristics and ignores the temporal evolution in the life cycle of MCS. In this article, we propose a novel encoder-decoder neural network named mesoscale convective system detection network (MCSDNet) to detect MCS regions. MCSDNet has a simple architecture and is easy to expand. Different from the previous models, MCSDNet targets on multiframes detection and leverages multiscale spatiotemporal information in remote sensing imagery (RSI). As far as we know, it is the first work to utilize multiscale spatiotemporal information to detect MCS regions. First, we design a multiscale spatiotemporal information module to extract multilevel semantic from different encoder levels, which makes our models can extract more detail spatiotemporal features. Second, spatiotemporal mix unit (STMU), a dual spatiotemporal attention, is introduced to MCSDNet to capture both intraframe features and interframe. Finally, we present MCS remote sensing image (MCSRSI) the first publicly available dataset for multiframes MCS detection based on FY-4A satellite. We also conduct several experiments on MCSRSI and find that our proposed MCSDNet achieves the best performance on MCS detection task when comparing with other baseline methods. We hope that the combination of our open-access dataset and promising results will encourage the future research for MCS detection task and provide a robust framework for related tasks in atmospheric science. Our code is available at: https://***/250HandsomeLiang/***
Recently, point cloud semantic segmentation has played an important role in real-world applications such as autonomous driving and robotics. In this context, while recognized as an efficient semantic segmentation mode...
详细信息
ISBN:
(纸本)9783031761966;9783031761973
Recently, point cloud semantic segmentation has played an important role in real-world applications such as autonomous driving and robotics. In this context, while recognized as an efficient semantic segmentation model with a good balance between performance and complexity, SqueezeSegv2 is still too heavy for resource-constrained devices. In this paper, we propose Lite-GrSeg, a compact and effective semantic segmentation model inspired by SqueezeSegv2. Lite-GrSeg adopts a cutting-edge design architecture that leverages SqueezeSegV2 with group convolution and spatial separable convolution to reduce the model's complexity. Additionally, Lite-GrSeg introduces a novel structure called the Spatial Context Aggregation Module (Spatial-CAM) to enhance the model's discriminability. Through the simulations benchmarked on the PandaSet dataset, Lite-GrSeg significantly reduces computational complexity and model size while presenting competitive segmentation accuracy compared to SqueezeSegV2, thus making it a compelling choice for lightweight applications and opening up exciting possibilities for development on resource-constrained IoT devices.
As one of the most important smart grid features, non-intrusive load monitoring (NILM) has become a practical technology for identifying the users' energy consumption behavior. The conventional studies are usually...
详细信息
As one of the most important smart grid features, non-intrusive load monitoring (NILM) has become a practical technology for identifying the users' energy consumption behavior. The conventional studies are usually based on the assumption that only one appliance is active or the signature database of all appliances is already known. Existing deep learning-based algorithms need to train a model for each target appliance. This paper, however, proposes an energy disaggregation network (EDNet) with deep encoder-decoder architecture to remove the unrealistic assumptions and reduce the size of the network to achieve latency-free NILM with only one model. Firstly, the blind source separation and mask mechanism used for speech recognition are creatively adopted for energy disaggregation. Then, the on/off states of each target appliance is detected based on the results of energy disaggregation. Finally, a personalized signature database with detailed states is constructed based on dynamic time warping (DTW) with energy disaggregation and state detection results to remove the assumption of NILM's dependence on prior information. Full comparison results with the previous work show that our proposed algorithms outperform state-of-the-art methods. It means that the load consumption behavior of residential users can be monitored with high accuracy without sub-metered information and other prior knowledge. Furthermore, the proposed EDNet has significantly smaller parameters, making the NILM toward offline and real-time load monitoring.
Background: Osteoarthritis (OA) is a common degenerative joint inflammation that may lead to disability. Although OA is not lethal, this disease will remarkably affect patient's mobility and their daily lives. Det...
详细信息
Background: Osteoarthritis (OA) is a common degenerative joint inflammation that may lead to disability. Although OA is not lethal, this disease will remarkably affect patient's mobility and their daily lives. Detecting OA at an early stage allows for early intervention and may slow down disease progression. Introduction: Magnetic resonance imaging is a useful technique to visualize soft tissues within the knee joint. Cartilage delineation in magnetic resonance (MR) images helps in understanding the disease progressions. Convolutional neuralnetworks (CNNs) have shown promising results in computer vision tasks, and various encoder-decoder-based segmentation neuralnetworks are introduced in the last few years. However, the performances of such networks are unknown in the context of cartilage delineation. Methods: This study trained and compared 10 encoder-decoder-based CNNs in performing cartilage delineation from knee MR images. The knee MR images are obtained from the Osteoarthritis Initiative (OAI). The benchmarking process is to compare various CNNs based on physical specifications and segmentation performances. Results: LadderNet has the least trainable parameters with the model size of 5 MB. UNetVanilla crowned the best performances by having 0.8369, 0.9108, and 0.9097 on JSC, DSC, and MCC. Conclusion: UNetVanilla can be served as a benchmark for cartilage delineation in knee MR images, while LadderNet served as an alternative if there are hardware limitations during production.
In this paper, we present a deep reinforcement learning(DRL) based strategy for optimizing the scheduling of satellite on-orbit services. The orbital maneuvers necessitate the servicing satellite to consecutively rend...
详细信息
ISBN:
(数字)9789887581581
ISBN:
(纸本)9798350366907
In this paper, we present a deep reinforcement learning(DRL) based strategy for optimizing the scheduling of satellite on-orbit services. The orbital maneuvers necessitate the servicing satellite to consecutively rendezvous with multiple targets to execute its on-orbit missions. The principal aim of our optimization approach is to ascertain the most advantageous sequence for servicing satellites, thereby minimizing the overall cost, contingent upon the expenditure of propulsion maneuvers. To surmount this formidable challenge, we introduce an attention-based encoder-decoder neural network and train its parameters utilizing the REINFORCE algorithm with a greedy rollout baseline. Ultimately, experimental results across diverse scenarios validate the efficacy and supremacy of our proposed algorithm. The chief contribution of this work lies in its conceptualization of the satellite on-orbit service scheduling optimization quandary as an extended traveling salesman problem, culminating in the introduction of an innovative DRL-based methodology.
To keep their code up-to-date with the newest functionalities as well as bug fixes offered by third-party libraries, developers often need to replace an old version of third-party libraries (TPLs) with a newer one. Ho...
详细信息
To keep their code up-to-date with the newest functionalities as well as bug fixes offered by third-party libraries, developers often need to replace an old version of third-party libraries (TPLs) with a newer one. However, choosing a suitable version for a library to be upgraded is complex and susceptible to error. So far, Dependabot is the only tool that supports library upgrades;however, it targets only security fixes and singularly analyzes libraries without considering the whole set of related libraries. In this work, we propose DeepLib as a practical approach to learn upgrades for third-party libraries that have been performed by similar clients. Such upgrades are considered safe, i.e., they do not trigger any conflict, since, in the training clients, the libraries already co-exist without causing any compatibility or dependency issues. In this way, the upgrades provided by DeepLib allow developers to maintain a harmonious relationship with other libraries. By mining the development history of projects, we build migration matrices to train deep neuralnetworks. Once being trained, the networks are then used to forecast the subsequent versions of the related libraries, exploiting the well-founded background related to the machine translation domain. As input, DeepLib accepts a set of library versions and returns a set of future versions to which developers should upgrade the libraries. The framework has been evaluated on two real-world datasets curated from the Maven Central Repository. The results show promising outcomes: DeepLib can recommend the next version for a library as well as a set of libraries under investigation. At its best performance, DeepLib gains a perfect match for several libraries, earning an accuracy of 1.0.
Recently, deep reinforcement learning (RL) technologies have been considered as a feasible solution for tackling combinatorial problems in various research and engineering areas. Motivated by this recent success of RL...
详细信息
Recently, deep reinforcement learning (RL) technologies have been considered as a feasible solution for tackling combinatorial problems in various research and engineering areas. Motivated by this recent success of RL-based approaches, in this paper, we focus on how to utilize RL technologies in the context of real-time system research. Specifically, we first formulate the problem of fixed-priority assignments for multi-processor real-time scheduling, which has long been considered challenging in the real-time system community, as a combinatorial problem. We then propose the RL-based priority assignment model Panda that employs (i) a taskset embedding mechanism driven by attention-based encoder-decoder deep neuralnetworks, hence enabling to efficiently extract useful features from the dynamic relation of periodic tasks. We also present two optimization schemes tailored to adopt RL for real-time task scheduling problems: (ii) the response time analysis (RTA)-based policy gradient RL and guided learning schemes, which facilitate the training processes of the Panda model. To the best of our knowledge, our approach is the first to employ RL for real-time task scheduling. Through various experiments, we show that Panda is competitive with well-known heuristic algorithms for real-time task scheduling upon a multi-processor platform, and it often outperforms them in large-scale non-trivial settings, e.g., achieving an average 7.7% enhancement in schedulability ratio for a testing system configuration of 64-sized tasksets and an 8-processor platform.
FPGA-based deep learning accelerator has become important for high throughput and low power inference at edges. In this paper, we have developed a computing-in-memory (CIM) accelerator using the binary SegNet (BSEG) f...
详细信息
ISBN:
(纸本)9781728129433
FPGA-based deep learning accelerator has become important for high throughput and low power inference at edges. In this paper, we have developed a computing-in-memory (CIM) accelerator using the binary SegNet (BSEG) for real-time scene text recognition (STR) at edges. The accelerator can perform highly efficient pixel-wise character classification under CIM architecture with massive bit-level parallelism as well as optimized pipeline for low latency at critical path. The BSEG is obtained during training with a small model size of 2.1MB as well as a high classification accuracy over 90% on ICDAR-03 and ICDAR-13 datasets. The RTL-level realized FPGA-accelerator can process the STR with an energy-efficiency of 351.7 GOPs/W and a throughput of 307 fps for processing one frame of 128x32 pixels in latency of 3.875 ms.
To improve automation, increase efficiency, and maintain high quality in the production of steel, applying modern machine learning techniques to help detect steel defects has been the research focus in the steel indus...
详细信息
ISBN:
(纸本)9781450376822
To improve automation, increase efficiency, and maintain high quality in the production of steel, applying modern machine learning techniques to help detect steel defects has been the research focus in the steel industry, since an unprecedented revolution in image semantic segmentation has been witnessed in the past few years. In the traditional production process of steel materials, localizing and classifying surface defects manually on a steel sheet is inefficient and error-prone. Therefore, it's a key challenge to achieve automated detection of steel surface defects in image pixel level, leaving an urgent and critical issue to be addressed. In this paper, to accomplish this crucial task, we apply a series of machine learning algorithms of real-time semantic segmentation, utilizing neuralnetworks with encoder-decoder architectures based on Unet and feature pyramid network (FPN). The image dataset of steel defects is provided by Severstal, the largest steel company in Russia, through a featured code competition in the Kaggle community. The results show that the ensemble algorithm of several neuralnetworks with encoder-decoder architectures has a decent performance regarding both time cost and segmentation accuracy. Our machine learning algorithms achieve dice coefficients over 0.915 and 0.905 at a speed of over 1.5 images per second on the public test set and private test set on the Kaggle platform, respectively, which locates at the top 2% among all teams in the competition.
暂无评论