In the field of aquaponics, where fish and plants coexist in a symbiotic environment, closely monitoring nitrate levels in the water is crucial due to their profound impact on aquatic and plant well-being. Traditional...
In the field of aquaponics, where fish and plants coexist in a symbiotic environment, closely monitoring nitrate levels in the water is crucial due to their profound impact on aquatic and plant well-being. Traditional nitrate measurement methods are often time-consuming and costly. Various approaches, including first principles, IoT-based sensors, and machine learning-based soft sensors, have been attempted to address this challenge. However, these efforts face challenges such as expensive sensors, infrequent data collection, multistage data processing using limited sensor types, and the need for regular maintenance like cleaning and calibration. Additionally, varied environmental conditions affect sensor suitability for different water environments, and even some machine learning-based soft sensors have proven inaccurate. In response, soft sensors, especially deep learning-based ones, have gained prominence in industrial applications for their adaptability and accuracy. These sensors provide real-time insights into complex processes without requiring expensive hardware. In this study, an innovative solution was introduced using Long Short-Term Memory (LSTM) technology, a neural network architecture in deep learning known for capturing complex temporal patterns. LSTM is well-suited for modeling and predicting nitrate concentration changes in aquaponics, trained with extensive data collected from various aquaponic ponds. Through rigorous evaluation, a remarkable MSE value of 0.00074 and an impressive R-squared score of 0.98 were achieved, holding potential for scaling up to commercial applications, benefiting aquaponics operations, supporting researchers, and enhancing sustainability and productivity in aquaponic systems.
There exist three approaches for multilingual and crosslingual automatic speech recognition (MCL-ASR) - supervised pretraining with phonetic or graphemic transcription, and self-supervised pretraining. We find that pr...
详细信息
There exist three approaches for multilingual and crosslingual automatic speech recognition (MCL-ASR) - supervised pretraining with phonetic or graphemic transcription, and self-supervised pretraining. We find that pretraining with phonetic supervision has been underappreciated so far for MCL-ASR, while conceptually it is more advantageous for information sharing between different languages. This paper explores the approach of pretraining with weakly phonetic supervision towards data-efficient MCL-ASR, which is called Whistle. We relax the requirement of gold-standard human-validated phonetic transcripts, and obtain International Phonetic Alphabet (IPA) based transcription by leveraging the LanguageNet grapheme-to-phoneme (G2P) models. We construct a common experimental setup based on the CommonVoice dataset, called CV-Lang10, with 10 seen languages and 2 unseen languages. A set of experiments are conducted on CV-Lang10 to compare, as fair as possible, the three approaches under the common setup for MCL-ASR. Experiments demonstrate the advantages of phoneme-based models (Whistle) for MCL-ASR, in terms of speech recognition for seen languages, crosslingual performance for unseen languages with different amounts of few-shot data, overcoming catastrophic forgetting, and training efficiency. It is found that when training data is more limited, phoneme supervision can achieve better results compared to subword supervision and self-supervision, thereby providing higher data-efficiency.
The multimodal task of Visual Question Answering (VQA) encompassing elements of computervision (CV) and Natural Language Processing (NLP), aims to generate answers to questions on any visual input. Over time, the sco...
详细信息
Continuous subgraph matching (CSM) is a critical task for analyzing dynamic graphs and has a wide range of applications, such as merchant fraud detection, cyber-attack hunting, and rumor detection. Although many effic...
详细信息
In recent times, following the paradigm of DETR (DEtection TRansformer), query-based end-to-end instance segmentation (QEIS) methods have exhibited superior performance compared to CNN-based models, particularly when ...
详细信息
In classic reinforcement learning algorithms, agents make decisions at discrete and fixed time intervals. The duration between decisions becomes a crucial hyperparameter, as setting it too short may increase the probl...
In classic reinforcement learning algorithms, agents make decisions at discrete and fixed time intervals. The duration between decisions becomes a crucial hyperparameter, as setting it too short may increase the problem's difficulty by requiring the agent to make numerous decisions to achieve its goal while setting it too long can result in the agent losing control over the system. However, physical systems do not necessarily require a constant control frequency, and for learning agents, it is often preferable to operate with a low frequency when possible and a high frequency when necessary. We propose a framework called Continuous-Time Continuous-Options (CTCO), where the agent chooses options as sub-policies of variable durations. These options are time-continuous and can interact with the system at any desired frequency providing a smooth change of actions. We demonstrate the effectiveness of CTCO by comparing its performance to classical RL and temporal-abstraction RL methods on simulated continuous control tasks with various action-cycle times. We show that our algorithm's performance is not affected by the choice of environment interaction frequency. Furthermore, we demonstrate the efficacy of CTCO in facilitating exploration in a real-world visual reaching task for a 7 DOF robotic arm with sparse rewards.
In skeleton-based action recognition, Graph Convolutional Networks model human skeletal joints as vertices and connect them through an adjacency matrix, which can be seen as a local attention mask. However, in most ex...
详细信息
In-context learning provides a new perspective for multi-task modeling for vision and NLP. Under this setting, the model can perceive tasks from prompts and accomplish them without any extra task-specific head predict...
详细信息
ISBN:
(数字)9798350353006
ISBN:
(纸本)9798350353013
In-context learning provides a new perspective for multi-task modeling for vision and NLP. Under this setting, the model can perceive tasks from prompts and accomplish them without any extra task-specific head predictions or model fine-tuning. However, skeleton sequence modeling via in-context learning remains unexplored. Directly applying existing in-context models from other areas onto skeleton sequences fails due to the similarity between inter-frame and cross-task poses, which makes it exceptionally hard to perceive the task correctly from a subtle context. To address this challenge, we propose Skeleton-in-Context (SiC), an effective framework for in-context skeleton sequence modeling. Our SiC is able to handle multiple skeleton-based tasks simultaneously after a single training process and accomplish each task from context according to the given prompt. It can further generalize to new, unseen tasks according to customized prompts. To facilitate context perception, we additionally propose a task-unified prompt, which adaptively learns tasks of different natures, such as partial joint-level generation, sequence-level prediction, or 2D-to-3D motion prediction. We conduct extensive experiments to evaluate the effectiveness of our SiC on multiple tasks, including motion prediction, pose estimation, joint completion, and future pose estimation. We also evaluate its generalization capability on unseen tasks such as motion-in-between. These experiments show that our model achieves state-of-the-art multi-task performance and even outperforms single-task methods on certain tasks.
Advanced Air Mobility (AAM) is a growing field that demands accurate modeling of legal concepts and restrictions in navigating intelligent vehicles. In addition, any implementation of AAM needs to face the challenges ...
详细信息
Image inpainting is a domain in which researchers have shown considerable interest, and when it comes to deep learning techniques, realistic problems become interesting and challenging. In image inpainting, a corrupte...
详细信息
ISBN:
(数字)9798350370249
ISBN:
(纸本)9798350370270
Image inpainting is a domain in which researchers have shown considerable interest, and when it comes to deep learning techniques, realistic problems become interesting and challenging. In image inpainting, a corrupted facial image with missing holes or significant holes can be restored and compared to the original image to see if it is real or fake. In addition to fixing the texture of the image and getting the image’s high-level abstract properties, it may also recover semantic images such as human faces. In the field of image-inpainting models, the Attention model with features learned through semantic approaches and progressive networks has become particularly popular. The proposed model introduces (i) Attention blocks in each decoder layer of U-Net architecture and (ii) a hybrid loss function leveraging both Mean Square Error (MSE) and Mean Absolute Error (MAE). The proposed Attention-based U-Net showed remarkable performance with SSIM and PSNR by 0.1067 and 13.63, respectively, compared to the previous approaches.
暂无评论