This paper is concerned with a finite-horizon inverse control problem, which has the goal of reconstructing, from observations, the possibly non-convex and non-stationary cost driving the actions of an agent. In this ...
详细信息
Sequence data are more commonly seen than other types of data (e.g., transaction data) in real-world applications. For the mining task from sequence data, several problems have been formulated, such as sequential patt...
详细信息
This paper investigates the application of end-to-end (E2E) learning for joint optimization of pulse-shaper and receiver filter to reduce intersymbol interference (ISI) in bandwidth-limited communication systems. We i...
详细信息
Thermoelectric generators (TEG) are used in many applications such as in computers, automobiles to generate electricity from the exhausted thermal gases. The waste heat recovery from the system can be used to generate...
详细信息
Exceptional point (EP)-based optical sensors exhibit exceptional sensitivity but poor detectivity. Slightly off EP operation boosts detectivity without much loss in sensitivity. We experimentally demonstrate a high-de...
详细信息
We consider the problem of estimating the possibly non-convex cost of an agent by observing its interactions with a nonlinear, non-stationary and stochastic environment. For this inverse problem, we give a result that...
We consider the problem of estimating the possibly non-convex cost of an agent by observing its interactions with a nonlinear, non-stationary and stochastic environment. For this inverse problem, we give a result that allows to estimate the cost by solving a convex optimization problem. To obtain this result we also tackle a forward problem. This leads to the formulation of a finite-horizon optimal control problem for which we show convexity and find the optimal solution. Our approach leverages certain probabilistic descriptions that can be obtained both from data and/or from first-principles. The effectiveness of our results, which are turned in an algorithm, is illustrated via simulations on the problem of estimating the cost of an agent that is stabilizing the unstable equilibrium of a pendulum.
Exceptional point (EP)-based optical sensors exhibit exceptional sensitivity but poor detectivity. Slightly off EP operation boosts detectivity without much loss in sensitivity. We experimentally demonstrate a high-de...
详细信息
This paper proposes an alternative detection frame-work for multiple sclerosis (MS) and idiopathic acute transverse myelitis (ATM) within the 6G-enabled Internet of Medical Things (IoMT) environment. The developed fra...
详细信息
The content of visual and audio scenes is multi-faceted such that a video stream can be paired with various audio streams and vice-versa. Thereby, in video-to-audio generation task, it is imperative to introduce steer...
详细信息
The content of visual and audio scenes is multi-faceted such that a video stream can be paired with various audio streams and vice-versa. Thereby, in video-to-audio generation task, it is imperative to introduce steering approaches for controlling the generated audio. While Video-to-Audio generation is a well-established generative task, existing methods lack such controllability. In this work, we propose VATT, a multi-modal generative framework that takes a video and an optional text prompt as input, and generates audio and optional textual description (caption) of the audio. Such a framework has two unique advantages: i) Video-to-Audio generation process can be refined and controlled via text which complements the context of the visual information, and ii) The model can suggest what audio to generate for the video by generating audio captions. VATT consists of two key modules: VATT Converter, which is an LLM that has been fine-tuned for instructions and includes a projection layer that maps video features to the LLM vector space, and VATT Audio, a bi-directional transformer that generates audio tokens from visual frames and from optional text prompt using iterative parallel decoding. The audio tokens and the text prompt are used by a pretrained neural codec to convert them into a waveform. Our experiments show that when VATT is compared to existing video-to-audio generation methods in objective metrics, such as VGGSound audiovisual dataset, it achieves competitive performance when the audio caption is not provided. When the audio caption is provided as a prompt, VATT achieves even more refined performance (with lowest KLD score of 1.41). Furthermore, subjective studies asking participants to choose the most compatible generated audio for a given silent video, show that VATT Audio has been chosen on average as a preferred generated audio than the audio generated by existing methods. VATT enables controllable video-to-audio generation through text as well as suggest
Cluster analysis, also known as clustering, is a multivariate data mining technique whose goal is to classify objects based on a set of selective features. Nonnegative matrix factorization (NMF), and K-means are the m...
详细信息
暂无评论