Grasp planning and most specifically the grasp space exploration when considering adaptive and underactuated multifingered grippers is still an open issue in robotics. This article describes an efficient procedure for...
详细信息
Grasp planning and most specifically the grasp space exploration when considering adaptive and underactuated multifingered grippers is still an open issue in robotics. This article describes an efficient procedure for exploring the grasp space of such grippers that aims at generating reliable grasps given a known object pose. This article also assesses its performances, and compares it to more commonly used grasp space exploration methods. This method relies on a limited dataset of human specified expert grasps, and uses variational autoencoders to learn grasp intrinsic features together with an analytic grasp quality metric in a compact way from a computational point of view. It is evaluated both in simulation and on a real setup for the specific and complex case of adaptive and underactuated multifingered grasping. Using the proposed grasp planner, it reaches a grasp success rate of 99.54% on 7000 simulated trials, and successfully plans stable and reliable grasps on the real setup, with no failed grasp reported on around 30 trials. It also shows a significantly higher grasp success rate than other grasps space exploration methods.
The amount of blood under the surface of skin is controlled by the autonomic nervous system and directly influences the facial skin temperature. Classification models have been used to estimate various physiological a...
详细信息
The amount of blood under the surface of skin is controlled by the autonomic nervous system and directly influences the facial skin temperature. Classification models have been used to estimate various physiological and psychological states of the human body using facial skin temperature. The anomaly detection method is required to monitor the facial skin temperature because of the difficulty in collecting anomalous samples. The normal state of the facial skin temperature fluctuates;hence, diurnal variation should be considered when applying anomaly detection methods to monitor the facial skin temperature. In a previous study, the anomaly detection method was applied to the facial skin temperature considering diurnal variation, and the normal and anomaly states were measured 16 times at 1-h intervals. A variational autoencoder (VAE) was applied to the normal-state data to construct an anomaly detection model. However, in many cases, anomalous states were not detected. The mean AUC (area under the receiver-operating characteristic curve) for the 16 experiments was 0.57 using the model of the previous study. The application of thermal images and VAE training is yet to be comprehensively studied. In this study, we improved anomaly detection accuracy for the facial skin temperature with diurnal variation by optimizing the method of thermal images and model structure. The mean AUC of the proposed model for the 16 experiments was 0.96.
We present our submission to the Extreme Value Analysis 2021 Data Challenge in which teams were asked to accurately predict distributions of wildfire frequency and size within spatio-temporal regions of missing data. ...
详细信息
We present our submission to the Extreme Value Analysis 2021 Data Challenge in which teams were asked to accurately predict distributions of wildfire frequency and size within spatio-temporal regions of missing data. For this competition, we developed a variant of the powerful variational autoencoder models, which we call Conditional Missing data Importance-Weighted autoencoder (CMIWAE). Our deep latent variable generative model requires little to no feature engineering and does not necessarily rely on the specifics of scoring in the Data Challenge. It is fully trained on incomplete data, with the single objective to maximize log-likelihood of the observed wildfire information. We mitigate the effects of the relatively low number of training samples by stochastic sampling from a variational latent variable distribution, as well as by ensembling a set of CMIWAE models trained and validated on different splits of the provided data.
Deep generative models yielding transition metal complexes (TMCs) remain scarce despite the key role of these compounds in industrial catalytic processes, anticancer therapies, and the energy transition. Compared to d...
详细信息
Deep generative models yielding transition metal complexes (TMCs) remain scarce despite the key role of these compounds in industrial catalytic processes, anticancer therapies, and the energy transition. Compared to drug discovery within the chemical space of organic molecules, TMCs pose further challenges, including the encoding of chemical bonds of higher complexity and the need to optimize multiple properties. In this work, we developed a generative model for the inverse design of transition metal ligands and complexes, based on the junction tree variational autoencoder (JT-VAE). After implementing a SMILES-based encoding of the metal-ligand bonds, the model was trained with the tmQMg-L ligand library, allowing for the generation of thousands of novel, highly diverse monodentate (kappa 1) and bidentate (kappa 2) ligands, including imines, phosphines, and carbenes. Further, the generated ligands were labeled with two target properties reflecting the stability and electron density of the associated homoleptic iridium TMCs: the HOMO-LUMO gap (& varepsilon;) and the charge of the metal center (q Ir). This data was used to implement a conditional model that generated ligands from a prompt, with the single- or dual-objective of optimizing either or both the & varepsilon;and q Ir properties and allowing for chemical interpretation based on the optimization trajectories. The optimizations also had an impact on other chemical properties, including ligand dissociation energies and oxidative addition barriers. A similar model was implemented to condition ligand generation by solubility and steric bulk.
Top-N recommendation is widely accepted as an effective method in personalized service that well serves users of different interests. However, as analyzed from the SOTAs, their performance on the users with diverse ac...
详细信息
ISBN:
(纸本)9789819757787;9789819757794
Top-N recommendation is widely accepted as an effective method in personalized service that well serves users of different interests. However, as analyzed from the SOTAs, their performance on the users with diverse activity levels has significant distinction, which seriously damage the service quality of personalized recommendation. Existing studies do not pay a high attention to this issue, which simply assume the preference of all users follows a common probability distribution and then use a fixed schema (e.g., one latent vector) to model user representation. This assumption makes existing models hard to accommodate users of diverse activity levels. In this work, we propose a variational Kernel Density Estimation (VKDE) model, a non-parametric estimation, which aims to fit arbitrary preference distributions for users. VKDE constructs user (global) preference distribution with multiple local distributions collectively. We propose a variational kernel function to infer user one-faceted interests and generate each local distribution. A sampling strategy for user one-faceted interest is further proposed to reduce training complexity and keep the recommendation effectiveness. Our experimental results on three public datasets show that VKDE outperforms SOTAs and greatly improves the accuracy for users of diverse activity levels.
Realizing general inverse design could greatly accelerate the discovery of new materials with user-defined properties. However, stateof-the-art generative models tend to be limited to a specific composition or crystal...
详细信息
Realizing general inverse design could greatly accelerate the discovery of new materials with user-defined properties. However, stateof-the-art generative models tend to be limited to a specific composition or crystal structure. Herein, we present a framework capable of general inverse design (not limited to a given set of elements or crystal structures), featuring a generalized invertible representation that encodes crystals in both real and reciprocal space, and a property-structured latent space from a variational autoencoder (VAE). In three design cases, the framework generates 142 new crystals with user-defined formation energies, bandgap, thermoelectric (TE) power factor, and combinations thereof. These generated crystals, absent in the training database, are validated by first-principles calculations. The success rates (number of first-principles-validated target-satisfying crystals/number of designed crystals) ranges between 7.1% and 38.9%. These results represent a significant step toward property-driven general inverse design using generative models, although practical challenges remain when coupled with experimental synthesis.
Point clouds allow for the representation of 3D multimedia content as a set of disconnected points in space. Their inherent irregular geometric nature poses a challenge to efficient compression, a critical operation f...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
Point clouds allow for the representation of 3D multimedia content as a set of disconnected points in space. Their inherent irregular geometric nature poses a challenge to efficient compression, a critical operation for both storage and transmission. This paper proposes a VAE-inspired codec tailored for dynamic point cloud geometry compression, taking advantage of a temporal autoregressive hyperprior to enhance compression performance. Specifically, features derived from adjacent point cloud frames help build a hyperprior for conditional entropy coding. Sparse convolutions are leveraged to reach higher computational efficiency when compared to 3D dense convolutions. Remarkably, the proposed approach achieves an average 60.2% BD-rate gain against the contemporary V-PCC compression standard from MPEG.
Event cameras are advantageous for tasks that require vision sensors with low-latency and sparse output responses. However, the development of deep network algorithms using event cameras has been slow because of the l...
详细信息
ISBN:
(纸本)9798350390599;9798350390582
Event cameras are advantageous for tasks that require vision sensors with low-latency and sparse output responses. However, the development of deep network algorithms using event cameras has been slow because of the lack of large labelled event camera datasets for network training. This paper reports a method for creating new labelled event datasets by using a text-to-X model, where X is one or multiple output modalities, in the case of this work, events. Our proposed text-to-events model produces synthetic event frames directly from text prompts. It uses an autoencoder which is trained to produce sparse event frames representing event camera outputs. By combining the pretrained autoencoder with a diffusion model architecture, the new text-to-events model is able to generate smooth synthetic event streams of moving objects. The autoencoder was first trained on an event camera dataset of diverse scenes. In the combined training with the diffusion model, the DVS gesture dataset was used. We demonstrate that the model can generate realistic event sequences of human gestures prompted by different text statements. The classification accuracy of the generated sequences, using a classifier trained on the real dataset, ranges between 42% to 92%, depending on the gesture group. The results demonstrate the capability of this method in synthesizing event datasets.
Respiratory diseases are one of the most common causes of death worldwide. At present, digital stethoscopes are valuable tools for diagnosing respiratory diseases, but they face limitations in storage, computation, an...
详细信息
ISBN:
(纸本)9798350354966;9798350354959
Respiratory diseases are one of the most common causes of death worldwide. At present, digital stethoscopes are valuable tools for diagnosing respiratory diseases, but they face limitations in storage, computation, and transmission capabilities. Therefore, developing respiratory sound compression algorithms is crucial. In this work, we propose TranscoderQVAE (TQVAE). TQVAE adopts the idea of variational autoencoder, non-uniform quantizer and Transformer to perform compression and recovery tasks on respiratory sounds Based on the concept of compressive sensing, we use an encoder to reduce data points and a non-uniform quantizer to compress the signal to 4 bits and stored which can reduce the storage space. Finally, we use a simplified Transformer as a decoder named Transcoder to recover signals. Meanwhile, a denoising algorithm based on wavelet transformation is applied to the recovered signal to remove noise. Through performing experiments of 3554 pieces of data, the correlation coefficient between the recovered signal and the raw signal can reach 0.98. And the compression ratio can reach up to 256. Other evaluation metrics are also terrific. The algorithm execution time is less than 13 seconds per signal.
This paper introduces the Generative Sample Map (GESAM), a novel two-stage unsupervised learning framework capable of generating high-quality and expressive audio samples for music production. Recent generative approa...
详细信息
ISBN:
(纸本)9798400706028
This paper introduces the Generative Sample Map (GESAM), a novel two-stage unsupervised learning framework capable of generating high-quality and expressive audio samples for music production. Recent generative approaches based on language models rely on text prompts as conditions. However, fine nuances in musical audio samples can hardly be described in the modality of text. For addressing this shortcoming, we propose to learn a highly descriptive latent 2D audio map by a variational autoencoder (VAE) which is then utilized for conditioning a Transformer model. We demonstrate the Transformer model's ability to achieve high generation quality and compare its performance against two baseline models. By selecting points on the map that compresses the manifold of the audio training set into 2D, we enable a more natural interaction with the model. We showcase this capability through an interactive demo interface, which is accessible on our website https://***/gesam/.
暂无评论