Generative AI refers to algorithms and techniques designed to generate text, images, videos, or other data, typically in response to prompts. These algorithms leverage large generative models that learn the patterns a...
详细信息
Generative AI refers to algorithms and techniques designed to generate text, images, videos, or other data, typically in response to prompts. These algorithms leverage large generative models that learn the patterns and structures of the media data (text, images, or videos) provided during training and then generate new media data that have analogous characteristics. Much of the recent research has gone into applying generative AI for text and 2-D image data. However, generative AI for 3-D models, especially 3-D point cloud data (PCD), has compelling applications in virtual reality content generation, gaming, and product design and manufacturing, but it introduces a multitude of research challenges.
Aiming at the problems of low accuracy and less prediction time step in traditional statistical model for PM2.5 concentration prediction, a PM2.5 concentration prediction method based on deep learning in Internet of T...
详细信息
Aiming at the problems of low accuracy and less prediction time step in traditional statistical model for PM2.5 concentration prediction, a PM2.5 concentration prediction method based on deep learning in Internet of Things air monitoring system is proposed. Firstly, the spatiotemporal correlation of each station data in the Internet of Things monitoring system is analyzed, and the cubic spline interpolation method is used to fill in the missing data. Then, the temporal attention of the input data is obtained by attention mechanism, and the feature encoder is used to encode the data to obtain the intermediate features. Finally, the intermediate feature is fused with the historical information of PM2.5 concentration, and the predicted value is obtained through the feature decoder. Using the proposed model to predict the PM2.5 concentration in Beijing, the experimental results show that the long-term PM2.5 predicted value is very close to the real value, and the RMSE and MAE are 17.93 mu g/m(3) and 11.52 mu g/m(3), respectively, which are better than other comparison models. So, this model is suitable for multivariable long time series forecasting scenarios.
Over the past few years, researchers are showing huge interest in sentiment analysis and summarization of documents. The primary reason being that huge volumes of information are available in textual format, and this ...
详细信息
Over the past few years, researchers are showing huge interest in sentiment analysis and summarization of documents. The primary reason being that huge volumes of information are available in textual format, and this data has proven helpful for real-world applications and challenges. The sentiment analysis of a document will help the user comprehend the content's emotional intent. Abstractive summarization algorithms generate a condensed version of the text, which can then be used to determine the emotion represented in the text using sentiment analysis. Recent research in abstractive summarization concentrates on neural networkbasedmodels, rather than conjunctions-based approaches, whichmight improve the overall efficiency. Neural network models like attention mechanism are tried out to handle complex works with promising results. The proposed work aims to present a novel framework that incorporates the part of speech tagging feature to the word embedding layer, which is then used as the input to the attention mechanism. With POS feature being part of the input layer, this framework is capable of dealing with words containing contextual and morphological information. The relevance of POS tagging here is due to its strong reliance on the language's syntactic, contextual, and morphological information. The three main elements in the work are pre-processing, POS tagging feature in the embedding phase, and the incorporation of it into the attention mechanism. The word embedding provides the semantic concept about the word, while the POS tags give an idea about how significant the words are in the context of the content, which corresponds to the syntactic information. The proposed work was carried out in Malayalam, one of the prominent Indian languages. A widely used and accepted dataset from the English language was translated to Malayalam for conducting the experiments. The proposed framework gives a ROUGE score of 28, which outperformed the baseline models.
Crack detection is an important part of building structural health monitoring. However, the traditional convolution is difficult to capture the characteristics of tiny concrete cracks and the complex topology, which u...
详细信息
ISBN:
(数字)9798350387780
ISBN:
(纸本)9798350387780;9798350387797
Crack detection is an important part of building structural health monitoring. However, the traditional convolution is difficult to capture the characteristics of tiny concrete cracks and the complex topology, which ultimately leads to the misclassification of crack segmentation. To address the above problems, a concrete crack detection network based on local topology and global group attention (LTGGNet) is proposed. In the encoding part, a deformable local attention module (DLA) is designed to improve the ability of the network to extract tiny crack features and complex topologies. It uses deformable convolution to extract the crack regional features and topology information, and strengthens the network's discriminability of the tiny crack pixel features. In the decoding part, a global grouping context aggregation (GGC) module is proposed to enhance the global modeling ability of the network to capture images of crack structures. It uses the self-attention mechanism to establish the correlation representation of the global information and enhance the long-range dependency of the topology, thereby achieving the accurate localization of cracks and the recovery of *** experimental results show that the proposed network achieves 75.84%, 86.26%, 59.20%, and 74.37% in the metrics IoU and F1-Score, respectively, on the public crack datasets Deepcrack and Crack500. Compared to the comparison network, the proposed network has a better concrete crack segmentation effect. The ablation experiments further verify the effectiveness of the network.
With increasing usage of face recognition algorithms, it is well established that external artifacts and makeup accessories can be applied to different facial features such as eyes, nose, mouth, and cheek, to obfuscat...
详细信息
With increasing usage of face recognition algorithms, it is well established that external artifacts and makeup accessories can be applied to different facial features such as eyes, nose, mouth, and cheek, to obfuscate one's identity or to impersonate someone else's identity. Recognizing faces in the presence of' these artifacts comprises the problem of' disguised face recognition, which is one of the most arduous covariates of face recognition. The challenge becomes exacerbated when disguised faces are captured in real-time environment, with low resolution images. To address the challenge of disguised face recognition, this paper first proposes a novel multi-objective encoder-decoder network, termed as DED-Net. DED-Net attempts to learn the class variations in the feature space generated by both disguised as well non-disguised images, using a combination of Mahalanobis and Cosine distance metrics, along with Mutual Information based supervision. The DED-Net is then extended to learn from the local and global features of both disguised and non-disguised face images for efficient face recognition, and the complete framework is termed as Disguise Resilient (D-Res) framework. The efficacy of the proposed framework has been demonstrated on two real-world benchmark datasets: Disguised Faces in the Wild (DFW) 2018 and DFW2019 competition datasets. In addition, this research also emphasizes on the importance of recognizing disguised faces in low resolution settings and proposes three experimental protocols to simulate the real-world surveillance scenario. To this effect, benchmark results have been shown on seven protocols for three low resolution settings (32 x 32, 24 x 24, and 16 x 16) of the two DFW benchmark datasets. The results demonstrate superior performance of the D-Res framework, in comparison with benchmark algorithms. For example, an improvement of around 3% is observed on the Overall protocol of the DFW2019 dataset, where the D-Res framework achieves 96.3%. Ex
Assertion inference techniques aim at automatically inferring sets of program assertions that capture the exhibited software behavior, often by generating and filtering assertions through dynamic test executions and m...
详细信息
ISBN:
(纸本)9798350315943
Assertion inference techniques aim at automatically inferring sets of program assertions that capture the exhibited software behavior, often by generating and filtering assertions through dynamic test executions and mutation testing. Although powerful, such techniques are computationally expensive due to the large number of mutants that require execution. In this study, we introduce the notion of Assertion Inferring Mutants, and demonstrate that these mutants are sufficient for assertion inference and correspond to a small subset (12.95%) of the entire mutant set. Moreover, these mutants are significantly different (71.59%) from Subsuming Mutants that are frequently cited by mutation testing literature. We also show that Assertion Inferring Mutants can be statically approximated via a learningbased method. Given the widespread adoption of encoder-decoder architecture for prediction tasks, we demonstrate that it predicts Assertion Inferring Mutants with 0.79 Precision and 0.49 Recall. Its evaluation on 46 projects showcases that it enables a comparable inference capability (missing only 12.49% assertions) with a complete mutation analysis, while significantly reducing the execution cost (achieving 46.29 times faster inference). Moreover, it enables assertion inference techniques to scale on subjects where complete mutation testing is prohibitively expensive and other mutant selection strategies do not lead to an acceptable assertion inference.
In this paper we present a novel approach for lane detection and segmentation using generative models. Traditionally discriminative models have been employed to classify pixels semantically on a road. We model the pro...
详细信息
ISBN:
(纸本)9781665443371
In this paper we present a novel approach for lane detection and segmentation using generative models. Traditionally discriminative models have been employed to classify pixels semantically on a road. We model the probability distribution of lanes and road symbols by training a generative adversarial network. Based on the learned probability distribution, context aware lanes and road signs are generated for a given image which are further quantized for nearest class label. Proposed method has been tested on BDD100K and Baidu's ApolloScape datasets and performs better than state of the art and exhibits robustness to adverse conditions by generating lanes in faded out and occluded scenarios.
Third-generation DNA sequencers provided by Oxford Nanopore Technologies (ONT) produce a series of samples of an electrical current in the nanopore. Such a time series is used to detect the sequence of nucleotides. Th...
详细信息
Third-generation DNA sequencers provided by Oxford Nanopore Technologies (ONT) produce a series of samples of an electrical current in the nanopore. Such a time series is used to detect the sequence of nucleotides. The task of translation of current values into nucleotide symbols is called basecalling. Various solutions for basecalling have already been proposed. The earlier ones were based on Hidden Markov Models, but the best ones use neural networks or other machine learning models. Unfortunately, achieved accuracy scores are still lower than competitive sequencing techniques, like Illumina's. Basecallers differ in the input data type-currently, most of them work on a raw data straight from the sequencer (time series of current). Still, the approach of using event data is also explored. Event data is obtained by preprocessing of raw data and dividing it into segments described by several features computed from raw data values within each segment. We propose a novel basecaller that uses joint processing of raw and event data. We define basecalling as a sequence-to-sequence translation, and we use a machine learning model based on an encoder-decoder architecture of recurrent neural networks. Our model incorporates twin encoders and an attention mechanism. We tested our solution on simulated and real datasets. We compare the full model accuracy results with its components: processing only raw or event data. We compare our solution with the existing ONT basecaller-Guppy. Results of numerical experiments show that joint raw and event data processing provides better basecalling accuracy than processing each data type separately. We implement an application called Ravvent, freely available under MIT licence.
Objective: Controlling blood glucose in the euglycemic range is the main goal of developing the closed-loop insulin delivery system for type 1 diabetes patients. The closed-loop system delivers the amount of insulin d...
详细信息
Objective: Controlling blood glucose in the euglycemic range is the main goal of developing the closed-loop insulin delivery system for type 1 diabetes patients. The closed-loop system delivers the amount of insulin dose determined by glucose predictions through the use of computational algorithms. A computationally efficient and accurate model that can capture the physiological nonlinear dynamics is critical for developing an efficient closed-loop system. Methods: Four data-driven models are compared, including different neural network architectures, a reservoir computing model, and a novel linear regression approach. Model predictions are evaluated over continuous 30 and 60 min time horizons using real-world data from wearable sensor measurements, a continuous glucose monitor, and self-reported events through mobile applications. The four data-driven models are trained on 12 data contributors for around 32 days, 8 days of data are used for validation, with an additional 10 days of data for out-of-sample testing. Model performance was evaluated by the root mean squared error and the mean absolute error. Results: A neural network model using an encoder-decoder architecture has the most stable performance and is able to recover missing dynamics in short time intervals. Regression models performed better at long-time prediction horizons (i.e., 60 min) and with lower computational costs. Significance: The performance of several distinct models was tested for individual-level data from a type 1 diabetes data set. These results may enable a feasible solution with low computational cost for the time-dependent adjustment of artificial pancreas for diabetes patients.
Paraphrase Generation is one of the most important and challenging tasks in the field of Natural Language Generation. The paraphrasing techniques help to identify or to extract/generate phrases/sentences conveying the...
详细信息
Paraphrase Generation is one of the most important and challenging tasks in the field of Natural Language Generation. The paraphrasing techniques help to identify or to extract/generate phrases/sentences conveying the similar meaning. The paraphrasing task can be bifurcated into two sub-tasks namely, Paraphrase Identification (PI) and Paraphrase Generation (PG). Most of the existing proposed state-of-the-art systems have the potential to solve only one problem at a time. This paper proposes a light-weight unified model that can simultaneously classify whether given pair of sentences are paraphrases of each other and the model can also generate multiple paraphrases given an input sentence. Paraphrase Generation module aims to generate fluent and semantically similar paraphrases and the Paraphrase Identification system aims to classify whether sentences pair are paraphrases of each other or not. The proposed approach uses an amalgamation of data sampling or data variety with a granular fine-tuned Text-To-Text Transfer Transformer (T5) model. This paper proposes a unified approach which aims to solve the problems of Paraphrase Identification and generation by using carefully selected data-points and a fine-tuned T5 model. The highlight of this study is that the same light-weight model trained by keeping the objective of Paraphrase Generation can also be used for solving the Paraphrase Identification task. Hence, the proposed system is light-weight in terms of the model's size along with the data used to train the model which facilitates the quick learning of the model without having to compromise with the results. The proposed system is then evaluated against the popular evaluation metrics like BLEU (BiLingual Evaluation Understudy):, ROUGE (Recall-Oriented Understudy for Gisting Evaluation), METEOR, WER (Word Error Rate), and GLEU (Google-BLEU) for Paraphrase Generation and classification metrics like accuracy, precision, recall and F1-score for Paraphrase Identificat
暂无评论