检索结果-内蒙古大学图书馆

VSCA: A Sentence Matching Model Incorporating Visual Perception

COGNITIVE COMPUTATION 2023年第1期15卷 323-336页

作者： Zhang, Zhe Xiao, Guangli Qian, Yurong Ma, Mengnan Leng, Hongyong Zhang, Tao XinJiang Univ Sch Software Urumqi 830000 Peoples R China XinJiang Univ Xinjiang Uygur Autonomous Reg Key Lab Signal Detec Urumqi 830046 Peoples R China Beijing Inst Technol IT Acad Beijing 100081 Peoples R China

Stacking multiple layers of attention networks can significantly improve a model's performance. However, this also increases the model's time and space complexity, making it difficult for the model to capture detailed information on the underlying features. We propose a novel sentence matching model (VSCA) that uses a new attention mechanism based on variational autoencoders (VAE), which exploits the contextual information in sentences to construct a basic attention feature map and combines it with VAE to generate multiple sets of related attention feature maps for fusion. Furthermore, VSCA introduces a spatial attention mechanism that combines visual perception to capture multilevel semantic information. The experimental results show that our proposed model outperforms pretrained models such as BERT on the LCQMC dataset and performs well on the PAWS-X data. Our work consists of two parts. The first part compares the proposed sentence matching model with state-of-the-art pretrained models such as BERT. The second part conducts innovative research on applying VAE and spatial attention mechanisms in NLP. The experimental results on the related datasets show that the proposed method has satisfactory performance, and VSCA can capture rich attentional information and detailed information with less time and space complexity. This work provides insights into the application of VAE and spatial attention mechanisms in NLP.

关键词： Natural language processing Sentence matching variational autoencoder Spatial attention

来源：评论

学校读者我要写书评

暂无评论

Performance-Based Generative Design for Parametric Modeling of Engineering Structures Using Deep Conditional Generative Models

引用

AUTOMATION IN CONSTRUCTION 2023年 156卷

作者： Bucher, Martin Juan Jose Kraus, Michael Anton Rust, Romana Tang, Siyu Swiss Fed Inst Technol Stefano Franscini Pl CH-8093 Zurich Switzerland

Parametric Modeling, Generative Design, and Performance-Based Design have gained increasing attention in the AEC field as a way to create a wide range of design variants while focusing on performance attributes rather than building codes. However, the relationships between design parameters and performance attributes are often very complex, resulting in a highly iterative and unguided process. In this paper, we argue that a more goal-oriented design process is enabled by an inverse formulation that starts with performance attributes instead of design parameters. A Deep Conditional Generative Design workflow is proposed that takes a set of performance attributes and partially defined design features as input and produces a complete set of design parameters as output. A model architecture based on a Conditional variational autoencoder is presented along with different approximate posteriors, and evaluated on four different case studies. Compared to Genetic Algorithms, our method proves superior when utilizing a pre-trained model.

关键词： Deep generative modeling Performance-based design Generative design variational autoencoder Deep generative design Artificial intelligence

来源：评论

学校读者我要写书评

暂无评论

ChartNavigator: An Interactive Pattern Identification and Annotation Framework for Charts

引用

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 2023年第2期35卷 1258-1269页

作者： Zhang, Tianye Feng, Haozhe Chen, Wei Chen, Zexian Zheng, Wenting Luo, Xiaonan Huang, Wenqi Tung, Anthony Zhejiang Univ State Key Lab CAD & CG Hangzhou 310058 Peoples R China Zhejiang Univ China Southern Power Grid Joint Res Hangzhou 310058 Peoples R China Guilin Univ Elect Technol Guilin 541214 Peoples R China China Southern Power Grid Digital Grid Res Inst Guangzhou 510670 Peoples R China Natl Univ Singapore Dept Comp Sci Singapore 117417 Singapore

Patterns in charts refer to interesting visual features or forms. Identifying patterns not only helps analysts understand the 'shape' of the data but also supports better and faster decision-making. Existing solutions for identifying patterns in charts require a large number of labeled data instances, making it intractable without user supervision. In this paper, we propose ChartNavigator, an interactive pattern identification and annotation framework for unlabeled visualization charts. ChartNavigator leverages a novel chart-sensitive deep factor model to map patterns into a low-dimensional factor representation space, and facilitates rich analysis with the derived representations. We design and implement a visual interface to support efficient identification and annotation of potential patterns in charts. Evaluations with multiple datasets show that our approach outperforms the baseline models in identifying and annotating patterns.

关键词： Visualization Data models Annotations Data visualization Solid modeling Inference algorithms Estimation Pattern identification chart variational autoencoder user interaction visual analysis

来源：评论

学校读者我要写书评

暂无评论

Combining High-Throughput Imaging in Visible and SWIR wavelengths for In-Situ Porosity Prediction in Laser Powder Bed Fusion 11

Combining High-Throughput Imaging in Visible and SWIR wavele...

引用

Conference on Laser 3D Manufacturing XI

作者： Ahar, Ayyoub Vandecasteele, Mathieu Booth, Brian G. De Grave, Kurt Verhees, Dries Philips, Wilfried Bey-Temsamani, Abdellatif Flanders Make B-3920 Lommel Belgium Univ Ghent Imec TELIN IPI B-3000 Leuven Belgium

ISBN: (纸本)9781510670136;9781510670129

Laser powder bed fusion is at the forefront of manufacturing metallic objects, particularly those with complex geometries or those produced in limited quantities. However, this 3D printing method is susceptible to several printing defects due to the complexities of using a high-power laser with ultra-fast actuation. Accurate online print defect detection is therefore in high demand, and this defect detection must maintain a low computational profile to enable low-latency process intervention. In this work, we propose a low-latency LPBF defect detection algorithm based on fusion of images from high-speed cameras in the visible and short-wave infrared (SWIR) spectrum ranges. First, we design an experiment to print an object while both imposing porosity defects on the print, and recording the laser's melt pool with the high-speed cameras. We then train variational autoencoders on images from both cameras to extract and fuse two sets of corresponding features. The melt pool recordings are then annotated with pore densities extracted from the printed object's CT scan. These annotations are then used to train and evaluate the ability of a fast neural network model to predict the occurrence of porosity from the fused features. We compare the prediction performance of our sensor fused model with models trained on image features from each camera separately. We observe that the SWIR imaging is sensitive to keyhole porosity while the visible-range optical camera is sensitive to lack-of-fusion porosity. By fusing features from both cameras, we are able to accurately predict both pore types, thus outperforming both single camera systems.

关键词： LPBF In-situ Monitoring Lack-of-Fusion Keyhole Porosity Additive Manufacturing SWIR variational autoencoder Sensor Fusion

来源：评论

学校读者我要写书评

暂无评论

EASE-DR: Enhanced Sentence Embeddings for Dense Retrieval 47

EASE-DR: Enhanced Sentence Embeddings for Dense Retrieval

引用

47th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR)

作者： Zhou, Xixi Gao, Yang Jie, Xin Cai, Xiaoxu Bu, Jiajun Wang, Haishuai Zhejiang Univ Hangzhou Peoples R China

ISBN: (纸本)9798400704314

Recent neural information retrieval models using dense text representations generated by pre-trained models commonly face two issues. First, a pre-trained model (e.g., BERT) usually truncates a long document before giving its representation, which may cause the loss of some important semantic information. Second, although pre-training models like BERT have been widely used in generating sentence embeddings, a substantial body of literature has shown that the pre-training models often represent sentence embeddings in a homogeneous and narrow space, known as the problem of representation anisotropy, which hurts the quality of dense vector retrieval. In this paper, we split the query and the document in information retrieval into two sets of natural sentences and generate their sentence embeddings with BERT, the most popular pre-trained model. Before aggregating the sentence embeddings to get the entire embedding representations of the input query and document, to alleviate the usual representation degeneration problem of sentence embeddings from BERT, we sample the variational auto-encoder's latent space distribution to obtain isotropic sentence embeddings and utilize supervised contrastive learning to uniform the distribution of these sentence embeddings in the representation space. Our proposed model undergoes training optimization for both the query and the document in the abovementioned aspects. Our model performs well in evaluating three extensively researched neural information retrieval datasets.

关键词： Information Retrieval variational autoencoder Supervised Contrastive Learning

来源：评论

学校读者我要写书评

暂无评论

Enhancing Facial Reconstruction Using Graph Attention Networks

引用

IEEE ACCESS 2023年 11卷 136680-136691页

作者： Lee, Hyeong Geun Hur, Jee Sic Yoon, Yeo Chan Kim, Soo Kyun Jeju Natl Univ Dept Comp Engn Jeju 63243 South Korea Jeju Natl Univ Dept Artificial Intelligence Jeju 63243 South Korea

Traditionally, research on three-dimensional (3D) facial reconstruction has focused heavily on methods that use 3D Morphable Models (3DMMs) based on principal component analysis (PCA). Because such methods are linear, they are robust to external noise. The PCA method has limitations when restoring faces that deviate from the training data distribution, particularly when recovering fine details. By contrast, restoration methods utilizing Graph Convolution Networks (GCN) offer the advantages of non-linearity and direct regression of vertex coordinates and colors. However, GCN-based approaches can be prone to overfitting, making them less stable. This study presents a face restoration approach that aims to regress the vertex coordinates and colors of a 3D face model directly from a single wilds 2D facial image. This method demonstrates greater stability and higher accuracy compared to conventional techniques. In addition, Graph Attention Networks (GAT) enhance the restoration performance while separating the networks responsible for facial shape and color, reducing noise caused by interference between different data attributes. Through experiments, we demonstrate the most optimized network structures and training methods and demonstrate improved performance compared to existing approaches.

关键词： Graph convolution network graph attention network convolution neural network variational autoencoder facial reconstruction

来源：评论

学校读者我要写书评

暂无评论

Enhancing Load Forecasting with VAE-GAN-Based Data Cleaning for Electric Vehicle Charging Loads 10th

Enhancing Load Forecasting with VAE-GAN-Based Data Cleaning ...

引用

29th International Conference on Database Systems for Advanced Applications (DASFAA)

作者： Zhang, Wensi Lei, Shuya Jiang, Yuqing Yao, Tiechui Wang, Yishen Sun, Zhiqing State Grid Smart Grid Res Inst Co Ltd Beijing Peoples R China State Grid Hangzhou Power Supply Co Hangzhou Peoples R China

ISBN: (纸本)9789819609130;9789819609147

With the popularization of environmental protection ideas, people are increasingly valuing low-carbon lifestyles and the economy. Electric vehicles play a crucial role in this transformation to reduce carbon emissions. However, integrating electric vehicles into the power grid poses challenges, especially the possibility of destructive load peaks, which may endanger the stability and safety of the power grid. Accurately predicting the load of electric vehicles and managing grid scheduling are crucial for solving this problem. The current solutions are mainly divided into two categories: statistics-based methods and machine learning-based methods. Statistical methods require a large amount of long-term data modeling, making data collection a significant challenge. Similarly, machine learning-based methods have good long-term prediction performance on high-quality data, but they do not perform well in terms of short-term prediction accuracy. To overcome these obstacles, a comprehensive electric vehicle charging load prediction framework is proposed, which utilizes an innovative variational autoencoder to generate adversarial networks (VAE-GAN) for data processing, Principal Component Analysis (PCA) for feature extraction, and an improved CNN-GRU model for prediction. The experimental results show that the accuracy of short-term power load prediction is significantly improved, which verifies the effectiveness of the framework in processing small sample load data and provides advanced tools for intelligent management of electric vehicle charging stations.

关键词： load forecasting electric vehicles data cleaning variational autoencoder generate adversarial networks gated recurrent network

来源：评论

学校读者我要写书评

暂无评论

Real-time suspicious detection framework for financial data streams

引用

International Journal of Information Technology (Singapore) 2025年 1-17页

作者： Gadimov, Elshan Birihanu, Ermiyas Department of Data Science and Engineering Faculty of Informatics Eötvös Loránd University Budapest Hungary

Money laundering hides illegal money’s origin by making it seem legal. Detecting suspicious activity quickly in financial data is key to stopping fraud and money laundering. Real-time detection is popular approach for its speed and efficiently detecting illegal activities in financial institutes system. However, handling massive and distributed data streams have challenges in achieving real-time efficiency and effectiveness. Therefore proposing and developing data stream framework is needed to handle these challenges efficiently. The main goal of this study is to propose real-time suspicious detection framework for financial institutions to effectively combat money laundering. The proposed model comprises two approaches: a distributed computing architecture based on Docker container to enhance flexibility, migration capabilities, and customization, and a suspicious detection module employing the autoencoder method. To determine whether there is any suspicious activity in the system, the proposed model uses the reconstruction error. The reconstruction error is the difference between the original input data and the data reconstructed by the proposed model. To evaluate the proposed model, we used real-world data from a financial institution and synthetic data generated from the real-world data. The study demonstrates the better performance of the proposed real-time detection framework compared to traditional methods in identifying anomalous transactions. It also explores the importance and limitations of using both real-world and generated data. Our code is publicly available: https://***/Ermiyas21/Real-Time-Suspicious-Detection-Framework-for-financial-data. © The Author(s) 2025.

关键词： Anti-money laundering (AML) autoencoders Generative adversarial Network Real-time variational autoencoder

来源：评论

学校读者我要写书评

暂无评论

CVML-Pose: Convolutional VAE Based Multi-Level Network for Object 3D Pose Estimation

引用

IEEE ACCESS 2023年 11卷 13830-13845页

作者： Zhao, Jianyu Sanderson, Edward Matuszewski, Bogdan J. J. Univ Cent Lancashire Comp Vis & Machine Learning CVML Grp Preston PR1 2HE England

Most vision-based 3D pose estimation approaches typically rely on knowledge of object's 3D model, depth measurements, and often require time-consuming iterative refinement to improve accuracy. However, these can be seen as limiting factors for broader real-life applications. The main motivation for this paper is to address these limitations. To solve this, a novel Convolutional variational Auto-Encoder based Multi-Level Network for object 3D pose estimation (CVML-Pose) method is proposed. Unlike most other methods, the proposed CVML-Pose implicitly learns an object's 3D pose from only RGB images encoded in its latent space without knowing the object's 3D model, depth information, or performing a post-refinement. CVML-Pose consists of two main modules: (i) CVML-AE representing convolutional variational autoencoder, whose role is to extract features from RGB images, (ii) Multi-Layer Perceptron and K-Nearest Neighbor regressors mapping the latent variables to object 3D pose including, respectively, rotation and translation. The proposed CVML-Pose has been evaluated on the LineMod and LineMod-Occlusion benchmark datasets. It has been shown to outperform other methods based on latent representations and achieves comparable results to the state-of-the-art, but without use of a 3D model or depth measurements. Utilizing the t-Distributed Stochastic Neighbor Embedding algorithm, the CVML-Pose latent space is shown to successfully represent objects' category and topology. This opens up a prospect of integrated estimation of pose and other attributes (possibly also including surface finish or shape variations), which, with real-time processing due to the absence of iterative refinement, can facilitate various robotic applications. Code available: https://***/JZhao12/CVML-Pose.

关键词： 3D pose estimation deep learning variational autoencoder synthetic data

来源：评论

学校读者我要写书评

暂无评论

PredLife: Predicting Fine-Grained Future Activity Patterns

引用

IEEE TRANSACTIONS ON BIG DATA 2023年第6期9卷 1658-1669页

作者： Li, Wenjing Shi, Xiaodan Huang, Dou Shen, Xudong Chen, Jinyu Kobayashi, Hill Hiroki Zhang, Haoran Song, Xuan Shibasaki, Ryosuke Univ Tokyo Ctr Spatial Informat Sci Kashiwa Chiba 2770882 Japan Univ Tokyo Informat Technol Ctr Kashiwa Chiba 2770882 Japan Locat Mind Inc Tokyo 1010048 Japan Peking Univ Sch Urban Planning & Design Shenzhen 518055 Guangdong Peoples R China Southern Univ Sci & Technol Southern Univ Sci & Technol Univ Tokyo Joint Res Dept Comp & Engn Shenzhen 518055 Guangdong Peoples R China Malardalens Univ Sch Business Soc & Technol S-72220 Vasteras Sweden Univ Tokyo Interfac Initiat Informat Studies Tokyo 1138654 Japan Reitaku Univ Kashiwa Chiba 2770065 Japan

Activity pattern prediction is a critical part of urban computing, urban planning, intelligent transportation, and so on. Based on a dataset with more than 10 million GPS trajectory records collected by mobile sensors, this research proposed a CNN-BiLSTM-VAE-ATT-based encoder-decoder model for fine-grained individual activity sequence prediction. The model combines the long-term and short-term dependencies crosswise and also considers randomness, diversity, and uncertainty of individual activity patterns. The proposed results show higher accuracy compared to the ten baselines. The model can generate high diversity results while approximating the original activity patterns distribution. Moreover, the model also has interpretability in revealing the time dependency importance of the activity pattern prediction.

关键词： Activity pattern prediction Human mobility Big GPS data variational autoencoder LSTM

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：