检索结果-内蒙古大学图书馆

A data-driven approach to estimating post-discovery parameters of unexplored oilfields

学校读者我要写书评

暂无评论

Petroleum 2023年第2期9卷 285-300页

作者： Fransiscus Pratikto Sapto Indratno Kadarsah Suryadi Djoko Santoso Industrial Engineering Department Bandung Institute of TechnologyJl Ganesha 10Bandung40132Indonesia Mathematics Department Bandung Institute of TechnologyJl Ganesha 10Bandung40132Indonesia Geophysical Engineering Department Bandung Institute of TechnologyJl Ganesha 10Bandung40132Indonesia University Center of Excellence on Artificial Intelligence for Vision Institut Teknologi BandungNatural Language Processing&Big Data Analytics(U-CoE AI-VLB)Bandung 40132West JavaIndonesia

Consider a typical situation where an investor is considering acquiring an unexplored *** oilfield has undergone a preliminary geological and geophysical study in which pre-discovery data such as lithology,depth,depositional system,diagenetic overprint,structural compartmentalization,and trap type are *** this situation,investors usually estimate production rates using a volumetric approach.A more accurate estimation of production rates can be obtained using analytical methods,which require additional data such as net pay,porosity,oil formation volume factor,permeability,viscosity,and *** call these data post-discovery parameters because they are only available after discovery through exploration drilling.A data-driven approach to estimating post-discovery parameters of an unexplored oilfield is developed based on its pre-discovery data by learning from proven reservoir *** the Gaussian mixture model,and a data-driven reservoir typology based on the joint probability distribution of post-discovery parameters is *** came up with 12 reservoir ***,an artificial neural network classification model with the resilient backpropagation algorithm is used to find relationships between pre-discovery data and reservoir *** on k-fold crossvalidation with k?10,the accuracy of the classification model is stable with an average of 87.9%.With our approach,an investor considering acquiring an unexplored oilfield can classify the oilfield's reservoir into a particular type and estimate its post-discovery parameters'joint probability *** investor can incorporate this information into a valuation model to calculate the production rates more accurately,estimate the oilfield's value and risk,and make an informed acquisition decision accordingly.

关键词： Data-driven Pre-discovery data Post-discovery parameters Gaussian mixture model Artificial neural network

Video-ColBERT: Contextualized Late Interaction for Text-to-Video Retrieval

学校读者我要写书评

暂无评论

arXiv 2025年

作者： Reddy, Arun Martin, Alexander Yang, Eugene Yates, Andrew Sanders, Kate Murray, Kenton Kriz, Reno de Melo, Celso M. Van Durme, Benjamin Chellappa, Rama Johns Hopkins Applied Physics Laboratory China Johns Hopkins University United States Human Language Technology Center of Excellence DEVCOM Army Research Laboratory

In this work, we tackle the problem of text-to-video retrieval (T2VR). Inspired by the success of late interaction techniques in text-document, text-image, and text-video retrieval, our approach, Video-ColBERT, introduces a simple and efficient mechanism for fine-grained similarity assessment between queries and videos. Video-ColBERT is built upon three main components: a fine-grained spatial and temporal token-wise interaction, query and visual expansions, and a dual sigmoid loss during training. We find that this interaction and training paradigm leads to strong individual, yet compatible, representations for encoding video content. These representations lead to increases in performance on common text-to-video retrieval benchmarks compared to other bi-encoder methods. Copyright © 2025, The Authors. All rights reserved.

关键词： Image retrieval

Multilingual Pixel Representations for Translation and Effective Cross-lingual Transfer

学校读者我要写书评

暂无评论

arXiv 2023年

作者： Salesky, Elizabeth Verma, Neha Koehn, Philipp Post, Matt Johns Hopkins University United States Human Language Technology Center of Excellence United States Microsoft United States

We introduce and demonstrate how to effectively train multilingual machine translation models with pixel representations. We experiment with two different data settings with a variety of language and script coverage, demonstrating improved performance compared to subword embeddings. We explore various properties of pixel representations such as parameter sharing within and across scripts to better understand where they lead to positive transfer. We observe that these properties not only enable seamless cross-lingual transfer to unseen scripts, but make pixel representations more data-efficient than alternatives such as vocabulary expansion. We hope this work contributes to more extensible multilingual models for all languages and scripts. Copyright © 2023, The Authors. All rights reserved.

关键词： Pixels

Two-Stage Augmentation and Adaptive CTC Fusion for Improved Robustness of Multi-Stream end-to-end ASR

学校读者我要写书评

暂无评论

Two-Stage Augmentation and Adaptive CTC Fusion for Improved ...

IEEE Spoken language technology Workshop

作者： Ruizhi Li Gregory Sell Hynek Hermansky Center for Language and Speech Processing The Johns Hopkins University USA Human Language Technology Center of Excellence The Johns Hopkins University USA

ISBN: (数字)9781728170664

ISBN: (纸本)9781728170671

Performance degradation of an Automatic speech Recognition (ASR) system is commonly observed when the test acoustic condition is different from training. Hence, it is essential to make ASR systems robust against various environmental distortions, such as background noises and reverberations. In a multi-stream paradigm, improving robustness takes account of handling a variety of unseen single-stream conditions and inter-stream dynamics. Previously, a practical two-stage training strategy was proposed within multi-stream end-to-end ASR, where Stage-2 formulates the multi-stream model with features from Stage-1 Universal Feature Extractor (UFE). In this paper, as an extension, we introduce a two-stage augmentation scheme focusing on mismatch scenarios: Stage-1 Augmentation aims to address single-stream input varieties with data augmentation techniques; Stage-2 Time Masking applies temporal masks on UFE features of randomly selected streams to simulate diverse stream combinations. During inference, we also present adaptive Connectionist Temporal Classification (CTC) fusion with the help of hierarchical attention mechanisms. Experiments have been conducted on two datasets, DIRHA and AMI, as a multi-stream scenario. Compared with the previous training strategy, substantial improvements are reported with relative word error rate reductions of 29.7 - 59.3% across several unseen stream combinations.

关键词： Training Error analysis Focusing Feature extraction Robustness Reverberation Noise measurement

Exploring Prompt-based Multi-task Learning for Multimodal Dialog State Tracking and Immersive Multimodal Conversation 11

学校读者我要写书评

暂无评论

Exploring Prompt-based Multi-task Learning for Multimodal Di...

11th Dialog System technology Challenge, DSTC 2023

作者： Chen, Yirong Li, Ya Wang, Tao Xing, Xiaofen Xu, Xiangmin Liu, Quan Liu, Cong Hu, Guoping Guangdong Provincial Key Laboratory of Human Digital Twin School of EE South China University of Technology Guangzhou China iFLYTEK Research Hefei China Pazhou Lab. Guangzhou China School of Future Technology South China University of Technology Guangzhou China State Key Laboratory of Cognitive Intelligence Hefei China National Engineering Research Center of Speech and Language Information Processing Hefei China

With the rise of the metaverse, immersive multimodal conversation has attracted more and more researchers’ attention. Multimodal contexts will become more important for human-computer interaction in the metaverse, especially in shopping domain. Unlike traditional conversation tasks, immersive multimodal conversation has challenges such as multimodal ambiguous candidate identification and multimodal coreference resolution, which makes it more difficult to dialog state tracking and response generation, as described in SIMMC 2.1 challenge, a part of DSTC11. In particular, as the number of objects in the scene increases, the difficulty will increase dramatically. We proposed PMTLED (Prompt-based Multi-Task Learning Encoder-Decoder), in which different subtasks use different prompts to make the model tend to focus on the current subtask. We achieve the winner in ambiguous candidates indentification and runner-up in multimodal coreference resolution (MM-Coref), multimodal dialog state tracking (MM-DST) and assistant response generation. Our code and model are made publicly available at https://***/scutcyr/dstc11-simmc2.1-scut-bds-lab. © 2023 Association for Computational Linguistics.

关键词： human computer interaction

Ambiguous Images With human Judgments for Robust Visual Event Classification

学校读者我要写书评

暂无评论

arXiv 2022年

作者： Sanders, Kate Kriz, Reno Liu, Anqi Van Durme, Benjamin Johns Hopkins University Human Language Technology Center of Excellence United States

Contemporary vision benchmarks predominantly consider tasks on which humans can achieve near-perfect performance. However, humans are frequently presented with visual data that they cannot classify with 100% certainty, and models trained on standard vision benchmarks achieve low performance when evaluated on this data. To address this issue, we introduce a procedure for creating datasets of ambiguous images and use it to produce SQUID-E ("Squidy"), a collection of noisy images extracted from videos. All images are annotated with ground truth values and a test set is annotated with human uncertainty judgments. We use this dataset to characterize human uncertainty in vision tasks and evaluate existing visual event classification models. Experimental results suggest that existing vision models are not sufficiently equipped to provide meaningful outputs for ambiguous images and that datasets of this nature can be used to assess and improve such models through model training and direct evaluation of model calibration. These findings motivate large-scale ambiguous dataset creation and further research focusing on noisy visual data.1 © 2022, CC BY.

关键词： Image enhancement

Injecting Text and Cross-lingual Supervision in Few-shot Learning from Self-Supervised Models

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Wiesner, Matthew Raj, Desh Khudanpur, Sanjeev Human Language Technology Center of Excellence Johns Hopkins University United States Center for Language and Speech Processing Johns Hopkins University United States

Self-supervised model pre-training has recently garnered significant interest, but relatively few efforts have explored using additional resources in fine-tuning these models. We demonstrate how universal phoneset acoustic models can leverage cross-lingual supervision to improve transfer of pretrained self-supervised representations to new languages. We also show how target-language text can be used to enable and improve fine-tuning with the lattice-free maximum mutual information (LF-MMI) objective. In three low-resource languages these techniques greatly improved few-shot learning performance. © 2021, CC BY.

关键词： Learning systems

Focus on the Present: A Regularization Method for the ASR Source-Target Attention Layer

学校读者我要写书评

暂无评论

Focus on the Present: A Regularization Method for the ASR So...

International Conference on Acoustics, speech, and Signal processing (ICASSP)

作者： Nanxin Chen Piotr Żelasko Jesús Villalba Najim Dehak Center for Language and Speech Processing Johns Hopkins University Baltimore MD Human Language Technology Center of Excellence Johns Hopkins University Baltimore MD

This paper introduces a novel method to diagnose the source-target attention in state-of-the-art end-to-end speech recognition models with joint connectionist temporal classification (CTC) and attention training. Our method is based on the fact that both, CTC and source-target attention, are acting on the same encoder representations. To understand the functionality of the attention, CTC is applied to compute the token posteriors given the attention outputs. We found that the source-target attention heads are able to predict several tokens ahead of the current one. Inspired by the observation, a new regularization method is proposed which leverages CTC to make source-target attention more focused on the frames corresponding to the output token being predicted by the decoder. Experiments reveal stable improvements up to 7% and 13% relatively with the proposed regularization on TED-LIUM 2 and Librispeech.

关键词： Training Adaptation models Conferences Computational modeling speech recognition Signal processing Acoustics

Radically old way of computing spectra: Applications in end-to-end ASR

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Sadhu, Samik Hermansky, Hynek Center for Language and Speech Processing Johns Hopkins University United States Human Language Technology Center of Excellence Johns Hopkins University United States

We propose a technique to compute spectrograms using Frequency Domain Linear Prediction (FDLP) that uses all-pole models to fit the squared Hilbert envelope of speech in different frequency sub-bands. The spectrogram of a complete speech utterance is computed by overlap-add of contiguous all-pole model responses. A long context window of 1.5 seconds allows us to capture the low frequency temporal modulations of speech in the spectrogram. For an end-to-end automatic speech recognition task, the FDLP spectrogram performs on par with the standard mel spectrogram features for clean read speech training and test data. For more realistic speech data with train-test domain mismatches or reverberations, FDLP spectrogram shows up to 25% and 22% relative WER improvements over mel spectrogram respectively. Copyright © 2021, The Authors. All rights reserved.

关键词： speech recognition