检索结果-内蒙古大学图书馆

Unlocking the Secrets Behind Advanced Artificial Intelligence Language Models in Deidentifying Chinese-English Mixed Clinical Text: Development and Validation Study

学校读者我要写书评

暂无评论

Journal of Medical Internet Research 2024年第1期26卷 e48443页

作者： Lee, You-Qian Chen, Ching-Tai Chen, Chien-Chang Lee, Chung-Hong Chen, Peitsz Wu, Chi-Shin Dai, Hong-Jie Dialogue System Technical Department Asustek Computer Inc Taipei Taiwan Intelligent System Laboratory Department of Electrical Engineering College of Electrical Engineering and Computer Science National Kaohsiung University of Science and Technology Kaohsiung Taiwan Department of Bioinformatics and Medical Engineering Asia University Taichung Taiwan Center for Precision Health Research Asia University Taichung Taiwan Electromagnetic Sensing Control and AI Computing System Laboratory Department of Electrical Engineering College of Electrical Engineering and Computer Science National Kaohsiung University of Science and Technology Kaohsiung Taiwan Knowledge Discovery and Data Mining Lab Department of Electrical Engineering College of Electrical Engineering and Computer Science National Kaohsiung University of Science and Technology Kaohsiung Taiwan Department of Chemical Engineering Feng Chia University Taichung Taiwan National Center for Geriatrics and Welfare Research National Health Research Institutes Zhunan Taiwan National Institute of Cancer Research National Health Research Institutes Tainan Taiwan School of Post-Baccalaureate Medicine College of Medicine Kaohsiung Medical University Kaohsiung Taiwan Center for Big Data Research Kaohsiung Medical University Kaohsiung Taiwan

Background: The widespread use of electronic health records in the clinical and biomedical fields makes the removal of protected health information (PHI) essential to maintain privacy. However, a significant portion of information is recorded in unstructured textual forms, posing a challenge for deidentification. In multilingual countries, medical records could be written in a mixture of more than one language, referred to as code mixing. Most current clinical natural language processing techniques are designed for monolingual text, and there is a need to address the deidentification of code-mixed text. Objective: The aim of this study was to investigate the effectiveness and underlying mechanism of fine-tuned pretrained language models (PLMs) in identifying PHI in the code-mixed context. Additionally, we aimed to evaluate the potential of prompting large language models (LLMs) for recognizing PHI in a zero-shot manner. Methods: We compiled the first clinical code-mixed deidentification data set consisting of text written in Chinese and English. We explored the effectiveness of fine-tuned PLMs for recognizing PHI in code-mixed content, with a focus on whether PLMs exploit naming regularity and mention coverage to achieve superior performance, by probing the developed models’ outputs to examine their decision-making process. Furthermore, we investigated the potential of prompt-based in-context learning of LLMs for recognizing PHI in code-mixed text. Results: The developed methods were evaluated on a code-mixed deidentification corpus of 1700 discharge summaries. We observed that different PHI types had preferences in their occurrences within the different types of language-mixed sentences, and PLMs could effectively recognize PHI by exploiting the learned name regularity. However, the models may exhibit suboptimal results when regularity is weak or mentions contain unknown words that the representations cannot generate well. We also found that the availability of cod

关键词： ChatGPT code mixing deidentification electronic health record large language model pretrained language model

Audio Segmentation Techniques and Applications Based on Deep Learning

学校读者我要写书评

暂无评论

Scientific Programming 2022年第0期2022卷

作者： Aggarwal, Shruti Vasukidevi, G. Selvakanmani, S. Pant, Bhaskar Kaur, Kiranjeet Verma, Amit Binegde, Geleta Negasa Department of Computer Science and Engineering Thapar University Punjab Patiala147004 India Department of Science and Humanities R.M.K. College of Engineering and Technology R.S.M. Nagar Tamil Nadu Puduvoyal India Department of Artificial Intelligence and Data Science Velammal Institute of Technology Velammal Knowledge Park Tamil Nadu Chennai India Department of Computer Science and Engineering Graphic Era Deemed to be University Bell Road Clement Town Uttarakhand Dehradun248002 India University Center for Research and Development Chandigarh University Punjab Ajitgarh India Department of Computer Science College of Engineering and Technology Mettu University Metu Ethiopia

Audio processing has become an inseparable part of modern applications in domains ranging from health care to speech-controlled devices. In automated audio segmentation, deep learning plays a vital role. In this article, we are discussing audio segmentation based on deep learning. Audio segmentation divides the digital audio signal into a sequence of segments or frames and then classifies these into various classes such as speech recognition, music, or noise. Segmentation plays an important role in audio signal processing. The most important aspect is to secure a large amount of high-quality data when training a deep learning network. In this study, various application areas, citation records, documents published year-wise, and source-wise analysis are computed using Scopus and Web of science (WoS) databases. The analysis presented in this paper supports and establishes the significance of the deep learning techniques in audio segmentation. Copyright © 2022 Shruti Aggarwal et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

关键词： Speech recognition

Convergence of sequences: A survey

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Franci, Barbara Grammatico, Sergio Department of Data Science and Knowledge Engineering Maastricht University Maastricht Netherlands Delft Center for Systems and Control Delft University of Technology Delft Netherlands

Convergent sequences of real numbers play a fundamental role in many different problems in system theory, e.g., in Lyapunov stability analysis, as well as in optimization theory and computational game theory. In this survey, we provide an overview of the literature on convergence theorems and their connection with Féjer monotonicity in the deterministic and stochastic settings, and we show how to exploit these results. © 2021, CC BY.

关键词： Surveys

Beware of the simulated DAG! Causal discovery benchmarks may be easy to game

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Reisach, Alexander G. Seiler, Christof Weichwald, Sebastian Department of Mathematical Sciences University of Copenhagen Denmark Department of Data Science and Knowledge Engineering Maastricht University Netherlands Mathematics Centre Maastricht Maastricht University Netherlands

Simulated DAG models may exhibit properties that, perhaps inadvertently, render their structure identifiable and unexpectedly affect structure learning algorithms. Here, we show that marginal variance tends to increase along the causal order for generically sampled additive noise models. We introduce varsortability as a measure of the agreement between the order of increasing marginal variance and the causal order. For commonly sampled graphs and model parameters, we show that the remarkable performance of some continuous structure learning algorithms can be explained by high varsortability and matched by a simple baseline method. Yet, this performance may not transfer to real-world data where varsortability may be moderate or dependent on the choice of measurement scales. On standardized data, the same algorithms fail to identify the ground-truth DAG or its Markov equivalence class. While standardization removes the pattern in marginal variance, we show that data generating processes that incur high varsortability also leave a distinct covariance pattern that may be exploited even after standardization. Our findings challenge the significance of generic benchmarks with independently drawn parameters. The code is available at https://***/Scriddie/Varsortability. Copyright © 2021, The Authors. All rights reserved.

关键词： Standardization

Block-Structured Optimization for Subgraph Detection in Interdependent Networks

学校读者我要写书评

暂无评论

arXiv 2022年

作者： Jie, Fei Wang, Chunpai Chen, Feng Li, Lei Wu, Xindong Key Laboratory of Knowledge Engineering with Big Data Hefei University of Technology Ministry of Education Hefei China School of Computer Science and Information Engineering Hefei University of Technology Hefei China Department of Computer Science University at Albany – SUNY AlbanyNY United States Erik Jonsson School of Engineering & Computer Science The University of Texas at Dallas DallasTX United States Mininglamp Academy of Sciences Mininglamp Technologies Beijing China Institute of Big Knowledge Science Hefei University of Technology Hefei China

We propose a generalized framework for block-structured nonconvex optimization, which can be applied to structured subgraph detection in interdependent networks, such as multi-layer networks, temporal networks, networks of networks, and many others. Specifically, we design an effective, efficient, and parallelizable projection algorithm, namely Graph Block-structured Gradient Projection (GBGP), to optimize a general non-linear function subject to graph-structured constraints. We prove that our algorithm: 1) runs in nearly-linear time on the network size;2) enjoys a theoretical approximation guarantee. Moreover, we demonstrate how our framework can be applied to two very practical applications and conduct comprehensive experiments to show the effectiveness and efficiency of our proposed algorithm. Copyright © 2022, The Authors. All rights reserved.

关键词： Functions

SmaAt-UNet: Precipitation Nowcasting using a Small Attention-UNet Architecture

学校读者我要写书评

暂无评论

arXiv 2020年

作者： Trebing, Kevin Mehrkanoon, Siamak Department of Data Science and Knowledge Engineering Maastricht University Netherlands

Weather forecasting is dominated by numerical weather prediction that tries to model accurately the physical properties of the atmosphere. A downside of numerical weather prediction is that it is lacking the ability for short-term forecasts using the latest available information. By using a data-driven neural network approach we show that it is possible to produce an accurate precipitation nowcast. To this end, we propose SmaAt-UNet, an efficient convolutional neural networks based on the well known UNet architecture equipped with attention modules and depthwise-separable convolutions. We evaluate our approach on a real-life dataset using precipitation maps from the region of the Netherlands. The experimental results show that in terms of accuracy the proposed model is comparable to other examined models while only using a quarter of the trainable parameters. Copyright © 2020, The Authors. All rights reserved.

关键词： Weather forecasting

On sorting with a network of two stacks 19

学校读者我要写书评

暂无评论

On sorting with a network of two stacks

19th Symposium on Algorithmic Approaches for Transportation Modelling, Optimization, and Systems, ATMOS 2019

作者： Mihalák, Matúš Pont, Marc Department of Data Science and Knowledge Engineering Maastricht University Netherlands

ISBN: (纸本)9783959771283

Sorting with stacks is a collection of problems that deal with sorting a sequence of numbers by pushing and popping the numbers to and from a given set of stacks. Multiple concrete decision or optimization questions are formed by restricting the access to the stacks. The motivation comes, e.g., from shunting train wagons in shunting yards, shunting trams in depots, or in stacking cargo containers on cargo ships or storage yards in transshipment terminals. We consider the problem of sorting a permutation of n integers 1, 2, . . ., n using k ≥ 2 stacks. In this problem, elements from the input sequence are pushed one-by-one (in the order of the elements in the sequence) to one of the k stacks. At any time, an element from a stack can be popped and pushed to another stack;such an operation is called a shuffle. Also, at any time, an element can be popped from a stack and placed to the output sequence. We can only place the elements to the output in the increasing order of their value such that at the end the output is the ordered sequence of the elements. The problem asks to minimize the number of shuffles in the process. It is known that for k ≥ 4, the problem is NP-hard, and that there is no approximation algorithm unless P=NP. For k ≥ 3, it is known that at most O(n log n) shuffles are needed for any input sequence. For the case when k = 2, there exist input sequences that require Ω(n2−Ε) shuffles, for any Ε > 0. Nothing substantially more is known for the case of k = 2. In this paper, we study the following variant of the problem with k = 2 stacks: no shuffle and no placement to the output sequence can happen before every element is in one of the two stacks. We show that our problem can be seen as the MinUnCut problem by providing a polynomial-time reduction, and thus we show that there exists a randomized O(√log n)-approximation algorithm and a deterministic O(log n)-approximation algorithm for our problem. © Matúš Mihalák and Marc Pont;licensed under Creative Comm

关键词： Optimization

Deep multi-stations weather forecasting: Explainable recurrent convolutional neural networks

学校读者我要写书评

暂无评论

arXiv 2020年

作者： Abdellaoui, Ismail Alaoui Mehrkanoon, Siamak Department of Data Science and Knowledge Engineering Maastricht University Netherlands

Deep learning applied to weather forecasting has started gaining popularity because of the progress achieved by data-driven models. The present paper compares two different deep learning architectures to perform weather prediction on daily data gathered from 18 cities across Europe and spanned over a period of 15 years. We propose the Deep Attention Unistream Multistream (DAUM) networks that investigate different types of input representations (i.e. tensorial unistream vs. multistream) as well as the incorporation of the attention mechanism. In particular, we show that adding a self-attention block within the models increases the overall forecasting performance. Furthermore, visualization techniques such as occlusion analysis and score maximization are used to give an additional insight on the most important features and cities for predicting a particular target feature of target cities. © 2020, CC BY-SA.

关键词： Weather forecasting

A generative policy gradient approach for learning to play text-based adventure games 31

学校读者我要写书评

暂无评论

A generative policy gradient approach for learning to play t...

31st Benelux Conference on Artificial Intelligence and the 28th Belgian Dutch Conference on Machine Learning, BNAIC/BENELEARN 2019

作者： Raab, René Driessens, Kurt Department of Data Science and Knowledge Engineering Maastricht University Netherlands