检索结果-内蒙古大学图书馆

6th International conference on communication and computational technologies, ICCCT 2024

作者： Vasudevan, Vibhav Ramakrishnan, Srinivas Seth, Utkarsh Shreya, M.B. Shylaja, S.S. Center for Data Science and Applied Machine Learning RR Campus Karnataka Bengaluru India

ISBN: (纸本)9789819774258

Video anomaly detection (VAD) is a demanding task because the very definition of anomalies in videos is inherently inconclusive and also due to the high manpower required to supervise lengthy videos. This research paper introduces a novel method for anomaly detection in videos. It utilizes the concurrent output of two deep learning models: the Convolutional Autoencoder (Conv-AE) for anomaly detection based on reconstruction errors and the Convolutional Long Short-Term Memory (ConvLSTM) for future frame prediction. The Conv-AE detects anomalies by capitalizing on its excellent spatial learning capabilities and the ConvLSTM model is helpful owing to its powerful temporal modeling abilities. By running these two models in parallel and normalizing the results obtained from both, we found that our combined model (CAELSTM) gave satisfactory results (AUROC) for two of the most prevalent datasets in this field of VAD, namely CUHK Avenue (77.44%) and Ped2 (87.31%), showcasing its promising performance. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

关键词： Video analysis

来源：评论

学校读者我要写书评

暂无评论

Optimized Automated Stock Trading using DQN and Double DQN

Optimized Automated Stock Trading using DQN and Double DQN

引用

2024 International Conference on Intelligent Algorithms for Computational Intelligence Systems, IACIS 2024

作者： Bharadwaj, Gurudutt S Pratap, David Darapaneni, Narayana Pes University Department of Computer Science Bengaluru India Great Learning Department of Data Science and Machine Learning Bengaluru India

ISBN: (纸本)9798350360660

Stock Portfolio management involves managing the buying, holding and selling decisions for the various stocks in the portfolio. There has been work where Reinforcement learning (RL) based actor-critic methods like Deep Direct Policy Gradient (DDPG) have been used for asset allocation problems. Here an attempt has been made to use solely critic-based value function methods like Deep Q-network (DQN) and Double DQN for estimating Q-values of market actions. Then, an optimized portfolio management algorithm designed to balance trades across a basket of stocks is designed. Five stocks are chosen of different price ranges from NYSE and NASDAQ stock exchanges. The average cumulative percentage returns provided by DQN was 55% on testing data with an average Maximum Drawdown (MDD) of 2.5%. The same with Double DQN was 71% on testing data with an average MDD of 2.83%. These results were significantly better than the case when a traditional Buy and Hold strategy was to be employed. © 2024 IEEE.

关键词： Reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

AgentFusion: A Multi-Agent Approach to Accurate Text Generation

AgentFusion: A Multi-Agent Approach to Accurate Text Generat...

引用

2024 International Conference on Electrical and Computer Engineering Researches, ICECER 2024

作者： Saeid, Yasser Kopinski, Thomas South Westphalia University of Applied Sciences Machine Learning - Data Science Meschede Germany

ISBN: (纸本)9798331539733

The rise of large language models (LLMs) like Chat-GPT has significantly transformed the field of natural language processing (NLP). These models are now central to many companies' operations due to their capabilities in generating human-like text, understanding context, and responding to queries with high fluency. However, LLMs are not without flaws. They sometimes generate inaccurate or even completely fabricated information-a phenomenon known as 'hallucination.' This issue underscores the increasing importance of Retrieval-Augmented Generation (RAG) systems, which aim to enhance the accuracy of LLM outputs by incorporating data from external sources. RAG systems are especially valuable when dealing with non-English content, such as German-language tasks, where high-quality data retrieval is crucial for achieving accurate results. Developing an effective RAG system, however, is complex and requires careful consideration of several critical elements. In this paper, we present a new approach to RAG, which we refer to as an 'agentic RAG' system. This system utilizes three distinct agents that collaborate to optimize the output. We rigorously tested this system across various types of embeddings and benchmarked its performance against GPT-4. Our results indicate that the agentic RAG system significantly improves accuracy, particularly for German-language content, achieving a 24/25 accuracy score in tasks related to reactor decommissioning, finance, and sports domains. These results demonstrate the broad applicability and performance gains of agentic RAG systems in multilingual NLP tasks. © 2024 IEEE.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

Video understanding : Tagging of videos through self attentive learnable key descriptors 11

Video understanding : Tagging of videos through self attenti...

引用

11th International Symposium on Electronic Systems Devices and Computing, ESDC 2023

作者： Darapaneni, Narayana Paduri, Anwesh Reddy Thomas, Dinu Jisha, C.U. Shrivastava, Abhinao Biradar, Seema Pes University Data Science and Machine Learning Bangalore India

ISBN: (纸本)9781665455725

In today's world, the UGC (User Generated Contents) videos have increased exponentially. Billions of videos are uploaded, played and exchanged between different actors. In this context, automatic video content classification has become a critical and challenging problem, especially in areas like video-based search, recommendation etc. In this work we try to extract frame-level visual and audio features, pre-extracted features are then converted into a compact video level representation effectively and efficiently. We aim to classify the video into a set of categories with high accuracy. From the literature survey, we identified that, the tagging of videos has been a problem which has not reached its maturity yet, and there are many researches happening in this area. It is observed that, the clustering based video description methodologies show a better result compared to the temporal algorithms. We also have identified that, majority of the SOTA techniques use the VLAD (Vector of Locally Aggregated Descriptors) technique to extract the video features and make the codebook learnable through some adjustments introduced in the NetVLAD. The key descriptors would be mostly noisy, and many of them are insignificant. In this work we aim to cascade a Self-Attention Block on the NetVLAD which can extract the significant descriptors and filter out the Noise. The YouTube 8M dataset shall be used for training the model and performance will be compared with other SOTA techniques. Like other similar works, model performance will be measured by GAP Metric (Global Average Precision) for all the videos predicted labels. We aim to achieve a GAP score close to 85% for this work. © 2023 IEEE.

关键词： Video recording

来源：评论

学校读者我要写书评

暂无评论

Video Label Enhancing and Standardization through Transcription and WikiId Mapping Techniques 11

Video Label Enhancing and Standardization through Transcript...

引用

11th International Symposium on Electronic Systems Devices and Computing, ESDC 2023

作者： Thomas, Dinu Pratap, David Sudha, B.G. Pes University Data Science and Machine Learning Bangalore India

ISBN: (纸本)9781665455725

Volume of video content surpass all other content types in internet. As per the reports from different sources, video traffic had acquired 82% of internet usage in 2022. Video is going to be more important in the years to come for user engagement, advertisement & marketing, news, education etc. Video information retrieval becomes an important problem to solve in this context. An accurate and fast video tagging system can aid a good content recommendation to the end users. It helps to audit the content automatically thereby platforms can control the contents which are politically and morally harmful. There are not many faster or cost-effective mechanisms to tag user generated videos at this moment. Manual tagging is a costly and highly time taking task. A delay in indexing the videos like news, sports etc., shall reduce its freshness and relevancy. Deep learning techniques have reached its maturity in the contents like text and images, but it is not the case with videos. Deep learning models need more resources to deal with videos due to its multi-modality nature, and temporal behavior. Apart from that, there are not many large-scale video datasets available at this moment. Youtube-8M is the largest dataset which is publicly available as of now. Much research works happened over Youtube-8M dataset. From our study, all these have a potential limitation. For example, in Youtube-8M, Video labels are only around 3.8K which are not covering all real-world tags. It is not covering the new domains which are created along with the surge in the content traffic. This study aims to handle this problem of tag creation through different methods available thereby enhancing the labels to a much wider set. This work also aims to produce a scalable tagging pipeline which uses multiple retrieval mechanisms, combine their results. The work aims to standardize the retrieved tokens across languages. This work creates a dataset as an outcome from 'Wikidata', which can be used for any NLP

关键词： Standardization

来源：评论

学校读者我要写书评

暂无评论

Hybrid speech enhancement in modulation domain

引用

Multimedia Tools and Applications 2024年 1-31页

作者： P S, Praveen Kumar H S, Jayanna Technical Lead-Machine Learning Merit Data and Technology Tamil Nadu Chennai India Department of Information Science and Engineering Siddaganga Institute of Technology Karnataka Tumkur India

This paper presents a comprehensive study on speech enhancement (SE) techniques, particularly focusing on the utilization of the discrete cosine transform (DCT) in the modulation domain (MD) in combination with the minimum mean square error (MMSE) and Kalman filter (KF) algorithms. The research investigates the efficacy of these techniques in improving speech quality and intelligibility under various noisy conditions, such as babble, drilling, horn, train, and traffic noises, across different signal-to-noise ratio (SNR) levels. Additionally, the study evaluates the performance of recurrent neural network (RNN) algorithms alongside traditional SE approaches. The study highlights the advantages of employing DCT in the modulation domain over the traditional discrete Fourier transform (DFT) in SE applications. Experimental results demonstrate significant improvements in objective quality metrics, including Perceptual Evaluation of Speech Quality (PESQ), composite measures, and short-time objective intelligibility (STOI), when implementing SE algorithms using discrete cosine transform in modulation domain. The proposed hybrid speech enhancement techniques, leveraging minimum mean square error and Kalman filter in combination with discrete cosine transform in modulation domain, outperform individual speech enhancement algorithms, showcasing superior noise reduction capabilities. This paper contributes to the advancement of speech enhancement methodologies, particularly in real-world noisy environments, and underscores the effectiveness of discrete cosine transform-based approaches in enhancing speech quality and intelligibility. © The Author(s), under exclusive licence to Springer science+Business Media, LLC, part of Springer Nature 2024.

关键词： Recurrent neural networks

来源：评论

学校读者我要写书评

暂无评论

Image Recognition for Wildlife Conservation

Image Recognition for Wildlife Conservation

引用

2025 IEEE International Conference on Computational, Communication and Information Technology, ICCCIT 2025

作者： Charanya, P. Sridharan, S. Yuvan Shankar, S. Sriram, M. Yashvanth, K.P. Department of Artificial Intelligence and Machine Learning Coimbatore India Department of Artificial Intelligence and Data Science Coimbatore India

ISBN: (数字)9798331512965

ISBN: (纸本)9798331512965

This study adopts an empirical approach to evaluate the efficacy of image processing methods in conservation efforts for animals. The initial phase involves the collection of data from various sources within the natural ecosystem, including images captured through camera traps, drones, and other monitoring devices. This comprehensive and diverse dataset is then processed using advanced Image Processing Tools. A key strength of this methodology lies in its use of machine learning algorithms, particularly YOLO v8 (You Only Look Once) and Support Vector machines (SVMs). These techniques are suitable for wildlife conservation because of their capacity to analyze complex visual data and detect patterns indicative of species presence, behavior, and habitat conditions. The paper outlines the application of YOLO v8 and SVMs to the collected image data, enabling accurate identification, real-time detection, and classification of wildlife species. This capability is essential for monitoring populations, tracking movements, and identifying potential threats, thus facilitating timely conservation actions. Furthermore, the research extends its evaluation by comparing the results of the machine learning approach with traditional ecological survey methods, offering a comprehensive assessment of the effectiveness of image processing in wildlife conservation. © 2025 IEEE.

关键词： Invertebrates

来源：评论

学校读者我要写书评

暂无评论

Privacy Preserving data Imputation via Multi-party Computation for Medical Applications

Privacy Preserving Data Imputation via Multi-party Computati...

引用

2024 IEEE International Conference on E-Health Networking, Application and Services, HealthCom 2024

作者： Jentsch, Julia Ünal, Ali Burak Mağara, Şeyma Selcan Akgün, Mete Department of Computer Science Medical Data Privacy and Privacy Preserving Machine Learning Tübingen Germany

ISBN: (数字)9798350350548

ISBN: (纸本)9798350350548

Handling missing data is crucial in machine learning, but many datasets contain gaps due to errors or non-response. Unlike traditional methods such as listwise deletion, which are simple but inadequate, the literature offers more sophisticated and effective methods, thereby improving sample size and accuracy. However, these methods require accessing the whole dataset, which contradicts the privacy regulations when the data is distributed among multiple sources. Especially in the medical and healthcare domain, such access reveals sensitive information about patients. This study addresses privacy-preserving imputation methods for sensitive data using secure multi-party computation, enabling secure computations without revealing any party’s sensitive information. In this study, we realized the mean, median, regression, and kNN imputation methods in a privacy-preserving way. We specifically target the medical and healthcare domains considering the significance of protection of the patient data, showcasing our methods on a diabetes dataset. Experiments on the diabetes dataset validated the correctness of our privacy-preserving imputation methods, yielding the largest error around 3 × 10-3, closely matching plaintext methods. We also analyzed the scalability of our methods to varying numbers of samples, showing their applicability to real-world healthcare problems. Our analysis demonstrated that all our methods scale linearly with the number of samples. Except for kNN, the runtime of all our methods indicates that they can be utilized for large datasets. © 2024 IEEE.

关键词： Differential privacy

来源：评论

学校读者我要写书评

暂无评论

Structural Breakpoint Detection in Noisy Time Series Using an Augmented Matrix Profile Index 24

Structural Breakpoint Detection in Noisy Time Series Using a...

引用

24th IEEE International Conference on data Mining Workshops, ICDMW 2024

作者： Fuchs, Marco Catholic University of Eichstätt-Ingolstadt Mathematical Institute for Machine Learning and Data Science Ingolstadt Germany

ISBN: (纸本)9798331530631

We introduce a method for partitioning a time series into segments. The method extends the recently introduced Fast Low-Cost Semantic Segmentation (FLUSS) algorithm to increase its robustness against noise and to automatically learn two required hyperparameters. FLUSS is part of the Matrix Profile framework and conceptualises the relationship between every subsequence and its nearest-neighbor as an arc spanning over the time series. It then determines segment boundaries by selecting time indexes which are spanned over by significantly fewer arcs than an average index, such that most subsequences on either side of the time index find their closest neighbor at the same side of the index. We extend FLUSS in two directions. First, we consider the Z nearest neighbors of each subsequence to compute its neighbors' concentration along the time series as a measure for how characteristic a sequence itself is for a particular segment. This measure is then used to filter out non-characteristic sequences, aimed at increasing the robustness of the segmentation against noise. Second, FLUSS requires the user to set two important yet hard to determine hyperparameters, namely the number of segments and the length of exclusion zones around the index of an already found segment boundary. We introduce a procedure to learn both hyperparameters implicitly by setting the tolerance parameter of an iterative end-point fit algorithm, which extracts (local) minima from a meta time series denoting the number of arcs that cross a time series index. We test the method on a set of 32 diverse and publicly available univariate time series, where it demonstrates promising results. © 2024 IEEE.

关键词： Semantic Segmentation

来源：评论

学校读者我要写书评

暂无评论

EPR-Net: constructing a non-equilibrium potential landscape via a variational force projection formulation

引用

National science Review 2024年第7期11卷 135-147页

作者： Yue Zhao Wei Zhang Tiejun Li Center for Data Science Peking University Zuse Institute Berlin Department of Mathematics and Computer Science Freie Universit?t Berlin Laboratory of Mathematics and Applied Mathematics(LMAM)and School of Mathematical Sciences Peking University Center for Machine Learning Research Peking University

We present EPR-Net, a novel and effective deep learning approach that tackles a crucial challenge in biophysics: constructing potential landscapes for high-dimensional non-equilibrium steady-state ***-Net leverages a nice mathematical fact that the desired negative potential gradient is simply the orthogonal projection of the driving force of the underlying dynamics in a weighted inner-product ***, our loss function has an intimate connection with the steady entropy production rate(EPR),enabling simultaneous landscape construction and EPR estimation. We introduce an enhanced learning strategy for systems with small noise, and extend our framework to include dimensionality reduction and the state-dependent diffusion coefficient case in a unified fashion. Comparative evaluations on benchmark problems demonstrate the superior accuracy, effectiveness and robustness of EPR-Net compared to existing methods. We apply our approach to challenging biophysical problems, such as an eight-dimensional(8D)limit cycle and a 52D multi-stability problem, which provide accurate solutions and interesting insights on constructed landscapes. With its versatility and power, EPR-Net offers a promising solution for diverse landscape construction problems in biophysics.

关键词： high-dimensional potential landscape non-equilibrium system entropy production rate dimensionality reduction deep learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：