检索结果-内蒙古大学图书馆

An optimized multi-scale convolutional autoencoder for efficient abnormal event detection using rgb, depth and optical flow data

引用

Multimedia Tools and Applications 2025年 1-35页

作者： Alqahtani, Abdullah Department of Computer Science College of Computer Engineering and Sciences Prince Sattam Bin Abdulaziz University KSA Al-Kharj Saudi Arabia

In this study, we propose a novel framework for detecting abnormal events in surveillance videos, a critical yet challenging task in security applications. This research introduces a robust and efficient solution for video anomaly detection, offering substantial improvements in surveillance systems' ability to detect abnormal events, thereby contributing to enhanced security measures in public spaces. The proposed framework utilizes a Multiscale Convolutional Autoencoder (MSCAE) that processes inputs from RGB, depth, and optical flow video clips, enhancing the detection accuracy in complex scenes characterized by varying object scales, aspect ratios, and occlusions. To address the challenge of noise and preserve edges in video data, we implement a two-pass bilateral smooth filtering method, which is effective for noise-invariant, edge-preserving image smoothing. For object detection within these complex scenes, an enhanced Faster R-CNN model is employed. This model's performance is further refined through transfer learning on a dataset specifically composed of abnormal event videos. We also introduce significant improvements to the region proposal network (RPN) of the Faster R-CNN, particularly in non-maximum suppression (NMS) and anchor generation techniques, to better detect anomalies in diverse and complex environments. Furthermore, the MSCAE is integrated with Long Short-Term Memory (LSTM) neural networks to classify the detected anomalies, creating an end-to-end solution for video anomaly detection. Hyperparameter optimization for our deep learning models is performed using the Chameleon Swarm Algorithm, ensuring optimal model performance. Our framework was rigorously tested on the CUHK Avenue dataset, where it achieved a remarkable 99.5% accuracy, significantly outperforming existing methods and demonstrating the effectiveness of our approach. © The Author(s), under exclusive licence to Springer science+Business Media, LLC, part of Springer Nature 2025.

关键词： Optical flows

来源：评论

学校读者我要写书评

暂无评论

IWM-LSTM encoder for abstractive text summarization

引用

Multimedia Tools and Applications 2025年第9期84卷 5883-5904页

作者： Gangundi, Ravindra Sridhar, Rajeswari Department of Computer Science and Engineering National Institute of Technology Tamilnadu Tiruchirappalli620015 India

Sequence-to-sequence models are fundamental building blocks for generating abstractive text summaries, which can produce precise and coherent summaries. Recently proposed, different text summarization models aimed to enhance summarization performance through the use of copying mechanisms, reinforcement learning, and multiple-level encoders. However, there has been limited research on improving the summarization output by modifying the structure of the long short-term memory (LSTM) cell. We introduced an improved version of LSTM called improved working memory LSTM (IWM-LSTM). IWM-LSTM removes the output gate and enhances the input and forget gates by incorporating cell state information into these gates. In our sequence-to-sequence model for text summarization, we replaced the LSTM encoder with a bi-directional IWM-LSTM, resulting in better summaries with minimal training time and less computational intensiveness. Additionally, we utilized Bidirectional encoder representations from transformers (BERT) embeddings to enhance the rouge score. The CNN/DailyMail dataset is used to train and test the model performance. The proposed model achieves better Recall-oriented understudy for gisting evaluation (ROUGE) scores than state-of-the-art models. © The Author(s), under exclusive licence to Springer science+Business Media, LLC, part of Springer Nature 2024.

关键词： Embeddings

来源：评论

学校读者我要写书评

暂无评论

RelaxVR: Cybersickness Reduction in Immersive Virtual Reality through Explainable AI and Large Language Models

引用

IEEE Access 2025年 13卷 84689-84712页

作者： Kundu, Ripan Kumar Hoque, Khaza Anuarul University of Missouri Department of Electrical Engineering Computer Science ColumbiaMO United States

Virtual reality (VR) systems are susceptible to cybersickness, significantly hindering user immersion. Very recently, researchers introduced explainable artificial intelligence (XAI) enabled methods for detecting and explaining cybersickness features. Since XAI methods can identify the dominant features causing cybersickness, we argue that this knowledge can also guide the dynamic adoption of effective cybersickness reduction strategies, unlike state-of-the-art static approaches that rely on fixed methods. This paper introduces a new dataset MazeSick and RelaxVR, an interactive XAI-guided VR cybersickness reduction framework to predict, explain, and reduce cybersickness. Specifically, we propose an innovative XAI-guided cybersickness reduction engine, which selects the most appropriate reduction technique based on XAI-provided feature importance scores from a cybersickness reduction library of different reduction techniques. We also design an interactive dialogue system powered by large language models, enabling users to engage in the VR simulation via voice commands to understand the reasoning of cybersickness and select effective reduction strategies through natural language interaction. We deployed RelaxVR on a consumer-grade VR headset (e.g., HTC VIVE Pro) and validated it through a user study on cybersickness evaluation, where participants were immersed in a custom-built VR Maze simulation. Our results demonstrate that RelaxVR effectively reduces cybersickness with minimal impact on immersion, and 94% of the users agreed that RelaxVR is highly effective for reducing cybersickness and easier to use through the dialogue engine for selecting effective reduction strategies. © 2013 IEEE.

关键词： Virtual reality

来源：评论

学校读者我要写书评

暂无评论

Machine learning and data analytics driven model for income tax rebate estimation: A neoteric and benign approach 4th

Machine learning and data analytics driven model for income ...

引用

4th International Conference on Computational Methods in science and Technology, ICCMST 2024

作者： Sharma, Viney Kumar, Achal Singh, Archana Kumar, Manish Department of Computer Science & Engineering Sharda University Agra India Department of Computer Science & Engineering Anand Engineering College Agra India Department of Computer Science & Engineering Chandigarh Engineering College Landran India

ISBN: (纸本)9781032911571

The provision of rebate to needy/underprivileged sections of society has been in practice since long in government organizations. The efficacy of such provisions lies in the fact that whether this rebate reaches people for whom it is intended. The biggest problem in awarding relief lies in the fact that people may not disclose their actual income. This paper conducts a study which strives for proposing a model which helps in assessing the financial status of any person given certain information about that person. This paper aims at proposing a Machine learning and Data Analytics driven model that predict the financial status of a person. The striking feature of this study is that results are applicable not only to income tax rebate problem but with minor customization, the model can be extended to work in other problems also where financial status of a person matters. © 2025 the Author(s).

关键词： Decentralized finance

来源：评论

学校读者我要写书评

暂无评论

Malicious iOS Apps Detection Through Multi-Criteria Decision-Making Approach

Informatica (Slovenia)

引用

Informatica (Slovenia) 2025年第1期49卷 207-220页

作者： Bhatt, Arpita Jadhav Sardana, Neetu Department of Computer Science & Engineering and Information Technology Jaypee Institute of Information Technology Indonesia

In today’s era, smartphones are used in daily lives because they are ubiquitous and can be customized by installing third-party apps. As a result, the menaces because of these apps, which are potentially risky for user’s privacy, have increased. Information on smartphones is perhaps, more personal than compared to data stored on desktops or computers, making it an easy target for intruders. After Android, the most prevalently used mobile operating system is Apple’s iOS. Both Android and iOS follow permission-based access control to protect user’s privacy. However, the users are unaware whether the app is breaching the user’s privacy. To combat this problem, in the paper we propose a hybrid approach to detect malicious iOS apps based on its permissions. In the first phase, weights have been assigned to app permissions using multi-criteria decision-making (MCDM) approach namely Analytic Hierarchy Process (AHP), and in the second phase machine learning& ensemble learning techniques have been employed to train the classifiers for detecting malicious apps. To test the efficacy of the proposed method dataset comprising 1150 apps from 12 app categories has been used. The results demonstrate the proposed approach improves the efficacy of detecting malicious iOS apps for majority of categories. © 2025 Slovene Society Informatika. All rights reserved.

关键词： Differential privacy

来源：评论

学校读者我要写书评

暂无评论

VoteDroid: a new ensemble voting classifier for malware detection based on fine-tuned deep learning models

引用

Multimedia Tools and Applications 2025年第12期84卷 10923-10944页

作者： Bakır, Halit Faculty of Engineering and Natural Sciences Department of Computer Engineering Sivas University of Science and Technology Sivas Turkey

In this work, VoteDroid a novel fine-tuned deep learning models-based ensemble voting classifier has been proposed for detecting malicious behavior in Android applications. To this end, we proposed adopting the random search optimization algorithm for deciding the structure of the models used as voter classifiers in the ensemble classifier. We specified the potential components that can be used in each model and left the random search algorithm taking a decision about the structure of the model including the number of each component that should be used and its location in the structure. This optimization method has been used to build three different deep learning models namely CNN-ANN, pure CNN, and pure ANN. After selecting the best structure for each DL model, the selected three models have been trained and tested using the constructed image dataset. Afterward, we suggested hybridizing the fine-tuned three deep-learning models to form one ensemble voting classifier with two different working modes namely MMR (Malware Minority Rule) and LMR (Label Majority Rule). To our knowledge, this is the first time that an ensemble classifier has been fine-tuned and hybridized in this way for malware detection. The results showed that the proposed models were promising, where the classification accuracy exceeded 97% in all experiments. © The Author(s) 2024.

关键词： Android malware

来源：评论

学校读者我要写书评

暂无评论

Attention enabled viewport selection with graph convolution for omnidirectional visual quality assessment

引用

Multimedia Tools and Applications 2025年第14期84卷 12925-12948页

作者： C, Nandhini M, Brindha Department of Computer Science and Engineering National Institute of Technology TamilNadu Tiruchirappalli620015 India

Omnidirectional images provide an immersive viewing experience in a Virtual Reality (VR) environment, surpassing the limitations of traditional 2D media beyond the conventional screen. This VR technology allows users to interact with visual information in an exciting and engaging manner. However, the storage and transmission requirements for 360-degree panoramic images are substantial, leading to the establishment of compression frameworks. Unfortunately, these frameworks introduce projection distortion and compression artifacts. With the rapid growth of VR applications, it becomes crucial to investigate the quality of the perceptible omnidirectional experience and evaluate the extent of visual degradation caused by compression. In this regard, viewport plays a significant role in omnidirectional image quality assessment (OIQA), as it directly affects the user’s perceived quality and overall viewing experience. Extracting viewports compatible with users viewing behavior plays a crucial role in OIQA. Different users may focus on different regions, and the model’s performance may be sensitive to the chosen viewport extraction strategy. Improper selection of viewports could lead to biased quality predictions. Instead of assessing the entire image, attention can be directed to areas that are more importance to the overall quality. Feature extraction is vital in OIQA as it plays a significant role in representing image content that aligns with human perception. Taking this into consideration, the proposed ATtention enabled VIewport Selection (ATVIS-OIQA) employs attention based view port selection with Vision Transformers(ViT) for feature extraction. Furthermore, the spatial relationship between the viewports is established using graph convolution, enabling intuitive prediction of the objective visual quality of omnidirectional images. The effectiveness of the proposed model is demonstrated by achieving state-of-the-art results on publicly available benchmark datasets, n

关键词： Convolution

来源：评论

学校读者我要写书评

暂无评论

Hybrid ECD model for firewall tuning and attack detection

引用

International Journal of Wireless and Mobile Computing 2025年第1期28卷 86-102页

作者： Thyagarajan, C. Vijay Bhanu S., S.V. Suthir, S. Department of Computer Science and Engineering Annamalai University Tamil Nadu Annamalai Nagar India

The rigorous security requirements and domain experts are necessary for the tuning of firewalls and for the detection of attacks. Those firewalls may create an incorrect sense or state of protection if they are improperly configured. One of the major configuration problems in firewalls is related to misconfiguration in the access control roles added to the firewall that will control network traffic. Furthermore, Software-Defined Networking (SDN) has greatly improved the network management. In this research, a hybrid Deep Learning (DL)-based firewall is designed. The request log is sent to the primary firewall, which tracks the network traffic and restricts the vulnerabilities and undesirable traffic. The EfficientNet-B3-Attn-2 fused Cascade Neuro-Fuzzy Network (ECD) is developed for network security whenever the primary firewall fails to regulate the network traffic. Furthermore, the devised framework is evaluated in terms of accuracy, sensitivity and specificity metrics that yield values like 0.885, 0.946 and 0.915. Copyright © 2025 Inderscience Enterprises Ltd.

关键词： computer system firewalls

来源：评论

学校读者我要写书评

暂无评论

Analyzing Hierarchical Relationships and Quality of Embedding in Latent Space

IEEE Transactions on Artificial Intelligence

引用

IEEE Transactions on Artificial Intelligence 2025年第4期6卷 843-858页

作者： Chatterjee, Ankita Mukherjee, Jayanta Das, Partha Pratim Indian Institute of Technology Kharagpur Department of Computer Science and Engineering Kharagpur721302 India

Existing learning models partition the generated representations using hyperplanes which form well defined groups of similar embeddings that is uniquely mapped to a particular class. However, in practical applications, the embedding space does not form distinct boundaries to segregate the class representations. There exists interaction among similar classes which cannot be visually determined in high-dimensional space. Moreover, the structure of the latent space remains obscure. As learned representations are frequently reused to reduce the inference time, it is important to analyse how semantically related classes interact among themselves in the latent space. Therefore, we propose a boundary estimation algorithm that minimises the inclusion of other classes in the embedding space to form groups of similar representations and compare the quality of these class embeddings for various models in an already encoded space. These groups are overlapping to denote ambiguous embeddings that cannot be mapped to a particular class with high confidence. The algorithm determines which representations to be included or discarded to form well defined regions, separating discriminating, ambiguous and rejected embeddings to depict a particular class. Later, we construct relation trees to evaluate the hierarchical relationships formed among the classes, and compare it with the WordNet ontology using phylogenetic tree comparison methods. © 2020 IEEE.

关键词： Embeddings

来源：评论

学校读者我要写书评

暂无评论

RTSA: A Run-Through Sparse Attention Framework for Video Transformer

引用

IEEE Transactions on computers 2025年第6期74卷 1949-1962页

作者： Wang, Xuhang Song, Zhuoran Qi, Chunyu Liu, Fangxin Jiang, Li Liang, Xiaoyao Naifeng, Jing Shanghai Jiao Tong University Department of Computer Science and Engineering Shanghai200240 China

In the realm of video understanding tasks, Video Transformer models (VidT) have recently exhibited impressive accuracy improvements in numerous edge devices. However, their deployment poses significant computational challenges for hardware. To address this, pruning has emerged as a promising approach to reduce computation and memory requirements by eliminating unimportant elements from the attention matrix. Unfortunately, existing pruning algorithms face a limitation in that they only optimize one of the two key modules on VidT's critical path: linear projection or self-attention. Regrettably, due to the variation in battery power in edge devices, the video resolution they generate will also change, which causes both linear projection and self-attention stages to potentially become bottlenecks, the existing approaches lack generality. Accordingly, we establish a Run-Through Sparse Attention (RTSA) framework that simultaneously sparsifies and accelerates two stages. On the algorithm side, unlike current methodologies conducting sparse linear projection by exploring redundancy within each frame, we extract extra redundancy naturally existing between frames. Moreover, for sparse self-attention, as existing pruning algorithms often provide either too coarse-grained or fine-grained sparsity patterns, these algorithms face limitations in simultaneously achieving high sparsity, low accuracy loss, and high speedup, resulting in either compromised accuracy or reduced efficiency. Thus, we prune the attention matrix at a medium granularity—sub-vector. The sub-vectors are generated by isolating each column of the attention matrix. On the hardware side, we observe that the use of distinct computational units for sparse linear projection and self-attention results in pipeline imbalances because of the bottleneck transformation between the two stages. To effectively eliminate pipeline stall, we design a RTSA architecture that supports sequential execution of both sparse linear pro

关键词： Vectors

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：