In this study, we propose a novel framework for detecting abnormal events in surveillance videos, a critical yet challenging task in security applications. This research introduces a robust and efficient solution for ...
详细信息
Sequence-to-sequence models are fundamental building blocks for generating abstractive text summaries, which can produce precise and coherent summaries. Recently proposed, different text summarization models aimed to ...
详细信息
Virtual reality (VR) systems are susceptible to cybersickness, significantly hindering user immersion. Very recently, researchers introduced explainable artificial intelligence (XAI) enabled methods for detecting and ...
详细信息
The provision of rebate to needy/underprivileged sections of society has been in practice since long in government organizations. The efficacy of such provisions lies in the fact that whether this rebate reaches peopl...
详细信息
In today’s era, smartphones are used in daily lives because they are ubiquitous and can be customized by installing third-party apps. As a result, the menaces because of these apps, which are potentially risky for u...
详细信息
In this work, VoteDroid a novel fine-tuned deep learning models-based ensemble voting classifier has been proposed for detecting malicious behavior in Android applications. To this end, we proposed adopting the random...
详细信息
Omnidirectional images provide an immersive viewing experience in a Virtual Reality (VR) environment, surpassing the limitations of traditional 2D media beyond the conventional screen. This VR technology allows users ...
详细信息
Omnidirectional images provide an immersive viewing experience in a Virtual Reality (VR) environment, surpassing the limitations of traditional 2D media beyond the conventional screen. This VR technology allows users to interact with visual information in an exciting and engaging manner. However, the storage and transmission requirements for 360-degree panoramic images are substantial, leading to the establishment of compression frameworks. Unfortunately, these frameworks introduce projection distortion and compression artifacts. With the rapid growth of VR applications, it becomes crucial to investigate the quality of the perceptible omnidirectional experience and evaluate the extent of visual degradation caused by compression. In this regard, viewport plays a significant role in omnidirectional image quality assessment (OIQA), as it directly affects the user’s perceived quality and overall viewing experience. Extracting viewports compatible with users viewing behavior plays a crucial role in OIQA. Different users may focus on different regions, and the model’s performance may be sensitive to the chosen viewport extraction strategy. Improper selection of viewports could lead to biased quality predictions. Instead of assessing the entire image, attention can be directed to areas that are more importance to the overall quality. Feature extraction is vital in OIQA as it plays a significant role in representing image content that aligns with human perception. Taking this into consideration, the proposed ATtention enabled VIewport Selection (ATVIS-OIQA) employs attention based view port selection with Vision Transformers(ViT) for feature extraction. Furthermore, the spatial relationship between the viewports is established using graph convolution, enabling intuitive prediction of the objective visual quality of omnidirectional images. The effectiveness of the proposed model is demonstrated by achieving state-of-the-art results on publicly available benchmark datasets, n
The rigorous security requirements and domain experts are necessary for the tuning of firewalls and for the detection of attacks. Those firewalls may create an incorrect sense or state of protection if they are improp...
详细信息
Existing learning models partition the generated representations using hyperplanes which form well defined groups of similar embeddings that is uniquely mapped to a particular class. However, in practical applications...
详细信息
In the realm of video understanding tasks, Video Transformer models (VidT) have recently exhibited impressive accuracy improvements in numerous edge devices. However, their deployment poses significant computational c...
详细信息
In the realm of video understanding tasks, Video Transformer models (VidT) have recently exhibited impressive accuracy improvements in numerous edge devices. However, their deployment poses significant computational challenges for hardware. To address this, pruning has emerged as a promising approach to reduce computation and memory requirements by eliminating unimportant elements from the attention matrix. Unfortunately, existing pruning algorithms face a limitation in that they only optimize one of the two key modules on VidT's critical path: linear projection or self-attention. Regrettably, due to the variation in battery power in edge devices, the video resolution they generate will also change, which causes both linear projection and self-attention stages to potentially become bottlenecks, the existing approaches lack generality. Accordingly, we establish a Run-Through Sparse Attention (RTSA) framework that simultaneously sparsifies and accelerates two stages. On the algorithm side, unlike current methodologies conducting sparse linear projection by exploring redundancy within each frame, we extract extra redundancy naturally existing between frames. Moreover, for sparse self-attention, as existing pruning algorithms often provide either too coarse-grained or fine-grained sparsity patterns, these algorithms face limitations in simultaneously achieving high sparsity, low accuracy loss, and high speedup, resulting in either compromised accuracy or reduced efficiency. Thus, we prune the attention matrix at a medium granularity—sub-vector. The sub-vectors are generated by isolating each column of the attention matrix. On the hardware side, we observe that the use of distinct computational units for sparse linear projection and self-attention results in pipeline imbalances because of the bottleneck transformation between the two stages. To effectively eliminate pipeline stall, we design a RTSA architecture that supports sequential execution of both sparse linear pro
暂无评论