Early human action prediction aims to complete the prediction of complete action sequences based solely on initial action sequences acquired at an initial stage. Considering that the execution of a single action usual...
详细信息
ISBN:
(纸本)9789819787944;9789819787951
Early human action prediction aims to complete the prediction of complete action sequences based solely on initial action sequences acquired at an initial stage. Considering that the execution of a single action usually relies on the synergistic coordination of multiple key body parts and the movement amplitude of different body parts at the onset of an action varies minimally, early human action prediction demonstrates high sensitivity to the location of action initiation and the type of action. Currently, skeletal-based action prediction methods primarily focus on action classification and exhibit limited capability for discrimination in terms of semantic association between actions. For instance, distinguishing actions concentrated on elbow joint movements, such as "touching the neck" and "touching the head," proves challenging through classification alone but can be achieved through semantic relationships. Therefore, when differentiating similar actions, incorporating descriptions of specific joint movements can enhance the feature extraction ability of the model. This paper introduces an Action Description-Assisted Learning Graph Convolutional Network (ADAL-GCN), which utilizes large language models as knowledge engines to pre-generate descriptions for key parts of different actions. These descriptions are then transformed into semantically rich feature vectors through text encoding. Furthermore, the model adopts a lightweight design, decoupling features across channel and temporal dimensions, consolidating redundant network modules, and executing strategic computational migration to optimize processing efficiency. Experimental results demonstrate significant performance improvements achieved by our proposed method, which achieves substantial reductions in training time without additional computational overhead.
Such as physical interaction, individuals with disabilities often encounter barriers when accessing and interacting with traditional computer interfaces. Navigating the digital interface poses challenges for those wit...
详细信息
ISBN:
(数字)9798331534691
ISBN:
(纸本)9798331534707
Such as physical interaction, individuals with disabilities often encounter barriers when accessing and interacting with traditional computer interfaces. Navigating the digital interface poses challenges for those with disabilities. The necessity for innovative solutions is underscored by the main issue of limited accessibility to computer interfaces for individuals with disabilities. At present time, Human-Computer Interaction technology is most known in the tech industry. Which enables any individual to interact with system or computer. Here, naturallanguageprocessing plays a major component in the advancement of Human-Computer Interaction technology. This paper surveys recent methods, techniques like embedding, Word2Vec and technology to help in improvement of the Human-Computer Interaction. This work highlights the need of integration of advance naturallanguageprocessing like with Human-Computer Interaction for people with disabilities. This paper also proposes work on a hands-free device navigation system or application for differently abled peoples.
There are hundreds of methods for analysis of data obtained in mRNA-sequencing. The most of them are focused on small number of genes. In this study, we propose an approach that reduces the analysis of several thousan...
详细信息
With major changes in productivity and the continuous upgrading of consumer demand, it is no longer sustainable for businesses to compete on the basis of low-cost, low-value-added products or services. As a result, it...
详细信息
Because of their security and convenience, crack detection methods based on image processing technique (IPT) have gradually become a mainstream trend in this field. However, the performances of existing methods can no...
详细信息
ISBN:
(数字)9798350349115
ISBN:
(纸本)9798350349122
Because of their security and convenience, crack detection methods based on image processing technique (IPT) have gradually become a mainstream trend in this field. However, the performances of existing methods can not satisfy the practical applications for serious category imbalance and uncontrolled noisy interferences. In this paper, a multiple feature fusion framework based on hierarchical constraint for crack detection is proposed, which mainly contains a weight-shared encoder component and three independent decoder ones, and the outputs of these three decoder components are fused together to finalize the detection. Specifically, each decoder component is constrained by pseudo label generated from the true crack map with a specific expansion stride, which can alleviate the serious category imbalance problem and avoid the risk of model collapse. Furthermore, the fused feature can represent the road image well because it can ensure the existence and meanwhile avert the false-alarm of crack samples. In addition, the backbone of our framework is based on a lightweight UNet equipped with enhancement modules, which can increase the structural representation capability, decrease the computational load, and speed up the reasoning time simultaneously. Finally qualitative and quantitative experimental results on three challenging datasets have verified the superiority of the proposed framework.
Pre-trained language models (LMs) have become ubiquitous in solving various naturallanguageprocessing (NLP) tasks. There has been increasing interest in what knowledge these LMs contain and how we can extract that k...
详细信息
Membership Inference Attacks (MIA) aim to infer whether a target data record has been utilized for model training or not. Existing MIAs designed for large language models (LLMs) can be bifurcated into two types: refer...
Learning binary classifiers from positive and unlabeled data (PUL) is vital in many real-world applications, especially when verifying negative examples is difficult. Despite the impressive empirical performance of re...
详细信息
Platforms like YouTube have revolutionized the way people distribute and consume information in the video-dominated digital age. However, viewers who are trying to get pertinent information quickly may find it difficu...
详细信息
ISBN:
(数字)9798350354348
ISBN:
(纸本)9798350354355
Platforms like YouTube have revolutionized the way people distribute and consume information in the video-dominated digital age. However, viewers who are trying to get pertinent information quickly may find it difficult due to the sheer volume of videos. This research describes a system for summarizing YouTube videos that effectively extracts information using Term Frequency-Inverse Document Frequency (TF-IDF) and Automatic Speech Recognition (ASR) analysis. ASR translates audio to text, and TF-IDF finds the most important information for brief summaries. empirical validation supports the optimization of usability through automation and personalization. The technology facilitates navigation through large amounts of video footage by supporting naturallanguageprocessing (NLP) and information retrieval.
暂无评论