State-of-the-art recommender systems are increasingly focused on optimizing implementation efficiency, such as enabling on-device recommendations under memory constraints. Current methods commonly use lightweight embe...
详细信息
State-of-the-art recommender systems are increasingly focused on optimizing implementation efficiency, such as enabling on-device recommendations under memory constraints. Current methods commonly use lightweight embeddings for users and items or employ compact embeddings to enhance reusability and reduce memory usage. However, these approaches consider only the coarse-grained aspects of embeddings, overlooking subtle semantic nuances. This limitation results in an adversarial degradation of meta-embedding performance, impeding the system's ability to capture intricate relationships between users and items, leading to suboptimal recommendations. To address this, we propose a novel approach to efficiently learn meta-embeddings with varying grained and apply fine-grained meta-embeddings to strengthen the representation of their coarse-grained counterparts. Specifically, we introduce a recommender system based on a graph neural network, where each user and item is represented as a node. These nodes are directly connected to coarse-grained virtual nodes and indirectly linked to fine-grained virtual nodes, facilitating learning of multi-grained semantics. Fine-grained semantics are captured through sparse meta-embeddings, which dynamically balance embedding uniqueness and memory constraints. To ensure their sparseness, we rely on initialization methods such as sparse principal component analysis combined with a soft thresholding activation function. Moreover, we propose a weight-bridging update strategy that aligns coarse-grained meta-embedding with several fine-grained meta-embeddings based on the underlying semantic properties of users and items. Comprehensive experiments demonstrate that our method outperforms existing baselines. The code of our proposal is available at https://***/htyjers/C2F-MetaEmbed.
The Internet of Everything(IoE)based cloud computing is one of the most prominent areas in the digital big data *** approach allows efficient infrastructure to store and access big real-time data and smart IoE service...
详细信息
The Internet of Everything(IoE)based cloud computing is one of the most prominent areas in the digital big data *** approach allows efficient infrastructure to store and access big real-time data and smart IoE services from the *** IoE-based cloud computing services are located at remote locations without the control of the data *** data owners mostly depend on the untrusted Cloud Service Provider(CSP)and do not know the implemented security *** lack of knowledge about security capabilities and control over data raises several security *** Acid(DNA)computing is a biological concept that can improve the security of IoE big *** IoE big data security scheme consists of the Station-to-Station Key Agreement Protocol(StS KAP)and Feistel cipher *** paper proposed a DNA-based cryptographic scheme and access control model(DNACDS)to solve IoE big data security and access *** experimental results illustrated that DNACDS performs better than other DNA-based security *** theoretical security analysis of the DNACDS shows better resistance capabilities.
Most of the search-based software remodularization(SBSR)approaches designed to address the software remodularization problem(SRP)areutilizing only structural information-based coupling and cohesion quality ***,in prac...
详细信息
Most of the search-based software remodularization(SBSR)approaches designed to address the software remodularization problem(SRP)areutilizing only structural information-based coupling and cohesion quality ***,in practice apart from these quality criteria,there require other aspects of coupling and cohesion quality criteria such as lexical and changed-history in designing the modules of the software ***,consideration of limited aspects of software information in the SBSR may generate a sub-optimal modularization ***,such modularization can be good from the quality metrics perspective but may not be acceptable to the *** produce a remodularization solution acceptable from both quality metrics and developers’perspectives,this paper exploited more dimensions of software information to define the quality criteria as modularization ***,these objectives are simultaneously optimized using a tailored manyobjective artificial bee colony(MaABC)to produce a remodularization *** assess the effectiveness of the proposed approach,we applied it over five software *** obtained remodularization solutions are evaluated with the software quality metrics and developers view of *** demonstrate that the proposed software remodularization is an effective approach for generating good quality modularization solutions.
Improving website security to prevent malicious online activities is crucial,and CAPTCHA(Completely Automated Public Turing test to tell computers and Humans Apart)has emerged as a key strategy for distinguishing huma...
详细信息
Improving website security to prevent malicious online activities is crucial,and CAPTCHA(Completely Automated Public Turing test to tell computers and Humans Apart)has emerged as a key strategy for distinguishing human users from automated ***-based CAPTCHAs,designed to be easily decipherable by humans yet challenging for machines,are a common form of this ***,advancements in deep learning have facilitated the creation of models adept at recognizing these text-based CAPTCHAs with surprising *** our comprehensive investigation into CAPTCHA recognition,we have tailored the renowned UpDown image captioning model specifically for this *** approach innovatively combines an encoder to extract both global and local features,significantly boosting the model’s capability to identify complex details within CAPTCHA *** the decoding phase,we have adopted a refined attention mechanism,integrating enhanced visual attention with dual layers of Long Short-Term Memory(LSTM)networks to elevate CAPTCHA recognition *** rigorous testing across four varied datasets,including those from Weibo,BoC,Gregwar,and Captcha 0.3,demonstrates the versatility and effectiveness of our *** results not only highlight the efficiency of our approach but also offer profound insights into its applicability across different CAPTCHA types,contributing to a deeper understanding of CAPTCHA recognition technology.
Voice, motion, and mimicry are naturalistic control modalities that have replaced text or display-driven control in human-computer communication (HCC). Specifically, the vocals contain a lot of knowledge, revealing de...
详细信息
Voice, motion, and mimicry are naturalistic control modalities that have replaced text or display-driven control in human-computer communication (HCC). Specifically, the vocals contain a lot of knowledge, revealing details about the speaker’s goals and desires, as well as their internal condition. Certain vocal characteristics reveal the speaker’s mood, intention, and motivation, while word study assists the speaker’s demand to be understood. Voice emotion recognition has become an essential component of modern HCC networks. Integrating findings from the various disciplines involved in identifying vocal emotions is also challenging. Many sound analysis techniques were developed in the past. Learning about the development of artificial intelligence (AI), and especially Deep Learning (DL) technology, research incorporating real data is becoming increasingly common these days. Thus, this research presents a novel selfish herd optimization-tuned long/short-term memory (SHO-LSTM) strategy to identify vocal emotions in human communication. The RAVDESS public dataset is used to train the suggested SHO-LSTM technique. Mel-frequency cepstral coefficient (MFCC) and wiener filter (WF) techniques are used, respectively, to remove noise and extract features from the data. LSTM and SHO are applied to the extracted data to optimize the LSTM network’s parameters for effective emotion recognition. Python Software was used to execute our proposed framework. In the finding assessment phase, Numerous metrics are used to evaluate the proposed model’s detection capability, Such as F1-score (95%), precision (95%), recall (96%), and accuracy (97%). The suggested approach is tested on a Python platform, and the SHO-LSTM’s outcomes are contrasted with those of other previously conducted research. Based on comparative assessments, our suggested approach outperforms the current approaches in vocal emotion recognition.
Deep learning technology has extensive application in the classification and recognition of medical images. However, several challenges persist in such application, such as the need for acquiring large-scale labeled d...
详细信息
Interpretable visual recognition is essential for decision-making in high-stakes situations. Recent advancements have automated the construction of interpretable models by leveraging Visual Language Models (VLMs) and ...
详细信息
Automated machine learning(AutoML) has achieved remarkable success in automating the non-trivial process of designing machine learning *** the focal areas of AutoML,neural architecture search(NAS) stands out,aiming to...
详细信息
Automated machine learning(AutoML) has achieved remarkable success in automating the non-trivial process of designing machine learning *** the focal areas of AutoML,neural architecture search(NAS) stands out,aiming to systematically explore the complex architecture space to discover the optimal neural architecture configurations without intensive manual *** has demonstrated its capability of dramatic performance improvement across a large number of real-world *** core components in NAS methodologies normally include(ⅰ) defining the appropriate search space,(ⅱ)designing the right search strategy and(ⅲ) developing the effective evaluation *** early NAS endeavors are characterized via groundbreaking architecture designs,the imposed exorbitant computational demands prompt a shift towards more efficient paradigms such as weight sharing and evaluation estimation,***,the introduction of specialized benchmarks has paved the way for standardized comparisons of NAS ***,the adaptability of NAS is evidenced by its capability of extending to diverse datasets,including graphs,tabular data and videos,etc.,each of which requires a tailored *** paper delves into the multifaceted aspects of NAS,elaborating on its recent advances,applications,tools,benchmarks and prospective research directions.
Fog computing is an emerging paradigm that provides services near the end-user. The tremendous increase in IoT devices and big data leads to complexity in fog resource allocation. Inefficient resource allocation can l...
详细信息
Video question answering(VideoQA) is a challenging yet important task that requires a joint understanding of low-level video content and high-level textual semantics. Despite the promising progress of existing efforts...
详细信息
Video question answering(VideoQA) is a challenging yet important task that requires a joint understanding of low-level video content and high-level textual semantics. Despite the promising progress of existing efforts, recent studies revealed that current VideoQA models mostly tend to over-rely on the superficial correlations rooted in the dataset bias while overlooking the key video content, thus leading to unreliable results. Effectively understanding and modeling the temporal and semantic characteristics of a given video for robust VideoQA is crucial but, to our knowledge, has not been well investigated. To fill the research gap, we propose a robust VideoQA framework that can effectively model the cross-modality fusion and enforce the model to focus on the temporal and global content of videos when making a QA decision instead of exploiting the shortcuts in datasets. Specifically, we design a self-supervised contrastive learning objective to contrast the positive and negative pairs of multimodal input, where the fused representation of the original multimodal input is enforced to be closer to that of the intervened input based on video perturbation. We expect the fused representation to focus more on the global context of videos rather than some static keyframes. Moreover, we introduce an effective temporal order regularization to enforce the inherent sequential structure of videos for video representation. We also design a Kullback-Leibler divergence-based perturbation invariance regularization of the predicted answer distribution to improve the robustness of the model against temporal content perturbation of videos. Our method is model-agnostic and can be easily compatible with various VideoQA backbones. Extensive experimental results and analyses on several public datasets show the advantage of our method over the state-of-the-art methods in terms of both accuracy and robustness.
暂无评论