Integrated sensing and communication (ISAC) is a promising technique to increase spectral efficiency and support various emerging applications by sharing the spectrum and hardware between these functionalities. Howeve...
详细信息
Integrated sensing and communication (ISAC) is a promising technique to increase spectral efficiency and support various emerging applications by sharing the spectrum and hardware between these functionalities. However, the traditional ISAC schemes are highly dependent on the accurate mathematical model and suffer from the challenges of high complexity and poor performance in practical scenarios. Recently, artificial intelligence (AI) has emerged as a viable technique to address these issues due to its powerful learning capabilities, satisfactory generalization capability, fast inference speed, and high adaptability for dynamic environments, facilitating a system design shift from model-driven to data-driven. Intelligent ISAC, which integrates AI into ISAC, has been a hot topic that has attracted many researchers to investigate. In this paper, we provide a comprehensive overview of intelligent ISAC, including its motivation, typical applications, recent trends, and challenges. In particular, we first introduce the basic principle of ISAC, followed by its key techniques. Then, an overview of AI and a comparison between model-based and AI-based methods for ISAC are provided. Furthermore, the typical applications of AI in ISAC and the recent trends for AI-enabled ISAC are reviewed. Finally, the future research issues and challenges of intelligent ISAC are discussed.
People-centric activity recognition is one of the most critical technologies in a wide range of real-world applications,including intelligent transportation systems, healthcare services, and brain-computer interfaces....
详细信息
People-centric activity recognition is one of the most critical technologies in a wide range of real-world applications,including intelligent transportation systems, healthcare services, and brain-computer interfaces. Large-scale data collection and annotation make the application of machine learning algorithms prohibitively expensive when adapting to new tasks. One way of circumventing this limitation is to train the model in a semi-supervised learning manner that utilizes a percentage of unlabeled data to reduce the labeling burden in prediction tasks. Despite their appeal, these models often assume that labeled and unlabeled data come from similar distributions, which leads to the domain shift problem caused by the presence of distribution gaps. To address these limitations, we propose herein a novel method for people-centric activity recognition,called domain generalization with semi-supervised learning(DGSSL), that effectively enhances the representation learning and domain alignment capabilities of a model. We first design a new autoregressive discriminator for adversarial training between unlabeled and labeled source domains, extracting domain-specific features to reduce the distribution gaps. Second, we introduce two reconstruction tasks to capture the task-specific features to avoid losing information related to representation learning while maintaining task-specific consistency. Finally, benefiting from the collaborative optimization of these two tasks, the model can accurately predict both the domain and category labels of the source domains for the classification task. We conduct extensive experiments on three real-world sensing datasets. The experimental results show that DGSSL surpasses the three state-of-the-art methods with better performance and generalization.
Pedestrian wind flow is a critical factor in designing livable residential environments under growing complex urban *** pedestrian wind flow during the early design stages is essential but currently suffers from ineff...
详细信息
Pedestrian wind flow is a critical factor in designing livable residential environments under growing complex urban *** pedestrian wind flow during the early design stages is essential but currently suffers from inefficiencies in numerical *** learning,particularly generative adversarial networks(GAN),has been increasingly adopted as an alternative method to provide efficient prediction of pedestrian wind ***,existing GAN-based wind flow prediction schemes have limitations due to the lack of considering the spatial and frequency characteristics of wind flow *** study proposes a novel approach termed SFGAN,which embeds spatial and frequency characteristics to enhance pedestrian wind flow *** the spatial domain,Gaussian blur is employed to decompose wind flow into components containing wind speed and distinguished flow edges,which are used as the embedded spatial *** information of wind flow is obtained through discrete wavelet transformation and used as the embedded frequency *** spatial and frequency characteristics of wind flow are jointly utilized to enforce consistency between the predicted wind flow and ground truth during the training phase,thereby leading to enhanced *** results demonstrate that SFGAN clearly improves wind flow prediction,reducing Wind_MAE,Wind_RMSE and the Fréchet Inception Distance(FID)score by 5.35%,6.52%and 12.30%,compared to the previous best method,*** also analyze the effectiveness of incorporating the spatial and frequency characteristics of wind flow in predicting pedestrian wind *** reduces errors in predicting wind flow at large error intervals and performs well in wake regions and regions surrounding *** enhanced predictions provide a better understanding of performance variability,bringing insights at the early design stage to improve pedestrian wind *** proposed spatial-frequen
Matrix minimization techniques that employ the nuclear norm have gained recognition for their applicability in tasks like image inpainting, clustering, classification, and reconstruction. However, they come with inher...
详细信息
Matrix minimization techniques that employ the nuclear norm have gained recognition for their applicability in tasks like image inpainting, clustering, classification, and reconstruction. However, they come with inherent biases and computational burdens, especially when used to relax the rank function, making them less effective and efficient in real-world scenarios. To address these challenges, our research focuses on generalized nonconvex rank regularization problems in robust matrix completion, low-rank representation, and robust matrix regression. We introduce innovative approaches for effective and efficient low-rank matrix learning, grounded in generalized nonconvex rank relaxations inspired by various substitutes for the ?0-norm relaxed functions. These relaxations allow us to more accurately capture low-rank structures. Our optimization strategy employs a nonconvex and multi-variable alternating direction method of multipliers, backed by rigorous theoretical analysis for complexity and *** algorithm iteratively updates blocks of variables, ensuring efficient convergence. Additionally, we incorporate the randomized singular value decomposition technique and/or other acceleration strategies to enhance the computational efficiency of our approach, particularly for large-scale constrained minimization problems. In conclusion, our experimental results across a variety of image vision-related application tasks unequivocally demonstrate the superiority of our proposed methodologies in terms of both efficacy and efficiency when compared to most other related learning methods.
Predicting the metastatic direction of primary breast cancer (BC), thus assisting physicians in precise treatment, strict follow-up, and effectively improving the prognosis. The clinical data of 293,946 patients with ...
详细信息
Language-guided fashion image editing is challenging,as fashion image editing is local and requires high precision,while natural language cannot provide precise visual information for *** this paper,we propose LucIE,a...
详细信息
Language-guided fashion image editing is challenging,as fashion image editing is local and requires high precision,while natural language cannot provide precise visual information for *** this paper,we propose LucIE,a novel unsupervised language-guided local image editing method for fashion *** adopts and modifies recent text-to-image synthesis network,DF-GAN,as its ***,the synthesis backbone often changes the global structure of the input image,making local image editing *** increase structural consistency between input and edited images,we propose Content-Preserving Fusion Module(CPFM).Different from existing fusion modules,CPFM prevents iterative refinement on visual feature maps and accumulates additive modifications on RGB *** achieves local image editing explicitly with language-guided image segmentation and maskguided image blending while only using image and text *** on the DeepFashion dataset shows that LucIE achieves state-of-the-art *** with previous methods,images generated by LucIE also exhibit fewer *** provide visualizations and perform ablation studies to validate LucIE and the *** also demonstrate and analyze limitations of LucIE,to provide a better understanding of LucIE.
Mobile edge computing(MEC) provides edge services to users in a distributed and on-demand *** to the heterogeneity of edge applications, deploying latency and resource-intensive applications on resourceconstrained dev...
详细信息
Mobile edge computing(MEC) provides edge services to users in a distributed and on-demand *** to the heterogeneity of edge applications, deploying latency and resource-intensive applications on resourceconstrained devices is a key challenge for service providers. This is especially true when underlying edge infrastructures are fault and error-prone. In this paper, we propose a fault tolerance approach named DFGP, for enforcing mobile service fault-tolerance in MEC. It synthesizes a generative optimization network(GON) model for predicting resource failure and a deep deterministic policy gradient(DDPG) model for yielding preemptive migration *** show through extensive simulation experiments that DFGP is more effective in fault detection and guaranteeing quality of service, in terms of fault detection accuracy, migration efficiency, task migration time, task scheduling time,and energy consumption than other existing methods.
Behavior-Driven Development (BDD) user stories are widely used in agile methods for capturing user requirements and acceptance criteria due to their simplicity and clarity. However, the concise structure of BDD-based ...
详细信息
As the adoption of explainable AI(XAI) continues to expand, the urgency to address its privacy implications intensifies. Despite a growing corpus of research in AI privacy and explainability, there is little attention...
详细信息
As the adoption of explainable AI(XAI) continues to expand, the urgency to address its privacy implications intensifies. Despite a growing corpus of research in AI privacy and explainability, there is little attention on privacy-preserving model explanations. This article presents the first thorough survey about privacy attacks on model explanations and their countermeasures. Our contribution to this field comprises a thorough analysis of research papers with a connected taxonomy that facilitates the categorization of privacy attacks and countermeasures based on the targeted explanations. This work also includes an initial investigation into the causes of privacy leaks. Finally, we discuss unresolved issues and prospective research directions uncovered in our analysis. This survey aims to be a valuable resource for the research community and offers clear insights for those new to this domain. To support ongoing research, we have established an online resource repository, which will be continuously updated with new and relevant findings.
Video question answering(VideoQA) is a challenging yet important task that requires a joint understanding of low-level video content and high-level textual semantics. Despite the promising progress of existing efforts...
详细信息
Video question answering(VideoQA) is a challenging yet important task that requires a joint understanding of low-level video content and high-level textual semantics. Despite the promising progress of existing efforts, recent studies revealed that current VideoQA models mostly tend to over-rely on the superficial correlations rooted in the dataset bias while overlooking the key video content, thus leading to unreliable results. Effectively understanding and modeling the temporal and semantic characteristics of a given video for robust VideoQA is crucial but, to our knowledge, has not been well investigated. To fill the research gap, we propose a robust VideoQA framework that can effectively model the cross-modality fusion and enforce the model to focus on the temporal and global content of videos when making a QA decision instead of exploiting the shortcuts in datasets. Specifically, we design a self-supervised contrastive learning objective to contrast the positive and negative pairs of multimodal input, where the fused representation of the original multimodal input is enforced to be closer to that of the intervened input based on video perturbation. We expect the fused representation to focus more on the global context of videos rather than some static keyframes. Moreover, we introduce an effective temporal order regularization to enforce the inherent sequential structure of videos for video representation. We also design a Kullback-Leibler divergence-based perturbation invariance regularization of the predicted answer distribution to improve the robustness of the model against temporal content perturbation of videos. Our method is model-agnostic and can be easily compatible with various VideoQA backbones. Extensive experimental results and analyses on several public datasets show the advantage of our method over the state-of-the-art methods in terms of both accuracy and robustness.
暂无评论