检索结果-内蒙古大学图书馆

GENERATIVE ADVERSARIAL NETWORKS FOR SPATIO-SPECTRAL COMPRESSION OF HYPERSPECTRAL IMAGES

学校读者我要写书评

暂无评论

arXiv 2023年

作者： Fuchs, Martin Hermann Paul Byju, Akshara Preethy Walda, Alisa Rasti, Behnood Demir, Begüm Faculty of Electrical Engineering and Computer Science Technische Universität Berlin Germany Department of Computer Science and Engineering Amrita School of Computing Amrita Vishwa Vidyapeetham Amritapuri India BIFOLD - Berlin Institute for the Foundations of Learning and Data Germany

Deep learning-based hyperspectral image (HSI) compression has recently attracted great attention in remote sensing due to the growth of hyperspectral data archives. Most of the existing models achieve either spectral or spatial compression and do not jointly consider the spatio-spectral redundancies present in HSIs. To address this problem, in this paper, we propose High Fidelity Compression (HiFiC)-based models for spatio-spectral compression of HSIs. In detail, we introduce two new models: i) HiFiC using Squeeze and Excitation (SE) blocks (denoted as HiFiCSE);and ii) HiFiC with 3D convolutions (denoted as HiFiC3D) in the framework of compression of HSIs. We analyze the effectiveness of HiFiCSE and HiFiC3D in compressing the spatio-spectral redundancies with channel attention and inter-dependency analysis. Experimental results show the efficacy of the proposed models in performing spatio-spectral compression, while reconstructing images at reduced bitrates with higher reconstruction quality. The code of the proposed models is publicly available at https://***/rsim/HSI-SSC. © 2023, CC BY.

关键词： Generative adversarial networks

Optimizing Human Pose Estimation Through Focused Human and Joint Regions 39

学校读者我要写书评

暂无评论

Optimizing Human Pose Estimation Through Focused Human and J...

39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025

作者： Jiao, Yingying Wang, Zhigang Liu, Zhenguang Fan, Shaojing Wu, Sifan Wu, Zheqi Xu, Zhuoyue College of Computer Science and Technology Jilin University China Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education Jilin University China College of Computer Science and Technology Zhejiang Gongshang University China The State Key Laboratory of Blockchain and Data Security Zhejiang University China Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security China School of Computing National University of Singapore Singapore

ISBN: (纸本)157735897X

Human pose estimation has given rise to a broad spectrum of novel and compelling applications, including action recognition, sports analysis, as well as surveillance. However, accurate video pose estimation remains an open challenge. One aspect that has been overlooked so far is that existing methods learn motion clues from all pixels rather than focusing on the target human body, making them easily misled and disrupted by unimportant information such as background changes or movements of other people. Additionally, while the current Transformer-based pose estimation methods has demonstrated impressive performance with global modeling, they struggle with local context perception and precise positional identification. In this paper, we try to tackle these challenges from three aspects: (1) We propose a bilayer Human-Keypoint Mask module that performs coarse-to-fine visual token refinement, which gradually zooms in on the target human body and keypoints while masking out unimportant figure regions. (2) We further introduce a novel deformable cross attention mechanism and a bidirectional separation strategy to adaptively aggregate spatial and temporal motion clues from constrained surrounding contexts. (3) We mathematically formulate the deformable cross attention, constraining that the model focuses solely on the regions centered at the target person body. Empirically, our method achieves state-of-the-art performance on three large-scale benchmark datasets. A remarkable highlight is that our method achieves an 84.8 mean Average Precision (mAP) on the challenging wrist joint, which significantly outperforms the 81.5 mAP achieved by the current state-of-the-art method on the PoseTrack2017 dataset. Copyright © 2025, Association for the Advancement of Artificial Intelligence (***). All rights reserved.

关键词：

MA-Unet3+ Segmentation Network of Remote Sensing Image Based on ECA Block

学校读者我要写书评

暂无评论

MA-Unet3+ Segmentation Network of Remote Sensing Image Based...

International Conference on Computer Network, Electronic and Automation (ICCNEA)

作者： Lang Gao Jianguo Wang Xin Ye School of Weapon Science and Technology Xi'an Technological University Xi’an China Research Institute of Artificial Intelligence and Data Science Xi'an Technological University Xi’an China

Semantic segmentation is a basal task and is a typical computer vision problem. Although semantic segmentation is developing rapidly, the speed and accuracy of model segmentation still need to be further improved. For solve the issue of scale differences between target objects and loss of spatial information in the segmentation task of remote sensing images, by improving the original U-Net3+ network and introducing the attention mechanism, a new network MA-Unet3+ is constructed. In the coding phase, images of unlike scales are fused, and the full-scale connections are pruned, some skip connections are removed, and attention mechanisms are introduced between each layer. The improved model is contrast with some common network models, and the experiment achieves 78.7% average intersection (mIoU) on the Vaihingen dataset, which is 0.8% better than this optimized network U-Net3+, the average category pixel accuracy (MPA) is 92.4%, which is 1.2% better, and the similarity coefficient (Dice) result is 87.3%, which is 0.8% better. 0.8%, it is observed that MA-Unet3+ is precede other algorithms.

关键词：

USV-AUV Collaboration Framework for Underwater Tasks under Extreme Sea Conditions

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Xu, Jingzehua Xie, Guanwen Wang, Xinqi Ding, Yimian Zhang, Shuai Tsinghua Shenzhen International Graduate School Tsinghua University China College of Information Science & Electronic Engineering Zhejiang University China Department of Data Science New Jersey Institute of Technology United States

Autonomous underwater vehicles (AUVs) are valuable for ocean exploration due to their flexibility and ability to carry communication and detection units. Nevertheless, AUVs alone often face challenges in harsh and extreme sea conditions. This study introduces a unmanned surface vehicle (USV)–AUV collaboration framework, which includes high-precision multi-AUV positioning using USV path planning via Fisher information matrix optimization and reinforcement learning for multi-AUV cooperative tasks. Applied to a multi-AUV underwater data collection task scenario, extensive simulations validate the framework’s feasibility and superior performance, highlighting exceptional coordination and robustness under extreme sea conditions. To accelerate relevant research in this field, we have made simulation code (demo version) available as open-source1 © 2024, CC BY.

关键词： Unmanned surface vehicles

ToonTalker: Cross-Domain Face Reenactment

学校读者我要写书评

暂无评论

ToonTalker: Cross-Domain Face Reenactment

International Conference on Computer Vision (ICCV)

作者： Yuan Gong Yong Zhang Xiaodong Cun Fei Yin Yanbo Fan Xuan Wang Baoyuan Wu Yujiu Yang Shenzhen International Graduate School Tsinghua University Tencent AI Lab Ant Group The School of Data Science Shenzhen Research Institute of Big Data The Chinese University of Hong Kong Shenzhen (CUHK-Shenzhen)

We target cross-domain face reenactment in this paper, i.e., driving a cartoon image with the video of a real person and vice versa. Recently, many works have focused on one-shot talking face generation to drive a portrait with a real video, i.e., within-domain reenactment. Straightforwardly applying those methods to cross-domain animation will cause inaccurate expression transfer, blur effects, and even apparent artifacts due to the domain shift between cartoon and real faces. Only a few works attempt to settle cross-domain face reenactment. The most related work AnimeCeleb [13] requires constructing a dataset with pose vector and cartoon image pairs by animating 3D characters, which makes it inapplicable anymore if no paired data is available. In this paper, we propose a novel method for cross-domain reenactment without paired data. Specifically, we propose a transformer-based framework to align the motions from different domains into a common latent space where motion transfer is conducted via latent code addition. Two domain-specific motion encoders and two learnable motion base memories are used to capture domain properties. A source query transformer and a driving one are exploited to project domain-specific motion to the canonical space. The edited motion is projected back to the domain of the source with a transformer. Moreover, since no paired data is provided, we propose a novel cross-domain training scheme using data from two domains with the designed analogy constraint. Besides, we contribute a cartoon dataset in Disney style. Extensive evaluations demonstrate the superiority of our method over competing methods.

关键词：

FairTP: A Prolonged Fairness Framework for Traffic Prediction

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Xia, Jiangnan Yang, Yu Shen, Jiaxing Wang, Senzhang Cao, Jiannong School of Computer Science and Engineering Central South University China Centre for Learning Teaching and Technology The Education University of Hong Kong Hong Kong School of Data Science Lingnan University Hong Kong The Department of Computing The Hong Kong Polytechnic University Hong Kong

Traffic prediction is pivotal in intelligent transportation systems. Existing works mainly focus on improving the overall accuracy, overlooking a crucial problem of whether prediction results will lead to biased decisions by transportation authorities. In practice, the uneven deployment of traffic sensors in different urban areas produces imbalanced data, making the traffic prediction model fail in some areas and leading to unfair regional decision-making that eventually severely affects equity and quality of residents' life. Additionally, existing fairness machine learning models fail to preserve fair traffic prediction for a prolonged time. Although they can achieve fairness at certain time points, such static fairness will be broken as the traffic conditions change. To fill this research gap, we investigate prolonged fair traffic prediction, introduce two novel fairness definitions tailored to dynamic traffic scenarios, and propose a prolonged fairness traffic prediction framework, namely FairTP. We argue that fairness in traffic scenarios changes dynamically over time and across areas. Each traffic sensor or city area has state that alternates between "sacrifice" and "benefit" based on its prediction accuracy (high accuracy indicates "benefit" state). Prolonged fairness is achieved when the overall states of sensors similar within a given ***, we first define region-based static fairness and sensor-based dynamic fairness. Next, we designed a state identification module in FairTP to discriminate between states of "sacrifice" or "benefit" to enable prolonged fairness-aware traffic predictions. Lastly, a state-guided balanced sampling strategy is designed to select training examples to promote prediction fairness further, mitigating the performance disparities among regions with imbalanced traffic sensors. Extensive experiments in two real-world datasets show that FairTP significantly improves prediction fairness without causing much accuracy degrada

关键词： Intelligent systems

Algebraic Geometrical Analysis of Metropolis Algorithm When Parameters Are Non-identifiable

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Nagata, Kenji Mototake, Yoh-Ichi Center for Basic Research on Materials National Institute for Materials Science Ibaraki Tsukuba305-0044 Japan Graduate School of Social Data Science Hitotsubashi University Tokyo Kunitachi186-8601 Japan

The Metropolis algorithm is one of the Markov chain Monte Carlo (MCMC) methods that realize sampling from the target probability distribution. In this paper, we are concerned with the sampling from the distribution in non-identifiable cases that involve models with Fisher information matrices that may fail to be invertible. The theoretical adjustment of the step size, which is the variance of the candidate distribution, is difficult for non-identifiable cases. In this study, to establish such a principle, the average acceptance rate, which is used as a guideline to optimize the step size in the MCMC method, was analytically derived in non-identifiable cases. The optimization principle for the step size was developed from the viewpoint of the average acceptance rate. In addition, we performed numerical experiments on some specific target distributions to verify the effectiveness of our theoretical results. Copyright © 2024, The Authors. All rights reserved.

关键词： Fisher information matrix

Solid-SQL: Enhanced Schema-linking based In-context Learning for Robust Text-to-SQL

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Liu, Geling Tan, Yunzhi Zhong, Ruichao Xie, Yuanzhen Zhao, Lingchen Wang, Qian Hu, Bo Li, Zang School of Cyber Science and Engineering Wuhan University China Big Data and AI Platform Department Tencent China Key Laboratory of Aerospace Information Security and Trusted Computing Ministry of Education China

Recently, large language models (LLMs) have significantly improved the performance of text-to-SQL systems. Nevertheless, many state-of-the-art (SOTA) approaches have overlooked the critical aspect of system robustness. Our experiments reveal that while LLM-driven methods excel on standard datasets, their accuracy is notably compromised when faced with adversarial perturbations. To address this challenge, we propose a robust text-to-SQL solution, called Solid-SQL, designed to integrate with various LLMs. We focus on the pre-processing stage, training a robust schema-linking model enhanced by LLM-based data augmentation. Additionally, we design a two-round, structural similarity-based example retrieval strategy for in-context learning. Our method achieves SOTA SQL execution accuracy levels of 82.1% and 58.9% on the general Spider and Bird benchmarks, respectively. Furthermore, experimental results show that Solid-SQL delivers an average improvement of 11.6% compared to baselines on the perturbed Spider-Syn, Spider-Realistic, and Dr. Spider benchmarks. Copyright © 2024, The Authors. All rights reserved.

关键词： Adversarial machine learning

A note on the asymptotic uniformity of Markov chains with random rates

学校读者我要写书评

暂无评论

arXiv 2025年

作者： Calvert, Jacob Den Hollander, Frank Randall, Dana Institute for Data Engineering and Science Georgia Institute of Technology AtlantaGA30332 United States Mathematical Institute Leiden University Einsteinweg 55 Leiden2333 CC Netherlands School of Computer Science Georgia Institute of Technology AtlantaGA30332 United States

The stationary distribution of a continuous-time Markov chain is generally a complicated function of its transition rates. However, we show that if the transition rates are i.i.d. random variables with a common distribution satisfying certain tail conditions, then the resulting stationary distribution is close in total variation distance to the distribution that is proportional to the inverse of the exit rates of the states. This result, which generalizes and makes a precise prediction in [CBV+21], constitutes the first rigorous validation of an emerging physical theory of order in nonequilibrium systems. The proof entails showing that the stationary distribution of the corresponding "jump chain," i.e., the discrete-time Markov chain with transition probabilities given by the normalized transition rates, is asymptotically uniform as the number of states grows, which settles a question raised in [BCC12] under certain *** Codes 60J27, 82C05 © 2025, CC BY.

关键词： Markov chains