检索结果-内蒙古大学图书馆

arXiv 2024年

作者： Jin, Peng Zhu, Bo Yuan, Li Yan, Shuicheng School of Electronic and Computer Engineering Peking University Shenzhen China Peng Cheng Laboratory Shenzhen China -Preferred Program Peking University Shenzhen Graduate School China Kunlun 2050 Research & Skywork AI Singapore

In this work, we aim to simultaneously enhance the effectiveness and efficiency of Mixture-of-Experts (MoE) methods. To achieve this, we propose MoE++, a general and heterogeneous MoE framework that integrates both Feed-Forward Network (FFN) and zero-computation experts. Specifically, we introduce three types of zero-computation experts: the zero expert, copy expert, and constant expert, which correspond to discard, skip, and replace operations, respectively. This design offers three key advantages: (i) Low Computing Overhead: Unlike the uniform mixing mechanism for all tokens within vanilla MoE, MoE++ allows each token to engage with a dynamic number of FFNs, be adjusted by constant vectors, or even skip the MoE layer entirely. (ii) High Performance: By enabling simple tokens to utilize fewer FFN experts, MoE++ allows more experts to focus on challenging tokens, thereby unlocking greater performance potential than vanilla MoE. (iii) Deployment Friendly: Given that zero-computation experts have negligible parameters, we can deploy all zero-computation experts on each GPU, eliminating the significant communication overhead and expert load imbalance associated with FFN experts distributed across different GPUs. Moreover, we leverage gating residuals, enabling each token to consider the pathway taken in the previous layer when selecting the appropriate experts. Extensive experimental results demonstrate that MoE++ achieves better performance while delivering 1.1∼2.1× expert forward throughput† compared to a vanilla MoE model of the same size, which lays a solid foundation for developing advanced and efficient MoE-related models. Copyright © 2024, The Authors. All rights reserved.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation

arXiv

引用

arXiv 2024年

作者： Jin, Peng Li, Hao Cheng, Zesen Li, Kehan Yu, Runyi Liu, Chang Ji, Xiangyang Yuan, Li Chen, Jie School of Electronic and Computer Engineering Peking University Shenzhen China Peng Cheng Laboratory Shenzhen China -Preferred Program Peking University Shenzhen Graduate School Shenzhen China Department of Automation and BNRist Tsinghua University Beijing China

Text-to-motion generation requires not only grounding local actions in language but also seamlessly blending these individual actions to synthesize diverse and realistic global motions. However, existing motion generation methods primarily focus on the direct synthesis of global motions while neglecting the importance of generating and controlling local actions. In this paper, we propose the local action-guided motion diffusion model, which facilitates global motion generation by utilizing local actions as fine-grained control signals. Specifically, we provide an automated method for reference local action sampling and leverage graph attention networks to assess the guiding weight of each local action in the overall motion synthesis. During the diffusion process for synthesizing global motion, we calculate the local-action gradient to provide conditional guidance. This local-to-global paradigm reduces the complexity associated with direct global motion generation and promotes motion diversity via sampling diverse actions as conditions. Extensive experiments on two human motion datasets, i.e., HumanML3D and KIT, demonstrate the effectiveness of our method. Furthermore, our method provides flexibility in seamlessly combining various local actions and continuous guiding weight adjustment, accommodating diverse user preferences, which may hold potential significance for the community. The project page is available at https://***/GuidedMotion-project/. Copyright © 2024, The Authors. All rights reserved.

关键词： Blending

来源：评论

学校读者我要写书评

暂无评论

Act as you wish: fine-grained control of motion diffusion model with hierarchical semantic graphs 23

Act as you wish: fine-grained control of motion diffusion mo...

引用

Proceedings of the 37th International Conference on Neural Information Processing Systems

作者： Peng Jin Yang Wu Yanbo Fan Zhongqian Sun Yang Wei Li Yuan School of Electronic and Computer Engineering Peking University Shenzhen China and AI for Science (AI4S)-Preferred Program Peking University Shenzhen Graduate School China Tencent AI Lab China School of Electronic and Computer Engineering Peking University Shenzhen China and Peng Cheng Laboratory Shenzhen China and AI for Science (AI4S)-Preferred Program Peking University Shenzhen Graduate School China

Most text-driven human motion generation methods employ sequential modeling approaches, e.g., transformer, to extract sentence-level text representations automatically and implicitly for human motion synthesis. However, these compact text representations may overemphasize the action names at the expense of other important properties and lack fine-grained details to guide the synthesis of subtly distinct motion. In this paper, we propose hierarchical semantic graphs for fine-grained control over motion generation. Specifically, we disentangle motion descriptions into hierarchical semantic graphs including three levels of motions, actions, and specifics. Such global-to-local structures facilitate a comprehensive understanding of motion description and fine-grained control of motion generation. Correspondingly, to leverage the coarse-to-fine topology of hierarchical semantic graphs, we decompose the text-to-motion diffusion process into three semantic levels, which correspond to capturing the overall motion, local actions, and action specifics. Extensive experiments on two benchmark human motion datasets, including HumanML3D and KIT, with superior performances, justify the efficacy of our method. More encouragingly, by modifying the edge weights of hierarchical semantic graphs, our method can continuously refine the generated motion, which may have a far-reaching impact on the community. Code and pre-trained weights are available at https://***/jpthu17/GraphMotion.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Contract-Inspired Contest Theory for Controllable Image Generation in Mobile Edge Metaverse

引用

IEEE Transactions on Mobile Computing 2025年

作者： Liu, Guangyuan Du, Hongyang Wang, Jiacheng Niyato, Dusit Kim, Dong In Nanyang Technological University College of Computing and Data Science Energy Research Institute @ NTU Interdisciplinary Graduate Program Singapore University of Hong Kong Department of Electrical and Electronic Engineering Hong Kong Hong Kong Nanyang Technological University College of Computing and Data Science Singapore Sungkyunkwan University Department of Electrical and Computer Engineering Korea Republic of

The rapid advancement of immersive technologies has propelled the development of the Metaverse, where the convergence of virtual and physical realities necessitates the generation of high-quality, photorealistic images to enhance user experience. However, generating these images, especially through Generative Diffusion Models (GDMs), in mobile edge computing environments presents significant challenges due to the limited computing resources of edge devices and the dynamic nature of wireless networks. This paper proposes a novel framework that integrates contract-inspired contest theory, Deep Reinforcement Learning (DRL), and GDMs to optimize image generation in these resource-constrained environments. The framework addresses the critical challenges of resource allocation and semantic data transmission quality by incentivizing edge devices to efficiently transmit high-quality semantic data, which is essential for creating realistic and immersive images. The use of contest and contract theory ensures that edge devices are motivated to allocate resources effectively, while DRL dynamically adjusts to network conditions, optimizing the overall image generation process. Experimental results demonstrate that the proposed approach not only improves the quality of generated images but also achieves superior convergence speed and stability compared to traditional methods. This makes the framework particularly effective for optimizing complex resource allocation tasks in mobile edge Metaverse applications, offering enhanced performance and efficiency in creating immersive virtual environments. © 2002-2012 IEEE.

关键词： Resource allocation

来源：评论

学校读者我要写书评

暂无评论

WiCo: Win-win Cooperation of Bottom-up and Top-down Referring Image Segmentation

arXiv

引用

arXiv 2023年

作者： Cheng, Zesen Jin, Peng Li, Hao Li, Kehan Li, Siheng Ji, Xiangyang Liu, Chang Chen, Jie School of Electronic and Computer Engineering Peking University Shenzhen China -Preferred Program Peking University Shenzhen Graduate School China Peng Cheng Laboratory Shenzhen China Tsinghua University Beijing China

The top-down and bottom-up methods are two mainstreams of referring segmentation, while both methods have their own intrinsic weaknesses. Top-down methods are chiefly disturbed by Polar Negative (PN) errors owing to the lack of fine-grained cross-modal alignment. Bottom-up methods are mainly perturbed by Inferior Positive (IP) errors due to the lack of prior object information. Nevertheless, we discover that two types of methods are highly complementary for restraining respective weaknesses but the direct average combination leads to harmful interference. In this context, we build Win-win Cooperation (WiCo) to exploit complementary nature of two types of methods on both interaction and integration aspects for achieving a win-win improvement. For the interaction aspect, Complementary Feature Interaction (CFI) provides fine-grained information to top-down branch and introduces prior object information to bottom-up branch for complementary feature enhancement. For the integration aspect, Gaussian Scoring Integration (GSI) models the gaussian performance distributions of two branches and weighted integrates results by sampling confident scores from the distributions. With our WiCo, several prominent top-down and bottom-up combinations achieve remarkable improvements on three common datasets with reasonable extra costs, which justifies effectiveness and generality of our method. Copyright © 2023, The Authors. All rights reserved.

关键词： Integration

来源：评论

学校读者我要写书评

暂无评论

Act As You Wish: Fine-Grained Control of Motion Diffusion Model with Hierarchical Semantic Graphs

arXiv

引用

arXiv 2023年

作者： Jin, Peng Wu, Yang Fan, Yanbo Sun, Zhongqian Wei, Yang Yuan, Li School of Electronic and Computer Engineering Peking University Shenzhen China Peng Cheng Laboratory Shenzhen China Tencent AI Lab China -Preferred Program Peking University Shenzhen Graduate School China

Most text-driven human motion generation methods employ sequential modeling approaches, e.g., transformer, to extract sentence-level text representations automatically and implicitly for human motion synthesis. However, these compact text representations may overemphasize the action names at the expense of other important properties and lack fine-grained details to guide the synthesis of subtly distinct motion. In this paper, we propose hierarchical semantic graphs for fine-grained control over motion generation. Specifically, we disentangle motion descriptions into hierarchical semantic graphs including three levels of motions, actions, and specifics. Such global-to-local structures facilitate a comprehensive understanding of motion description and fine-grained control of motion generation. Correspondingly, to leverage the coarse-to-fine topology of hierarchical semantic graphs, we decompose the text-to-motion diffusion process into three semantic levels, which correspond to capturing the overall motion, local actions, and action specifics. Extensive experiments on two benchmark human motion datasets, including HumanML3D and KIT, with superior performances, justify the efficacy of our method. More encouragingly, by modifying the edge weights of hierarchical semantic graphs, our method can continuously refine the generated motion, which may have a far-reaching impact on the community. Code and pre-training weights are available at https://***/jpthu17/GraphMotion. Copyright © 2023, The Authors. All rights reserved.

关键词： Topology

来源：评论

学校读者我要写书评

暂无评论

Discover and align taxonomic context priors for open-world semi-supervised learning 23

Discover and align taxonomic context priors for open-world s...

引用

Proceedings of the 37th International Conference on Neural Information Processing Systems

作者： Yu Wang Zhun Zhong Pengchong Qiao Xuxin Cheng Xiawu Zheng Chang Liu Nicu Sebe Rongrong Ji Jie Chen School of Electronic and Computer Engineering Peking University Shenzhen China and AI for Science (AI4S)-Preferred Program Peking University Shenzhen Graduate School China School of Computer Sceince University of Nottingham United Kingdom School of Electronic and Computer Engineering Peking University Shenzhen China and Department of Information Engineering and Computer Science University of Trento Italy School of Electronic and Computer Engineering Peking University Shenzhen China Peng Cheng Laboratory Shenzhen China and Key Laboratory of Multimedia Trusted Perception and Efficient Computing Ministry of Education of China Xiamen University Department of Automation Tsinghua University Beijing China Department of Information Engineering and Computer Science University of Trento Italy School of Electronic and Computer Engineering Peking University Shenzhen China and Peng Cheng Laboratory Shenzhen China and AI for Science (AI4S)-Preferred Program Peking University Shenzhen Graduate School China

Open-world Semi-Supervised Learning (OSSL) is a realistic and challenging task, aiming to classify unlabeled samples from both seen and novel classes using partially labeled samples from the seen classes. Previous works typically explore the relationship of samples as priors on the pre-defined single-granularity labels to help novel class recognition. In fact, classes follow a taxonomy and samples can be classified at multiple levels of granularity, which contains more underlying relationships for supervision. We thus argue that learning with single-granularity labels results in sub-optimal representation learning and inaccurate pseudo labels, especially with unknown classes. In this paper, we take the initiative to explore and propose a uniformed framework, called Taxonomic context prIors Discovering and Aligning (TIDA), which exploits the relationship of samples under various granularity. It allows us to discover multi-granularity semantic concepts as taxonomic context priors (i.e., sub-class, target-class, and super-class), and then collaboratively leverage them to enhance representation learning and improve the quality of pseudo labels. Specifically, TIDA comprises two components: i) A taxonomic context discovery module that constructs a set of hierarchical prototypes in the latent space to discover the underlying taxonomic context priors; ii) A taxonomic context-based prediction alignment module that enforces consistency across hierarchical predictions to build the reliable relationship between classes among various granularity and provide additions supervision. We demonstrate that these two components are mutually beneficial for an effective OSSL framework, which is theoretically explained from the perspective of the EM algorithm. Extensive experiments on seven commonly used datasets show that TIDA can significantly improve the performance and achieve a new state of the art. The source codes are publicly available at https://***/rain305f/TIDA.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Deep peak property learning for efficient chiral molecules ECD spectra prediction

arXiv

引用

arXiv 2024年

作者： Li, Hao Long, Da Yuan, Li Tian, Yonghong Wang, Xinchang Mo, Fanyang School of Electronic and Computer Engineering Peking University Shenzhen China -Preferred Program Peking University Shenzhen Graduate School Shenzhen China College of Chemistry and Chemical Engineering Xiamen University Xiamen361005 China Peng Cheng Laboratory Shenzhen China School of Materials Science and Engineering Peking University Beijing China

Chiral molecule assignation is crucial for asymmetric catalysis, functional materials, and the drug industry. The conventional approach requires theoretical calculations of electronic circular dichroism (ECD) spectra, which is time-consuming and costly. To speed up this process, we have incorporated deep learning techniques for the ECD prediction. We first set up a large-scale dataset of Chiral Molecular ECD spectra (CMCDS) with calculated ECD spectra. We further develop the ECDFormer model, a Transformer-based model to learn the chiral molecular representations and predict corresponding ECD spectra with improved efficiency and accuracy. Unlike other models for spectrum prediction, our ECDFormer creatively focused on peak properties rather than the whole spectrum sequence for prediction, inspired by the scenario of chiral molecule assignation. Specifically, ECDFormer predicts the peak properties, including number, position, and symbol, then renders the ECD spectra from these peak properties, which significantly outperforms other models in ECD prediction, Our ECDFormer reduces the time of acquiring ECD spectra from 1-100 hours per molecule to 1.5s. Copyright © 2024, The Authors. All rights reserved.

关键词： Forecasting

来源：评论

学校读者我要写书评

暂无评论

Aligning Instance Brownian Bridge with Texts for Open-Vocabulary Video Instance Segmentation 39

Aligning Instance Brownian Bridge with Texts for Open-Vocabu...

引用

39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025

作者： Cheng, Zesen Li, Kehan Hao, Li Jin, Peng Zheng, Xiawu Liu, Chang Chen, Jie School of Electronic and Computer Engineering Peking University Shenzhen China Pengcheng Laboratory Shenzhen China AI for Science (AI4S)-Preferred Program Peking University Shenzhen Graduate School China Tsinghua University Beijing China Xiamen University Xiamen China

ISBN: (纸本)157735897X

Temporally locating objects with arbitrary class texts is the primary pursuit of open-vocabulary Video Instance Segmentation (VIS). Because of the insufficient vocabulary of video data, previous methods leverage image-text pretraining model for recognizing object instances by separately aligning each frame with class texts. As a result, the separation breaks the instance movement context of videos and requires a lot of inference overhead. To tackle these issues, we propose Bridge-Text Alignment (BTA) to link frame-level instance representations as a Brownian Bridge. On one hand, we can calculate the global descriptor of a Brownian bridge for capturing instance dynamics, which enables extra considering temporal information rather than only static information of each frame for aligning with texts. On the other hand, according to the goal-conditioned property of Brownian bridge, we can estimate the middle frame features via the start and the end frame features so the global feature calculation of a Brownian bridge only needs to infer a few frames, which largely reduces inference overhead. We term our overall pipeline as BriVIS. Following training settings of previous works, BriVIS surpasses the SOTA (OV2Seg) by a clear margin. For example, on the challenging large-vocabulary datasets (BURST, LVVIS), BriVIS achieves 5.7 and 20.9 mAP, which exhibits +2.2∼+6.7 mAP improvement compared to OV2Seg. Furthermore, after training via BTA, using only the head and the tail frames for alignment improves the speed by 32% (2.77 → 1.88 s/iter) while just decreasing the performance by 0.2 mAP (21.1 → 20.9 mAP). © 2025, Association for the Advancement of Artificial Intelligence (***). All rights reserved.

关键词： Brownian movement

来源：评论

学校读者我要写书评

暂无评论

Ultrasound Stimulation Potentiates Management of Diabetic Hyperglycemia

引用

Ultrasound in Medicine and Biology 2023年第5期49卷 1259-1267页

作者： Chang, Chia-Hsuan Fan, Kang-Chih Cheng, Yuan-Pin Chen, Jung-Chih Chen, Gin-Shin Graduate Degree Program of the College of Electrical and Computer Engineering National Yang Ming Chiao Tung University Hsinchu City Taiwan Institute of Biomedical Engineering and Nanomedicine National Health Research Institutes Miaoli County Taiwan Division of Endocrinology and Metabolism Department of Internal Medicine National Taiwan University Hospital Hsinchu Branch Hsinchu City Taiwan Graduate Institute of Clinical Medicine College of Medicine National Taiwan University Taipei City Taiwan Electronic Systems Research Division National Chung-Shan Institute of Technology Taoyuan City Taiwan Institute of Biomedical Engineering National Yang Ming Chiao Tung University Hsinchu City Taiwan Department of Electrical and Computer Engineering National Yang Ming Chiao Tung University Hsinchu City Taiwan Catholic Mercy Hospital Catholic Mercy Medical Foundation Hsinchu County Taiwan Medical Device Innovation & Translation Center National Yang Ming Chiao Tung University Hsinchu City Taiwan

Objective: Glucose homeostasis is the only way to manage diabetic progression as all medications used do not cure diabetes. This study was aimed at verifying the feasibility of lowering glucose with non-invasive ultrasonic stimulation. Methods: The ultrasonic device was homemade and controlled via a mobile application on the smartphone. Diabetes was induced in Sprague–Dawley rats through high-fat diets followed by streptozotocin injection. The treated acupoint CV12 was at the middle of the xiphoid and umbilicus of the diabetic rats. Parameters of ultrasonic stimulation were an operating frequency of 1 MHz, pulse repetition frequency of 15 Hz, duty cycle of 10% and sonication time of 30 min for a single treatment. Discussion: The diabetic rats exhibited a significant decrease of 11.5% ± 3.6% in blood glucose in 5 min of ultrasonic stimulation (p © 2023 World Federation for Ultrasound in Medicine & Biology

关键词： Glucose

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：