检索结果-内蒙古大学图书馆

39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025

作者： Lyu, Xingyu Xu, Qianqian Yang, Zhiyong Lyu, Shaojie Huang, Qingming Key Lab. of Intelligent Information Processing Institute of Computing Tech. CAS China School of Computer Science and Tech. University of Chinese Academy of Sciences China Tencent Corporate China BDKM University of Chinese Academy of Sciences China

ISBN: (纸本)157735897X

Real-world datasets often exhibit a long-tailed distribution, where vast majority of classes known as tail classes have only few samples. Traditional methods tend to overfit on these tail classes. Recently, a new approach called Imbalanced SAM (ImbSAM) is proposed to leverage the generalization benefits of Sharpness-Aware Minimization (SAM) for long-tailed distributions. The main strategy is to merely enhance the smoothness of the loss function for tail classes. However, we argue that improving generalization in long-tail scenarios requires a careful balance between head and tail classes. We show that neither SAM nor ImbSAM alone can fully achieve this balance. For SAM, we prove that although it enhances the model's generalization ability by escaping saddle point in the overall loss landscape, it does not effectively address this for tail-class losses. Conversely, while ImbSAM is more effective at avoiding saddle points in tail classes, the head classes are trained insufficiently, resulting in significant performance drops. Based on these insights, we propose Stage-wise Saddle Escaping SAM (SSE-SAM), which uses complementary strengths of ImbSAM and SAM in a phased approach. Initially, SSE-SAM follows the majority sample to avoid saddle points of the head-class loss. During the later phase, it focuses on tail-classes to help them escape saddle points. Our experiments confirm that SSE-SAM has better ability in escaping saddles both on head and tail classes, and shows performance improvements. Copyright © 2025, Association for the Advancement of Artificial Intelligence (***). All rights reserved.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Deep Learning Super-Resolution-Based Channel Completion for Massive MISO Systems

引用

IEEE Signal processing Letters 2025年 32卷 2254-2258页

作者： Zu, Keke He, Yuhan Chen, Hongyang Zheng, Yu Haardt, Martin Yangtze Delta Region Institute (Quzhou) University of Electronic Science and Technology of China Zhejiang China Shenzhen Key Laboratory of Advanced Machine Learning and Applications Shenzhen University Shenzhen China Research Center for Graph Computing Zhejiang Lab Hangzhou China JD Intelligent Cities Research Beijing China Communications Research Lab Ilmenau University of Technology Ilmenau Germany

With the deployment of large-scale antenna arrays, the already limited time-frequency resources are becoming increasingly scarce. In this study, we propose a novel Laplacian Pyramid Channel Completion Network (LPCCNet) designed for channel completion, thereby reducing the demand for time-frequency resources in massive MIMO systems. Compared with existing network models, the proposed LPCCNet, by employing a progressive upsampling architecture, effectively mitigates aliasing effects, suppresses error propagation, and achieves a substantial reduction in computational complexity. The simulation results show that LPCCNet achieves a superior channel completion quality compared to existing methods, particularly in rapidly time-varying scenarios. © 1994-2012 IEEE.

关键词： Channel estimation Feature extraction Image reconstruction Interpolation Training Superresolution Filters Vectors Convolutional neural networks Time-frequency analysis

来源：评论

学校读者我要写书评

暂无评论

EFSA: Towards Event-Level Financial Sentiment Analysis

arXiv

引用

arXiv 2024年

作者： Chen, Tianyu Zhang, Yiming Yu, Guoxin Zhang, Dapeng Zeng, Li He, Qing Ao, Xiang Beijing China Key Lab of Intelligent Information Processing Institute of Computing Technology CAS Beijing China University of Chinese Academy of Sciences Beijing China Henan Institute of Advanced Technology Zhengzhou University Zhengzhou China School of IoT Engineering Jiangsu Vocational College of Information Technology Wuxi China Information Technology Department I Shenzhen Stock Exchange China

In this paper, we extend financial sentiment analysis (FSA) to event-level since events usually serve as the subject of the sentiment in financial text. Though extracting events from the financial text may be conducive to accurate sentiment predictions, it has specialized challenges due to the lengthy and discontinuity of events in a financial text. To this end, we reconceptualize the event extraction as a classification task by designing a categorization comprising coarse-grained and finegrained event categories. Under this setting, we formulate the Event-Level Financial Sentiment Analysis (EFSA for short) task that outputs quintuples consisting of (company, industry, coarse-grained event, fine-grained event, sentiment) from financial text. A large-scale Chinese dataset containing 12, 160 news articles and 13, 725 quintuples is publicized as a brand new testbed for our task. A four-hop Chainof- Thought LLM-based approach is devised for this task. Systematically investigations are conducted on our dataset, and the empirical results demonstrate the benchmarking scores of existing methods and our proposed method can reach the current state-of-the-art. Our dataset and framework implementation are available at https://***/cty1934/EFSA. Copyright © 2024, The Authors. All rights reserved.

关键词： Large datasets

来源：评论

学校读者我要写书评

暂无评论

Sequence-Aware Online Container Scheduling with Reinforcement Learning in Parked Vehicle Edge computing

引用

IEEE Transactions on Vehicular Technology 2025年

作者： Wu, Jianqiu Guo, Jianxiong Tang, Zhiqing Luo, Chuanwen Wang, Tian Jia, Weijia Beijing Normal-Hong Kong Baptist University Guangdong Key Lab of AI and Multi-Modal Data Processing Department of Computer Science Zhuhai519087 China Beijing Normal University Advanced Institute of Natural Sciences Zhuhai519087 China Beijing Normal-Hong Kong Baptist University Guangdong Key Lab of AI and Multi-Modal Data Processing Zhuhai519087 China Beijing Forestry University School of Information Science and Technology Beijing100083 China Engineering Research Center for Forestry-Oriented Intelligent Information Processing of National Forestry and Grassland Administration Beijing100083 China

intelligent vehicles, often parked for long periods, are ideally suited to serve as computational nodes to expand the Mobile Edge computing (MEC) infrastructure, with containerization significantly enhancing the system's load balancing, self-healing, resource isolation, and security. However, fluctuations in task demand and frequent container image downloads during peak hours create high loads on containerized nodes, as multiple mobile devices offload tasks simultaneously, leading to significant processing delays. Many existing studies make the simplified assumption of predefined patterns of task arrivals, which overlooks this issue and makes suboptimal decisions. In this paper, we consider a Parked Vehicles (PVs)-extended MEC scenario, where multiple devices request services on PVs functioning as edge servers, all controlled by a central base station. Task arrivals follow observed patterns based on long-term trends, such as peak and off-peak periods, resembling realistic arrival patterns rather than predefined ones. To optimize task offloading by identifying these patterns, we propose the Sequence-Aware Task Scheduling (SATS) algorithm, which is a policy gradient-based deep reinforcement learning approach that integrates Transformer and LSTM architectures to capture patterns in time-series task arrivals and relationships between nodes in a collaborative and containerized environment, thereby enhancing the efficiency of online task scheduling. The primary objective of SATS is to optimize the task offloading policy and minimize delay and energy consumption for all devices and PVs. Extensive numerical comparisons against baselines demonstrate the effectiveness and advantages of our algorithm. © 1967-2012 IEEE.

关键词： Mobile edge computing

来源：评论

学校读者我要写书评

暂无评论

Rethink Video Retrieval Representation for Video Captioning

SSRN

引用

SSRN 2024年

作者： Tian, Mingkai Li, Guorong Qi, Yuankai Wang, Shuhui Sheng, Quan Z. Huang, Qingming School of Computer Science and Technology Key Lab of Big Data Mining and Knowledge Management University of Chinese Academy of Sciences China School of Computing Macquarie University Australia Key Laboratory of Intelligent Information Processing Institute of Computer Technology Chinese Academy of Sciences China

Video captioning, a challenging task targeting the automatic generation of accurate and comprehensive descriptions based on video content, has witnessed substantial success recently driven by bridging video representations and textual semantics. Inspired by the nature of the video retrieval task, which learns visual features strongly related to text queries, we propose to take advantage of visual representation learning from the video retrieval framework to tackle video captioning tasks. However, a simple direct application of video retrieval models performs poorly due to the weak ability to capture sufficient video details and temporal information required for video captioning. To increase the attention on details, we propose a learnable token shift module, which flexibly captures subtle movements in local regions across the temporal sequence. Furthermore, we devise a Refineformer, which learns to integrate local video tokens strongly related to desired captions via a cross-attention mechanism. Extensive experiments on MSVD, MSR-VTT and VATEX demonstrate the favorable performance of our method. © 2024, The Authors. All rights reserved.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

Yolov9-Spe: A Novel Deep Learning Method For Classroom Student Behavior Detection

SSRN

引用

SSRN 2024年

作者： Liu, Junxiu Wu, Fengxiang Lu, Baoshan Qin, Sheng Fu, Qiang Luo, Yuling Guangxi Key Lab of Brain-inspired Computing and Intelligent Chips School of Electronic and Information Engineering Guangxi Normal University Guilin541004 China Guangxi Wireless Broadband Communication and Signal Processing Key Laboratory Guilin University of Electronic Technology Guilin541004 China

Deep learning can significantly enhance student behavior detection in classrooms. However, challenges such as small target recognition, blurry data, occlusion, and multi-scale detection persist. To address these, we propose YOLOv9-SPE, an improved model based on YOLOv9, incorporating Space-to-Depth Convolution (SPD-Conv), Partial Convolution (PConv), and an Efficient Multi-Scale Attention (EMA) Module. SPD-Conv enhances small target recognition and improves feature extraction for blurry data, while PConv, adapted from FasterNet, addresses occlusion issues. EMA is integrated to handle multi-scale targets in classroom student behavior detection. Experimental results on a student behavior dataset show that YOLOv9-SPE achieves a 2.1% improvement in mean average precision over the original YOLOv9. These enhancements provide a more accurate and efficient method for monitoring and analyzing student behavior in educational settings. © 2024, The Authors. All rights reserved.

关键词： Contrastive Learning

来源：评论

学校读者我要写书评

暂无评论

Online Learning Behavior Analysis and Prediction Based on Spiking Neural Networks

引用

Journal of Social computing 2024年第2期5卷 180-193页

作者： Yanjing Li Xiaowei Wang Fukun Chen Bingxu Zhao Qiang Fu the Institute of Education Science Research Heilongjiang UniversityHarbin 150080China the School of Cyberspace Security Shandong University of Political Science and LawJinan 250014China the School of Computer Science and Technology Harbin Engineering UniversityHarbin 150001China the Faculty of Electrical and Computer Engineering University of VictoriaVictoriaV8P 5C2Canada. the Guangxi Key Lab of Brain-Inspired Computing and Intelligent Chips School of Electronic and Information EngineeringGuangxi Normal UniversityGuilin 541004China

The vast amount of data generated by large-scale open online course platforms provide a solid foundation for the analysis of learning behavior in the field of *** study utilizes the historical and final learning behavior data of over 300000 learners from 17 courses offered on the edX platform by Harvard University and the Massachusetts Institute of Technology during the 2012-2013 academic *** have developed a spike neural network to predict learning outcomes,and analyzed the correlation between learning behavior and outcomes,aiming to identify key learning behaviors that significantly impact these *** goal is to monitor learning progress,provide targeted references for evaluating and improving learning effectiveness,and implement intervention measures *** results demonstrate that the prediction model based on online learning behavior using spiking neural network achieves an impressive accuracy of 99.80%.The learning behaviors that predominantly affect learning effectiveness are found to be students’academic performance and level of participation.

关键词： online learning learning outcomes prediction learning behavior analysis spiking neural network

来源：评论

学校读者我要写书评

暂无评论

MotionEditor: Editing Video Motion via Content-Aware Diffusion

MotionEditor: Editing Video Motion via Content-Aware Diffusi...

引用

Conference on Computer Vision and Pattern Recognition (CVPR)

作者： Shuyuan Tu Qi Dai Zhi-Qi Cheng Han Hu Xintong Han Zuxuan Wu Yu-Gang Jiang Shanahai Key Lab of Intell. Info. Processing School of CS Fudan University Shanghai Collaborative Innovation Center of Intelligent Visual Computing Microsoft Research Asia Camegie Mellon University Huya Inc.

ISBN: (数字)9798350353006

ISBN: (纸本)9798350353013

Existing diffusion-based video editing models have made gorgeous advances for editing attributes of a source video over time but struggle to manipulate the motion information while preserving the original protagonist's appearance and background. To address this, we propose MotionEditor, the first diffusion model for video motion editing. MotionEditor incorporates a novel content-aware motion adapter into ControlNet to capture temporal motion correspondence. While ControlNet enables direct generation based on skeleton poses, it encounters challenges when modifying the source motion in the inverted noise due to contradictory signals between the noise (source) and the condition (reference). Our adapter complements Control-Net by involving source content to transfer adapted control signals seamlessly. Further, we build up a two-branch ar-chitecture (a reconstruction branch and an editing branch) with a high-fidelity attention injection mechanism facilitating branch interaction. This mechanism enables the editing branch to query the key and value from the reconstruction branch in a decoupled manner, making the editing branch retain the original background and protagonist appearance. We also propose a skeleton alignment algorithm to address the discrepancies in pose size and position. Experiments demonstrate the promising motion editing ability of MotionEditor, both qualitatively and quantitatively. To the best of our knowledge, MotionEditor is the first to use diffusion models specifically for video motion editing, considering the origin dynamic background and camera movement.

关键词： Adaptation models Computer vision Heuristic algorithms Noise Dynamics Diffusion models Controllability

来源：评论

学校读者我要写书评

暂无评论

SSE-SAM: Balancing Head and Tail Classes Gradually through Stage-Wise SAM

arXiv

引用

arXiv 2024年

作者： Lyu, Xingyu Xu, Qianqian Yang, Zhiyong Lyu, Shaojie Huang, Qingming Key Lab. of Intelligent Information Processing Institute of Computing Tech. CAS China School of Computer Science and Tech University of Chinese Academy of Sciences China Tencent Corporate China BDKM University of Chinese Academy of Sciences China

Real-world datasets often exhibit a long-tailed distribution, where vast majority of classes known as tail classes have only few samples. Traditional methods tend to overfit on these tail classes. Recently, a new approach called Imbalanced SAM (ImbSAM) is proposed to leverage the generalization benefits of Sharpness-Aware Minimization (SAM) for long-tailed distributions. The main strategy is to merely enhance the smoothness of the loss function for tail classes. However, we argue that improving generalization in long-tail scenarios requires a careful balance between head and tail classes. We show that neither SAM nor ImbSAM alone can fully achieve this balance. For SAM, we prove that although it enhances the model’s generalization ability by escaping saddle point in the overall loss landscape, it does not effectively address this for tail-class losses. Conversely, while ImbSAM is more effective at avoiding saddle points in tail classes, the head classes are trained insufficiently, resulting in significant performance drops. Based on these insights, we propose Stage-wise Saddle Escaping SAM (SSE-SAM), which uses complementary strengths of ImbSAM and SAM in a phased approach. Initially, SSE-SAM follows the majority sample to avoid saddle points of the head-class loss. During the later phase, it focuses on tail-classes to help them escape saddle points. Our experiments confirm that SSE-SAM has better ability in escaping saddles both on head and tail classes, and shows performance improvements. Copyright © 2024, The Authors. All rights reserved.

关键词：

来源：评论

学校读者我要写书评

暂无评论

LCGen: mining in low-certainty generation for view-consistent text-to-3D 24

LCGen: mining in low-certainty generation for view-consisten...

引用

Proceedings of the 38th International Conference on Neural information processing Systems

作者： Zeng Tao Tong Yang Junxiong Lin Xinji Mai Haoran Wang Beining Wang Enyu Zhou Yan Wang Wenqiang Zhang Shanghai Engineering Research Center of AI & Robotics Academy for Engineering & Technology Fudan University Shanghai China Shanghai Key Lab of Intelligent Information Processing School of Computer Science Fudan University Shanghai China School of Computer Science Fudan University Shanghai China Shanghai Key Lab of Intelligent Information Processing School of Computer Science Fudan University Shanghai China and Engineering Research Center of AI & Robotics Ministry of Education Academy for Engineering & Technology Fudan University Shanghai China

ISBN: (纸本)9798331314385

The Janus Problem is a common issue in SDS-based text-to-3D methods. Due to view encoding approach and 2D diffusion prior guidance, the 3D representation model tends to learn content with higher certainty from each perspective, leading to view inconsistency. In this work, we first model and analyze the problem, visualizing the specific causes of the Janus Problem, which are associated with discrete view encoding and shared priors in 2D lifting. Based on this, we further propose the LCGen method, which guides text-to-3D to obtain different priors with different certainty from various viewpoints, aiding in view-consistent generation. Experiments have proven that our LCGen method can be directly applied to different SDS-based text-to-3D methods, alleviating the Janus Problem without introducing additional information, increasing excessive training burden, or compromising the generation effect. Project page is https://***/zeng-tao/LCGen.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：